e "l w *1 PRO^ Bayesian Space-time Downscaling Fusion Model (Downscaler) - Derived Estimates of Air Quality for 2021 ------- ------- EPA-454/R-24-002 October 2024 Bayesian Space-time Downscaling Fusion Model (Downscaler) - Derived Estimates of Air Quality for 2021 U.S. Environmental Protection Agency Office of Air Quality Planning and Standards Air Quality Assessment Division Research Triangle Park, NC ------- Authors: Adam Reff (EPA/OAR) Alison Eyth (EPA/OAR) David Mintz (EPA/OAR) Janice Godfrey (EPA/OAR) Jeff Vukovich (EPA/OAR) Julia Black (EPA/OAR) Karl Seltzer (EPA/OAR) Sharon Phillips (EPA/OAR) Acknowledgements: The following people served as reviewers of this document: Julia Black (EPA/OAR) and David Mintz (EPA/OAR). ------- Contents Contents 1 1.0 Introduction 2 2.0 Air Quality Data 5 2.1 Introduction to Air Quality Impacts in the United States 5 2.2 Ambient Air Quality Monitoring in the United States 7 2.3 Air Quality Indicators Developed for the EPHT Network 12 3.0 Emissions Data 15 3.1 Introduction to Emissions Data Development 15 3.2 Emission Inventories and Approaches 17 3.3 Emissions Modeling Summary 24 3.4 Emissions References 41 4.0 CMAQ Air Quality Model Estimates 43 4.1 Introduction to the CMAQ Modeling Platform 43 4.2 CMAQ Model Version, Inputs and Configuration 45 4.3 CMAQ Model Performance Evaluation 50 5.0 Bayesian space-time downscaling fusion model (downscaler) -Derived Air Quality Estimates 75 5.1 Introduction 75 5.2 Downscaler Model 75 5.3 Downscaler Concentration Predictions 76 5.4 Downscaler Uncertainties 81 5.5 Summary and Conclusions 83 Appendix A - Acronyms 84 Appendix B - Emissions Totals by Sector 87 1 ------- 1.0 Introduction This report describes estimates of daily ozone (maximum 8-hour average) and fine particulate matter (PM2.5) (24-hour average) concentrations throughout the contiguous United States during the 2021 calendar year generated by EPA's recently developed data fusion method termed the "downscaler model" (DS). Air quality monitoring data from the State and Local Air Monitoring Stations (SLAMS) and numerical output from the Community Multiscale Air Quality (CMAQ) model were both input to DS to predict concentrations at the 2010 and 2020 U.S. census tract centroids encompassed by the CMAQ modeling domain. Information on EPA's air quality monitors, CMAQ model, and DS is included to provide the background and context for understanding the data output presented in this report. These estimates are intended for use by statisticians and environmental scientists interested in the daily spatial distribution of ozone and PM2.5. DS operates by calibrating CMAQ data to the observational data, and then uses the resulting relationship to predict "observed" concentrations at new spatial points in the domain. Although similar in principle to a linear regression, spatial modeling aspects have been incorporated for improving the model fit, and a Bayesian1 approach to fitting is used to generate an uncertainty value associated with each concentration prediction. The uncertainties that DS produces are a major distinguishing feature from earlier fusion methods previously used by EPA such as the "Hierarchical Bayesian" (HB) model (McMillan et al, 2009). The term "downscaler" refers to the fact that DS takes grid-averaged data (CMAQ) for input and produces point-based estimates, thus "scaling down" the area of data representation. Although this allows air pollution concentration estimates to be made at points where no observations exist, caution is needed when interpreting any within-grid cell spatial gradients generated by DS since they may not exist in the input datasets. The theory, development, and initial evaluation of DS can be found in the earlier papers of Berrocal, Gelfand, and Holland (2009, 2010, and 2011). EPA's Office of Air and Radiation's (OAR) Office of Air Quality Planning and Standards (OAQPS) provides air quality monitoring data and model estimates to the Centers for Disease Control and Prevention (CDC) for use in their Environmental Public Health Tracking (EPHT) Network. CDC's EPHT Network supports the linkage of air quality data with human health outcome data for use by various public health agencies throughout the U.S. The EPHT Network Program is a multidisciplinary collaboration that involves the ongoing collection, integration, analysis, interpretation, and dissemination of data from: environmental hazard monitoring activities; human exposure assessment information; and surveillance of noninfectious health conditions. As part of the National EPHT Program efforts, the CDC led the initiative to build the National EPHT Network (https://www.cdc.gov/nceh/tracking/). The National EPHT Program, with the EPHT Network as its cornerstone, is the CDC's response to requests calling for improved understanding of how the environment affects human health. The EPHT Network is designed to provide the means to 1 Bayesian statistical modeling refers to methods that are based on Bayes' theorem and model the world in terms of probabilities based on previously acquired knowledge. 2 ------- identify, access, and organize hazard, exposure, and health data from a variety of sources and to examine, analyze, and interpret those data based on their spatial and temporal characteristics. Since 2002, EPA has collaborated with the CDC on the development of the EPHT Network. On September 30, 2003, the Secretary of Health and Human Services (HHS) and the Administrator of EPA signed a joint Memorandum of Understanding (MOU) with the objective of advancing efforts to achieve mutual environmental public health goals.2 HHS, acting through the CDC and the Agency for Toxic Substances and Disease Registry (ATSDR), and EPA agreed to expand their cooperative activities in support of the CDC EPHT Network and EPA's Central Data Exchange Node on the Environmental Information Exchange Network in the following areas: • Collecting, analyzing, and interpreting environmental and health data from both agencies (HHS and EPA). • Collaborating on emerging information technology practices related to building, supporting, and operating the CDC EPHT Network and the Environmental Information Exchange Network. • Developing and validating additional environmental public health indicators. • Sharing reliable environmental and public health data between their respective networks in an efficient and effective manner. • Consulting and informing each other about dissemination of results obtained through work carried out under the MOU and the associated Interagency Agreement (IAG) between EPA and CDC. The best available statistical fusion model, air quality data, and CMAQ numerical model output were used to develop the estimates. Fusion results can vary with different inputs and fusion modeling approaches. As new and improved statistical models become available, EPA will provide updates. Although these data have been processed on a computer system at the EPA, no warranty expressed or implied is made regarding the accuracy or utility of the data on any other system or for general or scientific purposes, nor shall the act of distribution of the data constitute any such warranty. It is also strongly recommended that careful attention be paid to the contents of the metadata file associated with these data to evaluate data set limitations, restrictions, or intended use. The EPA shall not be held liable for improper or incorrect use of the data described and/or contained herein. 2The original HHS and EPA MOU is available at https://www.cdc.gov/nceh/tracking/pdfs/epa mou 2007.pdf. 3 ------- The four remaining sections and appendices in the report are as follows: • Section 2 describes the air quality data obtained from EPA's nationwide monitoring network and the importance of the monitoring data in determining potential health risks. • Section 3 details the emissions inventory data, how it is obtained, and how it is processed into a key input into the CMAQ air quality computer model. • Section 4 describes the CMAQ computer model and its role in providing estimates of pollutant concentrations across the U.S. based on 12-km grid cells over the contiguous U.S. • Section 5 explains the downscaler model used to statistically combine air quality monitoring data and air quality estimates from the CMAQ model to provide daily air quality estimates for the 2010 and 2020 U.S. census tract centroid locations within the contiguous U.S. • Appendix A provides a description of acronyms used in this report. • Appendix B is a separate spreadsheet that shows emissions totals for the modeling domain and for each emissions modeling sector (see Section 3 for more details). 4 ------- 2.0 Air Quality Data To compare health outcomes with air quality measures, it is important to understand the origins of those measures and the methods for obtaining them. This section provides a brief overview of the origins and process of air quality regulation in this country. It provides a detailed discussion of ozone (03) and particulate matter (PM). The EPHT program has focused on these two pollutants, since numerous studies have found them to be most pervasive and harmful to public health and the environment, and there are extensive monitoring and modeling data available. 2.1 Introduction to Air Quality Impacts in the United States 2.1.1 The Clean Air Act In 1970, the Clean Air Act (CAA) was signed into law. Under this law, EPA sets limits on how much of a pollutant can be in the air anywhere in the United States. This ensures that all Americans have the same basic health and environmental protections. The CAA has been amended several times to keep pace with new information. For more information on the CAA. go to https://www.epa.gov/clean-air-act- overview. Under the CAA, the EPA has established standards, or limits, for six air pollutants known as the criteria air pollutants: carbon monoxide (CO), lead (Pb), nitrogen dioxide (N02), sulfur dioxide (S02), ozone (03), and particulate matter (PM). These standards, called the National Ambient Air Quality Standards (NAAQS), are designed to protect public health and the environment. The CAA established two types of air quality standards. Primary standards set limits to protect public health, including the health of "sensitive" populations such as asthmatics, children, and the elderly. Secondary standards set limits to protect public welfare, including protection against decreased visibility, damage to animals, crops, vegetation, and buildings. The CAA requires EPA to review these standards at least every five years. For more specific information on the NAAQS, go to https://www.epa.gov/criteria-air-pollutants/naaqs-table. For general information on the criteria pollutants, go to https://www.epa.gov/criteria-air-pollutants. When these standards are not met, the area is designated as a nonattainment area. States must develop state implementation plans (SIPs) that explain the regulations and controls it will use to clean up the nonattainment areas. States with an EPA-approved SIP can request that the area be designated from nonattainment to attainment by providing three consecutive years of data showing NAAQS compliance. The state must also provide a maintenance plan to demonstrate how it will continue to comply with the NAAQS and demonstrate compliance over a 10-year period, and what corrective actions it will take should a NAAQS violation occur after designation. EPA must review and approve the NAAQS compliance data and the maintenance plan before designating the area; thus, a person may live in an area designated as nonattainment even though no NAAQS violation has been observed for quite some time. For more information on ozone designations, go to https://www.epa.gov/ozone-designations and for PM designations, go to https://www.epa.gov/particle-pollution-designations. 5 ------- 2.1.2 Ozone Ozone is a colorless gas composed of three oxygen atoms. Ground level ozone is formed when pollutants released from cars, power plants, and other sources react in the presence of heat and sunlight. It is the prime ingredient of what is commonly called "smog." When inhaled, ozone can cause acute respiratory problems, aggravate asthma, cause inflammation of lung tissue, and even temporarily decrease the lung capacity of healthy adults. Repeated exposure may permanently scar lung tissue. EPA's Integrated Science Assessments and Risk and Exposure documents are available at https://www.epa.gov/naaqs/ozone-o3-air-quality-standards. The current NAAQS for ozone (last revised in 2015) is a daily maximum 8-hour average of 0.070 parts per million [ppm] (for details, see https://www.epa.gov/ozone-pollution/setting-and-reviewing-standards-control-ozone- pollution#standards). The CAA requires EPA to review the NAAQS at least every five years and revise them as appropriate in accordance with Section 108 and Section 109 of the Act. The standards for ozone are shown in Table 2-1. Table 2-1. Ozone National Ambient Air Quality Standards Form of the Standard (parts per million, ppm) 1997 2008 2015 Annual 4th highest daily max 8-hour average, averaged over three years 0.08 0.075 0.070 2.1.3 Particulate Matter PM air pollution is a complex mixture of small and large particles of varying origin that can contain hundreds of different chemicals, including cancer-causing agents like polycyclic aromatic hydrocarbons (PAH), as well as heavy metals such as arsenic and cadmium. PM air pollution results from direct emissions of particles as well as particles formed through chemical transformations of gaseous air pollutants. The characteristics, sources, and potential health effects of particulate matter depend on its source, the season, and atmospheric conditions. As practical convention, PM is divided by sizes into classes with differing health concerns and potential sources.3 Particles less than 10 micrometers in diameter (PMi0) pose a health concern because they can be inhaled into and accumulate in the respiratory system. Particles less than 2.5 micrometers in diameter (PM2.5) are referred to as "fine" particles. Because of their small size, fine particles can lodge deeply into the lungs. Sources of fine particles include all types of combustion (motor vehicles, power plants, wood burning, etc.) and some industrial processes. Particles with diameters between 2.5 and 10 micrometers (PM10-2.5) are referred to as "coarse" or PMc. Sources of PMc include crushing or grinding operations and dust from paved or unpaved roads. The distribution of PM10, PM2.5 and PMc varies from the eastern U.S. to arid western areas. 3 The measure used to classify PM into sizes is the aerodynamic diameter. The measurement instruments used for PM are designed and operated to separate large particles from the smaller particles. For example, the PM25 instrument only captures and thus measures particles with an aerodynamic diameter less than 2.5 micrometers. The EPA method to measure PMc is designed around taking the mathematical difference between measurements for PM10and PM25 6 ------- Particle pollution - especially fine particles - contains microscopic solids and liquid droplets that are so small that they can get deep into the lungs and cause serious health problems. Numerous scientific studies have linked particle pollution exposure to a variety of problems, including premature death in people with heart or lung disease, nonfatal heart attacks, irregular heartbeat, aggravated asthma, decreased lung function, and increased respiratory symptoms, such as irritation of airways, coughing or difficulty breathing. Additional information on the health effects of particle pollution and other technical documents related to PM standards are available at https://www.epa.gov/pm-pollution. The current NAAQS for PM2.5 (last revised in 2024) includes both a 24-hour standard to protect against short-term effects, and an annual standard to protect against long-term effects. The annual average PM2.5 concentration must not exceed 9.0 micrograms per cubic meter (ug/m3) based on the annual mean concentration averaged over three years, and the 24-hr average concentration must not exceed 35 ug/m3 based on the 98th percentile 24-hour average concentration averaged over three years. More information is available at https://www.epa.gov/pm-pollution/setting-and-reviewing-standards-control- particulate-matter-pm-pollution#standards. The standards for PM2.5 are shown in Table 2-2. Table 2-2. PM2.5 National Ambient Air Quality Standards Form of the Standard (micrograms per cubic meter, |ig/m3) 1997 2006 2012 2024 Annual mean of 24-hour averages, averaged over 3 years 15.0 15.0 12.0 9.0 98th percentile of 24-hour averages, averaged over 3 years 65 35 35 35 During June to August 2024, EPA updated PM2.5 data in AQS collected since 2017 with Teledyne Advanced Pollution Instrumentation T640/T640X Federal Equivalent Method (FEM) monitors to make those data more comparable to data collected by Federal Reference Method (FRM) monitors. PM2.5 data retrieved from AQS after August 2024 reflect this update, including the 2021 PM2.5 downscaler input dataset documented in this report which was retrieved in November 2024. See this PM?.s Data Advisory for more details. 2.2 Ambient Air Quality Monitoring in the United States 2.2.1 Monitoring Networks The CAA (Section 319) requires establishment of an air quality monitoring system throughout the U.S. The monitoring stations in this network have been called the State and Local Air Monitoring Stations (SLAMS). The SLAMS network consists of approximately 4,000 monitoring sites set up and operated by state and local air pollution agencies according to specifications prescribed by EPA for monitoring methods and network design. All ambient monitoring networks selected for use in SLAMS are tested periodically to assess the quality of the SLAMS data being produced. Measurement accuracy and precision are estimated for both automated and manual methods. The individual results of these tests for each method or analyzer are reported to EPA. Then, EPA calculates quarterly integrated estimates of precision and accuracy for the SLAMS data. 7 ------- The SLAMS network experienced accelerated growth throughout the 1970s. The networks were further expanded in 1999 based on the establishment of separate NAAQS for fine particles (PM2.5) in 1997. The NAAQS for PM2.5 were established based on their link to serious health problems ranging from increased symptoms, hospital admissions, and emergency room visits, to premature death in people with heart or lung disease. While most of the monitors in these networks are located in populated areas of the country, "background" and rural monitors are an important part of these networks. For more information on SLAMS, as well as EPA's other air monitoring networks go to https://www.epa.gov/amtic. In 2023, approximately 35 percent of the U.S. population was living within 10 kilometers of ozone and PM2.5 monitoring sites. Highly populated areas in the eastern U.S. and California are well covered by both ozone and PM2.5 monitoring network (Figure 2-1). 8 ------- Distance to Active Ozone Monitors # < 10 km (100.7 million people) # 10 km - 25 km (129.7 million people) 25 km - 50 km (58.8 million people) 50 km - 75 km (21.2 million people) 75 km - 100 km (8.8 million people) , 100 km -150 km (8.4 million people) i 150 km < ( 5.4 million people) Distance to Active PM2.5 Monitors # < 10 km (115.1 million people) # 10 km - 25 km (114 million people) 25 km - 50 km (59 million people) 50 km - 75 km (24.6 million people) 75 km -100 km (10.9 million people) # 100 km -150 km (6.6 million people) 9 150 < (2.9 million people) Figure 2-1. Distances from U.S. Census Tract centroids to the nearest monitoring site, 2023. 9 ------- In summary, state and local agencies and tribes implement a quality-assured monitoring network to measure air quality across the U.S. The EPA provides guidance to ensure a thorough understanding of the quality of the data produced by these networks. These monitoring data have been used to characterize the status of the nation's air quality and the trends across the U.S. (see https://www.epa.gov/air-trends). 2.2.2 Air Quality System Database EPA's Air Quality System (AQS) database contains ambient air monitoring data collected by EPA, state, local, and tribal air pollution control agencies from thousands of monitoring stations. AQS also contains meteorological data, descriptive information about each monitoring station (including its geographic location and its operator), and data quality assurance and quality control information. State and local agencies are required to submit their air quality monitoring data into AQS within 90 days following the end of the quarter in which the data were collected. This ensures timely submission of these data for use by state, local, and tribal agencies, EPA, and the public. EPA's OAQPS and other AQS users rely upon the data in AQS to assess air quality, assist in compliance with the NAAQS, evaluate SIPs, perform modeling for permit review analysis, and perform other air quality management functions. For more details, including how to retrieve data, go to https://www.epa.gov/aqs. 2.2.3 Advantages and Limitations of the Air Quality Monitoring and Reporting System Air quality data is required to assess public health outcomes that are affected by poor air quality. The challenge is to get surrogates for air quality on time and spatial scales that are useful for EPHT activities. The advantage of using ambient data from EPA monitoring networks for comparison with health outcomes is that these measurements of pollution concentrations are the best characterization of the concentration of a given pollutant at a given time and location. Furthermore, the data are supported by a comprehensive quality assurance program, ensuring data of known quality. One disadvantage of using the ambient data is that it is usually out of spatial and temporal alignment with health outcomes. This spatial and temporal 'misalignment' between air quality monitoring data and health outcomes is influenced by the following key factors: the living and/or working locations (microenvironments) where a person spends their time not being co-located with an air quality monitor; time(s)/date(s) when a patient experiences a health outcome/symptom (e.g., asthma attack) not coinciding with time(s)/date(s) when an air quality monitor records ambient concentrations of a pollutant high enough to affect the symptom (e.g., asthma attack either during or shortly after a high PM2.5 day). To compare/correlate ambient concentrations with acute health effects, daily local air quality data is needed.4 Spatial gaps exist in the air quality monitoring network, especially in rural areas since the air quality monitoring network is designed to focus on measurement of pollutant concentrations in high population density areas. Temporal limits also exist. Hourly ozone measurements are aggregated to daily values (the daily max 8-hour average is relevant to the ozone standard). Ozone is typically monitored during the ozone season (the warmer months, approximately April through October). However, year- 4 EPA uses exposure models to evaluate the health risks and environmental effects associated with exposure. These models are limited by the availability of air quality estimates, https://www.epa.gov/technical-air-pollution-resources. 10 ------- long data is available in many areas and is extremely useful to evaluate whether ozone is a factor in health outcomes during the non-ozone seasons. PM2.5 is generally measured year-round. Most Federal Reference Method (FRM) PM2.5 monitors collect data one day in every three days, due in part to the time and costs involved in collecting and analyzing the samples. Additionally, continuous monitors have become available which can automatically collect, analyze, and report PM2.5 measurements on an hourly basis. These monitors are available in most of the major metropolitan areas. Some of these continuous monitors have been determined to be equivalent to the FRM monitors for regulatory purposes and are called Federal Equivalent Methods (FEM). 2.2.4 Use of Air Quality Monitoring Data Air quality monitoring data has been used to provide the information for the following situations: (1) Assessing effectiveness of SIPs in addressing NAAQS nonattainment areas (2) Characterizing local, state, and national air quality status and trends (3) Associating health and environmental damage with air quality levels/concentrations For the EPHT effort, EPA is providing air quality data to support efforts associated with (2), and (3) above. Data supporting (3) is generated by EPA through the use of its air quality data and its downscaler model. Most studies that associate air quality with health outcomes use air monitoring as a surrogate for exposure to the air pollutants being investigated. Many studies have used the monitoring networks operated by state and federal agencies. Some studies perform special monitoring that can better represent exposure to the air pollutants: community monitoring, near residences, in-house or workplace monitoring, and personal monitoring. For the EPHT program, special monitoring is generally not supported, though it could be used on a case-by-case basis. From proximity-based exposure estimates to statistical interpolation, many approaches are developed for estimating exposures to air pollutants using ambient monitoring data (Jerrett et al., 2005). Depending upon the approach and the spatial and temporal distribution of ambient monitoring data, exposure estimates to air pollutants may vary greatly in areas further apart from monitors (Bravo et al., 2012). Factors like limited temporal coverage (i.e., PM2.5 monitors do not operate continuously such as recording every third day or ozone monitors operate only certain part of the year) and limited spatial coverage (i.e., most monitors are located in urban areas and rural coverage is limited) hinder the ability of most of the interpolation techniques that use monitoring data alone as the input. If we look at the example of Voronoi Neighbor Averaging (VNA) (referred as the Nearest Neighbor Averaging in most literature), rural estimates would be biased towards the urban estimates. To further explain this point, assume the scenario of two cities with monitors and no monitors in the rural areas between, which is very plausible. Since exposure estimates are guaranteed to be within the range of monitors in VNA, estimates for the rural areas would be higher according to this scenario. Air quality models may overcome some of the limitations that monitoring networks possess. Models such as CMAQ can estimate concentrations in reasonable temporal and spatial resolutions. However, these sophisticated air quality models are prone to systematic biases since they depend upon so many 11 ------- variables (i.e., meteorological models and emission models) and complex chemical and physical process simulations. Combining monitoring data with air quality models (via fusion or regression) may provide the best results in terms of estimating ambient air concentrations in space and time. EPA's eVNA5 is an example of an earlier approach for merging air quality monitor data with CMAQ model predictions. DS attempts to address some of the shortcomings in these earlier attempts to statistically combine monitor and model predicted data, see published paper referenced in Section 1 for more information about DS. As discussed in the next section, there are two methods used in EPHT to provide estimates of ambient concentrations of air pollutants: air quality monitoring data and the downscaler model estimate, which is a statistical 'combination' of air quality monitor data and photochemical air quality model predictions (e.g., CMAQ). 2.3 Air Quality Indicators Developed for the EPHT Network Air quality indicators have been developed for use in the Environmental Public Health Tracking Network by CDC using the ozone and PM2.5 data from EPA. The approach used divides "indicators" into two categories. First, basic air quality measures were developed to compare air quality levels over space and time within a public health context (e.g., using the NAAQS as a benchmark). Next, indicators were developed that mathematically link air quality data to public health tracking data (e.g., daily PM2i5 levels and hospitalization data for acute myocardial infarction). Table 2-3 and Table 2-4 describe the issues impacting calculation of basic air quality indicators. Table 2-3. Public Health Surveillance Goals and Current Status Goal Status 1) Air data sets and metadata required for air quality indicators are available to EPHT state Grantees. Data are available through state agencies and EPA's AQS. EPA and CDC developed an interagency agreement, where EPA provides air quality data along with statistically combined AQS and CMAQ data, associated metadata, and technical reports that are delivered to CDC. a) Estimate the linkage or association of PM2.5 and ozone on health to: Identify populations that may have higher risk of adverse health effects due to PM2.5 and ozone, b) Generate hypothesis for further research, and Regular discussions have been held on health-air linked indicators and CDC/HFI/EPA convened a workshop January 2008. CDC has collaborated on a health impact assessment (HIA) with Emory University, EPA, and state grantees that can be used to facilitate greater understanding of these linkages. 5 eVNA is described in the "Regulatory Impact Analysis for the Final Clean Air Interstate Rule", EPA-452/R-05-002, March 2005, Appendix F. 12 ------- c) Provide information to support prevention and pollution control strategies. 2) Produce and disseminate basic indicators and other findings in electronic and print formats to provide the public, environmental health professionals, and policymakers, with current and easy-to-use information about air pollution and the impact on public health. Templates and "how to" guides for PM2.5 and ozone have been developed for routine indicators. Calculation techniques and presentations for the indicators have been developed. Table 2-4. Basic Air Quality Indicators used in EPHT, derived from the EPA data delivered to CDC Ozone (daily 8-hr period with maximum concentration, ppm, by FRM) • Number of days with maximum ozone concentration over the NAAQS (or other relevant benchmarks (by county and MSA) • Number of person-days with maximum 8-hr average ozone concentration over the NAAQS & other relevant benchmarks (by county and MSA) PM2.5 (daily 24-hr integrated samples, ug/m3, by FRM) • Average ambient concentrations of particulate matter (< 2.5 microns in diameter) and compared to annual PM2.5 NAAQS (by state). • Percent of population exceeding annual PM2.5 NAAQS (by state). • Percent of days with PM2.5 concentration over the daily NAAQS (or other relevant benchmarks (by county and MSA) • Number of person-days with PM2.5 concentration over the daily NAAQS & other relevant benchmarks (by county and MSA) 2.3.1 Rationale for the Air Quality Indicators The CDC EPHT Network is initially focusing on ozone and PM2.5. These air quality indicators are based mainly around the NAAQS health findings and program-based measures (measurement, data, and analysis methodologies). The indicators will allow comparisons across space and time for EPHT actions. They are in the context of health-based benchmarks. By bringing population into the measures, they roughly distinguish between potential exposures (at broad scale). 13 ------- 2.3.2 Air Quality Data Sources The air quality data will be available in the EPA's AQS database based on the state/federal air program's data collection and processing. The AQS database contains ambient air pollution data collected by EPA, state, local, and tribal air pollution control agencies from thousands of state or local air monitoring stations (SLAMS). 2.3.3 Use of Air Quality Indicators for Public Health Practice The basic indicators can be used to inform policymakers and the public regarding the air quality within a state and across states (national). For example, the number of days per year that ozone is above the NAAQS can be used to communicate to sensitive populations (such as asthmatics) the number of days that they may be exposed to unhealthy levels of ozone. This short-term NAAQS level is the same level used in the AQI to inform sensitive populations when and how to reduce their exposure. These indicators, however, are not a surrogate measure of exposure and therefore will not be linked with health data. 14 ------- 3.0 Emissions Data 3.1 Introduction to Emissions Data Development The U.S. Environmental Protection Agency (EPA) developed an air quality modeling platform for air toxics and criteria air pollutants that represents the year 2021. The platform is based on the 2020 National Emissions Inventory (2020 NEI) published in April 2023 (EPA, 2023) along with other data specific to the year 2021. The air quality modeling platform consists of all the emissions inventories and ancillary data files used for emissions modeling, as well as the meteorological, initial condition, and boundary condition files needed to run the air quality model. This section focuses on the emissions modeling aspects of the 2021 modeling platform, including the emission inventories, the ancillary data files, and the approaches used to transform inventories for use in air quality modeling. The modeling platform includes all criteria air pollutants and precursors (CAPs), two groups of hazardous air pollutants (HAPs), and diesel particulate matter. The first group of HAPs are those explicitly used by the chemical mechanism in the Community Multiscale Air Quality (CMAQ) model (Appel, 2018) for ozone/particulate matter (PM): chlorine (CI), hydrogen chloride (HCI), naphthalene, benzene, acetaldehyde, formaldehyde, and methanol (the last five are abbreviated as NBAFM in subsequent sections of the document). The second group of HAPs consists of 52 HAPs or HAP groups (such as polycyclic aromatic hydrocarbon groups) that are included in CMAQ for the purposes of air quality modeling for a HAP+CAP platform. Emissions were prepared for the Community Multiscale Air Quality (CMAQ) model version 5.4.6 which was used to model ozone (O3) particulate matter (PM), and HAPs. CMAQ requires hourly and gridded emissions of the following inventory pollutants: carbon monoxide (CO), nitrogen oxides (NOx), volatile organic compounds (VOC), sulfur dioxide (SO2), ammonia (NH3), particulate matter less than or equal to 10 microns (PM10), and individual component species for particulate matter less than or equal to 2.5 microns (PM2.5). In addition, the Carbon Bond mechanism version 6 (CB6) with chlorine chemistry within CMAQ allows for explicit treatment of the VOC HAPs naphthalene, benzene, acetaldehyde, formaldehyde and methanol (NBAFM), includes anthropogenic HAP emissions of HCI and CI, and can model additional HAPs as described in Section 3. The short abbreviation for the modeling case name was "2021hb", where 2021 is the year modeled, 'h' represents that it was based on the 2020 NEI, and 'b' represents that it was the second version of a 2020 NEI-based platform. Although not used for this downscaler analysis, emissions were also prepared for an air dispersion modeling system: American Meteorological Society/Environmental Protection Agency Regulatory Model (AERMOD) (EPA, 2018). AERMOD was run for 2021 for all NEI HAPs (about 130 more than covered by CMAQ) in a similar way as was done for the 2018 version of AirToxScreen (EPA, 2022a). This TSD focuses on the CMAQ aspects of the 2021 emissions modeling platform from which ozone and PM data were developed for this study. The effort to create the emission inputs for this study included development of 6 CMAQ version 5.4: https://zenodo.org/record/7218076. CMAQ is also available from https://www.epa.gov/cmaq and the Community Modeling and Analysis System (CMAS) Center at: https://www.cmascenter.org. 15 ------- emission inventories to represent emissions during the year of 2021, along with application of emissions modeling tools to convert the inventories into the format and resolution needed by CMAQ. The emissions modeling platform includes point sources, nonpoint sources, onroad mobile sources, nonroad mobile sources, biogenic emissions and fires for the U.S., Canada, and Mexico. Some platform categories use more disaggregated data than are made available in the NEI. For example, in the platform, onroad mobile source emissions are represented as hourly emissions by vehicle type, fuel type process, and road type while the NEI emissions are aggregated to vehicle type/fuel type totals and annual temporal resolution. Emissions used in the CMAQ modeling from Canada are provided by Environment and Climate Change Canada (ECC) and Mexico are mostly provided by SEMARNAT and are not part of the NEI. Year-specific emissions were used for fires, biogenic sources, fertilizer, point sources, and onroad and nonroad mobile sources. Where available, hourly continuous emission monitoring system (CEMS) data were used for electric generating unit (EGU) emissions. The primary emissions modeling tool used to create the CMAQ model-ready emissions was the Sparse Matrix Operator Kernel Emissions (SMOKE) modeling system. SMOKE version 5.0 was used to create CMAQ-ready emissions files for a 12-km grid covering the continental United States. Additional information about SMOKE is available from http://www.cmascenter.org/smoke. The gridded meteorological model used to provide input data for the emissions modeling was developed using the Weather Research and Forecasting Model (WRF, https://ral.ucar.edu/solutions/products/ weather-research-and-forecasting-model-wrf) version 4.1.1, Advanced Research WRF core (Skamarock, et al., 2008). The WRF Model is a mesoscale numerical weather prediction system developed for both operational forecasting and atmospheric research applications. The WRF model was run for 2021 over a domain covering the continental U.S. at a 12km resolution with 35 vertical layers. The run for this platform included high resolution sea surface temperature data from the Group for High Resolution Sea Surface Temperature (GHRSST) (see https://www.ghrsst.org/) and is given the EPA meteorological case abbreviation "21k." The full case abbreviation includes this suffix following the emissions portion of the case name to fully specify the abbreviation of the case as "2021hb_cb6_21k." CMAQ was run on a 12km modeling domain over the Continental United States. The outputs from CMAQ provide the overall mass, chemistry, and formation for specific hazardous air pollutants (HAPs) formed secondarily in the atmosphere (e.g., formaldehyde, acetaldehyde, and acrolein). Data files and summaries for this platform are available from the "2021 Data Files and Summaries" link on this page of the air emissions modeling website https://www.epa.gov/air-emissions-modeling/2021-emissions- modeling-platform. This chapter contains two additional sections. Section 3.2 contains high-level information about the inventories input to SMOKE and summaries of the emissions used for the study. Section 3.3 contains high-level information on the emissions modeling performed to convert the inventories into the format and resolution needed by CMAQ. Additional details on the development of the emissions inputs to CMAQ are provided in the publication Technical Support Document (TSD): Preparation of Emissions Inventories for the 2021 North American Emissions Modeling Platform (EPA, 2024). 16 ------- 3.2 Emission Inventories and Approaches This section describes the emissions inventories created for input to SMOKE, which are based on the April 2023 version of the 2020 NEI with updates to reflect emissions in 2021. The NEI includes five main data categories: a) nonpoint sources; b) point sources; c) nonroad mobile sources; d) onroad mobile sources; and e) fires. For CAPs, the NEI data are largely compiled from data submitted by state, local and tribal (S/L/T) agencies. HAP emissions data are often augmented by EPA when they are not voluntarily submitted by S/L/T agencies. The NEI was compiled using the Emissions Inventory System (EIS). EIS collects and stores facility inventory and emissions data for the NEI and includes hundreds of automated QA checks to improve data quality, and it also supports release point (stack) coordinates separately from facility coordinates. EPA collaboration with S/L/T agencies helped prevent duplication between point and nonpoint source categories such as industrial boilers. The 2020 NEI Technical Support Document describes in detail the development of the 2020 emission inventories and is available at https://www.epa.gov/air-emissions-inventories/2020-national-emissions-inventory-nei-technical- support-document-tsd (EPA. 2023). A complete set of emissions for all source categories is developed for the NEI every three years, with 2020 being the most recent year represented with a full "triennial" NEI. S/L/T agencies are required to submit all applicable point sources to the NEI in triennial years, including the year 2020. Because only point source emissions were submitted by S/L/T agencies to develop the NEI for 2021, emissions for any point sources not submitted for 2021, and not marked as shutdown, were pulled forward from the 2020 NEI. The SMARTFIRE2 system and the BlueSky Pipeline (https://github.com/pnwairfire/bluesky) emissions modeling system were used to develop the fire emissions. SMARTFIRE2 categorizes all fires as either prescribed burning or wildfire, and the BlueSky Pipeline system includes fuel loading, consumption, and emission factor estimates for both types of fires. Onroad and nonroad mobile source emissions were developed for this project using MOVES4 (https://www.epa.gov/moves). With the exception of fire emissions, Canadian emissions were provided by Environment Canada and Climate Change (ECCC) for the years 2020 and 2023 and most 2021 emissions were developed by interpolating between 2020 and 2023. For Mexico, inventories from the 2019 emissions modeling platform (EPA, 2022b) were used as the starting point with data for border states supplemented with data for 2018 developed by SEMARNAT in collaboration with U.S. EPA. The emissions modeling process was performed using SMOKE v5.0. Through this process, the emissions inventories were apportioned into the grid cells used by CMAQ and temporally allocated into hourly values. In addition, the pollutants in the inventories (e.g., NOx, PM, and VOC) were split into the chemical species needed by CMAQ. For the purposes of preparing the CMAQ- ready emissions, the NEI emissions inventories by data category were split into emissions modeling platform "sectors"; and emissions from sources other than the NEI are added, such as the Canadian, Mexican, and offshore inventories. Emissions within the emissions modeling platform were separated into sectors for groups of related emissions source categories that were run through the appropriate SMOKE programs, except the final merge, independently from emissions categories in the other sectors. The final merge program called Mrggrid combines low-level sector-specific gridded, speciated, and temporalized emissions to create the final CMAQ-ready emissions inputs. For biogenic and fertilizer emissions, the CMAQ model allows for these emissions to be included in the CMAQ-ready emissions inputs, or to be computed within 17 ------- CMAQ itself (the "inline" option). This study used the option to compute biogenic emissions within the model and the CMAQ bidirectional ammonia process to compute the fertilizer emissions. Table 3-1 presents the sectors in the emissions modeling platform used to develop the year 2021 emissions for this project. The sector abbreviations are provided in italics; these abbreviations are used in the SMOKE modeling scripts, the inventory file names, and throughout the remainder of this section. Table 3-1. Platform Sectors Used in the Emissions Modeling Process Platform Sector: abbreviation NEI Data Category Description and resolution of the data input to SMOKE EGU units: Ptegu Point 2021 NEI point source EGUs, replaced with hourly Continuous Emissions Monitoring System (CEMS) values for NOx and S02, and the remaining pollutants temporally allocated according to CEMS heat input where the units are matched to the NEI. Emissions for all sources not matched to CEMS data come from 2021 NEI point inventory. Annual resolution for sources not matched to CEMS data, hourly for CEMS sources. EGUs closed in 2021 are not part of the inventory. Point source oil and gas: pt_oilgas Point 2021 NEI point sources that include oil and gas production emissions processes for facilities with North American Industry Classification System (NAICS) codes related to Oil and Gas Extraction, Natural Gas Distribution, Drilling Oil and Gas Wells, Support Activities for Oil and Gas Operations, Pipeline Transportation of Crude Oil, and Pipeline Transportation of Natural Gas. Includes U.S. offshore oil production. Aircraft and ground support equipment: airports Point 2021 NEI point source emissions from airports, including aircraft and airport ground support emissions projected to 2021 based on the 2022 Terminal Area Forecast (TAF). Annual resolution. Remaining non-EGU point: ptnonipm Point All 2021 NEI point source records not matched to the airports, ptegu, or pt_oilgas sectors. Includes 2020 NEI rail yard emissions projected to 2021. Annual resolution. Livestock: livestock Nonpoint 2021 nonpoint livestock emissions developed using a similar method to 2020 NEI but with adjusted animal counts and using 2021 meteorology. Livestock includes ammonia and other pollutants (except PM2.5). County and annual resolution. Agricultural Fertilizer: fertilizer Nonpoint 2021 agricultural fertilizer ammonia emissions computed inline within CMAQ. 18 ------- Platform Sector: abbreviation NEI Data Category Description and resolution of the data input to SMOKE Area fugitive dust: afdust_adj Nonpoint PMio and PM2.5 fugitive dust sources from the 2020 NEI nonpoint inventory; including building construction, road construction, agricultural dust, and paved and unpaved road dust where paved and unpaved road dust were adjusted to 2021 based on VMT differences. The emissions modeling system applies a transportable fraction reduction and zero-out adjustments based on the year-specific gridded hourly meteorology (precipitation and snow/ice cover). Emissions are county and annual resolution. Biogenic: beis Nonpoint Year 2021 emissions from biogenic sources. These were left out of the CMAQ-ready merged emissions, in favor of inline biogenic emissions produced during the CMAQ model run itself. Version 4 of the Biogenic Emissions Inventory System (BEIS) was used with Version 6 of the Biogenic Emissions Landuse Database (BELD6). These CMAQ-generated emissions are similar to the 2021 biogenic emissions generated through running SMOKE, but they are not exactly the same. Category 1, 2 CMV: cmv_clc2 Nonpoint 2021 Category 1 (CI) and Category 2 (C2), commercial marine vessel (CMV) emissions based on 2021 Automatic Identification System (AIS) data categorized using SCCs specific to ship type. Point and hourly resolution. Category 3 CMV: cmv_c3 Nonpoint 2021 Category 3 (C3) commercial marine vessel (CMV) emissions based on 2021 AIS data categorized using SCCs specific to ship type. Point and hourly resolution. Locomotives: rail Nonpoint Line haul rail locomotives emissions from 2020 NEI projected to 2021 using 5 percent growth based on Annual Energy Outlook (AEO) changes from 2020 to 2021. County and annual resolution. Nonpoint source oil and gas: np_oilgas Nonpoint Nonpoint emissions from oil and gas-related processes for 2021 computed using activity data from 2021. County and annual resolution. Residential Wood Combustion: rwc Nonpoint 2020 NEI nonpoint sources with residential wood combustion (RWC) processes, projected to 2021 with state- level adjustment factors derived from the State Energy Data System (SEDS). County and annual resolution. Solvents: np_solvents Nonpoint Emissions of solvents for 2021 based on methods used for the 2020 NEI (Seltzer, 2021). Includes household cleaners, personal care products, adhesives, architectural and aerosol coatings, printing inks, and pesticides. Annual and county resolution. Remaining nonpoint: nonpt Nonpoint 2020 NEI nonpoint sources not included in other platform sectors. County and annual resolution. 19 ------- Platform Sector: abbreviation NEI Data Category Description and resolution of the data input to SMOKE Nonroad: nonroad Nonroad 2021 nonroad equipment emissions developed with MOVES4, including the updates made to spatial apportionment that were developed with the 2016vl platform. MOVES4 was used for all states except California, which submitted their own emissions for 2020 and 2023 from which an interpolation to 2021 was performed. County and monthly resolution. Onroad: onroad Onroad Onroad mobile source gasoline and diesel vehicles from parking lots and moving vehicles for 2021 developed using VMT data from 2020 NEI projected to 2021 using factors based on FHWA VM-2 data. Includes the following emission processes: exhaust, extended idle, auxiliary power units, evaporative, permeation, refueling, vehicle starts, off network idling, long-haul truck hoteling, and brake and tire wear. MOVES4 was run for 2021 to generate year-specific emission factors. Onroad California: onroad_ca_adj Onroad California-provided 2020 and 2023 CAPs that were interpolated to 2021. HAPs speciated from CAPs. Onroad mobile source gasoline and diesel vehicles from parking lots and moving vehicles based on Emission Factor (EMFAC), gridded and temporalized based on outputs from MOVES4. Point source agricultural fires: ptagfire Nonpoint Agricultural fire sources for 2021 developed by EPA as point and day-specific emissions.7 Only EPA-developed data were used in this study, thus 2020 NEI state submissions are not included. Agricultural fires are in the nonpoint data category of the NEI, but in the modeling platform, they are treated as day-specific point sources. Updated HAP- augmentation factors were applied. Point source prescribed fires: ptfire-rx Nonpoint Point source day-specific prescribed fires for 2021 computed using SMARTFIRE 2 and BlueSky Pipeline. The ptfire emissions were run as two separate sectors: ptfire-rx (prescribed, including Flint Hills / grasslands) and ptfire- wild. Point source wildfires: ptfire-wild Nonpoint Point source day-specific wildfires for 2021 computed using SMARTFIRE 2 and BlueSky Pipeline. Non-US. Fires: ptfire_othna N/A Point source day-specific wildfires and agricultural fires outside of the U.S. for 2021. Canadian fires were computed using SMARTFIRE 2 and BlueSky Pipeline. Mexico, Caribbean, Central American, and other international fires, are from v2.5 of the Fire INventory (FINN) from National Center for Atmospheric Research (Wiedinmyer, C., 2023). 7 Only EPA-developed agricultural fire data were included in this study; data submitted by states to the NEI were excluded. 20 ------- Platform Sector: abbreviation NEI Data Category Description and resolution of the data input to SMOKE Canada Area Fugitive dust sources: canada_afdust N/A Area fugitive dust sources from ECCCfor 2021 (interpolated between provided 2020 and 2023 emissions) with transport fraction and snow/ice adjustments based on 2021 meteorological data. Annual and province resolution. Canada Point Fugitive dust sources: canada_ptdust N/A Point source fugitive dust sources from ECCC for 2021 (interpolated between provided 2020 and 2023 emissions) with transport fraction and snow/ice adjustments based on 2021 meteorological data. Monthly and province resolution. Canada and Mexico stationary point sources: canmex_point N/A Canada and Mexico point source emissions not included in other sectors. Canada point sources were provided by ECCC for 2020 and 2023, and interpolated to 2021. Mexico point source emissions for border states represent 2018 and were developed by SEMARNAT in collaboration with EPA, while emissions for all other states were carried forward from 2019ge (EPA, 2022b). Annual and monthly resolution. Canada and Mexico agricultural sources: canmex_ag N/A Canada and Mexico agricultural emissions. Canada emissions were provided by ECCCfor 2020 and 2023, and interpolated to 2021. Mexico agricultural emissions were provided by SEMARNAT and include updated emissions for border states representing 2018 developed by SEMARNAT in collaboration with EPAT, while emissions for all other states were carried forward from 2019ge. Annual resolution. Canada low-level oil and gas sources: canada_og2D N/A Canada emissions from upstream oil and gas, provided by ECCC for 2020 and 2023, and interpolated to 2021. This sector contains the portion of oil and gas emissions which are not subject to plume rise. The rest of the Canada oil and gas emissions are in the canmex_point sector. Annual resolution. Canada and Mexico nonpoint and nonroad sources: canmex_area N/A Canada and Mexico nonpoint source emissions not included in other sectors. Canada: ECCC provided surrogates and 2020 and 2023 inventories, that were interpolated to 2021. Mexico: include updated emissions for border states representing 2018 developed by SEMARNAT in collaboration with EPA, while emissions for all other states were carried forward from 2019ge. Annual and monthly resolution. Canada onroad sources: canada_onroad N/A Canada onroad emissions. 2020 and 2023 Canada inventories provided by ECCC and interpolated to 2021; processed using updated surrogates. Province and monthly resolution. Mexico onroad sources: mexico_onroad N/A Mexico onroad emissions. 2020 and 2023 emissions output from MOVES-Mexico were interpolated to 2021. Municipio and monthly resolution. 21 ------- Ocean chlorine emissions were also merged in with the above sectors. The ocean chlorine gas emission estimates are based on the build-up of molecular chlorine (CI2) concentrations in oceanic air masses (Bullock and Brehme, 2002). Ocean chlorine data at 12 km resolution were available from earlier studies and were not modified other than the name "CHLORINE" was changed to "CL2" because that is the name required by the CMAQ model. The emission inventories in SMOKE input formats for the platform are available from EPA's Air Emissions Modeling website: https://www.epa.gov/air-emissions-modeling/2021-emissions-modeling-platform. The platform informational text file indicates the zipped files associated with each platform sector. Some emissions data summaries are available with the data files for the 2021 platform. The types of reports include state summaries of inventory pollutants and model species by modeling platform sector and county annual totals by modeling platform sector. Annual summaries of the emissions in the Contiguous U.S. and emissions within the 12-km domain but outside of the U.S. are shown in Table 3-2 and Table 3-3, respectively. State total emissions for each sector are provided in Appendix B, a workbook entitled "Append ix_B_2021_emissions_totals_by_sector.xlsx". 22 ------- Table 3-2. 2021 Contiguous United States Emissions by Sector (short tons/yr in 48 states + D.C.) Sector CO NH3 NOX PM10 PM2_5 S02 VOC afdust_adj 6,027,656 821,738 airports 333,660 0 83,674 8,521 7,533 9,126 50,041 cmv_clc2 19,892 68 134,167 3,662 3,548 615 5,116 cmv_c3 10,252 44 81,846 2,507 2,307 5,767 4,687 fertilizer 1,275,333 livestock 2,824,644 225,971 nonpt 2,173,885 145,073 723,480 711,625 622,905 106,258 1,005,040 nonroad 11,037,304 1,998 816,810 80,205 75,312 917 945,175 nP_oilgas 654,275 43 728,663 14,048 13,880 139,514 2,876,480 np_solvents 0 0 0 0 0 0 2,716,884 onroad 14,391,846 183,954 2,258,178 188,833 74,375 8,748 1,039,569 ptegu 467,560 21,482 879,533 125,564 109,306 968,652 26,731 ptagfire 773,523 172,492 33,830 114,547 74,469 13,729 125,668 ptfire-rx 7,825,125 68,537 125,890 1,267,230 1,131,103 80,356 1,586,259 ptfi re-wild 17,682,184 178,672 163,750 3,826,054 2,386,263 166,480 4,865,824 ptnonipm 1,226,638 61,712 793,938 350,108 228,948 456,300 726,236 pt_oilgas 174,223 9,095 318,687 12,460 11,916 31,186 195,937 rail 96,705 296 444,124 11,360 10,982 369 18,367 rwc 2,940,341 22,616 44,790 448,615 446,995 11,894 453,043 beis 3,314,764 989,492 28,539,802 CONUS + beis 63,122,176 4,966,059 8,620,851 13,192,993 6,021,580 1,999,910 45,406,832 Table 3-3. Non-US Emissions by Sector within the 12US1 Modeling Domain (short tons/yr) Sector CO NH3 NOX PM10 PM2_5 S02 VOC Canada ag 500,395 6,562 1,875 124,257 Canada oil and gas 2D 8 306,206 Canada afdust 1,028,722 194,713 Canada ptdust 3,588 443 Canada area 2,040,850 5,983 317,182 184,382 134,440 14,175 711,153 Canada onroad 1,669,722 6,994 356,236 24,858 13,378 893 118,094 Canada point 1,021,439 18,569 538,357 112,670 42,409 483,703 148,235 Canada fires 18,068,782 259,108 302,681 3,543,123 3,141,541 173,644 5,070,468 Canada cmv_clc2 3,179 10 20,497 541 525 64 720 Canada cmv_c3 7,750 27 60,418 1,498 1,378 3,331 3,773 Mexico ag 137,778 53,862 11,638 Mexico area 98,400 26,201 57,960 42,108 20,576 21,937 425,809 Mexico onroad 1,418,503 2,509 350,527 13,377 9,349 5,778 127,181 Mexico point 158,097 979 199,367 90,822 53,973 341,038 32,822 Mexico fires 415,564 6,820 24,903 54,701 45,743 4,240 204,334 Mexico cmv_clc2 157 0 1,016 27 26 4 42 Mexico cmv_c3 9,601 87 82,079 4,907 4,514 12,970 4,596 23 ------- Sector CO NH3 NOX PM10 PM2_5 S02 VOC Offshore cmv_clc2 4,445 14 28,377 743 721 88 1,065 Offshore cmv_c3 51,349 309 414,286 17,467 16,069 43,957 25,126 Offshore pt_oilgas 28,548 5 34,658 422 416 321 31,400 Can/Mex/offshore total 24,996,385 965,797 2,788,544 5,184,380 3,693,725 1,106,143 7,335,281 3.3 Emissions Modeling Summary The CMAQ air quality model requires hourly emissions of specific gas and particle species for the horizontal and vertical grid cells contained within the modeled region (i.e., modeling domain). To provide emissions in the form and format required by the model, it is necessary to "pre-process" the "raw" emissions (i.e., emissions input to SMOKE) for the sectors described above. In brief, the process of emissions modeling transforms the emissions inventories from their original temporal resolution, pollutant resolution, and spatial resolution into the hourly, speciated, gridded, and vertical resolution required by the air quality model. Emissions modeling includes temporal allocation, spatial allocation, and pollutant speciation. Emissions modeling sometimes includes the vertical allocation (i.e., plume rise) of point sources, but many air quality models also perform this task because it greatly reduces the size of the input emissions files if the vertical layers of the sources are not included. The temporal resolutions of the emissions inventories input to SMOKE vary across sectors and may be hourly, daily, monthly, or annual total emissions. The spatial resolution may be individual point sources; totals by county (U.S.), province (Canada), or municipio (Mexico); or gridded emissions. This section provides some basic information about the tools and data files used for emissions modeling as part of the modeling platform. 3.3.1 The SMOKE Modeling System SMOKE version 5.0 was used to process the raw emissions inventories into emissions inputs for each modeling sector into a format compatible with CMAQ. SMOKE executables and source code are available from the Community Multiscale Analysis System (CMAS) Center at http://www.cmascenter.org. Additional information about SMOKE is available from http://www.smoke- model.org. For sectors that have plume rise, the in-line plume rise capability allows for the use of emissions files that are much smaller than full three-dimensional gridded emissions files. For quality assurance of the emissions modeling steps, emissions totals by specie for the entire model domain are output as reports that are then compared to reports generated by SMOKE on the input inventories to ensure that mass is not lost or gained during the emissions modeling process. 3.3.2 Key Emissions Modeling Settings When preparing emissions for the air quality model, emissions for each sector are processed separately through SMOKE, and then the final merge program (Mrggrid) is run to combine the model-ready, sector- specific 2-D gridded emissions across sectors. The SMOKE settings in the run scripts and the data in the SMOKE ancillary files control the approaches used by the individual SMOKE programs for each sector. Table 3-4 summarizes the major processing steps of each platform sector with the columns as follows. 24 ------- The "Spatial" column shows the spatial approach used: "point" indicates that SMOKE maps the source from a point location (i.e., latitude and longitude) to a grid cell; "surrogates" indicates that some or all of the sources use spatial surrogates to allocate county emissions to grid cells; and "area-to-point" indicates that some of the sources use the SMOKE area-to-point feature to grid the emissions. The "Speciation" column indicates that all sectors use the SMOKE speciation step, though biogenics speciation is done within the Tmpbeis3 program and not as a separate SMOKE step. The "Inventory resolution" column shows the inventory temporal resolution from which SMOKE needs to calculate hourly emissions. Note that for some sectors (e.g., onroad, beis), there is no input inventory; instead, activity data and emission factors are used in combination with meteorological data to compute hourly emissions. Finally, the "plume rise" column indicates the sectors for which the "in-line" approach is used. These sectors are the only ones with emissions in aloft layers based on plume rise. The term "in-line" means that the plume rise calculations are done inside of the air quality model instead of being computed by SMOKE. In all of the "in-line" sectors, all sources are output by SMOKE into point source files which are subject to plume rise calculations in the air quality model. In other words, no emissions are output to layer 1 gridded emissions files from those sectors as has been done in past platforms. The air quality model computes the plume rise using stack parameters, the Briggs algorithm, and the hourly emissions in the SMOKE output files for each emissions sector. The height of the plume rise determines the model layers into which the emissions are placed. The plume top and bottom are computed, along with the plumes' distributions into the vertical layers that the plumes intersect. The pressure difference across each layer divided by the pressure difference across the entire plume is used as a weighting factor to assign the emissions to layers. This approach gives plume fractions by layer and source. Day-specific point fire emissions are treated differently in CMAQ. After plume rise is applied, there are emissions in every layer from the ground up to the top of the plume. Table 3-4. Key emissions modeling steps by sector Platform sector Spatial Speciation Inventory resolution Plume rise afdust_adj Surrogates Yes Annual airports Point Yes Annual None beis Pre-gridded land use in BEIS4 computed hourly in CMAQ fertilizer EPIC No computed hourly in CMAQ livestock Surrogates Yes Daily cmv_clc2 Point Yes hourly in-line cmv_c3 Point Yes hourly in-line nonpt Surrogates & area-to-point Yes Annual nonroad Surrogates Yes monthly 25 ------- Platform sector Spatial Speciation Inventory resolution Plume rise np_oilgas Surrogates Yes Annual onroad Surrogates Yes monthly activity, computed hourly onroad_ca_adj Surrogates Yes monthly activity, computed hourly canada_onroad Surrogates Yes monthly mexico_onroad Surrogates Yes monthly canada_afdust Surrogates Yes annual & monthly canmex_area Surrogates Yes monthly canmex_point Point Yes monthly in-line canada_ptdust Point Yes annual None canada_og2D Point Yes monthly None canmex_ag Surrogates Yes annual ptagfire Point Yes daily in-line pt_oilgas Point Yes annual in-line ptegu Point Yes daily & hourly in-line ptfire-rx Point Yes daily in-line ptfi re-wild Point Yes daily in-line ptfire_othna Point Yes daily in-line ptnonipm Point Yes annual in-line rail Surrogates Yes annual rwc Surrogates Yes annual np_solvents Surrogates Yes annual Note that SMOKE has the option of grouping sources so that they are treated as a single stack when computing plume rise. For the modeling cases discussed in this document, no grouping was performed because grouping combined with "in-line" processing will not give identical results as "offline" processing (i.e., when SMOKE creates 3-dimensional files). This occurs when stacks with different stack parameters or latitude and longitudes are grouped, thereby changing the parameters of one or more sources. The most straightforward way to get the same results between in-line and offline is to avoid the use of stack grouping. Biogenic emissions can be modeled two different ways in the CMAQ model. The BEIS model in SMOKE can produce gridded biogenic emissions that are then included in the gridded CMAQ-ready emissions inputs, or alternatively, CMAQ can be configured to create "in-line" biogenic emissions within CMAQ itself. For this study, the in-line biogenic emissions option was used, and thus biogenic emissions from BEIS were not included in the gridded CMAQ-ready emissions. 26 ------- 3.3.3 Spatial Configuration For this study, SMOKE was run for the larger 12-km CONtinental United States "CONUS" modeling domain (12US1) shown in Figure 3-1, but the air quality model was run on the smaller 12-km domain (12US2). The grid used a Lambert-Conformal projection, with Alpha = 33, Beta = 45 and Gamma = -97, with a center of X = -97 and Y = 40. Later sections provide details on the spatial surrogates and area-to- point data used to accomplish spatial allocation with SMOKE. Later sections provide details on the spatial surrogates and area-to-point data used to accomplish spatial allocation with SMOKE. WRF, SMOKE, and CMAQ all presume the Earth is a sphere with a radius of 6370000 m. Figure 3-1. CMAQ Modeling Domain. 3.3.4 Chemical Speciation Configuration Chemical speciation involves the process of translating emissions from the inventory into the chemical mechanism-specific "model species" needed by an air quality model. Using the CB6R5_AE7 chemical mechanism as an example, these model species either represent explicit chemical compounds (e.g., acetone, benzene, ethanol) or groups of species (i.e., "lumped species;" e.g., PAR, OLE, KET). Table 3-5 lists the model species produced by SMOKE in the platform for the mechanism used for this study. 27 ------- Table 3-5. Emission model species produced for CB6R5_AE7 for CMAQ Inventory Pollutant Model Species Model species description Cl2 CL2 Atomic gas-phase chlorine HCI HCL Hydrogen Chloride (hydrochloric acid) gas CO CO Carbon monoxide NOx NO Nitrogen oxide NOx N02 Nitrogen dioxide NOx HONO Nitrous acid S02 S02 Sulfur dioxide S02 SULF Sulfuric acid vapor nh3 NH3 Ammonia nh3 NH3_FERT Ammonia from fertilizer VOC AACD Acetic acid VOC ACET Acetone VOC ALD2 Acetaldehyde VOC ALDX Propionaldehyde and higher aldehydes VOC APIN Alpha pinene VOC BENZ Benzene VOC CAT1 Methyl-catechols VOC CH4 Methane VOC CRES Cresols VOC CRON Nitro-cresols VOC ETH Ethene VOC ETHA Ethane VOC ETHY Ethyne VOC ETOH Ethanol VOC FACD Formic acid VOC FORM Formaldehyde VOC GLY Glyoxal VOC GLYD Glycolaldehyde VOC IOLE Internal olefin carbon bond (R-C=C-R) VOC ISOP Isoprene VOC ISPD Isoprene Product VOC IVOC Intermediate volatility organic compounds VOC KET Ketone Groups VOC MEOH Methanol VOC MGLY Methylglyoxal VOC NAPH Naphthalene VOC NVOL Non-volatile compounds VOC OLE Terminal olefin carbon bond (R-C=C) VOC PACD Peroxyacetic and higher peroxycarboxylic acids VOC PAR Paraffin carbon bond VOC PRPA Propane VOC SESQ Sesquiterpenes (from biogenics only) 28 ------- Inventory Pollutant Model Species Model species description VOC SOAALK Secondary Organic Aerosol (SOA) tracer VOC TERP Terpenes (from biogenics only) VOC TOL Toluene and other monoalkyl aromatics VOC UNR Unreactive VOC XYLMN Xylene and other polyalkyl aromatics, minus naphthalene Naphthalene NAPH Naphthalene from inventory Benzene BENZ Benzene from the inventory Acetaldehyde ALD2 Acetaldehyde from inventory Formaldehyde FORM Formaldehyde from inventory Methanol MEOH Methanol from inventory PM10 PMC Coarse PM > 2.5 microns and < 10 microns PM2.s PEC Particulate elemental carbon < 2.5 microns PM2.s PN03 Particulate nitrate < 2.5 microns PM2.5 POC Particulate organic carbon (carbon only) < 2.5 microns PM2.5 PS04 Particulate Sulfate < 2.5 microns PM2.5 PAL Aluminum PM2.5 PCA Calcium PM2.5 PCL Chloride PM2.5 PFE Iron PM2.5 PK Potassium PM2.5 PH20 Water PM2.5 PMG Magnesium PM2.5 PMN Manganese PM2.5 PMOTHR PM2.5 not in other AE6 species PM2.5 PNA Sodium PM2.5 PNCOM Non-carbon organic matter PM2.5 PNH4 Ammonium PM2.5 PSI Silica PM2.5 PTI Titanium The TOG and PM2.5 profiles used to speciate emissions are part of the SPECIATE v5.2 database (https://www.epa.gov/air-emissions-modeling/speciate). The SPECIATE database is developed and maintained by the EPA's Office of Research and Development (ORD), Office of Transportation and Air Quality (OTAQ), and the Office of Air Quality Planning and Standards (OAQPS), in cooperation with Environment Canada (EPA, 2019). These profiles are processed using the EPA's S2S-Tool (https://github.com/USEPA/S2S-Tool) to generate the GSPRO and GSCNV files needed by SMOKE. As with previous platforms, some Canadian point source inventories are provided from Environment Canada as pre-speciated emissions. Speciation profiles (GSPRO files) and cross-references (GSREF files) for this platform are available in the SMOKE input files for the platform. Emissions of VOC and PM2.5 emissions by county, sector, and profile for all sectors other than onroad mobile can be found in the sector summaries. Total emissions for each model species by state and sector can be found in the state-sector totals workbook. 29 ------- The following updates to profile assignments were made to this modeling platform and differ from prior years: • For PM2.5: o All GSPRO files were generated by the S2S-Tool, dated 09-11-2023, and utilized SPECIATE v5.3. o Update of the CMV speciation cross-reference files to utilize the SCC updates for this sector and use the new CROC profiles introduced in SPECIATE v5.3. o Update onroad and nonroad mobile cross-reference files to utilize the CROC profiles introduced in SPECIATE v5.3. • ForVOC: o All GSPRO and GSCNV files were generated by the S2S-Tool, dated 09-11-2023, and utilized SPECIATE v5.3. o All oil and gas well completion and abandoned wells emissions were updated (or added in the case of abandoned wells) from 1101 and 8949, respectively, to 95404 and 95403, respectively. However, this update was not performed for basin-specific profiles that were output by the O&G Tool. o Update of the CMV speciation cross-reference files to utilize the SCC updates for this sector and use the new GROC profiles introduced in SPECIATE v5.3. o Update usage of 95120a to 95120c. o Update onroad and nonroad mobile cross-reference files to utilize the GROC profiles introduced in SPECIATE v5.3. The base emissions inventory for this modeling platform includes total VOC and individual HAP emissions. Often, individual HAPs are components of VOC (HAP-VOC), and these HAP-VOCs are included ("integrated") in the speciation process. This HAP integration is performed in a way to ensure double counting of emitted mass does not occur and requires specific data processing by the S2S-Tool and user input in SMOKE. To incorporate HAP emissions from the base inventory into the modeling platform, one of two methods are performed. (1) Integrate, HAP-use is a method where the mass of integrated HAP-VOCs is summed and subtracted from VOC, and the residual mass (NONHAPVOC) is speciated using a renormalized speciation profile that does not include the integrated HAP-VOCs (they are subtracted from the profile and then the profile is renormalized to 100%). (2) No-Integrate, HAP-use is a method where the mass of VOC is speciated using a speciation profile that does not include the integrated HAP-VOCs (they are subtracted from the profile and the profile is not renormalized to 100%). In this scenario, the HAP-VOC and VOC portions of the inventory are difficult to harmonize, and it is assumed that the proportions of HAPs from these sources are adequately captured in the speciation profile used to speciate the VOC emissions (which is why there is no renormalization). In addition, HAPs can be introduced into a modeling platform using speciation profiles. In this scenario, HAP-VOC emissions are "generated" through VOC speciation and are not incorporated from the base inventory. This method is called "Criteria" speciation. The integration methods used for each platform sector are shown in Table 3-6. 30 ------- Table 3-6. Integration status for each platform sector Platform Sector Approach for Integrating NEI emissions of Naphthalene (N), Benzene (B), Acetaldehyde (A), Formaldehyde (F) and Methanol (M) afdust N/A - sector contains no VOC airports No integration, use NBAFM in inventory beis N/A - sector contains no inventory pollutant "VOC"; but rather specific VOC species cmv clc2 No integration, no NBAFM in inventory, create NBAFM from VOC speciation cmv c3 No integration, no NBAFM in inventory, create NBAFM from VOC speciation fertilizer N/A - sector contains no VOC livestock Full integration (NBAFM) nonpt Partial integration (NBAFM) nonroad Full integration (internal to MOVES) np_oilgas Partial integration (NBAFM) onroad Full integration (internal to MOVES) Canada onroad No integration, no NBAFM in inventory, create NBAFM from VOC speciation mexico_onroad Full integration (internal to MOVES-Mexico); however, MOVES-MEXICO speciation was older CB6, so post-SMOKE emissions were converted to CB6R3AE6 Canada afdust N/A - sector contains no VOC canmex area No integration, no NBAFM in inventory, create NBAFM from VOC speciation canmex_point No integration, no NBAFM in inventory, create NBAFM from VOC speciation canada_ptdust N/A - sector contains no VOC canada_og2D No integration, no NBAFM in inventory, create NBAFM from VOC speciation canmex_ag No integration, no NBAFM in inventory, create NBAFM from VOC speciation pt_oilgas No integration, use NBAFM in inventory ptagfire Full integration (NBAFM) ptegu No integration, use NBAFM in inventory ptfire-rx Full integration (NBAFM) ptfire-wild Partial integration (NBAFM) ptfire_othna No integration, no NBAFM in inventory, create NBAFM from VOC speciation ptnonipm No integration, use NBAFM in inventory rail Full integration (NBAFM) rwc Full integration (NBAFM) np_solvents Partial integration (NBAFM) The HAPs integrated from the base inventory into the modeling platform are sector and chemical mechanism specific. In recent years, CB6R3_AE7 has been the primary chemical mechanism used at the EPA. Within that mechanism, naphthalene (NAPH), benzene (BENZ), acetaldehyde (ALD2), formaldehyde (FORM), and methanol (MEOH) are explicit HAP-VOCs, and these compounds are collectively referred to as NBAFM. Since NBAFM are explicitly modeled in CB6R3_AE7, these species have become the default collection of integrated HAP species at the EPA. MOVES, the EPA's mobile emissions model, features additional species that are explicitly modeled (e.g., ethanol). These species are also incorporated directly into modeling platforms. To incorporate these species, additional files from the S2S-Tool are required. For California, speciation of NONHAPTOG is performed on CARB's VOC submissions using the county- specific speciation profile assignments generated by MOVES in California. 31 ------- In the NEI, N0X emissions are inventoried on a NO2 weighted basis, but must be speciated into NO, NO2, and HONO. Table 3-7 provides the NOx speciation profiles used in EPA's modeling platforms. The only difference between the two profiles is the allocation of some NO2 mass to HONO in the "HONO" profile. HONO emissions from mobile sources have been identified in tunnel studies and its inclusion in emissions inventories is important for urban chemistry. Here, a HONO to NOx ratio of 0.008 was selected (Sarwar, 2008). In this modeling platform, all non-mobile sources use the "NHONO" profile, all non- onroad mobile sources (including nonroad, cmv, and rail) use the "HONO" profile, and all onroad NOx speciation occurs within MOVES. For further details on NOx speciation within MOVES, please see the associated technical report. Table 3-7. NOx speciation profiles Profile pollutant Species split factor HONO NOx N02 0.092 HONO NOx NO 0.9 HONO NOx HONO 0.008 NHONO NOx N02 0.1 NHONO NOx NO 0.9 3.3.5 Temporal Processing Configuration Temporal allocation is the process of distributing aggregated emissions to a finer temporal resolution, thereby converting annual emissions to hourly emissions as is required by CMAQ. While the total emissions are important, the timing of the occurrence of emissions is also essential for accurately simulating ozone, PM, and other pollutant concentrations in the atmosphere. Many emissions inventories are annual or monthly in nature. Temporal allocation takes these aggregated emissions and distributes the emissions to the hours of each day. This process is typically done by applying temporal profiles to the inventories in this order: monthly, day of the week, and diurnal, with monthly and day-of- week profiles applied only if the inventory is not already at that level of detail. The temporal factors applied to the inventory were selected using some combination of country, state, county, SCC, and pollutant. Table 3-8 summarizes the temporal aspects of emissions modeling by comparing the key approaches used for temporal processing across the sectors. In the table, "Daily temporal approach" refers to the temporal approach for getting daily emissions from the inventory using the SMOKE Temporal program. The values given are the values of the SMOKE L_TYPE setting. The "Merge processing approach" refers to the days used to represent other days in the month for the merge step. If this is not "all," then the SMOKE merge step runs only for representative days, which could include holidays as indicated by the right-most column. The values given are those used for the SMOKE M_TYPE setting (see below for more information). 32 ------- Table 3-8. Temporal Settings Used for the Platform Sectors in SMOKE Monthly Daily Merge Process Platform sector Inventory profiles temporal processing holidays as short name resolutions used? approach approach separate days afdust_adj Annual Yes week all Yes airports Annual Yes all All No beis Hourly n/a all No cmv_clc2 Annual & hourly All all No cmv_c3 Annual & hourly All all No fertilizer Monthly met-based All Yes livestock Daily met-based All No nonpt Annual Yes week week Yes nonroad Monthly mwdss mwdss Yes np_oilgas Annual Yes aveday aveday No onroad Annual & monthly1 all all Yes onroad_ca_adj Annual & monthly1 all all Yes canada_afdust Annual & monthly Yes week all No canmex_area Monthly week week No canada_onroad Monthly week week No mexico_onroad Monthly week week No canmex_point Monthly Yes mwdss mwdss No canada_ptdust Annual Yes week all No canmex_ag Annual Yes mwdss mwdss No canada_og2D Monthly mwdss mwdss No pt_oilgas Annual Yes mwdss mwdss Yes ptegu Annual & hourly Yes2 all All No ptnonipm Annual Yes mwdss mwdss Yes ptagfire Daily all all No ptfire-rx Daily all all No ptfire-wild Daily all all No ptfire_othna Daily all all No rail Annual Yes aveday aveday No rwc Annual No3 met-based3 All No3 np_solvents Annual Yes aveday aveday No 1. Note the annual and monthly "inventory" actually refers to the activity data (VMT, VPOP, starts) for onroad. The actual emissions are computed on an hourly basis. 2. Only units that do not have matching hourly CEMs data use monthly temporal profiles. 3. Except for 2 SCCs that do not use met-based temporalization. The following values are used in the table. The value "all" means that hourly emissions were computed for every day of the year and that emissions potentially have day-of-year variation. The value "week" means that hourly emissions were computed for all days in one "representative" week, representing all 33 ------- weeks for each month. This means emissions have day-of-week variation, but not week-to-week variation within the month. The value "mwdss" means hourly emissions for one representative Monday, representative weekday (Tuesday through Friday), representative Saturday, and representative Sunday for each month. This means emissions have variation between Mondays, other weekdays, Saturdays and Sundays within the month, but not week-to-week variation within the month. The value "aveday" means hourly emissions computed for one representative day of each month, meaning emissions for all days within a month are the same. Special situations with respect to temporal allocation are described in the following subsections. In addition to the resolution, temporal processing includes a ramp-up period for several days prior to January 1, 2021, which is intended to mitigate the effects of initial condition concentrations. The ramp- up period was 10 days (December 22-31, 2020). For all anthropogenic sectors, emissions from December 2021 were used to fill in emissions for the end of December 2020. For biogenic emissions, December 2020 emissions were computed using year 2020 meteorology. The FF10 inventory format for SMOKE provides a consolidated format for monthly, daily, and hourly emissions inventories. With the FF10 format, a single inventory file can contain emissions for all 12 months and the annual emissions in a single record. This helps simplify the management of numerous inventories. Similarly, daily and hourly FF10 inventories contain individual records with data for all days in a month and all hours in a day, respectively. SMOKE prevents the application of temporal profiles on top of the "native" resolution of the inventory. For example, a monthly inventory should not have annual-to-month temporal allocation applied to it; rather, it should only have month-to-day and diurnal temporal allocation. This becomes particularly important when specific sectors have a mix of annual, monthly, daily, and/or hourly inventories. The flags that control temporal allocation for a mixed set of inventories are discussed in the SMOKE documentation. The modeling platform sectors that make use of monthly values in the FF10 files are nonroad, onroad (for activity data), and all Canada and Mexico inventories except for agriculture. Commercial marine vessels in cmv_c3 and cmv_clc2 use hourly data in the FF10 files. 3.3.6 Vertical Allocation of Emissions Table 3-4 specifiesthe sectors for which plume rise is calculated. If there is no plume rise for a sector, the emissions are placed into layer 1 of the air quality model. Vertical plume rise was performed in-line within CMAQ for all of the SMOKE point-source sectors (i.e., ptegu, ptnonipm, pt_oilgas, ptfire-rx, ptfire- wild, ptagfire, ptfire_othna, othpt, and cmv_c3). The in-line plume rise computed within CMAQ is nearly identical to the plume rise that would be calculated within SMOKE using the Laypoint program. The selection of point sources for plume rise is pre-determined in SMOKE using the Elevpoint program. The calculation is done in conjunction with the CMAQ model time steps with interpolated meteorological data and is therefore more temporally resolved than when it is done in SMOKE. Also, the calculation of the location of the point sources is slightly different than the one used in SMOKE and this can result in slightly different placement of point sources near grid cell boundaries. For point sources, the stack parameters are used as inputs to the Briggs algorithm, but point fires 34 ------- do not have traditional stack parameters. However, the ptfire-rx, ptfire-wild, ptagfire, and ptfire_othna inventories do contain data on the acres burned (acres per day) and fuel consumption (tons fuel per acre) for each day. CMAQ uses these additional parameters to estimate the plume rise of emissions into layers above the surface model layer. Specifically, these data are used to calculate heat flux, which is then used to estimate plume rise. In addition to the acres burned and fuel consumption, heat content of the fuel is needed to compute heat flux. The heat content was assumed to be 8000 Btu/lb of fuel for all fires because specific data on the fuels were unavailable in the inventory. The plume rise algorithm applied to the fires is a modification of the Briggs algorithm with a stack height of zero. CMAQ uses the Briggs algorithm to determine the plume top and bottom, and then computes the plumes' distributions into the vertical layers that the plumes intersect. The pressure difference across each layer divided by the pressure difference across the entire plume is used as a weighting factor to assign the emissions to layers. This approach gives plume fractions by layer and source. Note that the implementation of fire plume rise in CMAQ differs from the implementation of plume rise in SMOKE. This study uses CMAQ to compute the fire plume rise. 3.3.7 Emissions Modeling Spatial Allocation The methods used to perform spatial allocation are summarized in this section. For the modeling platform, spatial factors are typically applied by county and SCC. Spatial allocation was performed for the 12US1 modeling grid. To accomplish this, SMOKE used national 12-km spatial surrogates and a SMOKE area-to-point data file. For the U.S., the surrogates use circa 2020 data. The U.S., Mexican, and Canadian 12-km surrogates cover the entire CONUS domain. For Canada, shapefiles for generating new surrogates were provided by ECCC for use with their 2020 inventories. The U.S., Mexican, and Canadian 12-km surrogates cover the entire CONUS domain 12US1. While highlights of information are provided below, the file Surrogate_specifications_2021_platform_US_Can_Mex.xlsx documents the complete configuration for generating the surrogates and can be referenced for more details. 3.3.7.1 Surrogates for U.S. Emissions There are more than 80 spatial surrogates available for spatially allocating U.S. county-level emissions to the 12-km grid cells used by the air quality model. Note that an area-to-point approach overrides the use of surrogates for a limited set of sources. Table 3-9 lists the codes and descriptions of the surrogates. Surrogate names and codes listed in italics are not directly assigned to any sources for this platform, but they are sometimes used to gapfill other surrogates. When the source data for a surrogate have no values for a particular county, gap filling is used to provide values for the spatial surrogate in those counties to ensure that no emissions are dropped when the spatial surrogates are applied to the emission inventories. The surrogates for the platform are based on a variety of geospatial data sources, including the American Community Survey (ACS) for census-related data and the National Land Cover Database (NLCD). Onroad surrogates are based on average annual daily traffic counts (AADT) from the highway monitoring performance system (HPMS). 35 ------- Surrogates for the U.S. were generated using the Surrogate Tools DB with the Java-based Surrogate tools used to perform gapfilling and normalization where needed. The tool and documentation for the original Surrogate Tool are available at https://www.cmascenter.org/sa- tools/documentation/4.2/SurrogateToolUserGuide 4 2.pdf, and the tool and documentation for the Surrogate Tools DB is available from https://www.cmascenter.org/surrogate tools db/. The file "Surrogate_specifications_2021_platform_US_Can_Mex.xlsx" documents the configuration for generating the surrogates. Table 3-9. U.S. Surrogates available for the modeling platform Code Surrogate Description Code Surrogate Description N/A Area-to-point approach (see 3.6.2) 6696 All Abandoned CBM Wells - Plugged 100 Population 6697 All Abandoned Oil Wells - Unplugged 110 Housing 6698 All Abandoned Gas Wells - Unplugged 135 Detached Housing 670 Spud Count - CBM Wells 136 Single and Dual Unit Housing 671 Spud Count - Gas Wells 137 Single + Dual Unit + Manufactured Housing 672 Gas production - oil wells 150 Residential Heating - Natural Gas 674 Unconventional Well Completion Counts 170 Residential Heating - Distillate Oil 676 Well count - all producing 180 Residential Heating - Coal 677 Well count - all exploratory 190 Residential Heating - LP Gas 678 Completions at Gas Wells 205 Extended Idle Locations 679 Completions at CBM Wells 239 Total Road AADT 681 Spud Count - Oil Wells 240 Total Road Miles 683 Produced Water at All Wells 242 All Restricted AADT 6831 Produced water at CBM wells 244 All Unrestricted AADT 6832 Produced water at gas wells 258 Intercity Bus Terminals 6833 Produced water at oil wells 259 Transit Bus Terminals 685 Completions at Oil Wells 261 NTAD Total Railroad Density 686 Completions - all wells 271 NTAD Class 12 3 Railroad Density 687 Feet Drilled at All Wells 300 NLCD Low Intensity Development 689 Gas Produced - Total 304 NLCD Open + Low 691 Well Counts-CBM Wells 305 NLCD Low + Med 692 Spud Count - All Wells 306 NLCD Med + High 693 Well Count - All Wells 307 NLCD All Development 694 Oil Production at Oil Wells 308 NLCD Low + Med + High 695 Well Count - Oil Wells 309 NLCD Open + Low + Med 696 Gas Production at Gas Wells 310 NLCD Total Agriculture 697 Oil production - gas wells 319 NLCD Crop Land 698 Well Count - Gas Wells 320 NLCD Forest Land 699 Gas Production at CBM Wells 321 NLCD Recreational Land 711 Airport Areas 340 NLCD Land 801 Port Areas 350 NLCD Water 850 Golf Courses 401 FAO 2010 Cattle 860 Mines 402 FAO 2010 Pig 861 Sand and Gravel Mines 403 FAO 2010 Chicken 862 Lead Mines 404 FAO 2010 Goat 863 Crushed Stone Mines 405 FAO 2010 Horse 900 OSM Fuel 406 FAO 2010 Sheep 901 OSM Asphalt Surfaces 36 ------- Code Surrogate Description Code Surrogate Description 508 Public Schools 902 OSM Unpaved Roads 650 Refineries and Tank Farms 4011 FAO 2010 Large Cattle Operations 669 All Abandoned Wells 4012 NPDES 2020 Beef Cattle 6691 All Abandoned Oil Wells 4013 NPDES 2020 Dairy Cattle 6692 All Abandoned Gas Wells 4021 NPDES 2020 Swine 6693 All Abandoned CBM Wells 4031 NPDES 2020 Chicken 6694 All Abandoned Oil Wells - Plugged 4041 NPDES 2020 Goat 6695 All Abandoned Gas Wells - Plugged 4071 NPDES 2020 Turkey For the onroad sector, the on-network (RPD) emissions were spatially allocated differently from other off-network processes (i.e. RPV, RPP, RPHO, RPS, RPH). Surrogates for on-network processes are based on AADT data and off network processes (including the off-network idling included in RPHO) are based on land use surrogates as shown in Table 3-10. Emissions from the extended (i.e., overnight) idling of trucks were assigned to surrogate 205, which is based on locations of overnight truck parking spaces. The underlying data for this surrogate were updated during the development of the 2016 platforms to include additional data sources and corrections based on comments received and these updates were carried into this platform. Table 3-10. Off-Network Mobile Source Surrogates Source type Source Type name Surrogate ID Description 11 Motorcycle 307 NLCD All Development 21 Passenger Car 307 NLCD All Development 31 Passenger Truck 307 NLCD All Development 32 Light Commercial Truck 308 NLCD Low + Med + High 41 Other Bus 306 NLCD Med + High 42 Transit Bus 259 Transit Bus Terminals 43 School Bus 508 Public Schools 51 Refuse Truck 306 NLCD Med + High 52 Single Unit Short-haul Truck 306 NLCD Med + High 53 Single Unit Long-haul Truck 306 NLCD Med + High 54 Motor Home 304 NLCD Open + Low 61 Combination Short-haul Truck 306 NLCD Med + High 62 Combination Long-haul Truck 306 NLCD Med + High For the oil and gas sources in the np_oilgas sector, the spatial surrogates were updated to those shown in Table 3-11 using 2021 data consistent with what was used to develop the nonpoint oil and gas emissions. The exploration and production of oil and gas have increased in terms of quantities and locations over the last seven years, primarily through the use of new technologies, such as hydraulic fracturing. Census-tract, 2-km, and 4-km sub-county Shapefiles were developed, from which the year- specific oil and gas surrogates were generated. All spatial surrogates for np_oilgas are developed based on known locations of oil and gas activity for year 2021. 37 ------- Table 3-11. Spatial Surrogates for Oil and Gas Sources Surrogate Code Surrogate Description 669 All Abandoned Wells 6691 All Abandoned Oil Wells 6692 All Abandoned Gas Wells 6693 All Abandoned CBM Wells 6694 All Abandoned Oil Wells - Plugged 6695 All Abandoned Gas Wells - Plugged 6696 All Abandoned CBM Wells - Plugged 6697 All Abandoned Oil Wells - Unplugged 6698 All Abandoned Gas Wells - Unplugged 670 Spud Count - CBM Wells 671 Spud Count - Gas Wells 672 Gas Production at Oil Wells 673 Oil Production at CBM Wells 674 Unconventional Well Completion Counts 676 Well Count - All Producing 677 Well Count - All Exploratory 678 Completions at Gas Wells 679 Completions at CBM Wells 681 Spud Count - Oil Wells 683 Produced Water at All Wells 685 Completions at Oil Wells 686 Completions at All Wells 687 Feet Drilled at All Wells 689 Gas Produced - Total 691 Well Counts - CBM Wells 692 Spud Count - All Wells 693 Well Count - All Wells 694 Oil Production at Oil Wells 695 Well Count - Oil Wells 696 Gas Production at Gas Wells 697 Oil Production at Gas Wells 698 Well Count - Gas Wells 699 Gas Production at CBM Wells 6831 Produced water at CBM wells 6832 Produced water at gas wells 6833 Produced water at oil wells 3.3.7.2 Allocation Method for Airport-Related Sources in the U.S. There are numerous airport-related emission sources in the NEI, such as aircraft, airport ground support equipment, and jet refueling. The modeling platform includes the aircraft and airport ground support 38 ------- equipment emissions as point sources. For the modeling platform, the EPA used the SMOKE "area-to- point" approach for only jet refueling in the nonpt sector. The following SCCs use this approach: 2501080050 and 2501080100 (petroleum storage at airports), and 2810040000 (aircraft/rocket engine firing and testing). The ARTOPNT file that lists the nonpoint sources to locate using point data was unchanged from the 2005-based platform. 3.3.7.3 Surrogates for Canada and Mexico Emission Inventories The surrogates for Canada to spatially allocate the Canadian emissions are based on the 2020 Canadian inventories and associated data. The spatial surrogate data came from ECCC, along with cross references. The shapefiles they provided were used in the Surrogate Tool (previously referenced) to create spatial surrogates. The Canadian surrogates used for this platform are listed in Table 3-15. The population surrogate was updated for Mexico is based on the 2015 GPW v4 (see https://sedac.ciesin.columbia.edu/data/collection/gpw-v4/sets/browse). The other surrogates for Mexico are circa 1999 and 2000 and were based on data obtained from the Sistema Municpal de Bases de Datos (SIMBAD) de INEGI and the Bases de datos del Censo Economico 1999. The surrogates for Mexico in this platform are show in Table 3-13. Table 3-12. Canadian Spatial Surrogates Code Canadian Surrogate Description Code Description 100 Population 925 Manufacturing and Assembly 101 total dwelling 926 Distribution and Retail (no petroleum) 102 urban dwelling 927 Commercial Services 103 rural dwelling 933 Rail-Passenger 104 capped total dwelling 934 Rail-Freight 105 capped meat cooking dwelling 935 Rail-Yard 106 ALL INDUST 940 PAVED ROADS NEW 113 Forestry and logging 945 Commercial Marine Vessels 116 Total Resources 946 Construction and mining 200 Urban Primary Road Miles 948 Forest 210 Rural Primary Road Miles 949 Combination of Dwelling 211 Oil and Gas Extraction 951 Wood Consumption Percentage 212 Mining except oil and gas 952 Residential Fuel Wood Combustion (PIRD) 220 Urban Secondary Road Miles 955 UNPAVED ROADS AND TRAILS 221 Total Mining 960 TOTBEEF 222 Utilities 961 80110 Broilers 230 Rural Secondary Road Miles 962 8011 l_Catt 1 e_d a i ry_a n d_H e if e r 233 Total Land Development 963 80112_Cattle_non-Dairy 240 capped population 964 80113_Laying_hens_and_Pullets 308 Food manufacturing 965 80114 Horses 321 Wood product manufacturing 966 80115_Sheep_and_Lamb 323 Printing and related support activities 967 80116 Swine 324 Petroleum and coal products manufacturing 968 80117_Turkeys 39 ------- Code Canadian Surrogate Description Code Description Plastics and rubber products 326 manufacturing 969 80118 Goat Non-metallic mineral product 327 manufacturing 970 TOTPOUL 331 Primary Metal Manufacturing 971 80119 Buffalo 340 Construction - Oil and Gas 972 80120_Llama_and_Alpacas 350 Water 973 80121 Deer Petroleum product wholesaler- 412 distributors 974 80122 Elk 448 clothing and clothing accessories stores 975 80123 Wild boars Waste management and remediation 562 services 976 80124 Rabbit SCL12003 Petroleum Liquids 601 Transportation (PIRD) 977 80125 Mink SCL12007 Oil Sands In-Situ Extraction 602 and Processing (PIRD) 978 80126 Fox SCL12010 Light Medium Crude Oil 603 Production (PIRD) 980 TOTSWIN 604 SCL 12011 Well Drilling (PIRD) 981 Harvest Annual 605 SCL 12012 Well Servicing (PIRD) 982 Harvest Perennial 606 SCL 12013 Well Testing (PIRD) 983 Synthfert_Annual 607 SCL 12014 Natural Gas Production (PIRD) 984 Synthfert_Perennial 608 SCL 12015 Natural Gas Processing (PIRD) 985 Tillage_Annual SCL 12016 Heavy Crude Oil Cold 609 Production (PIRD) 990 TOTFERT SCL:12018 Disposal and Waste Treatment 610 (PIRD) 996 urban area SCL:12019 Accidents and Equipment 611 Failures (PIRD) 1251 OFFR TOTFERT SCL:12020 Natural Gas Transmission and 612 Storage (PIRD) 1252 OFFR MINES 651 MEITC1C2 Anchored 1253 OFFR Other Construction not Urban 652 MEIT C1C2 Underway 1254 OFFR Commercial Services 653 MEITC1C2 Berthed 1255 OFFR Oil Sands Mines 661 MEIT C3 Anchored 1256 OFFR Wood industries CANVEC 662 MEIT C3 Underway 1257 OFFR UNPAVED ROADS RURAL 663 MEIT C3 Berthed 1258 OFFR Utilities 901 AIRPORT 1259 OFFR total dwelling 902 Military LTO 1260 OFFR water 903 Commercial LTO 1261 OFFR ALL INDUST 904 General Aviation LTO 1262 OFFR Oil and Gas Extraction 905 Air Taxi LTO 1263 OFFR ALLROADS 921 Commercial Fuel Combustion 1264 OFFR AIRPORT TOTAL INSTITUTIONAL AND 923 GOVERNEMENT 1265 OFFR_RAILWAY 40 ------- Code Canadian Surrogate Description Code Description 924 Primary Industry Table 3-13. Mexican Spatial Surrogates Code SURROGATE WEIGHT SHAPEFILE WEIGHT ATTRIBUTE 10 MEX Population mex_population_2020 gridcode_Y 22 MEX Total Road Miles mex roads NONE 24 MEX Total Railroads Miles mex railroads NONE 26 MEX Total Agriculture mex_agriculture NONE 36 MEX Commercial plus Industrial Land mex com ind land NONE 44 MEX Airports Area m ex_a i rpo rts_a rea NONE 45 MEX Airports Point m ex_a i rpo rts_poi nt NONE 48 MEX Brick Kilns mex brick kilns NONE 50 MEX Border Crossings mex_border_crossings SUM_Value 3.4 Emissions References Appel, K.W., Napelenok, S., Hogrefe, C., Pouliot, G., Foley, K.M., Roselle, S.J., Pleim, J.E., Bash, J., Pye, H.O.T., Heath, N., Murphy, B., Mathur, R., 2018. Overview and evaluation of the Community Multiscale Air Quality Model (CMAQ) modeling system version 5.2. In Mensink C., Kallos G. (eds), Air Pollution Modeling and its Application XXV. ITM 2016. Springer Proceedings in Complexity. Springer, Cham. Available at https://doi.org/10.1007/978-3-319-57645-9 11. Bullock Jr., R, and K. A. Brehme (2002) "Atmospheric mercury simulation using the CMAQ model: formulation description and analysis of wet deposition results." Atmospheric Environment 36, pp 2135-2146. Available at https://doi.org/10.1016/S1352-2310(02)00220-0. EPA, 2018. AERMOD Model Formulation and Evaluation Document. EPA-454/R-18-003. U.S. Environmental Protection Agency, Research Triangle Park, North Carolina 27711. Available at https://www3.epa.gov/ttn/scram/models/aermod/aermod mfed.pdf. EPA, 2019. Final Report, SPECIATE Version 5.0, Database Development Documentation, Research Triangle Park, NC, EPA/600/R-19/988. Available with Addenda for versions 5.1, 5.2, and 5.3 at https://www.epa.gov/air-emissions-modeling/speciate-51-and-50-addendum-and-final-report. EPA, 2022a. Technical Support Document EPA's Air Toxics Screening Assessment - 2018 AirToxScreen TSD. EPA-452/B-22-002. Available at: https://www.epa.gov/AirToxScreen/airtoxscreen-technical- support-document. EPA, 2022b. Technical Support Document: Preparation of Emissions Inventories for the 2019 North American Emissions Modeling Platform. EPA-454/B-24-011. Available at: https://www.epa.gov/air- emissions-modeling/2019-emissions-modeling-platform-technical-support-document. 41 ------- EPA, 2023. 2020 National Emission Inventory Technical Support Document. EPA-454/R-23-001. U.S. Environmental Protection Agency, OAQPS, Research Triangle Park, NC 27711. Available at: https://www.epa.gov/air-emissions-inventories/2020-national-emissions-inventory-nei-technical- support-document-tsd. EPA, 2024. Technical Support Document (TSD): Preparation of Emissions Inventories for the 2021 North American Emissions Modeling Platform. EPA-454/B-24-011. Available at https://www.epa.gov/air- emissions-modeling/2021-emissions-modeling-platform-technical-support-document. Luecken D., Yarwood G, Hutzell WT, 2019. Multipollutant modeling of ozone, reactive nitrogen and HAPs across the continental US with CMAQ-CB6. Atmospheric environment. 2019 Mar 15;201:62-72. Sarwar, G., S. Roselle, R. Mathur, W. Appel, R. Dennis, "A Comparison of CMAQ HONO predictions with observations from the Northeast Oxidant and Particle Study", Atmospheric Environment 42 (2008) 5760-5770). Available at https://doi.Org/10.1016/i.atmosenv.2007.12.065. Seltzer, K. M., Pennington, E., Rao, V., Murphy, B. N., Strum, M., Isaacs, K. K., and Pye, H. 0. T., 2021: "Reactive organic carbon emissions from volatile chemical products", Atmos. Chem. Phys. 21, 5079- 5100, 2021. https://doi.org/10.5194/acp-21-5079-2021and https://acp.copernicus.org/articles/21/5079/2021/. Skamarock, W., J. Klemp, J. Dudhia, D. Gill, D. Barker, M. Duda, X. Huang, W. Wang, J. Powers, 2008. A Description of the Advanced Research WRF Version 3. NCAR Technical Note. National Center for Atmospheric Research, Mesoscale and Microscale Meteorology Division, Boulder, CO. June 2008. Available at: http://www2.mmm.ucar.edu/wrf/users/docs/arw v3 bw.pdf. Wiedinmyer, C., Y. Kimura, E. C. McDonald-Buller, L. K. Emmons, R. R. Buchholz, W. Tang, K. Seto, M. B. Joseph, K. C. Barsanti, A. G. Carlton, and R. Yokelson, Volume 16, issue 13, GMD, 16, 3873-3891, 2023. https://gmd.copernicus.org/articles/16/3873/2023/. Yarwood, G., R. Beardsley, Y. Shi, and B. Czader: Revision 5 of the Carbon Bond 6 Mechanism (CB6r5). Presented at the Annual CMAS Conference, Chapel Hill, NC, 2020. 42 ------- 4.0 CMAQ Air Quality Model Estimates 4.1 Introduction to the CMAQ Modeling Platform The Clean Air Act (CAA) provides a mandate to assess and manage air pollution levels to protect human health and the environment. EPA has established National Ambient Air Quality Standards (NAAQS), requiring the development of effective emissions control strategies for such pollutants as ozone and particulate matter. Air quality models are used to develop these emission control strategies to achieve the objectives of the CAA. Historically, air quality models have addressed individual pollutant issues separately. However, many of the same precursor chemicals are involved in both ozone and aerosol (particulate matter) chemistry; therefore, the chemical transformation pathways are dependent. Thus, modeled abatement strategies of pollutant precursors, such as VOC and NOx to reduce ozone levels, may exacerbate other air pollutants such as particulate matter. To meet the need to address the complex relationships between pollutants, EPA developed the Community Multiscale Air Quality (CMAQ) modeling system.8 The primary goals for CMAQ are to: • Improve the environmental management community's ability to evaluate the impact of air quality management practices for multiple pollutants at multiple scales. • Improve the scientist's ability to better probe, understand, and simulate chemical and physical interactions in the atmosphere. The CMAQ modeling system brings together key physical and chemical functions associated with the dispersion and transformations of air pollution at various scales. It was designed to approach air quality as a whole by including state-of-the-science capabilities for modeling multiple air quality issues, including tropospheric ozone, fine particles, toxics, acid deposition, and visibility degradation. CMAQ relies on emission estimates from various sources, including the U.S. EPA Office of Air Quality Planning and Standards' current emission inventories, observed emission from major utility stacks, and model estimates of natural emissions from biogenic and agricultural sources. CMAQ also relies on meteorological predictions that include assimilation of meteorological observations as constraints. Emissions and meteorology data are fed into CMAQ and run through various algorithms that simulate the physical and chemical processes in the atmosphere to provide estimated concentrations of the pollutants. Traditionally, the model has been used to predict air quality across a regional or national domain and then to simulate the effects of various changes in emission levels for policymaking purposes. For health studies, the model can also be used to provide supplemental information about air quality in areas where no monitors exist. 8 Byun, D.W., and K. L Schere, 2006: Review of the Governing Equations, Computational Algorithms, and Other Components of the Models-3 Community Multiscale Air Quality (CMAQ) Modeling System. Applied Mechanics Reviews, Volume 59, Number 2 (March 2006), pp. 51-77. 43 ------- CMAQ was also designed to have multi-scale capabilities so that separate models were not needed for urban and regional scale air quality modeling. The CMAQ simulation performed for this 2020 assessment used a single domain that covers the entire continental U.S. (CONUS) and large portions of Canada and Mexico using 12-km by 12-km horizontal grid spacing. Currently, 12-km x 12-km resolution is sufficient as the highest resolution for most regional-scale air quality model applications and assessments.9 With the temporal flexibility of the model, simulations can be performed to evaluate longerterm (annual to multi- year) pollutant climatologies as well as short-term (weeks to months) transport from localized sources. By making CMAQ a modeling system that addresses multiple pollutants and different temporal and spatial scales, CMAQ has a "one atmosphere" perspective that combines the efforts of the scientific community. Improvements will be made to the CMAQ modeling system as the scientific community further develops the state-of-the-science. For more information on CMAQ, go to https://www.epa.gov/cmaq or http://www.cmascenter.org. 4.1.1 Advantages and Limitations of the CMAQ Air Quality Model An advantage of using the CMAQ model output for characterizing air quality for use in comparing with health outcomes is that it provides a complete spatial and temporal coverage across the U.S. CMAQ is a three-dimensional Eulerian photochemical air quality model that simulates the numerous physical and chemical processes involved in the formation, transport, and destruction of ozone, particulate matter, and air toxics for given input sets of initial and boundary conditions, meteorological conditions, and emissions. The CMAQ model includes state-of-the-science capabilities for conducting urban to regional scale simulations of multiple air quality issues, including tropospheric ozone, fine particles, toxics, acid deposition, and visibility degradation. However, CMAQ is resource intensive, requiring significant data inputs and computing resources. An uncertainty of using the CMAQ model includes structural uncertainties, representation of physical and chemical processes in the model. These consist of: choice of chemical mechanism used to characterize reactions in the atmosphere, choice of land surface model, and choice of planetary boundary layer. Another uncertainty in the CMAQ model is based on parametric uncertainties, which include uncertainties in the model inputs: hourly meteorological fields, hourly 3-D gridded emissions, initial conditions, and boundary conditions. Uncertainties due to initial conditions are minimized by using a 10-day ramp-up period from which model results are not used in the aggregation and analysis of model outputs. Evaluations of models against observed pollutant concentrations build confidence that the model performs with reasonable accuracy despite the uncertainties listed above. A detailed model evaluation for ozone and PM2.5 species provided in Section 4.3 shows generally acceptable model performance which is equivalent or better than typical state-of-the-science regional modeling simulations as summarized in Simon et al., 2012.10 9 U.S. EPA (2018), Modeling Guidance for Demonstrating Air Quality Goals for Ozone, PM2.5, and Regional Haze, pp 205. https://www3.epa.gov/ttn/scram/guidance/guide/O3-PM-RH-Modeling_Guidance-2018.pdf. 10 Simon, H„ Baker, K.R., and Phillips, S. (2012) Compilation and interpretation of photochemical model performance statistics published between 2006 and 2012. Atmospheric Environment 61,124-139. 44 ------- 4.2 CMAQ Model Version, Inputs and Configuration This section describes the air quality modeling platform used for the 2021 CMAQ simulation. A modeling platform is a structured system of connected modeling-related tools and data that provide a consistent and transparent basis for assessing the air quality response to changes in emissions and/or meteorology. A platform typically consists of a specific air quality model, emissions estimates, a set of meteorological inputs, and estimates of boundary conditions representing pollutant transport from source areas outside the region modeled. We used the CMAQ modeling system as part of the 2021 Platform to provide a national scale air quality modeling analysis. The CMAQ model simulates the multiple physical and chemical processes involved in the formation, transport, and destruction of ozone and PM2.5. This section provides a description of each of the main components of the 2021 CMAQ simulation along with the results of a model performance evaluation in which the 2021 model predictions are compared to corresponding measured ambient concentrations. 4.2.1 CMAQ Model Version CMAQ is a non-proprietary computer model that simulates the formation and fate of photochemical oxidants, including PM2.5 and ozone, for given input sets of meteorological conditions and emissions. As mentioned previously, CMAQ includes numerous science modules that simulate the emission, production, decay, deposition, and transport of organic and inorganic gas-phase and particle pollutants in the atmosphere. This 2021 analysis employed CMAQ version 5.4.11 The 2021 CMAQ run included CB6r5 chemistry12'13, AER07 aerosol module14 with non-volatile Primary Organic Aerosol (POA), and updated halogen chemistry15. The CMAQ community model versions 5.2 and 5.3 were most recently peer-reviewed in May of 2019 for the U.S. EPA.16 11 CMAQ version 5.4: United States Environmental Protection Agency. (2022). CMAQ (Version 5.4) [Software], Available from https://doi.org/10.5281/zenodo.7218076; https://www.epa.gov/cmaa. CMAQ v5.4 is also available from the Community Modeling and Analysis System (CMAS) at: http://www.cmascenter.org. 12 Luecken, D. J., Yarwood, G., and Hutzell, W. T.: Multipollutant modeling of ozone, reactive nitrogen and HAPs across the continental US with CMAQ-CB6, Atmos Environ, 201, 62-72,10.1016/j.atmosenv.2018.11.060, 2019. 13 Yarwood, G., Beardsley, R., Shi, Y., Czader, B.: Revision 5 of the Carbon Bond 6 Mechanism (CB6r5), CMAS 2020, October 27, 2020. https://www.cmascenter.org/conference/2020/slides/BeardsleyR_CMAS2020_CarbonBond6_Revision5_clean.pdf 14 Xu, L., Pye, H. O. T., He, J., Chen, Y. L., Murphy, B. N., and Ng, N. L: Experimental and model estimates of the contributions from biogenic monoterpenes and sesquiterpenes to secondary organic aerosol in the southeastern United States, Atmos Chem Phys, 18, 12613-12637, 10.5194/acp-18-12613-2018, 2018. 15 Kang, D.; Willison, J.; Sarwar, G.; Madden, M.; Hogrefe, C.; Mathur, R.; Gantt, B.; and Saiz-Lopez, A.: Improving the Characterization of Natural Emissions in CMAQ Environmental Manager, A&WMA, October 2021. 16 Barsanti, K.C., Pickering, K.E., Pour-Biazar, A., Saylor, R.D., Stroud, C.A., (June 19, 2019). Final Report: Sixth Peer Review of the Community Multiscale Air Quality (CMAQ) Modeling System, /https://www.epa.gov/sites/default/files/2019- 08/documents/sixth_cmaq_peer_review_comment_report_6.19.19.pdf. This peer review was focused on CMAQv5.2, which was released in June of 2017, as well as CMAQ v5.3, which was released in August of 2019. It is available from the Community Modeling and Analysis System (CMAS) as well as previous peer-review reports at: http://www.cmascenter.org. 45 ------- 4.2.2 Model Domain and Grid Resolution The CMAQ modeling analyses were performed for a domain covering the continental United States, as shown in Figure 4-1. This single domain covers the entire continental U.S. (CONUS) and large portions of Canada and Mexico using 12-km by 12-km horizontal grid spacing. The 2021 simulation used a Lambert Conformal map projection centered at (-97, 40) with true latitudes at 33 and 45 degrees north. The 12- km CMAQ domain consisted of 459 by 299 grid cells and 35 vertical layers. Table 4-1 provides some basic geographic information regarding the 12-km CMAQ domain. The model extends vertically from the surface to 50 millibars (approximately 17,600 meters) using a sigma-pressure coordinate system. Table 4-2 shows the vertical layer structure used in the 2021 simulation. Air quality conditions at the outer boundary of the 12-km domain were taken from the GEOS-Chem global model (discussed in Section 4.2.4). Table 4-1. Geographic Information for 202112-km Modeling Domain National 12 km CMAQ Modeling Configuration Map Projection Lambert Conformal Projection Grid Resolution 12 km Coordinate Center 97 W, 40 N True Latitudes 33 and 45 N Dimensions 459 x 299 x 35 Vertical Extent 35 Layers: Surface to 50 mb level (see Table 4-2) Table 4-2. Vertical layer structure for 2021 CMAQ simulation (heights are layer top). Vertical Layers Sigma P Pressure (mb) Approximate Height (m) 35 0.0000 50.00 17,556 34 0.0500 97.50 14,780 33 0.1000 145.00 12,822 32 0.1500 192.50 11,282 31 0.2000 240.00 10,002 30 0.2500 287.50 8,901 29 0.3000 335.00 7,932 28 0.3500 382.50 7,064 27 0.4000 430.00 6,275 26 0.4500 477.50 5,553 25 0.5000 525.00 4,885 24 0.5500 572.50 4,264 23 0.6000 620.00 3,683 22 0.6500 667.50 46 3,136 ------- Vertical Layers Sigma P Pressure (mb) Approximate Height (m) 21 0.7000 715.00 2,619 20 0.7400 753.00 2,226 19 0.7700 781.50 1,941 18 0.8000 810.00 1,665 17 0.8200 829.00 1,485 16 0.8400 848.00 1,308 15 0.8600 867.00 1,134 14 0.8800 886.00 964 13 0.9000 905.00 797 12 0.9100 914.50 714 11 0.9200 924.00 632 10 0.9300 933.50 551 9 0.9400 943.00 470 8 0.9500 952.50 390 7 0.9600 962.00 311 6 0.9700 971.50 232 5 0.9800 981.00 154 4 0.9850 985.75 115 3 0.9900 990.50 77 2 0.9950 995.25 38 1 0.9975 997.63 19 0 1.0000 1000.00 0 47 ------- Figure 4-1, Map of the 2021 CMAQ Modeling Domain. The blue box denotes the 12-km national modeling domain. 4.2.3 Modeling Period/ Ozone Episodes The 12-km CMAQ modeling domain was modeled for the entire year of 2021. The annual simulation included a spin-up period, comprised of 10 days before the beginning of the simulation, to mitigate the effects of initial concentrations. Ail 365 model days were used in the annual average levels of PM2.5. For the 8-hour ozone, we used modeling results from the period between May 1 and September 30, This 153-day period generally conforms to the ozone season across most parts of the U.S. and contains the majority of days that observed high ozone concentrations. 48 ------- 4.2.4 Model Inputs: Emissions, Meteorology, and Boundary Conditions 2021 Emissions: The emissions inventories used in the 2021 air quality modeling are described in Section 3, above. 2021 Meteorological Input Data: The gridded meteorological data for the entire year of 2021 at the 12- km continental United States scale domain was derived from the publicly available version 4.1.1 of the Weather Research and Forecasting Model (WRF), Advanced Research WRF (ARW) core.17The WRF Model is a state-of-the-science mesoscale numerical weather prediction system developed for both operational forecasting and atmospheric research applications (http://wrf-model.org). The 12US WRF model was initialized using the 12-km North American Model (12NAM)18 analysis product provided by National Climatic Data Center (NCDC). Where 12NAM data was unavailable, the 40-km Eta Data Assimilation System (EDAS) analysis (ds609.2) from the National Center for Atmospheric Research (NCAR) was used. Analysis nudging for temperature, wind, and moisture was applied above the boundary layer only. The model simulations were conducted continuously. The 'ipxwrf' program was used to initialize deep soil moisture at the start of the run using a 10-day spin-up period. The 2021 WRF meteorology simulated was based on 2011 National Land Cover Database (NLCD).19 The WRF simulation included the physics options of the Pleim-Xiu land surface model (LSM), Asymmetric Convective Model version 2 planetary boundary layer (PBL) scheme, Morrison double moment microphysics, Kain- Fritsch cumulus parameterization scheme utilizing the moisture-advection trigger20 and the RRTMG long-wave and shortwave radiation (LWR/SWR) scheme.21 In addition, the Group for High Resolution Sea Surface Temperatures (GHRSST)22'23 1-km SST data was used for SST information to provide more resolved information compared to the more coarse data in the NAM analysis. Additionally, the hybrid-vertical coordinate system was employed, where the model is terrain-following (Eta) near the surface and isobaric aloft, reducing the influence of surface features on upper-level dynamics. 2021 Initial and Boundary Conditions: The 2021 annual lateral boundary and initial species concentrations were provided using a global 3-D GEOS-Chem vl4.0.1. GEOS-Chem is a 3-D model of atmospheric chemistry driven by meteorological inputs from the Goddard Earth Observing System of the National Aeronautics and Space Administration (NASA) Global Modeling Assimilation Office. GEOS-Chem was run using the standard (or default) options and full atmospheric chemistry.24 The GEOS-Chem simulation was performed at 2 x 2.5-degree horizontal resolution with a 72-layer vertical structure (36 17 Skamarock, W.C., Klemp, J.B., Dudhia, J., Gill, D.O., Barker, D.M., Duda, M.G., Huang, X., Wang, W., Powers, J.G., 2008. A Description of the Advanced Research WRF Version 3. 18 North American Model Analysis-Only, http://nomads.ncdc.noaa.gov/data.php; download from ftp://nomads.ncdc.noaa.gov/NAM/analysis_only/. 19 National Land Cover Database 2011, http://www.mrlc.gov/nlcd2011.php. 211 Ma, L-M. and Tan, Z-M, 2009. Improving the behavior of the Cumulus Parameterization forTropical Cyclone Prediction: Convection Trigger. Atmospheric Research 92 Issue 2,190-211. http://www.sciencedirect.com/science/article/pii/S01698095080Q2585. 21 Gilliam, R.C., Pleim, J.E., 2010. Performance Assessment of New Land Surface and Planetary Boundary Layer Physics in the WRF-ARW. Journal of Applied Meteorology and Climatology 49, 760-774. 22 Stammer, D., F.J. Wentz, and C.L Gentemann, 2003, Validation of Microwave Sea Surface Temperature Measurements for Climate Purposes, J. Climate, 16, 73-87. 23 Global High-Resolution SST (GHRSST) analysis, https://www.ghrsst.org/. 24 GEOS-Chem, https://geoschem.github.io/index.html 49 ------- layers in troposphere, hybrid terrain following coordinate). Simulation used full chemistry including an online stratosphere, non-local planetary boundary layer, and simple secondary organic aerosols. The 2021 simulation required extending the methane inputs to the year 2021, updating lightning inputs, and other parameters for 2021. Emissions included online Model of Emissions of Gases and Aerosols from Nature (MEGAN) version 2.125, online DUST module, and online sea salt module. Global Fire Emissions Database (GFED)26were monthly mean. Anthropogenic emissions included fugitive, combustion, and industrial dust (Philip et al. 2017).27 Marine emissions were based on Community Emissions Data System (CEDS) version 2 including shipping vessels.28 Aircraft Emissions Inventory Code (AEIC)29 monthly aircraft input data. The 2021 GEOS-Chem run was spun-up from the previous 2020 CEDS and AEIC was scaled by COvid-19 adjustment Factors fOR eMjssions (CONFORM) dataset.30 Meteorology used in this 2021 GEOS-Chem run was from Modern-Era Retrospective analysis for Research and Applications, version 2 (MERRA2)31 meteorology at 2 x 2.5-degree. With the exception of input updates for 2021, these were the default options and inputs distributed with vl4.0.1. 4.3 CMAQ Model Performance Evaluation An operational model performance evaluation for ozone and PM2.5 and its related speciated components was conducted for the 2021 simulation using state/local monitoring sites data in order to estimate the ability of the CMAQ modeling system to replicate the 2021 base year concentrations for the 12-km continental U.S. domain. There are various statistical metrics available and used by the science community for model performance evaluation. For a robust evaluation, the principal evaluation statistics used to evaluate CMAQ performance were two bias metrics, mean bias and normalized mean bias; and two error metrics, mean error and normalized mean error. 25 Guenther, A.B., Jiang, X., Heald, C.L., Sakulyanontvittaya, T., Duhl, T., Emmons, L.K., and Wang, X. The Model of Emissions of Gases and Aerosols from Nature version 2.1 (MEGAN2.1): an extended and updated framework for modeling biogenic emissions, 2012, GMD, Volume 5, Issue 6,1471-1492. 26 https://www.globalfiredata.org/ 27 Philip, S., Martin, R.V., Snider, G., Weagle, C.L, van Donkelaar, A., Brauer, M., Henze, D.K., Klimont, Z., Venkataraman, C., Guttikunda, S.K., and Zhang, Q., April 2017. "Anthropogenic fugitive, combustion and industrial dust is a significant, underrepresented fine particulate matter source in global atmospheric models." Environmental Research Letters; Bristol, Vol. 12, Iss. 4. Doi:10.1088/1748-9326/aa65a4. 28 A Community Emissions Data System (CEDS) for Historical emissions, https://www.pnnl.gov/projects/ceds 29Simone, N.W., Stettler, M.E.J., Barrett, S.R.H., 2013. Rapid estimation of global civil aviation emissions with uncertainty quantification, Transportation Research Part D: Transport and Environment, Volume 25, 33-41, ISSN 1361-9209, https://doi.Org/10.1016/j.trd.2013.07.001. 30 Doumbia, T., Granier, C., Elguindi, N., Bouarar, I., Darras, S., Brasseur, G., Gaubert, B., Liu, Y., Shi, X., Stavrakou, T., Tilmes, S., Lacey, F., Deroubaix, A., and Wang, T., 2021: Changes in global air pollutant emissions during the COVID-19 pandemic: a dataset for atmospheric modeling, Earth Syst. Sci. Data, 13,4191-4206, https://doi.org/10.5194/essd-13-4191-2021. 31 Global Modeling and Assimilation Office (GMAO). lnst3_3d_asm_Cp; MERRA-2 IAU State Meteorology Instantaneous 3- hourly (p-coord, 0.625x0.5L42), version 5.12.4, Greenbelt, MD, USA: Goddard Space Flight Center (GSFC DAAC), 2015. Doi: 10.5067/VJAFPLlCSIV. 50 ------- Mean bias (MB) is used as average of the difference (predicted - observed) divided by the total number of replicates (n). Mean bias is defined as: MB = -£i(P — 0) , where P = predicted and 0 = observed concentrations. Mean error (ME) calculates the absolute value of the difference (predicted - observed) divided by the total number of replicates (n). Mean error is defined as: ME = ^\P-0\ Normalized mean bias (NMB) is used as a normalization to facilitate a range of concentration magnitudes. This statistic averages the difference (model - observed) over the sum of observed values. NMB is a useful model performance indicator because it avoids overinflating the observed range of values, especially at low concentrations. Normalized mean bias is defined as: i(P-O) NMB = _j *100, where P = predicted concentrations and 0 = observed n E(o) i Normalized mean error (NME) is also similarto NMB, where the performance statistic is used as a normalization of the mean error. NME calculates the absolute value of the difference (model - observed) over the sum of observed values. Normalized mean error is defined as: i\p-c* _i n Z(O) NME = i *100 The performance statistics were calculated using predicted and observed data that were paired in time and space on an 8-hour basis. Statistics were generated for each of the nine National Oceanic and Atmospheric Administration (NOAA) climate regions32 of the 12-km U.S. modeling domain (Figure 4-2). The regions include the Northeast, Ohio Valley, Upper Midwest, Southeast, South, Southwest, Northern Rockies, Northwest, and West33,34 as were originally identified in Karl and Koss (1984).35 32 NOAA, National Centers for Environmental Information scientists have identified nine climatically consistent regions within the contiguous U.S., http://www.ncdc.noaa.gov/monitoring-references/maps/us-climate-regions.php. 33 The nine climate regions are defined by States where: Northeast includes CT, DE, ME, MA, MD, NH, NJ, NY, PA, Rl, and VT; Ohio Valley includes IL, IN, KY, MO, OH, TN, and WV; Upper Midwest includes IA, Ml, MN, and Wl; Southeast includes AL, FL, GA, NC, SC, and VA; South includes AR, KS, LA, MS, OK, and TX; Southwest includes AZ, CO, NM, and UT; Northern Rockies includes MT, NE, ND, SD, WY; Northwest includes ID, OR, and WA; and West includes CA and NV. 34 Note most monitoring sites in the West region are located in California (see Figure 4-2), therefore statistics for the West will be mostly representative of California ozone air quality. 35 Karl, T. R. and Koss, W. J., 1984: "Regional and National Monthly, Seasonal, and Annual Temperature Weighted by Area, 1895-1983." Historical Climatology Series 4-3, National Climatic Data Center, Asheville, NC, 38 pp. 51 ------- U.S. Climate Regions Figure 4-2. NOAA Nine Climate Regions (source: http://www.ncdc.noaa.gov/rnonitoring-references/maps/us- climate-regions.php#references). In addition to the performance statistics, regional maps which show the MB, ME, NMB, and NME were prepared for the ozone season, May through September, at individual monitoring sites as well as on an annual basis for PM2.sand its component species. Evaluation for 8-hour Daily Maximum Ozone: The operational model performance evaluation for S-hour daily maximum ozone was conducted using the statistics defined above. Ozone measurements in the continental U.S. were included in the evaluation and were taken from the 2021 state/local monitoring site data in AQS and the Clean Air Status and Trends Network (CASTNet). The 8-hour ozone model performance bias and error statistics for each of the nine NOAA climate regions and each season are provided in Table 4-4. Seasons were defined as: winter (December-January- February), spring (March-April-May), summer (June-July-August), and fall (September-October- November). In some instances, observational data were excluded from the analysis and model evaluation based on a completeness criterion of 75 percent. Spatial plots of the MB, ME, NMB, and NME for individual monitors are shown in Figures 4-3 through 4-6, respectively. The statistics shown in these two figures were calculated over the ozone season, April through September, using data pairs on days with observed 8-hour ozone of greater than or equal to 60 ppb. In general, the model performance statistics indicate that the 8-hour daily maximum ozone concentrations predicted by the 2021 CMAQ simulation closely reflectthe corresponding 8-hour observed ozone concentrations in space and time in each subregion of the 12-km modeling domain. As indicated by the statistics in Table 4-4, bias and error for 8-hour daily maximum ozone are relatively low in each subregion, not only in the summer when concentrations are highest, but also during other times of the year. Generally, 8-hour ozone at the AQS and CASTNet sites in the summer is over predicted at all 52 ------- climate regions (NMB ranging between 1.2 to 21.2 percent) except in the Southwest, Northwest, West, Northern Rockies, and Upper Midwest at CASTNet sites only where there is a slight under prediction. Likewise, 8-hour ozone at the AQS and CASTNet sites in the fall is typically over predicted across the contiguous U.S. (NMB ranging between 1.0 to 16.5 percent) except in the Southwest and West at CASTNet sites only. In the winter, 8-hour ozone is overpredicted in all climate regions at AQS and CASTNet sites (NMB ranging between 0.2 to 20.5 percent) except in the Southwest at CASTNet sites. However, in the spring, 8-hour ozone concentrations are under predicted at all CASTNet sites in all NOAA climate regions (with NMBs less than approximately 10 percent in each subregion) except in the South and at AQS sites in the Northeast, Southwest, Northern Rockies, Northwest, and West (slight over prediction of NMB ranging between 0.5 and 6.2 percent). Model bias at individual sites during the ozone season is similar to that seen on a subregional basis for the summer. Figure 4-3 shows the mean bias for 8-hour daily maximum ozone greater than 60 ppb is generally ± 15 ppb across the AQS and CASTNet sites. Likewise, the information in Figure 4-5 indicates that the normalized mean bias for days with observed 8-hour daily maximum ozone greater than 60 ppb is within ± 20 percent at the vast majority of monitoring sites across the U.S. domain. Model error, as seen from Figures 4-4 and 4-6, is generally 2 to 16 ppb and 30 percent or less at most of the sites across the U.S. modeling domain. Somewhat greater error is evident at sites in several areas most notably in central California, Northern Rockies, Upper Midwest, and Southeast. Table 4-4. Summary of CMAQ 2021 8-Hour Daily Maximum Ozone Model Performance Statistics by NOAA climate region, by Season and Monitoring Network. Climate Monitor No. of MB ME NMB region Network Season Obs (ppb) (ppb) (%) AQS Winter 10,552 5.0 3.6 12.1 15.4 Spring 16,053 -0.0 4.4 -0.1 10.2 Summer 16,608 5.0 7.5 12.1 18.0 Northeast Fall 12,728 5.5 6.5 16.5 19.4 CASTNet Winter 1,225 2.85 4.1 8.1 12.0 Spring 1,264 -1.4 4.4 -3.1 9.7 Summer 1,265 3.4 6.5 8.4 16.1 Fall 1,252 4.5 5.9 13.6 17.6 AQS Winter 5,773 6.3 7.1 20.5 23.2 Spring 20,787 1.1 4.3 2.5 10.1 Summer 20,461 4.9 7.4 11.2 16.9 Ohio Valley Fall 15,400 5.6 6.7 15.3 18.3 CASTNet Winter 1,586 4.9 6.3 15.0 19.1 Spring 1,615 -1.0 4.5 -2.2 9.8 Summer 1,616 3.9 6.6 9.5 16.0 Fall 1,606 3.9 5.6 10.8 15.5 53 ------- Climate Monitor No. of MB ME NMB region Network Season Obs (ppb) (ppb) (%) AQS Winter 1,794 5.5 5.9 16.6 17.8 Spring 8,332 0.7 4.6 1.7 11.0 Summer 8,789 0.5 6.4 1.2 14.2 Upper Fall 6,051 4.0 5.7 11.5 16.3 Midwest CASTNet Winter 443 4.6 4.9 13.9 14.7 Spring 456 -0.9 4.6 -2.1 9.9 Summer 439 -0.5 5.8 -1.3 13.7 Fall 444 3.5 4.2 10.9 15.2 AQS Winter 7,092 3.8 5.4 11.0 15.7 Spring 15,348 0.3 4.4 0.7 9.7 Summer 14,822 7.2 8.1 21.2 23.9 Fall 12,018 6.3 7.1 17.3 19.5 Southeast CASTNet Winter 983 2.3 4.9 6.5 13.8 Spring 1,023 -1.7 4.4 -3.7 9.4 Summer 1,067 5.5 7.2 15.6 20.2 Fall 1,077 3.9 5.8 10.6 15.5 AQS Winter 10,192 4.7 6.6 14.8 20.6 Spring 12,797 2.6 5.5 6.2 13.2 Summer 12,338 7.1 9.0 18.3 23.3 Fall 11,840 3.5 6.3 8.9 15.7 South CASTNet Winter 507 4.2 6.3 12.4 18.5 Spring 531 0.2 4.2 0.5 9.6 Summer 538 4.8 7.8 12.4 20.1 Fall 531 3.1 5.9 7.9 15.0 AQS Winter 10,325 1.1 4.6 2.9 11.9 Spring 11,348 -2.1 5.3 -4.2 10.3 Summer 11,235 -5.8 7.7 -9.9 13.3 Fall 11,018 0.4 4.8 1.0 10.6 Southwest CASTNet Winter 991 -0.2 3.2 -0.5 7.5 Spring 1,063 -2.9 4.7 -5.5 9.0 Summer 1,061 -4.6 6.6 -8.2 11.8 Fall 1,070 -0.0 3.7 -0.0 8.0 54 ------- Climate Monitor No. of Hi region Network Season Obs AQS Winter 4,177 4.0 5.0 11.0 14.0 Spring 4,539 -0.5 4.6 -1.2 10.5 Summer 4,481 -5.5 8.1 -10.5 15.5 Northern Fall 4,173 1.3 4.3 3.4 11.2 Rockies CASTNet Winter 775 2.4 4.0 6.2 10.2 Spring 795 -1.8 3.4 -4.0 9.1 Summer 803 -6.6 7.9 -12.4 15.0 Fall 790 0.9 4.2 2.3 10.3 AQS Winter 745 2.8 4.7 8.6 14.5 Spring 1,522 -3.1 5.1 -7.3 12.2 Summer 2,384 -3.2 6.6 -7.5 15.7 Northwest Fall 1,290 0.9 5.4 2.4 14.9 CASTNet Winter 256 3.5 4.5 9.8 12.5 Spring 271 -2.7 4.3 -6.1 9.7 Summer 273 -7.5 8.1 -14.7 16.0 Fall 268 1.9 5.5 5.1 14.9 AQS Winter 14,139 2.4 5.0 7.2 14.6 Spring 16,287 -1.3 4.7 -2.8 10.2 Summer 16,179 -3.5 7.9 -6.7 15.2 West Fall 15,267 0.5 5.5 1.1 12.8 CASTNet Winter 592 0.1 3.4 0.2 8.5 Spring 619 -5.2 5.8 -10.1 11.3 Summer 623 -10.8 11.2 -17.7 18.3 Fall 591 -2.4 5.0 -5.1 10.6 ------- 03 8hrmax MB (ppb) for run CMAQ 2021 hb MP 12US1 for 20210401 to 20210930 units = ppb coverage limit = 75% * CASTNET Daily • AQS Daily Figure 4-3. Mean Bias (ppb) of 8-hour daily maximum ozone greater than 60 ppb over the period April- September 2021 at AQS and CASTNet monitoring sites in the continental U.S. modeling domain. units = ppb coverage limit = 75% * CASTNET Daily • AQS Daily 03_8hrmax ME (ppb) for run CMAQ 2021hb_MP 12US1 lor 20210401 lo 20210930 Figure 4-4. Mean Error (ppb) of 8-hour daily maximum ozone greater than 60 ppb over the period April-September 2021 at AQS and CASTNet monitoring sites in the continental U.S. modeling domain. 56 ------- 03_8hrmax NMB (%) for run CMAQ_2021 hb_MP_12US1 for 20210401 to 20210930 * CASTNET Daily • AQS Daily Figure 4-5. Normalized Mean Bias (%) of 8-hour daily maximum ozone greater than 60 ppb over the period April-September 2021 at AQS and CASTNet monitoring sites in the continental U.S. modeling domain. 03_8hrmax NME (%) for run CMAQ_2021 hb_MP_12US1 for 20210401 to 20210930 a CASTNET Daily • AQS Daily Figure 4-6. Normalized Mean Error (%) of 8-hour daily maximum ozone greater than 60 ppb over the period April-September 2021 at AQS and CASTNet monitoring sites in the continental U.S. modeling domain. 57 ------- Evaluation for Annual PMp.s Components: The PM evaluation focuses on PM2.5 components including sulfate (SO4), nitrate (NO3), total nitrate (TNO3 = NO3 + HNO3), ammonium (NH4), elemental carbon (EC), and organic carbon (OC). The bias and error performance statistics were calculated on an annual basis for each of the nine NOAA climate subregions defined above (provided in Table 4-5). PM2.5 measurements for 2021 were obtained from the following networks for model evaluation: Chemical Speciation Network (CSN, 24-hour average), Interagency Monitoring of Protected Visual Environments (IMPROVE, 24-hour average, and Clean Air Status and Trends Network (CASTNet), weekly average). For PM2.5 species that are measured by more than one network, we calculated separate sets of statistics for each network by subregion. In addition to the tabular summaries of bias and error statistics, annual spatial maps which show the mean bias, mean error, normalized mean bias, and normalized mean error by site for each PM2.5 species are provided in Figures 4-7 through 4-30. As indicated by the statistics in Table 4-5, annual average sulfate is consistently under predicted at CASTNet, IMPROVE, and CSN monitoring sites across the 12-km modeling domain (with MB values ranging from -0.0 to -0.6 ngm"3) except at IMPROVE and CSN sites in the Northwest (over prediction, 0.1 to 0.2 |-ignr3, respectively). Sulfate performance shows moderate error in the eastern subregions (average of approximately 30-50 percent) while Western subregions show slightly larger error (ranging from 30 to 80 percent). Figures 4-7 through 4-10, suggest spatial patterns vary by region. The model bias for most of the Northeast, Southeast, Ohio Valley, and Southwest states are under predicted within ± 40 percent. The model bias appears to be greater in the Northwest with predictions up to approximately 60-80 percent at individual monitors. Model error also shows a spatial trend by region, where much of the Eastern states are 30 to 50 percent, the Western and Central U.S. states are 40 to 100 percent. Annual average nitrate is under predicted at the rural IMPROVE monitoring sites at all NOAA climate subregions (NMB averaging of-40 percent), except in the Northeast, Ohio Valley, Southeast, and Northwest where nitrate is over predicted (between 30 to 93 percent). At CSN urban sites, annual average nitrate is over predicted at all subregions, except in the Southwest (-40.3 percent), Northern Rockies (-27.1 percent), and West (-50.0 percent) where nitrate is under predicted. Likewise, model performance of total nitrate at sub-urban CASTNet monitoring sites shows an under prediction at all subregions (NMB in the range of-4.2 to -47.8 percent), except in the Northeast (24.7 percent), Ohio Valley (6.4 percent), Southeast (2.0 percent), and South (27.2 percent). Model error for nitrate and total nitrate is somewhat greater for each of the nine NOAA climate subregions as compared to sulfate. Model bias at individual sites indicates over prediction of greater than 10 percent at monitoring sites along the upper Northeast, and Northwest coastline as well as in the South and Southeast as indicated in Figure 4-13. The exception to this is in the Southwest, Northern Rockies, and Western U.S. of the modeling domain where there appears to be a greater number of sites with under prediction of nitrate of 10 to 80 percent. Annual average ammonium model performance as indicated in Table 4-5 has a tendency for the model to under predict across CASTNet sites (ranging from -13.2 to -75.6 percent). Ammonium performance across the urban CSN sites shows an over prediction in all NOAA climate subregions (ranging from 6.7 to >100 percent), except under predictions in the Southwest (-51.9 percent), Northern Rockies (-7.6 58 ------- percent), and West (-53.4 percent). The spatial variation of ammonium across the majority of individual monitoring sites in the Eastern U.S. shows bias within ± 50 percent (Figures 4-19 and 4-21). A larger bias is seen in the Northeast and in the Northern Rockies, (over prediction bias on average 80 to 100 percent). The urban monitoring sites exhibit slightly larger errors than at rural sites for ammonium. Annual average elemental carbon is under predicted in all of the nine climate regions at urban and rural sites (biases between -11.1 to -55.9 percent) except at urban Northwest sites (over prediction ranging between 0.5 to 33.0 percent). There is not a large variation in error statistics from subregion to subregion or at urban versus rural sites. Like elemental carbon, annual average organic carbon is under predicted in all of the nine climate regions at urban and rural sites (biases between -4.0 to 69.2 percent) except at urban Northwest CSN sites (over prediction of 64.2 percent). Similarly, error model performance does not show a large variation from subregion to subregion or at urban versus rural sites. Table 4-5. Summary of CMAQ 2021 Annual PM Species Model Performance Statistics by NOAA Climate region, by Monitoring Network. Pollutant Monitor Network Subregion No. of Obs MB (lagnv3) ME (lagnv3) NMB (%) NME (%) CSN Northeast 3,069 -0.3 0.4 -34.6 41.5 Ohio Valley 2,261 -0.3 0.4 -27.7 39.4 Upper Midwest 1,062 -0.2 0.3 -22.4 36.9 Southeast 1,740 -0.2 0.3 -21.9 40.8 South 1,066 -0.3 0.5 -28.5 43.6 Southwest 1,116 -0.2 0.3 -48.6 53.3 Northern Rockies 548 -0.1 0.2 -24.2 42.6 Northwest 724 0.2 0.3 52.5 75.5 Sulfate West 1,853 -0.2 0.4 -30.0 50.6 IMPROVE Northeast 1,959 -0.2 0.2 -38.9 44.9 Ohio Valley 922 -0.4 0.4 -40.9 44.6 Upper Midwest 923 -0.2 0.2 -31.8 42.4 Southeast 1,583 -0.3 0.4 -36.3 46.0 South 1,080 -0.3 0.5 -37.9 46.4 Southwest 3,775 -0.2 0.2 -48.8 54.1 Northern Rockies 2,121 -0.1 0.1 -28.5 47.2 Northwest 1,905 0.1 0.2 39.0 89.4 West 2,352 -0.1 0.3 -30.8 66.7 59 ------- Monitor No. of MB ME NMB NME Pollutant Network Subregion Obs (lagnr3) (|agnr3) (%) (%) CASTNet Northeast 883 -0.4 0.4 -48.4 48.5 Ohio Valley 894 -0.5 0.5 -47.3 47.6 Upper Midwest 250 -0.3 0.3 -42.8 43.6 Sulfate Southeast 631 -0.5 0.5 -53.8 54.1 South 381 -0.6 0.6 -50.9 51.2 Southwest 444 -0.2 0.2 -58.9 58.9 Northern Rockies 517 -0.2 0.2 -46.7 47.9 Northwest 99 -0.0 0.1 -25.9 37.1 West 280 -0.3 0.4 -60.2 65.5 CSN Northeast 3,068 0.3 0.6 31.3 67.8 Ohio Valley 2,260 0.7 0.6 14.2 49.9 Upper Midwest 1,061 0.1 0.6 7.5 41.1 Southeast 1,739 0.3 0.5 95.2 >100 South 1,064 0.0 0.4 2.8 69.3 Southwest 1,116 -0.3 0.6 -40.3 69.0 Northern Rockies 545 -0.2 0.4 -27.1 49.5 Northwest 724 0.6 0.9 >100 >100 West 1,853 -1.1 1.4 -50.0 63.0 Nitrate IMPROVE Northeast 1,958 0.3 0.3 93.1 >100 Ohio Valley 922 0.1 0.4 29.4 78.2 Upper Midwest 920 -0.0 0.3 -2.2 50.2 Southeast 1,582 0.1 0.3 62.7 >100 South 1,080 -0.0 0.3 -5.3 71.7 Southwest 3,774 -0.1 0.1 -62.7 84.9 Northern Rockies 2,121 -0.0 0.1 -24.2 69.9 Northwest 1,890 0.0 0.2 29.7 >100 West 2,350 -0.2 0.3 -46.3 70.1 CASTNet Northeast 883 0.2 0.4 24.7 40.2 Total Ohio Valley 894 0.1 0.4 6.4 26.9 Nitrate (N03 + Upper Midwest 250 -0.0 0.3 -4.2 24.9 HNO3) Southeast 631 0.0 0.5 2.0 53.1 60 ------- Monitor Pollutant Network Subregion No. of Obs MB (lagnr3) ME (lagnr3) NMB (%) NME (%) South 381 -0.1 0.3 27.2 -12.3 Total Southwest 444 -0.1 0.2 -26.1 38.4 Nitrate (N03 + Northern Rockies 517 -0.1 0.2 -24.3 34.2 HNO3) Northwest 99 -0.0 0.1 -5.3 28.6 West 280 -0.6 0.6 -47.8 51.8 CSN Northeast 3,068 0.0 0.2 21.5 68.5 Ohio Valley 2,261 0.0 0.2 17.9 57.1 Upper Midwest 1,062 0.0 0.2 14.3 48.4 Southeast 1,738 0.0 0.2 32.1 90.0 South 1,065 0.0 0.2 6.7 62.7 Southwest 1,114 -0.1 0.2 -51.9 75.6 Northern Rockies 548 -0.0 0.1 -7.6 53.6 Northwest 721 0.1 0.2 >100 >100 Ammonium West 1,850 -0.3 0.5 -53.4 70.0 CASTNet Northeast 883 -0.0 0.1 -13.2 45.9 Ohio Valley 894 -0.1 0.2 -21.3 38.8 Upper Midwest 250 -0.0 0.1 -22.5 38.3 Southeast 631 -0.0 0.1 -26.4 56.8 South 381 -0.1 0.1 -28.7 43.7 Southwest 444 -0.1 0.1 -65.1 67.1 Northern Rockies 517 -0.1 0.1 -54.3 60.3 Northwest 99 -0.0 0.1 -58.8 69.0 West 280 -0.1 0.2 -75.6 80.9 CSN Northeast 3,032 -0.1 0.3 -26.9 47.6 Ohio Valley 2,238 -0.2 0.3 -42.6 50.0 Elemental Upper Midwest 1,163 -0.1 0.2 -27.7 48.5 Carbon Southeast 1,617 -0.4 0.4 -49.5 54.5 South 1,072 -0.1 0.1 -46.0 58.5 Southwest 1,119 -0.2 0.3 -30.7 52.5 Northern Rockies 528 -0.2 0.2 -49.0 59.7 61 ------- Monitor No. of MB ME NMB NME Pollutant Network Subregion Obs (lagnr3) (|agnr3) (%) (%) Northwest 730 0.2 0.4 33.0 71.9 West 1,235 -0.3 0.4 -35.2 47.6 IMPROVE Northeast 1,804 -0.0 0.1 -11.1 50.3 Ohio Valley 922 -0.1 0.1 -47.5 51.9 Upper Midwest 1,029 -0.1 0.1 -40.1 58.3 Elemental Southeast 1,673 -0.1 0.1 -45.9 51.2 Carbon South 1,021 -0.2 0.3 -44.5 50.3 Southwest 3,688 -0.1 0.1 -55.9 61.9 Northern Rockies 2,172 -0.0 0.1 -13.9 64.9 Northwest 1,811 0.0 0.1 0.5 76.9 West 2,225 -0.0 0.1 -27.6 61.4 CSN Northeast 3,032 -0.0 1.1 -4.0 54.8 Ohio Valley 2,237 -0.6 0.9 -30.6 42.7 Upper Midwest 1,163 -0.6 1.0 -31.5 49.9 Southeast 1,615 -0.3 1.0 -13.4 41.5 South 1,021 -0.8 1.1 -37.0 50.7 Southwest 1,119 -0.7 1.2 -36.2 58.4 Northern Rockies 528 -1.3 1.4 -65.8 70.9 Northwest 730 1.2 2.0 64.2 >100 Organic West 1,235 -0.9 1.4 -31.9 48.5 Carbon IMPROVE Northeast 1,819 -0.2 0.5 -19.5 53.0 Ohio Valley 923 -0.4 0.5 -34.1 43.5 Upper Midwest 1,045 -0.5 0.7 -43.6 60.8 Southeast 1,693 -0.3 0.6 -23.2 49.6 South 1,080 -0.5 0.6 -45.1 56.3 Southwest 3,749 -0.6 0.7 -69.2 72.4 Northern Rockies 2,224 -0.9 1.1 -62.4 74.9 Northwest 1,877 -0.3 1.2 -24.2 87.6 West 2,286 -0.9 1.3 -47.5 70.3 62 ------- S04 MB (ug/m3) for run CMAQ 2021 hb JilP 12US1 for 20210101 to 20211231 units = ug/m3 coverage limit = , 10 1.0 1.0 1.1 1 0.8 0.6 0.4 0 0.. -0, I • IMPROVE CSN ¦ CASTNET Weekly Figure 4-7. Mean Bias (ngrrr3) of annual sulfate at monitoring sites in the continental U.S. modeling domain. S04 ME (ug/m3) for run CMAQ 2021 hb_MP 12US1 for 20210101 to 20211231 • IMPROVE * CSN ¦ CASTNET Weekly Figure 4-8. Mean Error (|igm3) of annual sulfate at monitoring sites in the continental U.S. modeling domain. 63 ------- S04 NMB (%) tor run CMAQ 2021 hb MP 12US1 for 20210101 to 20211231 units = % coverage limit = 75% • IMPROVE CSN ¦ CASTNET Weekly Figure 4-9. Normalized Mean Bias (%) of annual sulfate at monitoring sites in the continental U.S. modeling domain. units = % coverage limit = 75% • IMPROVE * CSN ¦ CASTNET Weekly Figure 4-10. Normalized Mean Error (%) of annual sulfate at monitoring sites in the continental U.S. modeling domain. 12US1 tor 20210101 to 20211231 > 100 90 80 70 60 50 40 30 20 10 S04 NME 64 ------- NQ3 MB (ug/m3) for run CMAQ 2021 hb MP 12US1 for 20210101 to 20211231 units = ug/m3 coverage limit = 75% 0.6 0.4 0.2 0 -0.2 -0.4 | -0.6 -0.8 -1 I -1.2 -1.4 | -1.6 -1.8 I <-2 • IMPROVE a CSN Figure 4-11. Mean Bias (|igm 3) of annual nitrate at monitoring sites in the continental U.S. modeling domain. N03 ME (ug/m3) for run CMAQ 2021 hb_MP 12US1 for 20210101 to 20211231 units = ug/m3 coverage limit = 75% • IMPROVE CSN Figure 4-12. Mean Error fugnr3) of annual nitrate at monitoring sites in the continental U.S. modeling domain. 65 ------- N03 NMB (%) for run CMAQ 2021 hb MP 12US1 for 20210101 to 20211231 units = % coverage limit = 75% • IMPROVE CSN Figure 4-13. Normalized Mean Bias (%) of annual nitrate at monitoring sites in the continental U.S. modeling domain. units = % coverage limit = 75% >100 • IMPROVE CSN Figure 4-14. Normalized Mean Error (%) of annual nitrate at monitoring sites in the continental U.S. modeling domain. N03 NME (%) for run CMAQ 2021 hb_MP 12US1 for 20210101 to 20211231 66 ------- TN03 MB (ug/m3) for run CMAQ_2021 12US1 for 20210101 to 20211231 units - ug/m3 coverage limit = 75% • CASTNET Weekly Figure 4-16. Mean Error (|ignr3) of annual total nitrate at monitoring sites in the continental U.S. modeling domain. • CASTNET Weekly Figure 4-15. Mean Bias (ngnr3) of annual total nitrate at monitoring sites in the continental U.S. modeling domain. TNQ3 ME (ug/m3) lor run CMAQ_2021 hb_MP_12US1 for 20210101 to 20211231 ------- TNQ3 NMB (%) for run CMAQ_2021 12US1 for 20210101 to 20211231 unils = % coverage limit = 75% > 100 90 80 70 60 50 40 30 20 10 0 -10 -20 -30 -40 -50 -60 -70 -80 -90 <-100 units - % coverage limit = 75% • CASTNET Weekly Figure 4-18. Normalized Mean Error (%) of annual total nitrate at monitoring sites in the continental U.S. modeling domain. • CASTNET Weekly Figure 4-17. Normalized Mean Bias (%) of annual total nitrate at monitoring sites in the continental U.S. modeling domain. TNQ3 NME (%) for run CMAQ_2021hb_MP_12US1 for 20210101 to 20211231 68 ------- NH4 MB (ug/m3) for run CMAQ_2021 hb_MP_12US1 for 20210101 to 20211231 units = ug/m3 coverage limit = 75% CSN CASTNET Weekly Figure 4-20. Mean Error (ngnr3) of annual ammonium at monitoring sites in the continental U.S. modeling domain. Figure 4-19. Mean Bias (ngm3) NH4ME CSN ± CASTNET Weekly of annual ammonium at monitoring sites in the continental U.S. modeling domain. 12US1 for 20210101 to 20211231 69 ------- NH4 NMB (%) for run CMAQ_2021hb_MP_12US1 for 20210101 to 20211231 units = % coverage limit = 75% > 100 90 80 70 60 50 40 30 20 10 0 -10 -20 -30 -40 -50 -60 -70 -80 units = % coverage limit = 75% • CSN * CASTNET Weekly Figure 4-22. Normalized Mean Error (%) of annual ammonium at monitoring sites in the continental U.S. modeling domain. • CSN * CASTNET Weekly Figure 4-21. Normalized Mean Bias (%) of annual ammonium at monitoring sites in the continental U.S. modeling domain. NH4 NME (%) for run CMAQ 2021hb MP 12US1 for 20210101 to 20211231 70 ------- EC MB (ug/m3) for run CMAQ 2021 hb_MP_12US1 for 20210101 to 20211231 • IMPROVE * CSN Figure 4-23. Mean Bias (ngnr3) of annual elemental carbon at monitoring sites in the continental U.S. modeling domain. EC ME (ug/m3) for run CMAQ 2020ha2 MP cb6r5hap_ae7 12US1 for 20200101 to 20201231 • IMPROVE ^ CSN Figure 4-24. Mean Error (ngrrr3) of annual elemental carbon at monitoring sites in the continental U.S. modeling domain. 71 ------- EC NMB {%) for run CMAQ_2021hb_MP_12US1 tor 20210101 to 20211231 units = % coverage limit = 75% • IMPROVE * CSN Figure 4-25. Normalized Mean Bias (%) of annual elemental carbon at monitoring sites in the continental U.S. modeling domain. units = % coverage limit = 75% • IMPROVE * CSN for run CMAQ 2021hb MP 12US1 for 20210101 to 20211231 > 100 90 80 70 60 50 40 30 20 10 0 EC NME Figure 4-26. Normalized Mean Error (%) of annual elemental carbon at monitoring sites in the continental U.S. modeling domain. 72 ------- OC MB (ug/m3) for run CMAQ 2021 hb_MP_12US1 for 20210101 to 20211231 units = ug/m3 coverage limit = , 10 1.0 1.0 l' 1^ 1 0.8 0.6 0.4 0 0.. -0, I IMPROVE CSN Figure 4-28. Mean Error (|ignr3) of annual organic carbon at monitoring sites in the continental U.S. modeling domain. • IMPROVE CSN ,3* Figure 4-27. Mean Bias (ngm3) of annual organic carbon at monitoring sites in the continental U.S. modeling domain. OC ME (ug/m3) for run CMAQ 2021hb MP 12US1 for 20210101 to 20211231 73 ------- for 20210101 to 20211231 OC NMB units = % coverage limit = 75% • IMPROVE ± CSN Figure 4-29. Normalized Mean Bias (%) of annual organic carbon at monitoring sites in the continental U.S. modeling domain. units = % coverage limit = 75% Figure 4-30. Normalized Mean Error (%) of annual organic carbon at monitoring sites in the continental U.S. modeling domain. for 20210101 to 20211231 > 100 90 80 70 60 50 40 30 20 10 0 • IMPROVE CSN OC NME 74 ------- 5.0 Bayesian space-time downscaling fusion model (downscaler) - Derived Air Quality Estimates 5.1 Introduction The need for greater spatial coverage of air pollution concentration estimates has grown in recent years as epidemiology and exposure studies that link air pollution concentrations to health effects have become more robust and as regulatory needs have increased. Direct measurement of concentrations is the ideal way of generating such data, but prohibitive logistics and costs limit the possible spatial coverage and temporal resolution of such a database. Numerical methods that extend the spatial coverage of existing air pollution networks with a high degree of confidence are thus a topic of current investigation by researchers. The downscaler model (DS) is the result of the latest research efforts by EPA for performing such predictions. DS utilizes both monitoring and CMAQ data as inputs and attempts to take advantage of the measurement data's accuracy and CMAQ's spatial coverage to produce new spatial predictions. This chapter describes methods and results of the DS application that accompany this report, which utilized ozone and PIVh.sdata from AQS and CMAQ to produce predictions to continental U.S. 2020 census tract centroids for 2021. 5.2 Downscaler Model DS develops a relationship between observed and modeled concentrations, and then uses that relationship to spatially predict what measurements would be at new locations in the spatial domain based on the input data. This process is separately applied for each time step (daily in this work) of data, and for each of the pollutants under study (ozone and PM2.5). In its most general form, the model can be expressed in an equation similar to that of linear regression: Y(s) = /?0(s) + ¦ x(s) + e(s) (Equation 1) Where: • F(s) is the observed concentration at point s. Note that F(s) could be expressed as Yt(s), where t indicates the model being fit at time t (in this case, t=l,...,365 would represent day of the year.) • x(s) is the point-level regressor based on the CMAQ concentration at point s. This value is a weighted average of both the gridcell containing the monitor and neighboring gridcells. • Po(s) is the intercept, where /?0(s) = /?0 + /?o(5) 's composed of both a global component /?0 and a local component /?0(5) that is modeled as a mean-zero Gaussian Process with exponential decay • is the global slope; local components of the slope are contained in the x(s) term. • e(s) is the model error. 75 ------- DS has additional properties that differentiate it from linear regression: 1. Rather than just finding a single optimal solution to Equation 1, DS uses a Bayesian approach so that uncertainties can be generated along with each concentration prediction. This involves drawing random samples of model parameters from built-in "prior" distributions and assessing their fit on the data on the order of thousands of times. After each iteration, properties of the prior distributions are adjusted to try to improve the fit of the next iteration. The resulting collection of /?0 and /?x values at each space-time point are the "posterior" distributions, and the means and standard distributions of these are used to predict concentrations and associated uncertainties at new spatial points. 2. The model is "hierarchical" in structure, meaning that the top-level parameters in Equation 1 (i.e. /?0(s), /?i, x(s)), are actually defined in terms of further parameters and sub-parameters in the DS code. For example, the overall slope and intercept is defined to be the sum of a global (one value for the entire spatial domain) and local (values specific to each spatial point) component. This gives more flexibility in fitting a model to the data to optimize the fit (i.e. minimize e(s)). Further information about the development and inner workings of the current version of DS can be found in Berrocal, Gelfand and Holland (2012)36 and references therein. The DS outputs that accompany this report are described below, along with some additional analyses that include assessing the accuracy of the DS predictions. Results are then summarized, and caveats are provided for interpreting them in the context of air quality management activities. 5.3 Downscaler Concentration Predictions In this application, DS was used to predict daily concentration and associated uncertainty values at the 2020 U.S. census tract centroids across the continental U.S. using measurement and CMAQ data as inputs. For ozone, the concentration unit is the daily maximum 8-hour average in ppb and for PM2.5 the concentration unit is the 24-hour average in pLg/m3. 5.3.1 Summary of 8-hour Ozone Results Figure 5-1 summarizes the AQS, CMAQ, and DS ozone data over the year 2021. It shows the 4th max daily maximum 8-hour average ozone for AQS observations, CMAQ model predictions, and DS model results. The DS model estimated that for 2021, about 42% of the U.S. Census tracts (35384 out of 83776) experienced at least one day with an ozone value above the NAAQS of 70 ppb. 36 Berrocal, V., Gelfand, A., and D. Holland. Space-Time Data Fusion Under Error in Computer Model Output: An Application to Modeling Air Quality. Biometrics. 2012. September; 68(3): 837-848. doi:10.1111/j.l541-0420.2011.01725. 76 ------- AQS 120°W 110°W 100°W 90°W 80°W Figure 5-1: Annual 4th max (daily max 8-hour ozone concentrations) derived from AQS, CMAQ, and DS data. 45°N - 40°N - 35°N- 30°IM - 25°N - 45°N 40°IM 35°N 30°N 25CN 45°N - 40°N - 35°N - 30CIM - 25°N - 2021 4'th Max, Daily max 8-hour avg ozone(ppb) (-Inf, 55] (55,60] (60,65] (65,70] (70,75] (75,80] ¦ (80,85] ¦ (85,90] ¦ (90, Inf] CMAQ 77 ------- 5.3.2 Summary ofPM2.s Results Figures 5-2 and 5-3 summarize the AQS, CMAQ, and DS PM2.5 data over the year 2021. Figure 5-2 shows annual means and Figure 5-3 shows 98th percentiles of 24-hour PM2.5 concentrations for AQS observations, CMAQ model predictions, and DS model results. The DS model estimated that for 2021 about 43% of the U.S. Census tracts (35753 out of 83776) experienced at least one day with a PM2.5 value above the 24-hour NAAQS of 35 ^g/m3. 78 ------- AQS 45°N - 40°N - 35°N - 30°N - 45°N - 40°N - 35°N - 30°N - 25°N - 2021 Annual mean, 24-hour avg PM2.5 (ug/m3) (0,3] (3,5] (5,8] (8,10] (10,12] (12,15] (15,18] ¦ (18,1 nf] 110°W 100°W 90°W 80°W mean PM2.5 concentrations derived from AQS, CMAQ, and DS data. Figure 5-2: Annual CMAQ 25°N - 120°W 79 ------- AQS 2021 98'th percentile, 24-hour avg PM2.5 (ug/m3) (0,10] (10,15] (15,20] (20,25] (25,30] (30,35] (35,40] ¦ (40,45] ¦ (45,50] ¦ (5 0, Inf] 45°N - 40°N - 35°N - 30°N - 25°N - 45°N - 40"N- 35°N - 30°N - 25°N - 120°W 110°W 100°W 90°W 80°W CMAQ Figure 5-3: 98th percentile 24-hour average PM2.s concentrations derived from AQS, CMAQ, and DS data. 80 ------- 5.4 Downscaler Uncertainties 5.4.1 Standard Errors As mentioned above, the DS model works by drawing random samples from built-in distributions during its parameter estimation. The standard errors associated with each of these populations provide a measure of uncertainty associated with each concentration prediction. Figures 5-4 and 5-5 show the percent errors resulting from dividing the DS standard errors by the associated DS prediction. The black dots on the maps show the location of EPA sampling network monitors whose data was input to DS via the AQS datasets (Chapter 2). The maps show that, in general, errors are relatively smaller in regions with more densely situated monitors (i.e. the eastern U.S.), and larger in regions with more sparse monitoring networks (i.e. western states). These standard errors could potentially be used to estimate the probability of an exceedance for a given point estimate of a pollutant concentration. % DS Error: ozone ¦ (5,10] ¦ (10,15] ¦ (15,20] 45°N - 40°N - 35°N - 30°N - 25°N - Figure 5-4: Annual mean relative errors (standard errors divided by predictions) from the DS 2021 runs for ozone. The black dots show the locations of monitors that generated the AQS data used as input to the DS model. 81 ------- % DS Error: pm25 ¦ (20,30] ! (30,40] (40,50] ¦ (50,75] 45°N- 40°N - 35°N - 30°N - 25°N - Figure 5-5: Annual mean relative errors (standard errors divided by predictions) from the DS 2021 runs for PM2.5. The black dots show the locations of monitors that generated the AQS data used as input to the DS model. 5.4.2 Cross Validation To check the quality of its spatial predictions, DS can be set to perform "cross-validation" (CV), which involves leaving a subset of AQS data out of the model run and predicting the concentrations of those left out points. The predicted values are then compared to the actual left-out values to generate statistics that provide an indicator of the predictive ability. In the DS runs associated with this report, 10% of the data was chosen randomly by the DS model to be used for the CV process. The resulting CV statistics are shown below in Table 5-1. Table 5-1: Cross-validation statistics associated with the 2021 DS runs. Pollutant Monitor Count Mean Bias RMSE Mean Coverage PM25 967 0.121 3.578 0.952 03 1237 0.020 4.329 0.960 The statistics indicated by the columns of Table 5-1 are as follows: • Mean Bias; The bias of each prediction is the DS prediction minus the AQS value. This column is the mean of all biases across the CV cases. 82 ------- • Root Mean Squared Error (RMSE): The bias is squared for each CV prediction, then the square root of the mean of all squared biases across all CV predictions is obtained. • Mean Coverage: A value of 1 is assigned if the measured AQS value lies in the 95% confidence interval of the DS prediction (the DS prediction ± the DS standard error), and 0 otherwise. This column is the mean of all those O's and l's. 5.5 Summary and Conclusions The results presented in this report are from an application of the DS fusion model for characterizing national air quality for ozone and PM2.5. DS provided spatial predictions of daily ozone and PM2.5 at 2020 U.S. census tract centroids by utilizing monitoring data and CMAQ output for 2021. Large-scale spatial and temporal patterns of concentration predictions are generally consistent with those seen in ambient monitoring data. Both ozone and PM2.5 were predicted with lower error in the eastern versus the western U.S., presumably due to the greater monitoring density in the east. An additional caution that warrants mentioning is related to the capability of DS to provide predictions at multiple spatial points within a single CMAQ grid cell. Care needs to be taken not to over-interpret any within-grid cell gradients that might be produced by a user. Fine-scale emission sources in CMAQ are diluted into the grid cell averages, but a given source within a grid cell might or might not affect every spatial point contained therein equally. Therefore DS-generated fine-scale gradients are not expected to represent actual fine-scale atmospheric concentration gradients, unless possibly where multiple monitors are present in the grid cell. 83 ------- Appendix A - Acronyms Acronyms ARW Advanced Research WRF core model BEIS Biogenic Emissions Inventory System BlueSky Emissions modeling framework BSP BlueSky Pipeline modeling system CAIR Clean Air Interstate Rule CAMD EPA's Clean Air Markets Division CAP Criteria Air Pollutant CAR Conditional Auto Regressive spatial covariance structure (model) CARB California Air Resources Board CEM Continuous Emissions Monitoring CHIEF Clearinghouse for Inventories and Emissions Factors CMAQ Community Multiscale Air Quality model CMV Commercial marine vessel CO Carbon monoxide CSN Chemical Speciation Network DQO Data Quality Objectives EGU Electric Generating Units Emission Inventory Listing of elements contributing to atmospheric release of pollutant substances EPA Environmental Protection Agency EMFAC Emission Factor (California's onroad mobile model) FAA Federal Aviation Administration FDDA Four-Dimensional Data Assimilation FIPS Federal Information Processing Standards HAP Hazardous Air Pollutant HC Hydrocarbon HMS Hazard Mapping System ICS-209 Incident Status Summary form IPM Integrated Planning Model UN Itinerant LSM Land Surface Model MOBILE OTAQ's model for estimation of onroad mobile emissions factors MODIS Moderate Resolution Imaging Spectroradiometer MOVES Motor Vehicle Emission Simulator NEEDS National Electric Energy Database System NEI National Emission Inventory NERL National Exposure Research Laboratory NESHAP National Emission Standards for Hazardous Air Pollutants NH Ammonia NMIM National Mobile Inventory Model NONROAD OTAQ's model for estimation of nonroad mobile emissions NO Nitrogen oxides 84 ------- OAQPS EPA's Office of Air Quality Planning and Standards OAR EPA's Office of Air and Radiation ORD EPA's Office of Research and Development ORIS Office of Regulatory Information Systems (code) - is a 4 or 5 digit number assigned by the Department of Energy's (DOE) Energy Information Agency (EIA) to facilities that generate electricity ORL One Record per Line OTAQ EPA's Office of Transportation and Air Quality PAH Polycyclic Aromatic Hydrocarbon PFC Portable Fuel Container PM2.5 Particulate matter less than or equal to 2.5 microns PM 10 Particulate matter less than or equal to 10 microns PMc Particulate matter greater than 2.5 microns and less than 10 microns Prescribed Fire Intentionally set fire to clear vegetation RIA Regulatory Impact Analysis RPO Regional Planning Organization RRTM Rapid Radiative Transfer Model SCC Source Classification Code SMARTFIRE Satellite Mapping Automatic Reanalysis Tool for Fire Incident Reconciliation SMOKE Sparse Matrix Operator Kernel Emissions TSD Technical support document VOC Volatile organic compounds VMT Vehicle miles traveled Wildfire Uncontrolled forest fire WRAP Western Regional Air Partnership WRF Weather Research and Forecasting Model 85 ------- Appendix B - Emissions Totals by Sector Please see the independent spreadsheet Appendix_B_2021_emissions_totals_by_sector.xlsx that provides inventory and speciation emissions totals for each emissions modeling sector. 86 ------- United States Office of Air Quality Planning and Standards Publication No. EPA-454/R-24-002 Environmental Protection Air Quality Assessment Division October 2024 Agency Research Triangle Park, NC ------- |