Investigation of Current State, Local, and Tribal Facility Inventory Data Sources and Data Flows 1 Overview The goal of this report is to explore how facility data workflows would have to happen between CAERS and State, Local, and Tribal authorities (SLTs), so that CAERS may be developed to accommodate such SLT workflows with as much flexibility as possible. Previously, four use cases of SLT workflows had been identified at a conceptual level, depending on whether an SLT would prefer to keep some or all their current SLT interface and backend. These cases are: Case 1: the SLT interface and backend are retained (CAERS receives data from SLT system) to distribute to the federal programs. Case 2: SLT interface and backend are retained (CAERS receives data from facilities and pushes data to state interfaces). Case 3: CAERS replaces SLT interface, but state databases are retained. Case 4: SLT uses CAERS for reporting instead of their previous reporting system or method. The current SLTs using CAERS (GA, Washington D.C., Pima County AZ, and Rhode Island) are all considered to fall under Cases 3 or 4. However, details about how facility data are updated and maintained by SLTs require research because, regardless of the case the SLT falls under, SLTs may choose not to use CAERS as the primary repository of their facility inventory data. Furthermore, there may be additional data fields that SLTs require for their facility inventory, that would need to be accommodated in CAERS to enable SLTs to take full advantage of combined air emissions reporting. Overall, while SLTs may focus on the uses of their facility inventory to meet their own needs (permitting compliance, billing, emissions inventory analysis), the ability to share the same facility inventory with the federal programs via CAERS, would keep emissions inventory data reporting at the facility and sub-facility levels consistent across programs. This means that each program would have a similar version of a facility in terms of facility and sub-facility data, instead of slightly different version of the same facility which happens currently. Also, reporters wouldn't have to report their emissions data associated to different versions of the same facility and sub-facility components to each program. Facility and sub-facility components shared amongst SLT and federal programs would ultimately provide data analysts, inventory developers, and modelers, the ability to conduct multipollutant analyses across several different types of pollutants; hazardous air pollutants (HAPS)/toxics, criteria air pollutants (CAPs), and greenhouse gases (GHGs). To ensure that CAERS can accommodate specific facility data workflows, so that more SLTs may use CAERS, the Facility Research & Development (R&D) Team conducted a research project to find out what 1 ------- data sources and data flows SLTs use currently for obtaining and updating facility inventory information for their respective emission inventories. The results will help inform how CAERS would need to be enhanced to send facility inventory data to and receive facility inventory data from SLTs. Enhancements in CAERS could include both workflows: the ability to send data to and from a given SLT's system, additional SLT-specific data fields that might need to be included in CAERS, as well as potential data transformations which could be done by the SLT before sending data to CAERS, or as part of an SLT CAERS module. An SLT CAERs module refers to a set of coding that would be added in CAERS or as part of an SLTs workflow with CAERS, to meet SLT specific requirements (that other SLTs don't need), but that also does not require a rebuild of CAERS. 2 Method The R&D team gathered input about information needed to understand SLT facility inventory data workflows from team members and the PDT. State members in the R&D team designed and finalized a questionnaire based on that input, and ECOS sent the survey out to SLT contacts who report to the National Emissions Inventory (NEI), in April of 2022. The questionnaire referenced the new Emissions Inventory System (EIS) Consolidated Emissions Reporting Schema (CERS) v2.0, to ensure that questionnaire respondents were familiar with the data fields and definitions mentioned in the survey. CERS is used by the Emissions Inventory System (EIS) to provide a standardized structure for delegated authorities to submit emissions inventory data to the United States Environmental Protection Agency (EPA) to meet the Air Emissions Reporting Rule (AERR) requirements. The questionnaire used simplified data components (Figure 1) to cover all information specified in CERS for facility inventory that is currently being collected. Figure 1. Simplified Representation of CERS Facility and Sub-facility Inventory Components Used in The Questionnaire The online questionnaire used a multiple-choice format (see Appendix A). It was sent to SLT point source contacts via e-mail on 4/5/2022. Two follow-up reminders were sent out on 4/18/2022 and 4/27/2022, respectively. ECOS received 55 jurisdictional responses out of 78 jurisdictions that were contacted, and one incomplete response had to be excluded, leaving 54 valid responses (a 69% response rate). Table 1 summarizes the number of SLTs responses to the questionnaire. 2 ------- Table 1. Number of Questionnaire Responses by Respondent Type Respondent Type Local State Tribal Incomplete Total Valid Individuals Contacted 19 76 2 107 107 Jurisdictions Contacted | 23 53 2 78 78 Individuals Responded 16 41 1 1 59 58 Jurisdictions Responded I 16 37 1 1 55 54 To clarify the responses further, SLT team members sent follow-up emails to 18 SLTs. Data clean-up was also performed to make responses consistent. For example, one responder did not select "State permitting program" in answering the question regarding information sources for facility sites. It indicated the "other" choice and entered "Permittees submit annual throughput info to the Permitting Section, which calculates the emissions and charges the Permittees an annual emission fee. The Emissions Inventory (El) group uses the calculated emissions from the Permitting calculation sheets." This implies the jurisdiction obtains facility site information from a permitting program. Therefore, the choice of "State permitting program" was considered as the actual response. Another example of data clean-up is an initial indication in the "other" choice of the use of "Inspections", which means that the State compliance program was the right choice. In this case, the choice of State compliance program was considered as the response. There were 4 states for which multiple people from the state responded to the questionnaire with inconsistent answers. These responders were contacted, and the correct answers were confirmed and adjusted accordingly to represent one complete answer for their jurisdiction. 3 Analysis of Questionnaire Responses 3.1 Type of Facilities in Emission Inventories All 54 valid responses indicated the inclusion of facilities with Title V permits in their emission inventory, while 43 also included facilities with other types of permits (such as minors, synthetic minor, etc.). Of the responses, 14 jurisdictions indicated they have non-permitted facilities in their point sources emission inventory. Those non-permitted facilities include greenhouse gas (GHG) emission facilities that are not subject to reporting criteria air pollutants (CAPs) by EPA's Air Emission Report Requirements (AERR) and/or SLT requirements, facilities subject to other state requirements, TRI facilities, small facilities with locational coordinates to support reporting for EPA nonpoint source emission inventory requirements, such as landfills, and publicly owned treatment works (POTWs). Table 2 and Figure 2 shows a further breakdown of types of facilities handled for emission inventories by different jurisdictions. This clearly indicates CAERs could be used by SLTs not only for facilities that are subject to federal- reporting requirements but also for non-federal reporting requirements in use cases 2-4 of SLT workflows. Otherwise, the SLTs would need to have one reporting system for facilities that also report to one or more federal programs, and a separate reporting system for those that don't, and this may or may not be preferred by SLTs. 3 ------- Table 2. Types of Facilities by Jurisdiction Title V Permit Other Permit Non-Permit Local 16 13 2 State 37 28 11 Tribal 1 1 1 Total 54 42 14 Figure 2. Number ofSLTs with Facilities in Point Source Inventories by Permit Type 60 Title V Permit Other Permit Non-Permit ¦ Local ¦ State ¦ Tribe 3.2 Number of Facilities in Emission Inventories The total number of unique facilities within a given SLT point source emission inventory ranges from tens-of-thousands (for example, 30,000 in Colorado and 28,000 in Wyoming) to less than 10 (5 in Knox County, Tennessee). From the data collected, the median number of facilities in SLT point source inventories is 271. SLTs may not report to the NEI all the facilities for which they collect emissions data. The total number of facilities reporting to the NEI by SLTs could be as many as 28,000 in Wyoming; and as few as 5 in Knox County, TN. The median number of facilities reported to the NEI by SLTs is 149. This information speaks to the need for CAERS to have the capacity to hold large facility inventories for SLTs who may need them and ensure that speed and system performance is not slowed by large facility inventories. 4 ------- 3.3 Data in Tribal Emission Inventories The U.S. government officially recognizes 574 Indian tribes in the contiguous 48 states and Alaska. Although each tribe is different, most tribes do not have their own emission inventory and do not report to the NEI, as many are not required to report. Tribes with Treatment as a State (TAS) for emissions inventories are subject to AERR reporting, and tribes can voluntarily accept this responsibility in coordination with states and EPA. Eight tribes reported to the 2017 NEI. Table 3 provides details. Table 3. Number of Facilities Reported by Tribes for the 2017 NEI. Tribal Name EIS Code Program System Code Number of Facilities Coeur d Alene Tribe of the Coeur d Alene Reservation, Idaho 88181 TR181 2 Confederated Tribes and Bands of the Yakama Nation, Washington 88124 TR124 28 Nez Perce Tribe of Idaho 88182 TR182 1 Northern Cheyenne Tribe of the Northern Cheyenne Indian Reservation, Montana 88207 TR207 1 Salt River Pima-Maricopa Indian Community of the Salt River Reservation, Arizona 88615 TR615 16 Shoshone-Bannock Tribes of the Fort Hall Reservation of Idaho 88180 TR180 1 Southern Ute Indian Tribe 88750 TR750 293 Ute Mountain Tribe of the Ute Mountain Reservation, Colorado, New Mexico & Utah 88751 TR751 1 Grand Total 343 The Southern Ute Indian Tribe reported the most point source data to the 2017 NEI with 293 facilities. This tribe responded to the questionnaire, and the results of this report reflect that response. It is expected that the CAERS could facilitate emission inventory development by tribes by providing tribes a way to collect point source data, because in most cases, there are similarities between tribal emission inventories and State and Local (SLs) emission inventories. By enhancing how CAERS handles facility inventory data, tribes might find the use of CAERS more advantageous and could be encouraged to report more so than they do now. 3.4 Facility Inventory Information in Emission Inventories Twenty-eight SLTs do not collect emission inventory (El) information beyond what is identified in the CERS 2.0, while 26 SLTs collect El information beyond that found in CERS 2.0. A variety of additional information collected depending on special SLT needs. Examples are: 5 ------- Idaho collects extra geographic data for the facility and release points, design capacity for all emission units (versus the subset required to do so for NEI purposes), some extra control fields, and a Federal Employer ID to support emission fee billing. Massachusetts collects information on potential-to-emit and permitted limits for throughput and emissions, detailed information on emission units (especially emergency generators), monitors, and controls. Minnesota collects information on permit type, source type, comment fields for each data component, and NAICS for emission units, whereas the federal program requires only facility level NAICS. 'Also, the CERS requires information on one field for the percent reduction achieved for a pollutant when all control measures are operating as designed for a site control measure. Minnesota collects information on two fields for a pollutant controlled by a site control measure: control efficiency and capture efficiency. Wyoming collects permit limits at the unit-process level, U.S. Well Numbers for wellhead facilities, and an Oil & Gas Facility Category for facilities in the oil and gas sector. Illinois collects additional information that pertains to air emissions rules specific to the state. Colorado collects data related to permit processing and analysis. Wisconsin collects number of employees, area from fence line to fence line, whether an Environmental Management System (EMS) exists and if the EMS is reviewed by a third party, and whether the facility is a small business (less than 100 employee and annual receipts not more than $750,000). Several respondents also indicated that the information they collect for a facility inventory depends on the type of a facility. For example, North Carolina has facility data for both permitted and permit- exempt facilities, but only permitted facilities report emissions. Some SLs also indicated more pollutants were covered in their Els than in the NEI, such as Colorado; Maine; City of Louisville, Kentucky; Washington; and Wyoming, although pollutant emissions are not included in the facility inventory data. Therefore, the analysis of pollutant differences is excluded from this report. CAERS workflows with SLTs would have to be designed with these SLT-specific data fields in mind. This means CAERS would have to accommodate additional codes to existing fields, additional data fields that SLTs require, as well as QA checks, for example¦, to enforce reporting of required fields where the EPA does not require them. 3.5 Techniques to Obtain and Update Facility Inventory Data in Els The techniques used to obtain and update facility data existing within Els vary by jurisdiction. Within any one jurisdiction, they may also be dependent on facility characteristics, such as new or existing facilities; permitted or not permitted; and permit type. One jurisdiction may use more than one technique to obtain facility data for emission inventories. For example: Oklahoma maintains an Air Quality facility database (called TEAM) used for their permitting, compliance and enforcement, and emissions inventory programs. Oklahoma collects facility emissions data using a version of SLEIS from Windsor Solutions. SLEIS and TEAM are not dynamically connected but data do flow in both directions between the systems in an ad hoc 6 ------- way. Data in TEAM relevant for Els are only for facility level data, not sub-facility level data. Facility reporters are responsible for reporting all data for emission units, release points, processes, etc., that is, data considered to be facility data only within the context of El reporting. Oklahoma El staff, within operational bounds, also make updates to data during the QC process. Contrasting with the above, El Staff at SLTs that have a small number of facilities may only create and update facility inventory data manually, for example: Hawaii, Lane County, Oregon, and Chattanooga City, TN. About 30 (56%) SLT El programs obtain facility inventory data "with data flows", which means they do so directly from other within-SLT program systems and databases (e.g., permitting, compliance, state master database), using: system integration with other within-SLT systems, linking with other SLT databases, sharing tables, and/or snapshot synchronization from other SLT programs and/or databases. Table 4 shows the number of SLTs that use each technique noting that some SLTs may use more than one "with data flows" technique. Table 4. Number of SLTs Using Techniques to Obtain and Update Facility Inventory Data in Els Technique to Obtain and Update Facility Inventory Data in Els* Number of SLTs El system is part of an integrated/within state database used by other 19 programs (e.g., permitting, compliance, state master DB) El system is a unique application/database linked to other state database(s) 12 El system shares certain tables with state database(s) for other programs 8 Snapshot sync** with state database for other programs 3 Manually created and updated by El staff 32 Reported by facility operator 34 *Note: Georgia relies on CAERS for facility inventory information, which was not listed as a technique. Before using CAERS in 2019 their El system was a unique application/database linked to other state database(s). **Note: The sync function can be performed automatically or manually, and timing of the sync depends on needs. Here the emphasis is "snapshot" not live link. For example, Minnesota can sync all facilities or individual facilities from a state master database automatically at any desired time. Conversely, it is notable that 24 SLTs do not have data flows such as those described above. Table 5 shows the distribution of the 24 jurisdictions "without data flows". This means El staff and/or facility operators manually create and update facility inventory data to report relevant facility configuration changes. Table 5. Number of SLTs that Create and Update Facility Inventory Data Manually Jurisdiction No Data Flow Total Responded to the Questionnaire % No Data Flow from Total Number of Respondents Local 12 16 75% State 11 37 30% Tribal 1 1 100% 7 ------- Table 5 illustrates higher percentages of local and tribal jurisdictions without data flows as compared to states. This result is consistent with the smaller number of facilities needing to be included in emissions inventories for local and tribal jurisdictions. For example: Chattanooga City, TN obtains information for 170 facility sites from permitting staff during annual inspections. Data are manually entered. Clark County, Nevada obtains the facility site information during the permitting process and then retains it in the El database and reports 70 facilities to NEI. At Southwest Clean Air Agency, Washington, facility site data are entered into the El system at the time of permitting or in response to an enforcement action. The El of Southwest Clean Air Agency only contains data for 19 facilities. Table 6 shows the cases where one jurisdiction uses more than one technique to obtain facility data for facility inventories. Note that values in Table 6 include automated and manually conducted techniques. For example, the manual (without data flow) Tribal case in Table 5 clearly appears in Table 6; because the tribal respondent indicated they do manual updates, as well as obtain data from facility operators, both of which are manual. Table 6. Number of Techniques Used by SLTs to Obtain Facility Inventory Data Number of Techniques Used Local State Tribal Total SLTs 1 10 12 22 2 4 12 1 17 3 9 9 4 2 3 5 5 1 1 Total 54 Although 22 SLTs only use one technique in creating and updating facility inventory data, Connecticut uses 5 techniques except the snapshot sync with state database for other programs to obtain a complete set of facility inventory information due to the specific structure of their database and El system. The nature of these updates indicates that SLTs with more than one method might need more time to update their facility inventory than other SLTs before reporting begins in CAERS every year. Therefore, the time that building and updating the facility inventory takes, should be factored in the annual workflow process between CAERS and the SLTs in question. Similarly, if the SLTs would like to get updates on the facility inventory from CAERS itself time would have to be factored into the process so that SLTs have time for their relevant systems to absorb the changes. Finally, while in the case of El reporting for the NEI the SLT manages the facility inventory data, it is industry reporters themselves who edit and report facility data directly to other EPA programs (for example, CEDRI). SLTs who prefer their own facility inventory databases, rather than direct industry input, would need to work with EPA to design their workflow in CAERS, so that they may provide input on the correct facility data inventory to be shared between CAERS and the other federal programs. 8 ------- 3.6 Information Sources for Facility Site, Emissions Unit, and Process In survey responses, SLTs commonly indicated using multiple information sources to compile their facility inventory. Some information sources may be popular for one data component while not for others. Table 7 shows the number of SLTs using the different information sources for facility site, emission units, and processes. Responses do not add up to the total number of respondents because multiple sources can be used by an SLT. Table 7. Number of SLTs Using Different Sources of Information for Facility Site, Emission Unit, and Process Information Sources Facility Site Emission Unit Process State permitting program 37 37 30 State compliance program 23 18 17 State master database 18 13 8 Combination SLT permitting program, compliance program, and/or master database 40 39 33 Facility operator 43 45 46 NEI facility inventory 1 1 NA Previous El 32 31 31 Auto created by El system NA NA 10 Manually created by El staff NA NA 24 Note: "NA" means the choice was not in the survey questionnaire where only the most common information sources for the components were listed. Depending on the SLT, multiple programs can be supported by a central database, which supports interrelated handling of facility data. The state master database may include databases for a permitting program and/or a compliance program in many SLTs. Some respondents selected all three sources (State permitting program, State compliance program, and State master database) when they were asked what sources they used for creating and updating facility inventory data, while others selected only having a single master database. The entry in Table 7 listed as "Combination SLT permitting program, compliance program, and/or master program database," avoids double counting with other rows in the table. For example, Alaska responded obtaining data from a combination of state permitting program, the state compliance program, and the state master databases. Alaska is counted as 1 state in the row of "SLT permitting program, compliance program, and/or master database". About 74% (40 out of 54) of SLTs are using information directly from combined SLT databases for facility sites. Although the number of SLTs using the combined information source decreases for emission units and decreases further for processes, about 61% (33 out of 54) of SLTs are using the combined information source for processes. This is because SLT permitting, compliance, and/or master databases contain facility site information but less unit (or process) level information. For example, Oklahoma maintains an air quality facility database that only contains data at the facility site level, not at the sub- facility level. About 80% of SLTs also obtain information on the facility site level from facility operator reporting. The number of SLTs obtaining information from industry reporting increases for emission units (83%) and increases further (85%) for processes. 9 ------- Only few SLTs use information from the NEI facility inventory to initiate some facility and emission unit data update. More than 57% of SLTs carry the previous El as the starting point for the current El. For example: Washington, Maricopa County, and Arizona use the previous El report as the starting point for a new report and require facilities to review and update all information. Minnesota copies the previous El to the current El, then, only synchronizes new information in the master database with the current El. As described at the end of the previous section, for SLTs wanting to use their previous year reported facility inventory as a starting point, their workflow would have to include steps to ensure data to CAERS come from SLTs with updated information necessary from all their systems or databases. Also, their workflow would have to include steps to ensure data from CAERS, once received by the SLT, would be distributed to any systems or databases the SLT would want to update. While a jurisdiction may use multiple information sources for facility sites, emission units, and processes, the information sources used could be dependent on the facility permitting types. For example, MN obtains information from its master database for permitted facilities that are required to report to the state El program. However, the master database contains information at different levels of detail for facilities with different types of permits due to the state El rule requirements. For small facilities with state general permits, the master database only has information for facility-level attributes. For large facilities with Title V and state permits other than general permits, available information includes facility sites and emission units but not processes. For non-permitted facilities that do not report to the MN El program, such as TRI sources, landfills, POTWs, dry cleaners, and human crematories, MN uses the Toxics Release Inventory (TRI) and other information sources. In all cases, MN uses the previous El as the starting point. For a large facility, the facility needs to add new processes or shut down the processes. For small facilities, the system auto-generates emission units and processes based on the process source classification codes (SCCs) that are selected by facilities. These examples indicates that in some cases, it might be beneficial for CAERS to obtain the updated facility inventory data from the SLT permitting system or database directly, instead of from the SLT El system that is updated from the SLT permitting system or database. The detailed SLT dataflows should be considered in the development of the SLT CAERS modules. Where SLTs may not need sub-facility data for some of their facility types we require additional thinking as to how the CAERS module for each could accommodate the different types of facilities and thus, potentially different levels of detail in their inventory. 3.7 Information Sources for Site Control and Control Path EIS contains information about the types of emissions controls that are present at the facility. Site control paths provide information on how the site controls relate to each other, and between processes, and release points at the facility. Recently, the EPA changed the way it collects data on control devices to improve the configuration of control characterizations in the NEI. CERS vl.O was designed to collect that data using a control "approach". However, the new way of reporting controls is by including information on "paths", and this change is reflected in CERS v2.0. While SLTs may have implemented their El programs based on the AERR with El systems based on the CERS vl.O, those SLT El programs may 10 ------- not collect the information for control path configurations and need time to catch up to the new way of representing control information. Furthermore, while all SLTs are required to submit data for facility site, emission unit, and process to NEI. However, SLTs are not currently required to collect or submit data for site control and site control path because the requirements are not included in the AERR. They are also not included as an automated quality check by EIS or listed in the NEI submission guidance "cheat sheet" for facility inventory data. The EPA is considering whether new control information included in CERS v2.0 should be more explicitly required. Therefore, additional questions were included to understand the variation of SLT data practices with respect to controls data. Table 8 shows responses about control and control path reporting practices. Table 8. Number of SLTs Using Site Control Device and Path Reporting Practices SLT Practice Site Control Data Total SLTs Site Control Path Total SLTs Collect and Report 30 22 Collect Only 12 5 Report Data Only 3 3 Neither 8 24 Blank 1 0 Total 54 54 It is interesting to note the number of SLTs that collect and report data to NEI for controls is larger than those collecting path information. Also, some SLTs may be collecting data but not report it to NEI. Furthermore, 3 SLTs do not collect data but report data for the control devices and control path component that they have created. Eight SLTs neither collect nor report control information, and 24 SLTs do not collect or report control path information at all. Table 9 shows the information sources for control and control paths, including the same 'SLT permitting program, compliance program, and/or master database' information source as described in Section 3.4. Table 9. Number of SLTs Using Different Sources of Information for Control Devices and Paths Information Sources Number of SLTs Using Source for Control Data % Surveyed* Number of SLTs Using Source for Control Path Data % Surveyed** State permitting program 29 63% 17 55% State compliance program 13 28% 5 16% State master database 8 17% 4 13% 11 ------- Information Sources Number of SLTs Using Source for Control Data % Surveyed* Number of SLTs Using Source for Control Path Data % Surveyed** SLT permitting, compliance, and/or master database 31 67% 20 65% Facility reporting 42 91% 29 94% NEI Facility inventory 3 7% 2 6% Previous El 23 50% 17 55% Based on emission units/SCCs 11 24% 7 23% * Forty-six SLTs collect data for site control component. ** Thirty-one SLTs collect data for site control path component. A high percentage of SLTs, about 67% (36 out of 54), incorporate information from SLT permitting, compliance, and/or master databases. About 91% (49 out of 54) of SLTs also use information from facility reporting. About a half of SLTs use previous El reported data as the starting point of the facility inventory for the next El report. A few SLTs use information from NEI facility inventory. About 24% of SLTs generate information based on emission units and processes (SCCs). There are many different requirements and business practices for handling controls data. SLTs collect data based on SLTs regulations and needs that may not match those of the federal program. To meet the CERS v2.0 specifications for items required by the AERR, some SLTs may, thus, need to transform their data either during facility reporting or afterwards when the SLTs elect to report control components to the NEI. SLTs may make assumptions or gap-fill information with their best knowledge to transform the data. The following list provides some examples to illustrate the range of SLT practices in data collection and transformation. In ID, controls are reported by facilities. The names for the required control paths needed for the CERS v2.0 control tables are generated by the Department of Environmental Quality after receiving data from the facilities. Facilities do not see them. Controls are then manually added to the CERS staging tables. ID reported control path names to EPA for the first time for the 2020 NEI. In OK, facilities report control device sequence, percent capture, percent uptime, controlled pollutants by the control device, and control percent reduction efficiency. These data are carried forward into the next inventory cycle and can be amended by reporters as equipment configurations and/or processes change. OK has not implemented the site control path as a reporting requirement or as an option in the front end of its SLEIS (OK's custom reporting system). OK transforms its data into the new control path schema during the EIS submission and aligns with CERS v2.0. MA has built simple linear path data into their CAP emissions reporting system. More complex paths are rarely observed at facilities in MA, and MA is considering the best approach for future implementation. 12 ------- The CT emission reporting system has a sequence order included in the reporting of controls but is based on a simple unit-level control that does not model controls at a facility level, or sophisticated control configurations, such as controls in parallel. In Michigan, emission units are associated or linked with controls in the state system. Facilities can select which controls are installed on each emissions unit when filling out their annual emissions report. Additionally, Ml creates an Excel template to collect supplemental control information from facilities to populate the CERS v2.0 data fields. This Excel template supports facilities creating control paths to the associated processes (SCCs). Puerto Rico does not collect control path data but creates control paths using the control data available from facilities and reports it to the NEI. WA manually creates new tables using previous data to meet AERR reporting requirements, and CERS v2.0 format definitions. WA does not collect "Sequence Number", so the state artificially creates those automatically in its software to convert its data into the XML format for EIS. The MN El system is designed based CERS vl.O. It only collects control information for large facilities from its master database where site controls are connected to emission units. One emission unit can connect to multiple site controls either in parallel or in series, but there is no information on connections between site controls. The code that exports data from this database to be submitted to the NEI has built-in assumptions to allow upload using CERS v2.0. MN does not collect and report control information for small facilities and non-permitted facilities. CAERS reporters are currently submitting data to EIS directly and data are formatted in accordance with the CERS v2.0 specification. The CAERS schema allows facilities to report data under both federal requirements by the AERR and SLT-specific requests. The SLTs can request data fields in CAERS as a part of their overall SLT inventory. While CAERS may be able to collect SLT-specific data, special care must be taken to ensure that technological artifacts (artificial requirements) are not produced when designing technical solutions, and do not create unnecessary difficulty for industry reporters or SLTs. 3.8 Information Sources for Release Point and Release Point Apportionment Although SLTs must identify release points and release point apportionments when submitting data to the NEI, SLTs have diverse requirements, business practices, and approaches in data handling. There are two types of release points: stacks and fugitives. Table 9 shows the results of the survey question asking about the data source for release point apportionment data. Table 10. Number of SLTs Using Different Sources of Information for Release Points and Release Point Apportionment Information Sources Stack Release Fugitive Release Release Point Point Point Apportionment 23 11 7 27 State permitting program State compliance program State master database SLT permitting program, compliance program, and/or master database Facility reporting NEI Facility inventory 45 4 37 17 10 38 29 13 8 31 38 6 37 3 13 ------- Information Sources Stack Release Point Fugitive Release Point Release Point Apportionment Previous El 28 23 20 Auto created by El system 9 10 8 Manually created by El staff 18 16 20 The observations again show a strong dependence (at least 50%, 27 out of 54 SLTs) on SLT permitting program, compliance program, and/or master database as sources of information for release point and release point apportionments data. This reliance on SLT non-El programs and databases for release points and release point apportionment information is less than for data on facilities, emission units, and processes (at least 61%, 33 out of 54 SLTs). This result shows that SLTs are less likely to have information readily available for release points and release point apportionments than they are to have facility-, unit-, or process- level data. Therefore, they may be having to take additional steps to create and report release point data to NEI. Figure 3 illustrates the reliance of data sourced from non-El datasets for facility and sub-facility information. It compares each of the data types previously described by showing the percentage of SLTs that rely on other databases within the SLT to get that information. Figure 3. Percent of SLTs Relying on SLT Permit Program, Compliance Program and/or Master Database for Facility and Sub- facility Data Facility Site Emission Unit Process Stack Release Fugitive Release Point Point Release Point Apportionment Because of the lack of release point information, it is possible SLTs will rely on their NEI facility inventory data more or exclusively to maintain and update their release point information. Specifically, 4 SLTs use NEI facility inventory as the source for their stack parameters and 6 SLTs use it for fugitive release parameters. For the release point apportionment, 3 SLTs use the NEI facility inventory, which is more than those that use it for facility (2 SLTs) and emission units (1 SLT). Fewer SLTs use previous El information stored at the SLT, particularly for release point apportionments (36%), than for facility sites, emission units, and processes (for which at least 53% of SLTs use the previous El). Many SLTs need their 14 ------- El information from the NEI on release point apportionments and paths, as that information is recently required and SLTs may not have had time to adjust their other facility inventory sources to intake and update the new information if they desire to have it stored there. Facility reporting is the most popular information source of release point and release point apportionment information. Finally, about 15% of SLTs use information created manually by El staff and 10% use data created automatically by the El system. A single jurisdiction may use a variety of approaches to obtain data on release points and release point apportionments. The data collected from facilities may be different from the data a SLT transforms and submits to the EIS. SLTs must use assumptions and gap-filling in data transformation to the CERS v2.0 format. In doing this, some SLTs collect total emissions and assume 100% of emissions go through stacks, and don't consider fugitive release points at all, even though doing so is not expected by the AERR. Examples are provided below. MA does not collect release point apportionment data. The release point apportionment information is auto created by their El system based on some very basic assumptions. Pima County, AZ, has not collected or reported release point data for fugitive emissions. Fugitive emissions have been reported but not the associated release point data. Therefore, all emissions (100%), including fugitive emissions, are reported as being released through stacks. Because Pima is a CAERS user, and because CAERS is set up for facility reporters to apportion data to fugitives, in future Pima could refine its data on fugitive emissions. In Wl, the El system assumes all emissions from emissions units are fugitive if the emissions are not partitioned to any stacks. Indiana's database does not support apportionment. IN gathers the information through an online system used to prepare the emissions statements. IN saves the data in an Oracle database based on a very early version of the NEI, then, submits it through the node. IN assumes 100% of emissions are released through stacks. Ml created an Excel template in addition to its El system to fulfill the requirements of the new CERS v2.0. This includes the creation of release point apportionments for any associated stacks. MN's master database contains information for about 428 large facilities on stacks but only identifies fugitive emission units, not fugitive release points. Fugitive release points are auto generated by the El system. MN does not report any parameters associated with fugitive release points, such as width and length. For about 1,700 small permitted and non-permitted facilities, the El system auto generates one dummy release point (could be a stack or a fugitive release point) for an emission unit based on the first process SCCs. The dummy release point is used by all processes under that emission unit. Lincoln/Lancaster County, Nebraska assumes 100% emissions are released through stacks. In CO, process-pollutant level emissions are 100% assigned to a single stack (AIRS/AFS model). HI uses the default SLEIS input format for 2020 emissions. This means collecting data meeting the CERS 1.0 format and then letting the Bridge Tool convert to the CERS 2.0. Because CAERS is already set up to obtain detailed data for release points, SLTs taking advantage of CAERS for reporting would benefit by getting this data from industry reporters. In addition, having data for release point apportionments include fugitives would mean a better alignment with TRI reported emissions, as TRI requires fugitives to be reportedand there can be a mismatch in reporting for both NEI and TRI solely due to the fact that a facility may be reporting fugitives to TRI but not NEI. 15 ------- 4 Discussion and Recommendations for CAERS development The Facility PDT R&D team discussed the survey responses and their implications for CAERS. The following recommendations were issued from those discussions. 1. CAERS capacity: CAERS needs to be able to handle a large volume of facilities, about 30,000 facilities in one jurisdiction. This need applies to data management within CAERS and data transfer between CAERS and SLTs. Additionally, many SLTs contain non-NEI reporting facilities in their inventories and some SLTs keep such facilities in EIS already. CAERs would be used by SLTs not only for facilities that are subject to federal-reporting requirements but also for non-federal reporting requirements in use cases 2-4 of SLT workflows. Otherwise, the SLTs would need to have one reporting system for facilities that also report to one or more federal programs, and a separate reporting system for those that don't. CAERS system performance should not be slowed down for SLTs with a large volume of data. 2. SLT-specific data fields beyond NEI requirements: These include: data fields, codes, and types of facilities including differing levels of detail for facility and sub-facility level data for the facilities depending on their type (for example, size, type of permit). Some implications of these differences are: a. CAERS workflows with SLTs would have to be designed with SLT-specific data fields in mind. This means CAERS would have to accommodate additional codes to existing fields, additional data fields that SLTs require, as well as QA checks, for example, to enforce reporting of required fields where the EPA does not require them. Considerations include. b. The facility inventory information beyond CAERS is not needed by more than 50% of SLTs. The CAERS development team should use the information collected here to help decide which data elements should be included in the CAERS core information model, which should not be included, and which could optionally be included depending on further evaluation and specific SLT needs. c. The need for CAERS to support facility inventory information beyond CERS v2.0 depends on SLTs regulations and requirements. CAERS will support SLT-specific facility data fields already included in CERS v2.0 when webservices between CAERS and SLTs are established. In addition, at the time this report was being finalized, the PDT had discussed additional SLT-specific data fields that SLTs may want to see in CAERS in future. The need for one jurisdiction may not be a need for others, and CAERS should account for that. d. The CAERS codes and functions (including QA checks) for using those data fields that meet SLT-specific needs should be considered in future CAERS versions as SLTs onboard so their specific needs may be met. However, where an onboarding SLT would like to take advantage of the fact that CAERS is already set up to take information in using a certain format, the SLT may want its module to take advantage of this set up. For example, through CAERS, the facility reporter may add fugitive release points, and associate controls and control paths to stack release points. This data could also be fed back to the SLT. 16 ------- 3. Differences in SLT data collection and updating methods: Many SLTs, from 31% to 59% of respondents depending on the components, use their previous El as a starting point for the current El. CAERS already copies the previous year El to the current El as a starting point for facility reports as a default. However, many SLT's El systems have inherent relationship with other SLT's programs such as permitting and compliance. The systems for those other programs are also SLT-specific. CAERS could interreact with SLT emission inventory systems and/or directly interact with databases for other SLT programs (for example, permitting) depending on the SLT's specific needs. Or the SLT could create, as part of its workflow with CAERS, the compilation of the facility inventory data that the SLT wants to have in CAERS for reporting; SLTs could develop more specialized modules with guidance and help from EPA for this purpose. The SLT- specialized CAERS module could have specialized interfaces and perhaps special web services that use the SLT data fields between the CAERS facility data and the multiple SLT data systems. There are multiple scenarios that CAERS should handle: a. Differences in the SLT starting facility inventory for a given reporting year. i. An SLT may rely on CAERS exclusively to obtain facility inventory updates as reported by industry. ii. An SLT may rely on some of the data in CAERS but may want to supplement the facility inventory with updates from some of its other databases or systems, and/or manually. In either of these cases, if the SLT would like to get updates on the facility inventory from CAERS, this would require a workflow that would allow them to get that data when and how they'd like to get it. In addition, if the SLT would like to parse out the facility inventory data to update different SLT databases and systems this step would also have to be factored into its overall workflow with CAERS. iii. The SLT may not want to rely on the data in CAERS as a starting point for the facility inventory to start reporting for a given inventory year and thus, may want to pull its facility inventory into CAERS every year. SLTs that would like to have their El in CAERS before annual reporting begins, and who use multiple sources of data would need to factor in more time to update their facility inventory in CAERs and their workflow with CAERS could involve coordination amongst multiple databases. If those SLTs would also want to take advantage of CAERS to allow facility reporters to update the facility inventory, they could do so with the help of QA checks in CAERS that would ensure data is entered correctly. Then, a workflow that brings those updates back to the SLT database(s) would have to be designed and implemented. SLTs who prefer their own databases for facility inventory updates, rather than direct industry input, would also need to think about when and how to update CAERS with the goal of ensuring their most updated facility inventory is shared with other EPA programs such as CEDRI, for example. This is because for other EPA programs, industry reporters directly report and edit the facility data themselves, and thus, the SLT would have to weigh in to determine the correct version of the facility's inventory data to be shared from CAERS. Further research and discussions are needed to this effect. 17 ------- 4. Control device data: In terms of controls and control paths, there are three broad use cases for SLTs: a. SLTs require detailed control and path information: CAERS is already equipped to allow facility reporters to build paths from processes to release points and "reuse" those paths where applicable to more process-release point combinations. While the process for collection control data is straightforward for facilities with single controls between processes and release points, complex control configuration sometimes require assistance from SLTs and EPA to set up appropriately in CAERS. And this assistance has been available to facility reporters since CAERS MVP. However, in the dataflow between CAERS and the SLT, the SLT may require additional functionality to be able to update the information in CAERS as needed, if it's facility inventory will not be based on the data in CAERS. b. SLTs do not have rules to require facilities to report but encourage facilities to report: Ideally, CAERS could include a mechanism to auto-generates control path configurations for SLTs that do not require that information. While the data would not be as robust, this could be done based on existing information for controls, their connections and/or SCCs. The need for this feature in CAERS is highly dependent on the need to support different SLT regulations and practices, which could change over time depending on any future changes to the AERR. Because CAERs already has the capability to obtain control path data from facility reporters, SLTs may want to take advantage of CAERS reporting. Where SLTs wish to send their data to CAERS (Case 1 SLT) their use cases would have to be reviewed to see how this automated functionality might be added to their module. This means that the workflow between CAERS and SLT would have to include additional code for this transformation. c. SLTs do not have rules to require facilities to report and do not see the value of collecting detailed information on site control component and site control path component in the SLT: The workflow would not require CAERS to return the updated control and path data to the SLT. In this case, the SLT-specific CAERS module could auto populate CAERS site control component and site control path component with the control information at a higher level (such as 80% of control for PM10-FIL) and basic assumption provided by the SLT. 5. Release point data: Handling fugitive release points and release point apportionment is another challenging task. Some SLTs may not have regulations to support it and facility reporters may not report what is needed in a correct way. Although some SLTs allow facilities to report the information, facilities may not do a diligent work to obtain the information without SLT requirements. Assumptions have been made by SLTs to be able to include this information in the NEI submissions. SLTs adopting CAERS, may benefit from allowing their facilities to report the release point apportionments, and incorporating that information into their systems, or apply assumptions to their data in CAERS so it is set up with fugitive release information before sending it to EIS. The same applies to release point parameters, such as stack heights, stack diameters. Currently, some SLT El systems auto fill the missing values by using SCC-based default stack parameters. If those kinds of needs arise, CAERS or most likely, SLT-specific CAERS modules should have this function. The current CAERS has QA checks to make sure no data are missing. 18 ------- 6. Time series: While not explicitly mentioned in the survey analysis, the team discussed that it is part of a good facility inventory to be able to track components through time. Date stamps would be helpful to include for many facility inventory data fields. This time-stamp-ability is already being explored in CAERS so that SLT-specific start and end date type fields may be included in future for SLTs that would like to have them. As new data is added these time stamps may also be generated in CAERS. 7. Data editors: Another aspect the team derived from the above discussion that it deemed important, and has already been mentioned, has to do with who should edit which parts of a facility inventory. SLTs may want to prevent facility reporters from editing certain data fields. SLTs need to be able to manually create and update certain facility inventory data before submitting data to NEI by El staff. Different SLTs may require different data field restrictions. CAERS already allows SLTs to determine which data they enter versus which data they want facilities to enter. To alleviate burden to SLTs, an option that some onboarded SLTs have opted for is to issue warnings when a facility reporter has edited certain fields. This allows the SLT to be aware and accept the change if desired, without having to spend SLT staff time entering the data themselves. 8. Editing functionality: Given the variety of needs and that some SLTs have certain functionalities to assist in the creation and editing of their inventories, the team considered that it would be helpful for CAERS to have future enhancements or new capabilities to allow copy/paste records and auto fill of sequential numbers to make manual data editing easier and less time consuming. 9. Reporting by tribes: Noteworthy was the fact that only one tribal jurisdiction responded to the survey. Tribes are not currently compelled to report to NEI, and in may not have the resources to do so. CAERS could facilitate emission inventory development by tribes by providing tribes a way to collect point source data, because in most cases, there are similarities between tribal emission inventories and SLs emission inventories. By enhancing how CAERS handles facility inventory data, tribes might find the use of CAERS more advantageous and could be encouraged to report more so than they do now. 10. Future research. a. Facility management: EPA has the Facility Registry Service (FRS) integrates facility information from several different national and state information systems (https://www.epa.gov/frs), including compliance and permit data for stationary sources of air pollution regulated by the EPA, state, and local air pollution agencies. The R&D facility sub team explored the possibility to use FRS as one information source for CAERS. It was found that while initial work had already been conducted to explore the possibility of using FRS as a data source for facility information in CAERS, including workflows with the SLTs, FRS is not designed to meet future CAERS needs in terms of facility inventory data sharing with other federal programs or SLTs without major overhaul of the system, so at this time FRS is not feasible to be used with CAERS. b. QA/QC checks: A subsequent report from this team will explore the types of quality assurance (QA) and quality control (QC) checks that CAERS should have to allow SLTs to customize their use of CAERs. 19 ------- c. SLT facility inventory updates and how they should be shared with the other federal programs via CAERS is also a topic for future research. Especially in situations where SLTs do not wish to rely on facility reporters to obtain facility inventory information. Integration between CAERS and SLT systems will be crucial to keep facility inventory data updated. Also, El systems are linked with other SLT systems. This means that there are different cases for CAERS facility workflows ranging from SLTs who would use CAERS as their starting point for reporting the following year, to workflows between CAERS and SLT facility inventory databases, and potentially allowing CAERS to receive certain data from different SLT databases to mimic that SLTs current practices. Differences in the facility inventory content (types of facilities, data fields and codes) beyond the NEI also exist. CAERS will require much flexibility with respect to facility inventories to accommodate future SLTs and their workflows. 20 ------- Appendix A - Original Letter and Questionnaire See Appendix A. Original Letter and Questionnaire.docx Appendix B. EIS facility staging table requirements See Appendix B. Facility Staging Requirements.xlsx Appendix C. Responses to the Questionnaire See Appendix C. All Original Responses.pdf Appendix D. Data analysis See Appendix D. SLT Facility Inventory Research.xlsx 21 ------- |