/ Q \ ®! %r,f PRO^ AERMOD Model Evaluation ------- ------- EPA-454/B-24-006 November 2024 AERMOD Model Evaluation U.S. Environmental Protection Agency Office of Air Quality Planning and Standards Air Quality Assessment Division Research Triangle Park, NC 11 ------- iii ------- Table of Contents Section Page Table of Contents iv Figures vi Tables vii 1. Introduction 1 2. Database descriptions 1 2.1. Martin's Creek 3 2.2. Tracy Power Plant 5 2.3. Lovett Power Plant 7 2.4. Westvaco Mill 9 2.5. Duane Arnold Energy Center 11 2.6. Experimental Organically Cooled Reactor 13 2.7. Alaska North Slope 14 2.8. Prairie Grass 16 2.9. Indianapolis 18 2.10. Kincaid 21 2.11. AGA 23 2.12. Millstone Nuclear Power Plant 24 2.13. Bowline 26 2.14. Baldwin Power Plant 27 2.15. Clifty Creek Power Plant 29 3. Evaluation methodology 31 3.1. AERMET/AERMOD comparisons 31 3.2. Evaluation procedures 32 3.2.1. Robust highest concentrations 32 3.2.2. EPA Protocol for determining best performing model 32 3.3. Results 36 iv ------- 3.3.1. Turbulence cases 36 3.3.2. Non-turbulence cases 38 3.3.3. Statistical evaluations 39 4. Summary/Conclusions 48 5. References 49 v ------- Figures Figure Page Figure 1. Martin's Creek study area 4 Figure 2. Tracy power plant study area 6 Figure 3. Lovett study area 8 Figure 4. Westvaco study area 10 Figure 5. DAEC study area (SF6 releases) 12 Figure 6. Terrain map featuring the entire EOCR grid with the source at the grid center (SF6 releases). Arcs are at distances of about 40, 80, 200, 400, 800, 1200, and 1600 m 13 Figure 7. Depiction of Alaska North Slope Oil Gathering Center turbine stack, meteorological tower (X), and camera locations used to visualize plume rise 15 Figure 8. Prairie Grass study area 17 Figure 9. Map showing the location of the Perry-K Station (A), the Hoosier Dome (B), and the central Indianapolis business district (C). The downtown surface meteorological site is located at (D) and the "bank tower" site was on top of the building at (E) 19 Figure 10. Indianapolis meteorological sites and emissions site (Perry K Station) 20 Figure 11. Kincaid study area 22 Figure 12. Plan view of the locations of tracer samplers at Site 1, AGA field study (SF6 releases) 23 Figure 13. Millstone study area (SF6 and freon releases) 25 Figure 14. Bowline Point study area (S02 releases) 26 Figure 15. Baldwin study area 28 Figure 16. Clifty Creek study area 30 vi ------- Tables Table Table 1. AERMOD evaluation databases used for comparisons of AERMOD 23132 and AERMOD 24142. Databases in gray are also subject to the EPA's protocol for determining best performing model Table 2. Hourly, 3-hour, and 24-hour RHC for turbulence cases Table 3. Hourly, 3-hour, and 24-hour RHC for non-turbulence cases Table 4. Composite Performance Measure (CPM) for turbulence cases Table 5. Composite Performance Measure (CPM) for non-turbulence databases Table 6. Martins Creek Model Comparison Measure (MCM) results Table 7. Lovett Model Comparison Measure (MCM) results Table 8. Westvaco Model Comparison Measure (MCM) results Table 9. Kincaid Model Comparison Measure (MCM) results Table 10. Bowline Model Comparison Measure (MCM) results Table 11. Baldwin Model Comparison Measure (MCM) results Table 12. Clifty Creek Model Comparison Measure (MCM) results vii ------- 1. Introduction This evaluation presents a benchmark of model performance based on the original field studies presented in Cimorelli, et al, 2005 and Perry, et al, 2005. The evaluation focused on the performance of the 24142 version of the AERMOD modeling system compared to the previous version, 23132. The statistical analysis determines the best performing version of the model for 15 of the original 17 databases, including the adjust u* option1 formally adopted as a regulatory option in the version 16216r of AERMOD. 2. Database descriptions The 15 databases used in this evaluation are briefly described in this section and summarized in Table 1. The stack heights, terrain complexity, urban/rural status, importance of downwash, inclusion of turbulence parameters and meteorological data included for the database are listed for each area. A more complete description of these databases can be found in U.S. EPA, 2003. The databases are arranged by the following hierarchy: Two categories of turbulence inclusion (inclusion of turbulence or no turbulence). Within each of those categories, databases were ordered by complexity of terrain (complex or flat), and within those two categories, databases were ordered by increasing height. 1 The adjust u* option accounts for low wind speeds when calculating u* in AERMET. 1 ------- Table 1. AERMOD evaluation databases used for comparisons of AERMOD 23132 and AERMOD 24142. Databases in gray are also subject to the EPA's protocol for determining best performing model. Location Stack heights Urban/ rural Terrain Downwash Turbulence parameters Site specific AERMET inputs Martins Creek 59, 76, 183 m Rural Complex Yes 10 m gv, ow 10m wind, temperature; 90-420 m wind (every 30 m). Tracy 91 m Rural Complex No Gv? Ow 10 and 50-400 m (every 25 m) wind, temperature Lovett 145 m Rural Complex No Gv? Ow 10, 50, and 100 m wind, temperature Westvaco 190 m Rural Complex No Gv? Ow 30,210, 326, 366, and 416 m wind, temperature2 DAEC 1 m, 24 m, 46 m Rural Flat Yes Ov Insolation 10, 23.5 and 50 m wind, temperature EOCR 1, 25, 30 m Rural Flat Yes Ov 4, 10, and 30 m wind, temperature Alaska 39.2 m Rural Flat Yes Oy, Ow 33 m wind, temperature Prairie Grass 0.46 m Rural Flat No 2 H Ov? Ow 1, 2, 4, 8, and 16 m temperature, 1 m wind, u*, mixing height, sky cover Indianapolis 84 m Urban Flat No Ov? Ow Station pressure, net radiation, 10 m wind, temperature Kincaid 187 m Rural Flat No Ov? Ow Net radiation insolation, 10, 30, and 50 m wind, temperature AGA 9.8, 14.5, 24.4 m Rural Flat Yes None 10 m wind and temperature Millstone 3 stacks 29 m (freon) 48 m (SF6) Rural Flat Yes None 10 m wind speed; 43.3 m wind and temperature Bowline 2 stacks 86.87 m Rural Flat Yes None 100 m winds and temperature Baldwin 3 stacks 184.4 m Rural Flat Yes None2 10 and 100 m wind, temperature Clifty Creek 3 stacks 207.9 m Rural Flat/Elev No None 10 m temperature; 60 m wind 2 30 m observations removed from AERMOD profile before running AERMOD. 2 ------- 2.1. Martin's Creek The Martins Creek Steam Electric Station is located in a rural area along the Delaware River on the Pennsylvania/New Jersey border, approximately 30 km northeast of Allentown, PA and 95 km north of Philadelphia, PA (Figure 1). The area is characterized by complex terrain rising above the stacks. Sources include multiple tall stacks ranging from 59 to 183 m in height, including Martins Creek and three background sources located between 5 and 10 km from Martins Creek. The seven SO2 monitors were located on Scotts Mountain, which is about 2.5 - 8 km southeast of the Martins Creek facility. On-site meteorological data covered the period from May 1, 1992 through May 19, 1993. Hourly temperature, wind speed, wind direction, and sigma- theta (standard deviation of the horizontal wind direction) at 10 m were recorded from an instrumented tower located in a flat area approximately 2.5 km west of the plant. In addition, hourly multi-level wind measurements were taken by sound detection and ranging (SODAR) located approximately three kilometers southwest of the Martins Creek station. 3 ------- ol V, I WARREN CYRRFl cT ^ j MARTINS CREEK ! M! T- \T7 1 SC0TT8 MOUNTAIN I y > ( LCC? ^ K&L LEGEND • Emission Source A Monitoring Site 000-ft Elevation Contour APPROXIMATE SCALE KM FROM NEWARK, NJ, PA, NY, 1S44 /-* Figure 10 Locations of S02 Monitors, Meteorological Stations, And Emissions Sources for tbe Martins Creek Model Evaluation Study Figure 1. Martin's Creek study area. 4 ------- 2.2. Tracy Power Plant The Tracy Power Plant is located 27 km east of Reno, Nevada in the rural Truckee River valley completely surrounded by mountainous terrain (Figure 2). A field tracer study was conducted at the power plant in August 1984 with SF6 being released with the moderately buoyant plume from a 91-m stack. A total of 128 hours of data were collected over 14 experimental periods. Stable atmospheric conditions were dominant for this study. Site-specific meteorological data (wind, temperature, and turbulence) for Tracy were collected from an instrumented 150-m tower located 1.2 km east of the power plant. The wind measurements from the tower were extended above 150 meters using a Doppler acoustic sounder and temperature measurements were extended with a tethersonde. 5 ------- LEGEND A 150-m Tower ^7/ T I0*m Tower 4 r? C Camera ^ CP Commit rvd C®r>t«r Te Tether#onde \ E Etecironic Weather Station * Tracy Slack 0 Doppter Sourtd«r M Monostotic Sounder L Lidar R Radar A Arc Lamp Figure 2. Tracy power plant study area. 6 ------- 2.3. Lovett Power Plant The Lovett Power Plant study consisted of a buoyant, continuous release of S02 from a 145 m tall stack located in complex terrain, rural area in New York State (Figure 3). The data spanned one year from December 1987 through December 1988. Data were collected from 12 monitoring sites (ten on elevated terrain and two near stack-base elevation) that were located about 2 to 3 km from the plant. The monitors provided hourly-averaged concentrations. The important terrain features rise approximately 250 m to 330 m above stack base at about 2 to 3 km downwind from the stack. Meteorological data include winds, turbulence, and AT from a tower instrumented at 10 m, 50 m, and 100 m. National Weather Service surface data were available from a station 45 km away. 7 ------- 300 0 Figure 3. Lovett study area. 8 ------- 2.4. Westvaco Mill The Westvaco Corporation's pulp and paper mill in rural Luke, Maryland is located in a complex terrain setting in the Potomac River valley (Figure 4). A single 183-m buoyant source was modeled for this evaluation. There were 11 SO2 monitors surrounding the facility, with eight monitors well above stack top on the high terrain east and south of the mill at a distance of 800 - 1500 m. Hourly meteorological data (wind, temperature, and turbulence) were collected between December 1980 and November 1991 at three instrumented towers: the 100-m Beryl tower in the river valley about 400 m southwest of the facility; the 30-m Luke Hill tower on a ridge 900 meters north-northwest of the facility; and the 100-m Met tower located 900 m east southeast of the facility on a ridge across the river. 9 ------- J KMOUJeiBi 0on(fMiq6 ^ 8 laagfruB i • me el Figure 4. Westvaco study area. 10 ------- 2.5. Duane Arnold Energy Center The Duane Arnold Energy Center (DAEC) is located in rural Iowa, located about 16 km northwest of Cedar Rapids. It is located in a river valley with some bluffs on the east side. Terrain varies by about 30 m across the receptor network with the eastern half of the semicircular receptor arcs being flat and the western half elevated. The tracer study35 involved SF6 releases from two rooftops (46-m and 24-m levels) and the ground (1-m level). Building tiers for the rooftop releases were 43 and 24 m high, respectively. The 1-m and 24-m releases were non-buoyant, non-momentum, while the 46-m release was close to ambient but had about a 10 m/s exit velocity. The number of tracer release hours was 12, 16 and 11 from the release heights of 46 m, 24 m, and 1 m, respectively. There were two arcs of monitors at downwind distances of 300 and 1000 m (see Figure 5). Meteorological data consisted of winds at 10, 24, and 50 m. The meteorological conditions were mostly convective (30 out of 39 hours), with fairly light wind speeds. Only one hour had a wind speed above 4 m/s (4.6), and almost half of the hours were less than 2 m/s. 11 ------- nmtxy} »ttT LEGSNfi. O HEADQUARTERS SITE ¦ MgTiOROLOGICAL TOWER A TRACER RELEASE POINT • BAG SAMPLER LOCATION • available lidar site TRACER SAMPLING ARC SCALE 1 ?40CO l "HKcni CONTOUR INTERVAL 10 FfET i*»tbon*l bcooctic *taTic*t datum ar iwt Figure 5. DAEC study area (SF6 releases). 12 ------- 2.6. Experimental Organically Cooled Reactor The Experimental Organically Cooled Reactor (EOCR) study involved the simultaneous release of three tracer gases (SF6, F12, and Freon-12B2) at three levels around the EOCR test reactor building at the Idaho National Engineering Laboratory in Southeast Idaho. The terrain was flat with low-lying shrubs. The main building was 25 m high with an effective width of 25 m. The tracer releases typically occurred simultaneously and were conducted during 22 separate time periods. Tracer sampler coverage was provided at eight concentric rings at distances of about 50, 100, 200, 400, 800, 1200, and 1600 m from the release points (see Figure 6). The stability classes ranged from stable to unstable. The 10 m wind speeds for the cases selected ranged from 3 to 8 m/s. Figure 6. Terrain map featuring the entire EOCR grid with the source at the grid center (SF6 releases). Arcs are at distances of about 40, 80, 200, 400, 800,1200, and 1600 m. 13 ------- 2.7. Alaska North Slope The Alaska North Slope tracer study (see Figure 7) involved 44 hours of buoyant SF6 releases from a 39 m high turbine stack. Tracer sampler coverage ranged over seven arcs from 50 to 3,000 m downwind. Meteorological data, including wind speed, wind direction, temperature, sigma-theta, and sigma-w, were available from an on-site tower at the 33 m level. Atmospheric stability and wind speed profiles were influenced by the smooth snow-covered tundra surface with negligible levels of solar radiation in the autumn months. All experiments (44 usable hours) were conducted during the abbreviated day light hours (0900 - 1600). Wind speeds taken at the 33-m level during the tests were less than 6 m/s during one and part of another test, between 6 and 15 m/s during four tests, and in excess of 15 m/s during three tests. Stability conditions were generally neutral or slightly stable. 14 ------- Figure 7. Depiction of Alaska North Slope Oil Gathering Center turbine stack, meteorological tower (X), and camera locations used to visualize plume rise. 15 ------- 2.8. Prairie Grass The Prairie Grass study used a near-surface, non-buoyant tracer release in a flat rural area in Nebraska. This study involved a tracer of S02 released at 0.46 m above the surface. Surface sampling arrays (arcs) were positioned from 50 m to 800 m downwind. Meteorological data included the 2 -m level wind direction and speed, the root-mean-square wind direction fluctuation, and the temperature difference (AT) between 2 m and 16 m. Other surface parameters, including friction velocity, Monin-Obukhov length, and lateral plume spread were estimated. Wind, turbulence, and temperature were obtained from a multi-leveled instrumented 16 m meteorological tower. A total of 44 ten-minute sampling periods were used, including both convective and stable conditions. 16 ------- ------- 2.9. Indianapolis The Indianapolis study consisted of an elevated, buoyant tracer (SF6) released in a flat- terrain urban to suburban area from a single 84-m stack (Figure 9). Data are available for approximately a four- to five-week period with 177 monitors providing 1-hour averaged samples along arcs from 250 m to 12 km downwind for a total of 1,297 arc-hours. Meteorological data included wind speed and direction, sigma-theta on a 94-meter tower; and wind speed, AT (2m - 10m) and other supporting surface data at three other 10-m towers (Figure 10). Observed plume rise and estimates of plume sigma-y are also available from the database. 18 ------- Figure 9. Map showing the location of the Perry-K Station (A), the Hoosier Dome (B), and the central Indianapolis business district (C). The downtown surface meteorological site is located at (D) and the "bank tower" site was on top of the building at (E). 19 ------- UAHlOft C»- \ 11/ DtAK* |_W£2C SCALE #PERRY K STATION ASURFACE TEMPERATURE * PRIMARY METEOROLOGICAL SITES PRAWIN50NDE Figure 10. Indianapolis meteorological sites and emissions site (Perry K Station). 20 ------- 2.10. Kincaid The Kincaid S02 study was conducted in a flat rural area of Illinois (Figure 11). It involved a buoyant, continuous release of S02 from a 187-m stack in rural flat terrain. The study included about six months of data between April 1980 and June 1981 (a total of 4,614 hours of samples). There were 30 S02 monitoring stations providing 1-hour averaged samples from about 2 km to 20 km downwind of the stack. Meteorological data included wind speed, direction, and temperature from a tower instrumented at 2, 10, 50, and 100 m levels, and nearby National Weather Service (NWS) data. 21 ------- J I 0 6Km. I. i i—i J j—' Figure 11. Kincaid study area. 22 ------- 2.11. AGA The AGA experiments occurred during spring and summer 1980 at gas compressor stations in Texas and Kansas (Figure 12). At each test facility, one of the gas compressor stacks was retrofitted to accommodate SF6 tracer gas emissions. In addition, stack height extensions were provided for some of the experiments (with the normal stack height close to 10 m). The stack height to building height ratios for the tests ranged from 0.95 to 2.52. There were a total of 63 tracer releases over the course of the tests, and the tracer samplers were located between 50 and 200 m away from the release point (see Figure 12). An instrumented 10-m tower was operated at both experimental sites. The tracer releases were generally restricted to daytime hours. Stability classes range from neutral to extremely unstable, except for three hours that were slightly stable. Wind speeds range from 2 to 11 m/s over the 63 hours. Figure 12. Plan view of the locations of tracer samplers at Site 1, AGA field study (SF6 releases). 3AIVLinfr LDCA110H [LN ABOVE GSUUHOj (EhTML SWITCHING WMHWL 23 ------- 2.12. Millstone Nuclear Power Plant The Millstone nuclear power plant is located on the Connecticut coast, near Niantic. The model evaluation database features 36 hours of SF6 emissions from a 48-m reactor stack and 26 hours of Freon emissions from a 29-m turbine stack. Exit temperatures were close to ambient (about 295K) with exit velocities of about 10 m/s for both the reactor stack (48.3 m) and the three turbine stacks (29.1 m). These stacks were associated with 45-m and 28-m building tiers, respectively. The monitoring data consisted of three arcs at 350, 800 and 1,500 m. Meteorological data were available from an on-site tower at the 10-m and 43-m levels. There was about an even split between stable and unstable hours, with mostly onshore winds and fairly high wind speeds. There were only 3 stable hours with wind speed less than 4 m/s, and the majority was above about 7 m/s and several above 10 m/s. Figure 13 shows the layout of the study area. 24 ------- Figure 13. Millstone study area (SF6 and freon releases). 25 ------- 2.13. Bowline The Bowline Point site33, located in the Hudson River valley in New York State, is shown in Figure 14 (topographic map). The electric utility site included two 600-MW units, each with an 86.9-m stack and a dominant roof tier with a height of 65.2 m high in a rural area. There were four monitoring sites as shown in Figure 14 that ranged from about 250 to 850 m from the stacks. Flourly emissions data was determined from load data, coal analyses, and site- specific relationships between loads and fuel consumption. Meteorological data was obtained from a 100-m tower at the site. This site was also used as an independent evaluation database with the entire year included. -TtoV- * ¦- iit * n rv- - e •—i v,i i •; V u -l- . • t it ¦ .¦ v- 3 A. I) 1 * , Bo« 'i Met. jrwm- BowlinG Point stacfcs Ramp Monitor 4 KM-$3?%. '• •<. i' ¦. f* > i V1 Bowline Point Monitor •JJMf I " Figure 14. Bowline Point study area (S02 releases). 26 ------- 2.14. Baldwin Power Plant The Baldwin Power Plant is located in a rural, flat terrain setting of southwestern Illinois and has three identical 184-m stacks aligned approximately north-south with a horizontal spacing of about 100 m (Figure 15). There were 10 S02 monitors that surrounded the facility, ranging in distance from two to ten km. On-site meteorological data was available during the study period of April 1, 1982 through March 31, 1983 and consisted of hourly averaged wind speed, wind direction, and temperature measurements taken at 10 m and wind speed and wind direction at 100 m. 27 ------- Bearing Directions and Distances To Monitors Near the Baldwin Plant 2) Stopper 3) Rover 4) Nearsighted 5) Well 6) Goosedown 7) Houston 8) Old Bethel 9) Stringtown A) Wayside Legend • SO; monitor P Power Plant ~ SO; monitor and 100-m met tower Figure 15. Baldwin study area. 28 ------- 2.15. Clifty Creek Power Plant The Clifty Creek Power Plant is located in rural southern Indiana along the Ohio River with emissions from three 208-m stacks during this study (Figure 16). The area immediately north of the facility is characterized by cliffs rising about 115 m above the river and intersected by creek valleys. Six nearby S02 monitors (out to 16 km from the stacks) provided hourly averaged concentration data. Meteorological data from a nearby 60-m tower covered the two- year period from January 1, 1975 through December 31, 1976, although only the data from 1975 were used in this evaluation. This database was also used in a major EPA-funded evaluation of rural air quality dispersion models in the early 1980s. 29 ------- SOi Monitors Distance from Plant Elevation (m) During from Fl»»t 1) Bawn Ridge 15.0 km 277 40° 2) Rykcrs Ridge 7 4 km 274 56° 3) Nnrth Madison 4 5 km 267 16° 4) Hefcron Church II 6 km 273 24° 5) Liberty Ridge 3 1 km 253 174° 6) Canip Creek 8 0 km 146 Note: Grade elevation at the Clifty Crock Power Plant site is 143 m. The stack-top elevation is 351 m. I Ii*L*n*"lr«>"u5A • SO2 monitor P Power Plant ~ S02 monitor and 60-m met tower [\, Legend Figure 16. Clifty Creek study area. 30 ------- 3. Evaluation methodology 3.1. AERMET/AERMOD comparisons Two versions of AERMET/AERMOD will be compared using Robust highest concentrations and the EPA Protocol for determining best performing model. AERMET 23132/AERMOD 23132 will be compared against AERMET 24142/AERMOD 24142 with various combinations of adjusted or non-adjusted surface friction velocity (u*) and inclusion/exclusion of turbulence parameters (sv and sw). The modeled scenarios are: • 23132_no_u*_with_turb: AERMET/AERMOD 23132 with no u* adjustment and turbulence included in the meteorological data • 23132_with_u*_no_turb: AERMET/AERMOD 23132 with u* adjustment and no turbulence included in the meteorological data. • 23132_no_u*_no_turb: AERMET/AERMOD 23132 with no u* adjustment and no turbuluence included in the meteorological data • 24142_no_u*_with turb: AERMET/AERMOD 24142 with no u* adjustment and turbulence included in the meteorological data • 24142_with_u*_no_turb: AERMET/AERMOD 24142 with u* adjustment and no turbulence included in the meteorological data. • 24142_no_u*_no_turb:AERMET/AERMOD 24142 with no u* adjustment and no turbulence included in the meteorological data. 31 ------- 3.2. Evaluation procedures 3.2.1. Robust highest concentrations Robust highest concentrations (RHC) were calculated for each averaging period of each database. The RHC statistic is calculated as: RHC = X(JV) + [X - X(JV)] x In where X(N) is the Nth largest value, X is the average of N-l values, and N is the number of values exceeding the threshold value, usually 26. For the 1-hour RHC, the RHC is calculated based on N=26 across all modeled and monitored values (i.e., not paired in time or space). For the 3-hour and 24-hour the RHC is calculated separately for each monitor within the network for observations and modeled values. The highest observed RHC is then compared to the highest modeled RHC. 3.2.2. EPA Protocol for determining best performing model (1) AERMOD output, among the different meteorological datasets, was evaluated using the EPA's Protocol for Determining the Best Performing Model, or Cox-Tikvart method (U.S. EPA, 1992; Cox and Tikvart, 1990). The protocol uses a two-step process for determining the better performing model when comparing models. The first step is a screening test that fails to perform at a minimal operational level. The second test applies to those models that pass the screening test that uses bootstrapping to generate a probability distribution of feasible outcomes (U.S. EPA, 1992). This section will discuss the methodology using the evaluation cases as examples. The first step is to perform a screening test based on fractional bias: FB = 2 OB - PR OB + PR. 32 (2) ------- where FB is the fractional bias, OB is the average of the highest 25 observed concentrations and PR is the average of the highest 25 predicted averages. The fractional bias is also calculated for the standard deviation where OB and PR refer to the standard deviation of the highest 25 observed and predicted concentrations respectively. This is done across all monitors and modeled receptors, unpaired in time and space for the 3-hour and 24-hour averaging periods. The fractional bias of the means is plotted against the fractional bias of the standard deviation. Biases that exceed a factor-of-two under-prediction or over-prediction are considered grounds for excluding a model for further evaluation (U.S. EPA, 1992). Models that pass the screening test are subjected to a more comprehensive statistical comparison that involves both an operational and scientific component using the RHC (Eq. 1). For the evaluations presented here, the screening step was skipped. The operational component is to measure the model's ability to estimate concentration statistics most directly used for regulatory purposes and the scientific component evaluates the model's ability to perform accurately throughout the range of meteorological conditions and the geographic area of concern (U.S. EPA, 1992). The operational component of the evaluation compares performance in terms of the largest network-wide RHC test statistic. The RHC is calculated separately for each monitor within the network for observations and modeled values. The highest observed RHC is then compared to the highest modeled RHC using Equation 2, where RHC now replaces the means of the top 25 values of observed or modeled concentrations. Absolute fractional bias (the absolute value of fractional bias), AFB is calculated for 3 and 24-hour averages. The scientific component of the evaluation is also based on absolute fractional bias, but the bias is calculated using the RHC for each meteorological condition and monitor. The meteorological conditions are a function of atmospheric stability and wind speed. For the purposes of these studies, six unique conditions were defined based on two wind speed categories (below and above 2.0 m/s) and three stability categories: unstable, neutral, and 33 ------- stable.3 In this evaluation, only 1-hour concentrations are used, and the AFB is based on RHC values paired in space and stability/wind speed combination. A composite performance measure (CPM) is calculated from the 1-hour, 3-hour, and 24- hour AFB's: CPM = i x (AFBtj) + \ x AFB3 — AFB24 (3) where AFB,., is the absolute fractional bias for monitor i and meteorological condition j, AFBij is the average absolute fractional bias across all monitors and meteorological conditions, AFB3 is the absolute fractional bias for the 3-hour average, and AFB24 is the absolute fractional bias for the 24-hour average. Once CPM values have been calculated for each model, a model comparison measure is calculated to compare the models: MCMa b = CPMa - CPMb (4) where CPMa is the CPM for model A and CPMb is the CPM for model B. When more than two models are being compared simultaneously, the number of MCM values is equal to the total of the number of unique combinations of two models. For Martins Creek, Lovett, Westvaco, and Kincaid, there are four scenarios each, so there were six MCM comparisons for each location. For Bowline, Baldwin, and Clifty Creek, there are three scenarios each, resulting in three MCM comparisons for each location. In order to determine if the difference between models was statistically significant, the standard error was calculated. A bootstrapping technique was used to create 1000 sample years based on methodology outlined in U.S. EPA (1992). The original data is divided into 3-day 3 In U.S. EPA (1992), the three stability categories are related to the Pasquill-Gifford categories, unstable being A, B, and C, neutral being D, and stable being E and F. Since AERMOD does not use the stability categories, the stability class was determined using Monin-Obukhov length and surface roughness using methodology from AERMOD subroutine LTOPG. 34 ------- blocks. Within each season, the 3-day blocks are sampled with replacement until a total season is created. The process is repeated until 1000 boot-strap years are created4. The standard error is calculated as the standard deviation of the bootstrap generated outcomes for the MCM. The magnitude and sign of the MCM are indicative of relative performance of each pair of models. The smaller the CPM the better the overall performance of the model. This means that for two models, A and B, a negative difference between the CPM for A and CPM for B implies that model A is performing better (Model A has a smaller CPM) while a positive difference indicates that Model B is performing better. Since more than two scenarios are being evaluated in these studies, simultaneous confidence intervals of 90 and 95 percent were calculated. These were calculated by finding the 90th and 95th percentiles of the distribution across all MCM values from the bootstrapping procedure for all model comparisons. The confidence intervals were then found by: CIx,a,b = MCMab + cxsAB (5) where CIx,a,b is the confidence interval for X percent (90 or 95th) for models A and B, MCMa,b is as defined in Equation 4, cx is the X percentile of the MCM values from the bootstrap results and sa,b is the standard deviation of the bootstrap MCM results for models A and B. Note that in Equation 5, MCMa,b is the MCM value from the original data, not the bootstrap results. For each pair of model comparisons, the significance of the model comparison measure depended on whether the confidence interval overlapped zero. If the confidence interval overlapped zero, then the two models were not performing at a level which was considered 4 The bootstrapping was completed using the SAS® SURVEYSELECT procedure with resampling for 1000 replicates. 35 ------- statistically different. Otherwise, if they did not overlap zero, then there was a statistically significant difference between the two models. 3.3. Results 3.3.1. Turbulence cases Table 2 lists the hourly observed and modeled RHC, as well as 3-hour and 24-hour RHC for applicable databases, for the databases that initially included turbulence. Table 3 lists the RHC values for those databases initially without turbulence. The modeled scenario(s) closest to the observed RHC are highlighted in gray for each database. Results in Table 2 indicate that the 23132 and 24142 modeled RHC's are identical. Results in Table 2 also indicate that for the most part for the databases with turbulence data, the 23132 or 24142 cases without the u* adjustment and with turbulence data were the better performers against observations. For a few instances, depending on the averaging period, the cases with the u* adjustment and no turbulence, or the cases with no u* adjustment and no turbulence were the better performers. Table 3 indicates that for the non-turbulence databases, the use of adjusted u* increased modeled performance in some cases depending on the averaging period or stack height. While decreasing or not changing model performance in other cases, depending on averaging period or stack height. For the databases that had multiple averaging periods (Martins Creek, Lovett, Westvaco, and Kincaid), there was not a consistent better performing model across the averaging periods. For example, for Martins Creek, 23132_with_u*_no_turb and 24142_with_u*_no_turb performed better for the 24-hour averaging period, while 23132_no_u*_with_turb and 24142_no_u*_with turb performed better for the 1 and 3-hour period. For DAEC, which had observed concentrations for emissions from different stack heights, the better performing modeling appeared to be dependent on stack height. Overall, it appears that the use of adjusted u* did not increase model performance for most of the cases and that the inclusion of turbulence is more important to model performance than the u* adjustment. 36 ------- Table 2. Hourly, 3-hour, and 24-hour RHC for turbulence cases. Best performing model compared to observed RHC are highlighted in gray. RHC Avg. period (hr) AERMOD version Database 23132 24142 Observed No u* with With u* no No u* No u* With u* No u* turb turb noturb with turb no tur b noturb Martins Creek 1 1216 1133 1034 1427 1133 1034 1427 3 461 497 505 655 497 505 655 24 79 143 132 158 143 132 158 Tracy 1 15 13 18 25 13 18 25 Lovett 1 426 374 538 622 374 538 622 3 187 169 239 254 169 239 254 24 52 48 63 68 48 63 68 Westvaco 1 2757 2460 1252 2091 2460 1252 2091 3 1575 1731 783 1654 1731 783 1654 24 480 522 457 613 522 457 613 DAEC (h=lm) 1 346 240 188 222 240 188 222 DAEC (h=24m) 1 253 84 71 75 84 71 75 DAEC (h=46m) 1 140 91 59 99 91 59 99 EOCR 1 3763 5822 5731 8250 5822 5731 8250 Alaska 1 6 5 8 8 5 8 8 Prairie Grass 1 925087 987307 867946 883444 987307 867946 883444 Indianapolis 1 6 4 4 5 4 4 5 Kincaid 1 1611 1312 717 717 1312 717 717 3 618 615 470 470 615 470 470 24 113 101 167 167 101 167 167 37 ------- 3.3.2. Non-turbulence cases Table 3 lists the RHC values for the non-turbulence databases for 23132 and 24142. In these databases, because of the lack of turbulence in the meteorological data, the effect of the u* adjustment has more impact in improving model performance. Also, the results indicate the changes made to AERMOD between 23132 and 24142 did not impact these findings. Table 3. Hourly, 3-hour, and 24-hour RHC for non-turbulence cases. Best performing model compared to observed RHC are highlighted in gray. RHC Avg. AERMOD version Database period Observed 23132 24142 (hr) With u* No u* With u* No u* no turb no turb no turb no turb AGA 1 296 262 281 262 281 Millstone 1 76 96 101 96 101 (Freon) Millstone 1 79 33 35 33 35 (SF6) Bowline 1 763 552 547 552 547 3 469 514 523 514 523 24 204 307 290 307 290 Baldwin 1 2348 3531 3531 3531 3531 3 920 1183 1184 1183 1184 24 209 230 230 230 230 Clifty Creek 1 1451 1360 1360 1360 1360 3 796 871 870 871 870 24 243 170 165 170 165 38 ------- 3.3.3. Statistical evaluations While the review of RHC can indicate general model performance, the use of the EPA Protocol for Determining Best Performing Model (U.S. EPA, 1992) provides a statistical basis of determining the best performing model. Tables 4 and 5 show the composite performance measure (CPM) for the turbulence databases and non-turbulence databases respectively. For the databases with turbulence (Table 4), the best performing models for Martins Creek were the cases with adjusted u* and no turbulence but for the remaining areas, the better performing models were the adjusted u* and no turbulence scenarios. This means the use of adjusted u* did not increase model performance and the use of turbulence was important to model performance. For the non-turbulence databases (Table 5), the use of adjusted u* increased model performance for Baldwin and Clifty Creek, while for Bowline, the use of adjusted u* slightly decreased model performance. For all cases, the CPM values were identical for the 23132 and 24142 model versions, suggesting the changes between 23132 and 24142 had minimal to no impact on model performance, which was expected based on the changes made to AERMET and AERMOD and no changes to the adjusted u* equations. Table 4. Composite Performance Measure (CPM) for turbulence cases. Scenarios with lowest CPM's for each study location are highlighted in gray. Scenario Database Martins Creek Lovett Westvaco Kincaid 23132 no u* with turb 0.35 <) 4<) i)4l 0.37 23132 with u* no turb o 31 i) 52 DM) o 50 23132 no u* no turb 0.49 o 5X i)44 o 50 24142 no u* with turb i) 35 <) 4<) i)4l 0.37 24142 with u* no turb i) 31 i) 52 0.60 0.56 24142 no u* no turb 0.49 0.58 i)44 0.56 39 ------- Table 5. Composite Performance Measure (CPM) for non-turbulence databases. Scenarios with lowest CPM's for each study location are highlighted in gray. Scenario Database Bowline Baldwin Clifty Creek 23132 no u* no turb 0.47 0.46 0.51 23132 with u* no turb 0.50 0.45 0.49 24142 no u* no turb 0.47 0.46 0.51 24142 with u* no turb 0.50 0.45 0.49 Tables 6 through 9 show the model comparison measure (MCM) for the turbulence databases while Tables 10 through 12 show the MCM for the non-turbulence databases. Also shown are the 90 and 95% confidence intervals of the MCM based on the bootstrapping results. Confidence intervals highlighted in gray indicated statistical significance in the specific MCM cases. The original pairings of 23132 scenarios to other 23132 scenarios are shown for comparison to the analogous 24142 pairings. MCM pairings for the same u*/turbulence pairings between 24142 and 23132 are also shown to show if model changes made differences to results. For all such cases, such comparisons are zero. Martins Creek (Table 6): The better performing models were 23132 and 24142 with u* and no turbulence. Also, the MCM results indicate that the use of adjusted u* with no turbulence is not statistically significant when compared to no adjusted u* with turbulence for both 23132 and 24142. There were three statistically significant MCM pairings that were statistically significant at the 90% confidence interval, and these were the difference between no u* adjustment and no turbulence and the cases (no adjusted u* with turbulence or adjusted u* with no turbulence) for both 23132 and 24142, indicating that not using adjusted u* and not using turbulence noticeably decreases model performance. At the 95% confidence interval, the two statistically significant differences were between 24142 no adjusted u*/ no turbulence and adjusted u*/ with turbulence for 24142 and for 24142 no adjusted u*/ no turbulence and adjusted u*/ no turbulence for 24142. Lovett (Table 7): All cases of AERMET/AERMOD 23132 are statistically insignificant when compared AERMET/AERMOD 23132 at both the 90% and 95% CI with the exception of the no u* and no turbulence case compared to the no u* with turbulence case. For 24142 all 40 ------- cases are statistically insignificant compared to each other at the 90% CI, with the exception of the 24142 no u* and no turbulence case compared to the 24142 no u* with turbulence case. However, the lower bound of the 90% CI is close to zero. Westvaco (Table 8): The use of adjusted u* decreases model performance significantly at both the 90% and 95% CI for both 23132 and 24142. The use of no adjusted u* and no turbulence also decreases model performance at a statistically significant level for both 23132 and 24142. Kincaid (Table 9): None of the MCM differences were statistically significant at 90% or 95% CI. The better performers were 23132 or 24142 with no u* adjustment and inclusion of turbulence, but as previously stated, were not statistically different from the adjusted u* case or the case with no adjusted u* and no turbulence. For the non-turbulence databases (Tables 10-12), the use of adjusted u* was statistically insignificant compared to not using adjusted u* and as expected, the MCM values indicated no difference between 23132 and 24142. 41 ------- Table 6. Martins Creek Model Comparison Measure (MCM) results. Confidence intervals highlighted in gray are significant at that percent. MCM Comparison MCM Confidence Intervals 90% 95% Lower bound Upper bound Lower bound Upper bound 23132 with u* no turb - 23132 no u* with turb -0.03 -0.14 0.07 -0.16 0.09 23132 no u* no turb-23132 no u* with turb 0.14 0.03 0.26 -0.003 0.29 23132 no u* no turb-23132 with u* no turb 0.18 0.07 0.29 0.04 0.31 24142 no u* no turb-23132 no u* no turb 0 -0.13 0.13 -0.16 0.16 24142 no u* with turb-23132 no u* with turb 0 -0.10 0.10 -0.12 0.12 24142 with u* no turb-23132 with u* no turb 0 -0.12 0.12 -0.14 0.14 24142 with u* no turb-23112 no u* with turb -0.03 -0.14 0.06 -0.15 0.09 24142 no u* no turb-24142 no u* with turb 0.14 0.03 0.26 0.007 0.28 24142 no u* no turb-24142 with u* no turb 0.18 0.07 0.29 0.05 0.31 42 ------- Table 7. Lovett Model Comparison Measure (MCM) results. Confidence intervals highlighted in gray are significant at that percent. MCM Comparison MCM Confidence Intervals 90% 95% Lower bound Upper bound Upper bound Lower bound 23132 with u* no turb - 23132 no u* with turb 0.12 -0.05 0.30 -0.08 0.34 23132 no u* no turb-23132 no u* with turb 0.18 0.01 0.35 -0.0 0.39 23132 no u* no turb-23132 with u* no turb 0.05 -0.05 0.14 -0.06 0.17 24142 no u* no turb-23132 no u* no turb 0 -0.12 0.12 -0.14 0.14 24142 no u* with turb-23132 no u* with turb 0 -0.13 0.12 -0.15 0.15 24142 with u* no turb-23132 with u* no turb 0 -0.11 0.11 -0.13 0.13 24142 with u* no turb-24142 no u* with turb 0.12 -0.04 0.30 -0.08 0.33 24142 no u* no turb-24142 no u* with turb 0.18 0.001 0.36 -0.03 0.39 24142 no u* no turb-24142 with u* no turb 0.05 -0.04 0.15 -0.06 0.16 43 ------- Table 8. Westvaco Model Comparison Measure (MCM) results. Confidence intervals highlighted in gray are significant at that percent. MCM Comparison MCM Confidence Intervals 90% 95% Lower bound Upper bound Lower bound Upper bound 23132 with u* no turb - 23132 no u* with turb 0.19 0.05 0.33 0.02 0.36 23132 no u* no turb-23132 no u* with turb 0.03 -0.05 0.12 -0.07 0.13 23132 no u* no turb-23132 with u* no turb -0.16 -0.31 -0.01 -0.34 0.02 24142 no u* no turb-23132 no u* no turb 0 -0.09 0.09 -0.11 0.11 24142 no u* with turb-23132 no u* with turb 0 -0.08 0.08 -0.09 0.09 24142 with u* no turb-23132 with u* no turb 0 -0.07 0.07 -0.09 0.09 24142 with u* no turb - 24142 no u* with turb 0.19 0.04 0.34 0.01 0.37 24142 no u* no turb-24142 no u* with turb 0.03 -0.05 0.11 -0.07 0.13 24142 no u* no turb-24142 with u* no turb -0.16 -0.31 -0.01 -0.34 0.02 44 ------- Table 9. Kincaid Model Comparison Measure (MCM) results. Confidence intervals highlighted in gray are significant at that percent. MCM Comparison MCM Confidence Intervals 90% 95% Lower bound Upper bound Lower bound Upper bound 23132 with u* no turb - 23132 no u* with turb 0.19 -0.27 0.66 -0.32 0.70 23132 no u* no turb-23132 no u* with turb 0.19 -0.29 0.67 -0.34 0.72 23132 no u* no turb-23132 with u* no turb -5.1xl0"4 -0.13 0.13 -0.15 0.15 24142 no u* no turb-23132 no u* no turb 2.0xl0"5 -0.14 0.14 -0.16 0.16 24142 no u* with turb-23132 no u* with turb 6.0xl0"5 -0.56 0.51 -0.61 0.61 24142 with u* no turb-23132 with u* no turb 2.0xl0"5 -0.14 0.14 -0.15 0.15 24142 with u* no turb - 24142 no u* with turb 0.19 -0.27 0.65 -0.32 0.70 24142 no u* no turb-24142 no u* with turb 0.19 -0.28 0.66 -0.33 0.71 24142 no u* no turb-24142 with u* no turb -5.1xl0"4 -0.13 0.13 -0.14 0.14 45 ------- Table 10. Bowline Model Comparison Measure (MCM) results. Confidence intervals highlighted in gray are significant at that percent. MCM Comparison MCM Confidence Intervals 90% 95% Lower bound Upper bound Lower bound Upper bound 23132 no u* no turb - 23132 with u* no turb -0.03 -0.11 0.05 -0.12 0.06 24142 no u* no turb-23132 no u* no turb 0.0 -0.10 0.10 -0.12 0.12 24142 with u* no turb-23132 with u* no turb 0.0 -0.09 0.09 -0.12 0.12 24142 no u* no turb-24142 with u* no turb -0.03 -0.10 0.04 -0.12 0.06 Table 11. Baldwin Model Comparison Measure (MCM) results. Confidence intervals highlighted in gray are significant at that percent. MCM Comparison MCM Confidence Intervals 90% 95% Lower bound Upper bound Lower bound Upper bound 23132 no u* no turb-23132 with u* no turb 0.002 -0.07 0.08 -0.09 0.09 24142 no u* no turb-23132 no u* no turb 2.0xl0"5 -0.10 0.10 -0. 12 0.12 24142 with u* no turb-23132 with u* no turb 2.0xl0"5 -0.10 0.10 -0. 12 0.12 24142 no u* no turb-24142 with u* no turb 0.002 -0.07 0.08 -0.09 0.09 46 ------- Table 12. Clifty Creek Model Comparison Measure (MCM) results. Confidence intervals highlighted in gray are significant at that percent. MCM Comparison MCM Confidence Intervals 90% 95% Lower bound Upper bound Lower bound Upper bound 23132 no u* no turb - 23132 with u* no turb 0.02 -0.04 0.07 -0.05 0.08 24142 no u* no turb-23132 no u* no turb 3xl0"5 -0.07 0.07 -0.08 0.08 24142 with u* no turb-23132 with u* no turb 3xl0"5 -0.06 0.06 -0.08 0.08 24142 no u* no turb-24142 with u* no turb 0.02 -0.04 0.07 -0.05 0.08 47 ------- 4. Summary/Conclusions Based on the results the RHC comparisons and the EPA protocol for determining best performing model, in situations involving turbulence, the use of turbulence without adjusting u* usually led to better performance than using adjusted u* without turbulence, especially in areas of complex terrain. In some instances, the differences between the adjusted u* cases were statistically worse than non-adjusted u* cases. For situations where turbulence is not in the meteorological data, the use of adjusted u* often resulted in little change or some increase in model performance. However, the databases without turbulence were in flat terrain and had talk stacks, so model performance for non-turbulence cases with complex terrain cannot be determined from these results. The results of the RHC and EPA protocol also indicate that changes made to AERMOD 24142 had no unexpected changes from AERMOD 23132. 48 ------- 5. References Cimorelli, A. J., S. G. Perry, A. Venkatram, J. C. Weil, R. J. Paine, R. B. Wilson, R. F. Lee, W. D. Peters, and R. W. Brode, 2005: AERMOD: A dispersion model for industrial source applications Part I: General model formulation and boundary layer characterization. J.Appl.Meteor. 44, 682-693 Cox, W. M. and J. A. Tikvart, 1990. A statistical procedure for determining the best performing air quality simulation model. Atmos. Environ., 24A(9): 2387-2395. Perry, S. G., A. J. Cimorelli, R. J. Paine, R. W. Brode, J. C. Weil, A. Venkatram, R. B. Wilson, R. F. Lee, and W. D. Peters, 2005: AERMOD: A dispersion model for industrial source applications Part II: Model performance against seventeen field-study databases. J.Appl.Meteor. 44, 694-708. U.S. Environmental Protection Agency, 1992: Protocol for Determining Best Performing Model. EPA-454/R-92-025, U.S. Environmental Protection Agency, RTP, NC. U.S. Environmental Protection Agency, 2003: AERMOD: Latest Features and Evaluation Results. EPA-454/R-03-003, U.S. Environmental Protection Agency, RTP, NC. 49 ------- United States Office of Air Quality Planning and Standards Publication No. EPA-454/B-24-006 Environmental Protection Air Quality Assessment Division November 2024 Agency Research Triangle Park, NC ------- |