United States Environmental Protection Agency Research and Development Atmospheric Research and Exposure Assessment Laboratory Research Triangle Park, NC 27711 EPA/600/SR-92/221 January 1993 EPA Project Summary Application of a Data- Assimilating Prognostic Meteorological Model to Two Urban Areas Sharon G. Douglas A data-assimilating prognostic me- teorological model, the Systems Appli- cations International Mesoscale Model (SAIMM), was applied to generate me- teorological fields suitable for photo- chemical modeling of two urban areas: Los Angeles, California, and the Lower Lake Michigan area which includes Chi- cago, Illinois. The objectives of this study were to test the ability of the SAIMM to provide accurate meteoro- logical fields for photochemical model- ing of the Los Angeles and Lower Lake Michigan urban areas and to investi- gate the meteorological data require- ments needed to support the use of the SAIMM four-dimensional data as- similation (FDDA) procedure. Testing of the SAIMM/FDDA method- ology was accomplished through a se- ries of nudging-effectiveness and data-reduction simulations. For Los Angeles, the SAIMM/FDDA procedure was tested using observational data collected during the 1987 Southern Cali- fornia Air Quality Study (SCAQS) and was applied to 25 June (one of the SCAQS episode days); for the Lower Lake Michigan area the procedure was tested using observational data col- lected during the 1991 Lake Michigan Ozone Study (LMOS) and was applied to 26 June (one of the LMOS episode days). The results of the nudging-ef- fectiveness experiments for both the Los Angeles and Lower Lake Michigan areas indicate that assimilation of both wind and temperature data provides the best representation of the meteorologi- cal fields. The data-reduction simula- tion results indicate that even when the Intensive SCAQS and LMOS data sets are reduced to what might be con- sidered routine data sets, assimilation of the available wind and temperature data provides an improved representa- tion of the meteorological fields when compared with a simulation in which no data assimilation was performed. Appropriate specification of the episode and domain-dependent analysis and modeling parameters is essential to successful application of the technique. This Project Summary was devel- oped by EPA's Atmospheric Research and Exposure Assessment Laboratory, Research Triangle Park, NC, to an- nounce key findings of the research project that Is fully documented In a separate report of the same title (see Project Report ordering information at back). Introduction In this study we have used a data- assimilating prognostic meteorological model, the Systems Applications Interna- tional Mesoscale Model (SAIMM), to gen- erate meteorological fields suitable for photochemical modeling of two urban ar- eas: Los Angeles, California, and the Lower Lake Michigan area which includes Chi- cago, Illinois; Milwaukee, Wisconsin; Gary, Indiana; and Muskegon, Michigan. These areas were selected for study because both of the areas (1) have been desig- nated ozone non-attainment areas by the U.S. Environmental Protection Agency (EPA) and continue to experience exceedances of the National Ambient Air Quality Standard (NAAQS) for ozone, (2) ^g/9 Printed on Recycled Paper ------- are characterized by complex mesoscale meteorology that cannot be accurately represented by routinely collected me- teorological data, and (3) have been the setting for recent intensive air quality/ meteorological data collection studies, and thus enhanced surface and upper-air me- teorological data are available for a num- ber of ozone episode days. The objectives of this study were to test the ability of the SAIMM to provide accu- rate meteorological fields for the Los An- geles and Lower Lake Michigan urban areas and to investigate the meteorologi- cal data requirements needed to support the use of the SAIMM four-dimensional data assimilation (FDDA) procedure. To this end, a series of nudging-effectiveness and data-reduction experiments were per- formed for each of the areas. Model per- formance for Los Angeles was evaluated (both graphically and statistically) using data from the 1987 Southern California Air Quality Study (SCAQS); for the Lower Lake Michigan area, data from the 1991 Lake Michigan Ozone Study (LMOS) were used. Testing and Evaluation Procedures Modeling Procedures Prognostic meteorological models nu- merically solve an approximation of the equations that govern atmospheric behav- ior. Beginning with a set of initial conditions that represent the state of the atmosphere at the initial time, a prognostic model simu- lates the response of the atmosphere within the domain of interest to differential heat- ing of the earth's surface. Prognostic mod- els provide a dynamically consistent, physically realistic, three-dimensional rep- resentation of the wind as well as other meteorological variables, such as potential temperature, and planetary-boundary-layer height. However, the prognostic model so- lution may not always replicate observa- tions and, therefore, may not accurately represent day-specific meteorology as is required if the fields are to be used for air quality modeling. Numerical approxima- tions, physical parameterizations, and ini- tialization problems represent a few of the potential sources of error in meteorological models that can cause the model solution to deviate from actual atmospheric behav- ior. The objective of four-dimensional data assimilation (FDDA) is to improve the agreement between the simulated fields and observed data and, thus, provide more accurate meteorological fields for photo- chemical modeling of historical episode days. Using this procedure, observed data are incorporated into the prognostic model solution during the course of a simulation. The most common approach to FDDA is Newtonian "nudging" in which the prog- nostic variables are relaxed or "nudged" toward the observational data by additional forcing terms in the prognostic model equa- tions. The general form of the prognostic equation for a variable a is — = F(a,x,t) + Gxw,xwxyx (af - a) dt The first term on the right-hand side of equation (1), F, represents all of the model's physical processes. The second term is the nudging term. G determines the rela- tive weight of the nudging term with re- spect to the model's physical processes. Typical values of G are 10'3 s-1 for strong nudging and 10-4 s-1 for weak nudging. The variables w, and wxjr are temporal and spa- tial weighting functions and the quantity a' represents the analyzed or observed value. Data assimilation is accomplished in the SAIMM using the Newtonian nudging technique. An objective analysiis of the observational data is performed and a spa- tial weighting factor (based on data avail- ability) is calculated. A temporal weighting factor varies linearly from 0 to 1 throughout the data assimilation interval. The degree to which the prognostic variables are then nudged toward the objective analysis is determined by the weighting information. Overview of the Numerical Experiments The numerical experiments were de- signed to test the ability of the SAIMM/ FDDA procedure to provide accurate me- teorological fields for the Los Angeles and Lower Lake Michigan urban areas and to investigate the meteorological data require- ments needed to support the use of the SAIMM/FDDA methodology in each of these areas. For Los Angeles, the SAIMM/ FDDA procedure was used to simulate the meteorology of 25 June 1987 (one of the SCAQS episode days). For the Lower Lake Michigan area, the simulations were fo- cused on 26 June 1991 (one of the LMOS episode days). The first series of numerical simulation experiments, referred to as the nudging- effectiveness simulations, were designed to examine the effectiveness of nudging the prognostic wind and temperature vari- ables separately and in combination with one another and to determine (roughly) the optimum nudging coefficients for each. Use of the intensive data from SCAQS and LMOS allowed us, in a second series of experiments, to investigate (through stepwise reduction of the input data), the data requirements for successful FDDA. For the data-reduction experiments, the input data base was reduced in three stages so that after the third reduction the data base approximated that which would be available from a routine monitoring net- work. For these experiments, the spatial density of the monitoring sites was re- duced, but the temporal distribution of the observations at each site was hot changed. A model run was performed with the full data set, and then again after each site reduction to determine how the model ad- justed to a decrease in the spatial density of the observational data. Evaluation Measures A number of graphical and statistical analysis products were used to evaluate the simulation results. Graphical analysis was used to subjectively assess how well the assimilated data were represented in the meteorological fields and the effect of the data on the simulated fields in areas removed from the monitoring locations. Graphical analysis products include x-y, x- z, y-z, and z-t cross sections of the wind and temperature fields for several simula- tion times and locations. Statistical analysis was used to quan- tify the differences between the simulated fields and the observed data and, thus, provided a basis for evaluating both the nudging-effectiveness and data-reduction experiments. The statistical analysis in- cluded the calculation of a number of sta- tistical measures of bias including the mean residual, mean unsigned error, mean rela- tive error (normalized bias), mean unsigned relative error (gross error), and the root mean square error. Results Testing and Evaluation for the Los Angeles Domain The SAIMM/FDDA procedure was tested using observational data collected during the 1987 Southern California Air Quality Study (SCAQS) and was applied to 25 June (one of the SCAQS episode days). Specification of the modeling domain (including the horizontal and vertical reso- lution) was based on geographical and meteorological considerations. The com- plex meteorology of this region is strongly influenced by the diurnal land/sea breeze cycle and by slope flows that develop along the steep terrain encompassing the Los Angeles basin. Therefore, the modeling domain includes the Los Angeles basin, adjacent offshore areas, and the surround- ------- ing terrain. The domain consists of 65 grid points in the west-east direction, 36 grid points in the south-north direction, and 22 vertical levels. The horizontal grid spacing is 5 km. The simulation period used in this study included a full diurnal cycle, beginning and ending at 2300 1ST. The SAIMM simula- tions were initialized using domain-scale profiles of temperature and specific humid- ity that were based on available meteoro- logical sounding data from the Ontario, CA (ONT), monitoring site. The geostrophic wind, which is used to initialize the wind field and as a forcing term in the pressure gradient term of the momentum equations, was set equal to zero for all simulations. Because the assimilated data contain in- formation on all scales of motion (including those too large to be resolved within the modeling domain), we have assumed that it is not necessary to artificially impose the large-scale forcing. Nudging-effectiveness simulations in which wind data, temperature data, and wind and temperature data, respectively, were assimilated indicate that, for this SAIMM application, assimilation of wind data alone improves the agreement be- tween the simulated and observed winds and the representation of the wind field but does little to improve the agreement between the simulated and observed up- per-air temperatures. Assimilation of tem- perature data alone improves accuracy with which the upper-air temperatures are simulated but does little to improve the agreement between the simulated and observed winds. Assimilation of both the wind and temperature data not only im- proves the agreement between the simu- lated and observed winds and the agreement between the simulated and observed upper-air temperatures, but ac- tually results in better agreement between the simulated and observed upper-air tem- peratures than does assimilation of tem- perature data alone. The first two nudging-effectiveness simulations utilized nudging coefficients equal to 0.001 for wind and 0.0001 for temperature, respectively. The importance of the wind field in air-quality modeling and the indirect benefits derived from the as- similation of the wind data support the use of a larger nudging coefficient for the as- similation of the wind data than for the assimilation of the temperature data. How- ever, further analysis of the simulation re- sults indicated that strong nudging of the wind components toward the analyzed data created some unrealistic airflow patterns over regions where data were not avail- able. Therefore, the nudging coefficient for assimilation of the wind data was reduced to 6.0005 in the third nudging-effective- ness simulation. A series of data-reduction experiments, using the SCAQS data base, were per- formed to investigate the response of SAIMM/FDDA methodology to varying lev- els of data availability. To accomplish this, monitoring sites were eliminated from the data set in a series of three site-reduction exercises. A model run was performed with the full data set, and then again after each site reduction to determine how the model adjusted to decreased amounts of observational data. Although necessarily somewhat sub- jective, the data reductions were based primarily on geographic considerations. The goal was to produce, after the third site reduction, a data set which represented a routine meteorological monitoring network. For the Los Angeles basin, which contains an extraordinary number of routine surface meteorological monitoring sites, this meant reducing the number of sites beyond what is normally available for this area. The number of surface and upper-air monitor- ing sites used for each of the data-reduc- tion experiments is given in Table 1. The nudging coefficients were assigned based on the results of the nudging-effectiveness simulations and were set equal to 0.0005 for the u and v wind components and 0.0001 for temperature. Table 1. SCAQS Site Reductions Surface Data Wind Reduction Sites Upper Upper Wind Temperature Sites Sites 0 1 2 3 71 51 32 15 15 10 7 4 14 10 7 4 A thorough graphical and statistical analysis of the data-reduction simulation results indicate that, even when the data set is reduced to what might be considered a routine data set, assimilation of the avail- able wind and temperature data provides an improved representation of the meteo- rological fields when compared with the no-FDDA simulation. The influence of the data on the simulations is not confined to the monitoring site locations but is propa- gated within the modeling domain and in- fluences the evolution of the meteorology over data-sparse areas as well. Testing and Evaluation for the Lower Lake Michigan Domain For the Lower Lake Michigan area, the SAIMM/FDDA procedure was tested using observational data collected during the 1991 Lake Michigan Ozone Study (LMOS) and was applied to 26 June (one of the LMOS episode days). The Lower Lake Michigan area includes Chicago, Illinois; Milwaukee, Wisconsin; Gary, Indiana; and Muskegon, Michigan. The lake breeze (driven by the horizontal temperature gradients created by the dif- ferential heating of the land and water surfaces) plays an important role in deter- mining the meteorology of this area and results in complex mesoscale circulation patterns along the lake shore. To allow resolution of the lake-induced circulations, the entire southern portion of Lake Michi- gan is included in the modeling domain. The domain consists of 50 grid points in the west-east direction, 52 grid points in the south-north direction, and 20 vertical levels. The horizontal grid spacing is 5 km. The simulation period for the Lower Lake Michigan area simulations included a full diurnal cycle—extending from 2300 CST 25 June to 2300 CST 26 June. The SAIMM simulations were initialized using domain- scale profiles of temperature and specific humidity that were based on sounding data from the Kankakee, IL (KANK) monitoring site. As in the SCAQS simulations, the geostrophic wind was set equal to zero. The results of the first three nudging- effectiveness simulations in which wind data, temperature data, and wind and tem- perature data, respectively, were assimi- lated were quite disappointing. While assimilation of the observed data improved the agreement between the simulated and observed winds and upper-air tempera- tures locally, some physically unrealistic meteorological features appeared in the simulated wind and temperature fields. Apparently, the information provided by the data had little effect on the simulation in data-sparse areas (i.e., this information was not propagated throughout the model- ing domain). In preparing the analyses for FDDA, the user must specify maximum radii of influence for the interpolation of the data at the surface and aloft. In these initial simulations, the maximum radius of influ- ence for the surface level was 20 km; aloft this value was set equal to 50 km. Addi- tional objective analyses were prepared using a maximum radii of influence of 50 and 100 km for the surface and upper levels, respectively. An additional nudging- effectiveness simulation was performed using the revised analyses. Increasing the radii of influence in the objective analysis of the data constrained the simulation over a broader geographical area and contrib- uted to much improved simulation results. Use of a larger radius of influence for the ------- interpolation of data over the Lower Lake Michigan domain than for the Los Angeles domain Is justifiable due to the absence of terrain in the Lake Michigan area. Nudging coefficients for this simulation were 0.0005 and 0.0001, respectively, for the wind and temperature variables. Assimilation of both the wind and temperature data improved the agreement between the simulated and observed winds and the agreement be- tween the simulated and observed upper- air temperatures. As in the SCAQS nudging-effectiveness simulations, assimi- lation of the wind data in combination with the temperature data resulted in better agreement between the simulated and ob- served upper-air temperatures than as- similation of temperature data alone. A series of data-reduction experiments, using the LMOS data base, were performed to investigate the response of SAIMM/ FDDA methodology to varying levels of data availability. To accomplish this, moni- toring sites were eliminated from the data set in a series of three site-reduction exer- cises. A model run was performed with the full data set, and then again after each site reduction to determine how the model ad- justed to decreased amounts of observa- tional data in a similar manner to the SCAQS runs. The nudging coefficients were assigned based on the results of the nudging-effec- tiveness simulations and were set equal to 0.0005 for the u and v wind components and 0.0001 for temperature. The analyses for the data-reduction simulation were pre- pared using the larger radii of influence (50 km at the surface and 100 km aloft). A thorough graphical and statistical analysis of the data-reduction simulation results indicates that even when the data set is reduced to what might be considered a routine data set, assimilation of the avail- able wind and temperature data provides an improved representation of the meteo- rological fields when compared with the no-FDDA simulation. Summary and Recommendations Testing of the SAIMM/FDDA methodol- ogy for application to the Los Angeles and Lower Lake Michigan urban areas was accomplished through a series of nudging- effectiveness and data-reduction simula- tions. For Los Angeles the SAIMM/FDDA procedure was tested using observational data collected during the 1987 Southern California Air Quality Study (SCAQS) and was applied to 25 June (one of the SCAQS episode days); for the Lower Lake Michi- gan area the procedure was tested using observational data collected during the 1991 Lake Michigan Ozone Study (LMOS) and was applied to 26 June (one of the LMOS episode days). To provide a basis from which to as- sess the FDDA simulations, the SAIMM was first exercised for both areas without data assimilation. The SAIMM simulation without FDDA seems to capture many of the important meteorological features of the SCAQS episode day such as the sea breeze and the upslope and downslope flows; however, the observed data are not always well represented in the simulated fields. In particular, the SAIMM simulation indicates westerly flow (outflow from the Los Angeles basin) earlier than observed, and the southerly flow that develops aloft during the evening hours is not simulated. The SAIMM with a zero geostrophic wind and without FDDA is not able to simulate the 26 June meteorology of the Lower Lake Michigan region. While the model generates some physically realistic me- soscale circulation patterns, the prevailing southwesterly flow that is observed over the region during this episode day is not simulated. Due to the large differences between the LMOS no-FDDA simulated fields and the observed data, effective use of the FDDA methodology for this simula- tion represented a much greater challenge than for the SCAQS simulation. The nudging-effectiveness experiments were designed to examine the effects of nudging the prognostic wind and tempera- ture variables separately and in combina- tion with one another and to determine (roughly) the optimum nudging coefficients for each. The simulation results for both the Los Angeles and Lower Lake Michigan areas indicate that assimilation of both wind and temperature data provides the best representation of the meteorological fields. The importance of the wind field in air-quality modeling and the indirect ben- efits derived from the assimilation of the wind data support the use of a larger nudg- ing coefficient for the assimilation of the wind data than for the assimilation of the temperature data. However, strong nudg- ing of the wind components can create some unrealistic airflow patterns over data- sparse regions. In both the'SCAQS and LMOS simulations, the best overall simula- tion results were achieved with a 0.0005 nudging coefficient for assimilation of the wind data and a 0.0001 nudging coeffi- cient for assimilation of the temperature data. Specification of the maximum radii of influence for the interpolation of the data was an important consideration in the simu- lation of the LMOS episode. Increasing the radii of influence in the objective analysis of the data constrained the simulation over a broader geographical area and contrib- uted to much improved simulation results. A series of data-reduction experiments, using the SCAQS and LMOS data bases, were performed to investigate the response of SAIMM/FDDA methodology to varying levels of data availability. To accomplish this, monitoring sites were eliminated from the data sets in a series of three site- reduction exercises. A model run was per- formed with the full data set, and then again after each site reduction to deter- mine how the model adjusted to decreased amounts of observational data. The data- reduction simulation results for both the SCAQS and LMOS episode days indicate that even when the data set is reduced to what might be considered a routine data set, assimilation of the available wind and temperature data provides an improved representation of the meteorological fields when compared with the no-FDDA simula- tion. As the number of sites is reduced, the simulation errors increase and the effec- tiveness of the FDDA decreases (i.e., some unusual airflow patterns were simulated over data-sparse or unconstrained subre- gions of the modeling domain). The SAIMM/FDDA methodology ap- pears to be a promising technique for the generation of meteorological fields for pho- tochemical modeling. Appropriate specifi- cation of the analysis and modeling parameters is essential to successful ap- plication of the technique. As these param- eters will necessarily be episode- and domain-dependent, thorough testing of the model (including no-FDDA simulation) and evaluation of the simulation results is rec- ommended for each application. Guide- lines for evaluation of the meteorological fields will be developed under Phase II of this study. Further study is-required and anticipated in order to assess the utility of the SAIMM/FDDA methodology to provide accurate inputs for photochemical model- ing. •U.S. Government Printing Office: 1993 — 750-071/60191 ------- ------- Sharon G. Douglas Is with Systems Applications International, San Rafael, CA 94903. Jamas M. Godowltch and Shao-Hang Chu are the EPA Project Officers (see below). The complete report, entitled "Application of a Data-Assimilating Prognostic Meteo- rological Model to Two Urban Areas," (Order No. ;PB93-126571 Cost: $19.50, subject to change) will be available only from;: National Technical Information Service 5285 Port Royal Road Springfield, VA 22161 Telephone: 703-487-4650 The EPA Project Officers can be contacted at: James M. Godowitch Atmospheric Research and Exposure Assessment Laboratory U.S. Environmental Protection Agency (MD-80) Research Triangle Park, NC 27711 Shao-Hang Chu Office of Air Quality Planning and Standards U.S. Environmental Protection Agency (MD-14) Research Triangle Park, NC 27711 United States Environmental Protection Agency Center for Environmental Research Information Cincinnati, OH 45268 Official Business Penalty for Private Use $300 BULK RATE POSTAGE & FEES PAID EPA PERMIT No. G-35 EPA/600/SR-92/221 ------- |