United States
               Environmental Protection
               Agency
               Research and Development
 Atmospheric Research and
 Exposure Assessment Laboratory
 Research Triangle Park, NC 27711
 EPA/600/SR-92/221  January 1993
EPA       Project  Summary

               Application  of  a  Data-
               Assimilating  Prognostic
               Meteorological  Model to Two
               Urban  Areas
              Sharon G. Douglas
                A data-assimilating prognostic me-
              teorological model, the Systems Appli-
              cations International Mesoscale Model
              (SAIMM), was applied to generate me-
              teorological fields suitable for photo-
              chemical modeling of two urban areas:
              Los Angeles, California, and the Lower
              Lake Michigan area which includes Chi-
              cago,  Illinois. The  objectives of this
              study  were to test the  ability of the
              SAIMM to  provide accurate meteoro-
              logical fields for photochemical model-
              ing of the Los Angeles and Lower Lake
              Michigan urban areas and to investi-
              gate the meteorological  data require-
              ments needed  to support the use of
              the SAIMM four-dimensional  data as-
              similation (FDDA) procedure.
                Testing of the SAIMM/FDDA method-
              ology was accomplished  through a se-
              ries of nudging-effectiveness and
              data-reduction simulations. For Los
              Angeles, the SAIMM/FDDA procedure
              was tested using observational data
              collected during the 1987 Southern Cali-
              fornia  Air Quality Study  (SCAQS) and
              was applied to  25  June (one of the
              SCAQS episode days); for the Lower
              Lake Michigan area the procedure was
              tested  using observational data col-
              lected  during the 1991 Lake Michigan
              Ozone Study (LMOS) and was applied
              to 26 June  (one of the LMOS episode
              days).  The  results of the nudging-ef-
              fectiveness experiments  for both the
              Los Angeles and Lower Lake Michigan
              areas indicate that assimilation of both
              wind and temperature data provides the
              best representation of the meteorologi-
              cal fields.  The data-reduction simula-
 tion results indicate that even when
 the Intensive SCAQS and LMOS data
 sets are reduced to what might be con-
 sidered routine data sets, assimilation
 of the available wind and temperature
 data provides an improved representa-
 tion of the meteorological fields when
 compared with a simulation in which
 no  data assimilation was performed.
 Appropriate specification of the episode
 and domain-dependent analysis and
 modeling parameters  is essential to
 successful application of the technique.
   This  Project Summary was devel-
 oped by EPA's Atmospheric Research
 and Exposure Assessment Laboratory,
 Research Triangle Park, NC,  to an-
 nounce key findings of  the research
 project that Is fully documented In a
 separate report of the same title (see
 Project Report ordering information at
 back).

 Introduction
   In this study we have  used a data-
 assimilating  prognostic meteorological
 model, the Systems Applications Interna-
 tional Mesoscale Model (SAIMM), to gen-
 erate meteorological  fields suitable for
 photochemical modeling of  two urban ar-
 eas: Los Angeles, California, and the Lower
 Lake Michigan area which  includes Chi-
 cago, Illinois; Milwaukee, Wisconsin; Gary,
 Indiana; and Muskegon, Michigan. These
 areas were  selected for study because
 both of the  areas (1) have been desig-
 nated ozone non-attainment areas by the
 U.S.  Environmental Protection Agency
 (EPA) and  continue to experience
 exceedances of the National Ambient Air
Quality Standard (NAAQS) for ozone, (2)

         ^g/9 Printed on Recycled Paper

-------
are characterized by complex mesoscale
meteorology  that cannot  be accurately
represented by  routinely  collected me-
teorological data, and (3) have been the
setting for recent intensive  air quality/
meteorological data collection studies, and
thus enhanced surface and upper-air me-
teorological data are available for a num-
ber of ozone episode days.
   The objectives of this study were to test
the ability of the SAIMM to provide accu-
rate meteorological fields for the Los An-
geles and Lower Lake Michigan  urban
areas and to investigate the  meteorologi-
cal data  requirements needed to support
the use  of the SAIMM  four-dimensional
data assimilation (FDDA)  procedure. To
this end,  a series of nudging-effectiveness
and data-reduction experiments were per-
formed for each of the areas. Model per-
formance for Los Angeles was evaluated
(both graphically and statistically) using
data from the 1987 Southern  California Air
Quality Study (SCAQS); for the Lower Lake
Michigan area, data from  the  1991  Lake
Michigan Ozone Study (LMOS) were used.

Testing and Evaluation
Procedures

Modeling Procedures
    Prognostic meteorological models nu-
merically solve an  approximation  of the
equations that govern atmospheric behav-
ior. Beginning with a set of initial conditions
that represent the state of the atmosphere
at the initial time, a prognostic model simu-
lates the  response of the atmosphere within
the domain of interest to differential heat-
ing of the earth's surface. Prognostic mod-
els  provide a  dynamically consistent,
physically realistic, three-dimensional rep-
resentation of the wind as well as other
meteorological variables, such as potential
temperature, and planetary-boundary-layer
height. However, the prognostic model so-
lution  may not always replicate observa-
tions and, therefore,  may not accurately
represent day-specific meteorology as is
required if the fields are to be used for air
quality modeling. Numerical  approxima-
tions, physical parameterizations,  and ini-
tialization problems represent a few of the
potential sources of error in meteorological
 models that can cause the model  solution
to deviate from actual atmospheric behav-
 ior.
    The objective of four-dimensional data
 assimilation (FDDA)  is to  improve the
 agreement between the simulated fields
 and observed data and, thus, provide more
 accurate meteorological fields for photo-
 chemical modeling of historical  episode
 days. Using this procedure, observed data
are incorporated into the prognostic model
solution during the course of a simulation.
The most common approach to FDDA is
Newtonian  "nudging" in  which the  prog-
nostic variables are relaxed or "nudged"
toward the observational data by additional
forcing terms in the prognostic model equa-
tions. The general form of the prognostic
equation for a variable a is

   — = F(a,x,t) + Gxw,xwxyx (af - a)
   dt

   The first term on the right-hand side of
equation (1), F, represents all of the model's
physical processes. The  second term is
the nudging term. G determines the rela-
tive weight of the  nudging term with re-
spect to the model's physical processes.
Typical values of G are 10'3 s-1 for strong
nudging and 10-4 s-1 for weak nudging. The
variables w, and wxjr are temporal and spa-
tial weighting functions and the quantity a'
represents the analyzed or observed value.
   Data assimilation is  accomplished in
the SAIMM using the Newtonian nudging
technique.  An objective  analysiis of  the
observational data is performed and a spa-
tial weighting factor (based on data avail-
ability) is calculated. A temporal weighting
factor varies linearly from 0 to 1 throughout
the data assimilation interval.  The degree
to which the prognostic variables are then
nudged  toward the objective analysis is
determined by the weighting information.

Overview of the Numerical
Experiments
   The  numerical  experiments  were  de-
signed  to test the  ability  of the SAIMM/
FDDA procedure to provide accurate  me-
teorological fields for the Los Angeles and
Lower Lake Michigan urban areas and to
investigate the meteorological data require-
ments needed to support the use  of the
SAIMM/FDDA  methodology  in  each of
these areas. For Los Angeles, the SAIMM/
FDDA procedure was used to simulate the
meteorology  of 25 June 1987 (one of the
SCAQS episode days). For the Lower Lake
Michigan area, the simulations were fo-
cused on 26 June 1991 (one of the LMOS
episode days).
    The first series of numerical simulation
 experiments, referred to as the nudging-
 effectiveness simulations, were designed
to examine the effectiveness of nudging
the prognostic wind and temperature vari-
 ables separately and  in combination  with
 one  another and  to determine (roughly)
the optimum nudging coefficients for each.
    Use of  the intensive data from SCAQS
 and LMOS allowed us, in  a second series
 of experiments, to investigate  (through
stepwise reduction of the input data), the
data requirements for successful FDDA.
For the data-reduction experiments, the
input data base was reduced in three stages
so that after the third reduction the data
base approximated that which would be
available from a  routine monitoring  net-
work. For these experiments,  the spatial
density  of the  monitoring sites was  re-
duced, but the temporal distribution of the
observations at each site was hot changed.
A model run was performed with the full
data set, and  then again after each site
reduction to determine how the model ad-
justed to a decrease in the spatial density
of the observational data.

Evaluation Measures
   A number  of graphical and statistical
analysis  products were  used to evaluate
the simulation results. Graphical analysis
was used to subjectively assess how well
the assimilated data  were represented in
the meteorological fields and the effect of
the data on  the simulated fields in areas
removed from  the monitoring locations.
Graphical analysis products include x-y, x-
z, y-z, and z-t cross  sections  of the wind
and temperature fields for several simula-
tion times and locations.
    Statistical analysis was used to quan-
tify the differences between the simulated
fields and the observed data and, thus,
provided a basis for evaluating both the
nudging-effectiveness and data-reduction
experiments. The statistical analysis in-
cluded the calculation of a number of sta-
tistical measures of bias including the mean
residual, mean unsigned error, mean rela-
tive error (normalized bias), mean unsigned
relative error  (gross error), and the root
mean square error.

Results

 Testing and Evaluation for the
Los Angeles Domain
    The  SAIMM/FDDA  procedure  was
tested using observational  data collected
during the 1987 Southern California Air
Quality Study (SCAQS) and was applied
to  25 June (one of  the  SCAQS episode
days).
    Specification of the  modeling domain
 (including the horizontal and vertical reso-
 lution) was based on  geographical  and
 meteorological considerations. The com-
 plex meteorology of this region is strongly
 influenced by the diurnal land/sea breeze
 cycle and by slope flows that develop along
the steep terrain  encompassing the Los
 Angeles basin. Therefore, the modeling
 domain  includes the Los Angeles basin,
 adjacent offshore areas, and the surround-

-------
ing terrain. The domain consists of 65 grid
points  in the west-east direction, 36 grid
points in the south-north direction, and 22
vertical levels. The horizontal grid spacing
is 5 km.
   The simulation period used in this study
included a full diurnal cycle, beginning and
ending at 2300 1ST. The SAIMM simula-
tions were initialized using domain-scale
profiles of temperature and specific humid-
ity that were based on available meteoro-
logical sounding data from the Ontario, CA
(ONT),  monitoring site. The geostrophic
wind, which is used to initialize the wind
field  and as a forcing term  in the pressure
gradient term of the momentum equations,
was  set equal to  zero for  all simulations.
Because the assimilated data contain  in-
formation on all scales of motion (including
those too  large to be resolved within the
modeling domain), we have assumed that
it is not necessary to artificially impose the
large-scale forcing.
   Nudging-effectiveness  simulations in
which wind data, temperature data, and
wind and temperature data, respectively,
were assimilated  indicate that, for this
SAIMM application, assimilation of wind
data alone improves the agreement be-
tween the simulated and observed winds
and  the representation of the wind field
but does little to improve the agreement
between the simulated and observed up-
per-air temperatures. Assimilation of tem-
perature data alone improves accuracy
with which the upper-air temperatures are
simulated  but does little to  improve the
agreement between the simulated and
observed winds. Assimilation of both the
wind and temperature data not only im-
proves the agreement between the simu-
lated  and  observed  winds  and the
agreement between the simulated and
observed upper-air temperatures, but ac-
tually results in better agreement between
the simulated and observed upper-air tem-
peratures than does  assimilation of tem-
perature data alone.
   The first two  nudging-effectiveness
simulations  utilized nudging coefficients
equal to 0.001 for wind and 0.0001 for
temperature, respectively. The importance
of the wind field in air-quality modeling and
the indirect benefits derived from the as-
similation of the wind data support the use
of a larger nudging coefficient for the as-
similation of the wind data than for the
assimilation of the temperature data. How-
ever, further analysis of the simulation re-
sults indicated that strong  nudging  of the
wind components toward the analyzed data
created some  unrealistic airflow patterns
over  regions where data were not avail-
able. Therefore, the nudging coefficient for
assimilation of the wind data was reduced
to 6.0005  in the third nudging-effective-
ness simulation.
   A series of data-reduction experiments,
using the SCAQS data base, were per-
formed to investigate the response of
SAIMM/FDDA methodology to varying lev-
els of data availability. To accomplish this,
monitoring sites were eliminated from the
data set in a series of three site-reduction
exercises.  A model  run was performed
with the full data set, and then again after
each site  reduction to determine how the
model adjusted to decreased  amounts of
observational data.
   Although  necessarily  somewhat sub-
jective, the data  reductions were  based
primarily on geographic considerations. The
goal was  to produce, after the third  site
reduction, a data set which represented a
routine meteorological monitoring network.
For the Los Angeles basin, which contains
an extraordinary number of routine surface
meteorological monitoring sites, this meant
reducing the number of sites beyond what
is normally available for this area. The
number of surface and upper-air monitor-
ing sites used for each of the  data-reduc-
tion experiments is given in Table 1. The
nudging coefficients were assigned based
on the results of the nudging-effectiveness
simulations and were set equal to 0.0005
for the u and  v wind components and
0.0001 for temperature.

Table 1. SCAQS Site Reductions
          Surface
Data       Wind
Reduction   Sites
Upper    Upper
Wind Temperature
Sites    Sites
0
1
2
3
71
51
32
15
15
10
7
4
14
10
7
4
   A  thorough graphical  and statistical
analysis of the data-reduction simulation
results indicate that, even  when the data
set is reduced to what might be considered
a routine data set, assimilation of the avail-
able wind and temperature data provides
an improved representation of the meteo-
rological  fields  when compared with the
no-FDDA simulation. The influence of the
data on the simulations is not confined to
the monitoring site locations but is propa-
gated within the modeling domain and in-
fluences the evolution of the  meteorology
over data-sparse areas as well.

Testing and Evaluation for the
Lower Lake Michigan Domain
   For the Lower Lake Michigan area, the
SAIMM/FDDA procedure was tested using
observational data collected during the
1991 Lake Michigan Ozone Study (LMOS)
and was applied to 26 June (one of the
LMOS episode days).
    The Lower Lake Michigan area includes
Chicago,  Illinois; Milwaukee, Wisconsin;
Gary, Indiana; and Muskegon,  Michigan.
The lake breeze (driven by the  horizontal
temperature gradients created by the dif-
ferential  heating  of the  land and water
surfaces) plays an important role in deter-
mining the meteorology of this  area and
results in complex mesoscale circulation
patterns  along the lake  shore. To  allow
resolution of the lake-induced circulations,
the entire southern portion of Lake Michi-
gan is included in the modeling domain.
The domain consists of 50 grid points in
the west-east direction, 52 grid points in
the south-north direction, and 20  vertical
levels. The horizontal grid spacing is 5 km.
    The simulation period for the  Lower
Lake Michigan area simulations included a
full diurnal cycle—extending from  2300 CST
25 June to 2300 CST 26 June. The SAIMM
simulations were initialized using domain-
scale profiles of temperature and specific
humidity that were based on sounding data
from the  Kankakee, IL (KANK) monitoring
site. As  in the SCAQS  simulations, the
geostrophic wind was set equal to zero.
   The results of the first three nudging-
effectiveness simulations in which  wind
data, temperature data, and wind and tem-
perature  data, respectively,  were  assimi-
lated were  quite disappointing.  While
assimilation of the observed data improved
the agreement between the simulated and
observed winds  and upper-air  tempera-
tures locally, some physically unrealistic
meteorological features appeared in the
simulated wind  and temperature fields.
Apparently, the  information  provided by
the data  had  little effect on the simulation
in data-sparse areas (i.e., this information
was not propagated throughout the model-
ing domain). In preparing the analyses for
FDDA, the user must  specify maximum
radii of influence for the interpolation of the
data at the surface and aloft. In these initial
simulations, the maximum radius of influ-
ence for the surface level was 20 km; aloft
this value was set equal to 50 km. Addi-
tional objective analyses were  prepared
using a maximum radii of influence of 50
and 100 km  for the surface and  upper
levels, respectively. An additional nudging-
effectiveness simulation  was performed
using the revised analyses. Increasing the
radii of influence in the objective analysis
of the data constrained the simulation over
a broader geographical area and contrib-
uted to much improved simulation results.
Use of a  larger radius of influence for the

-------
interpolation of data over the Lower Lake
Michigan domain than for the Los Angeles
domain Is justifiable due to the absence of
terrain in the Lake Michigan area. Nudging
coefficients for this simulation were 0.0005
and 0.0001, respectively, for the wind and
temperature variables. Assimilation of both
the wind and temperature data improved
the agreement between the simulated and
observed winds  and the agreement be-
tween the simulated and observed upper-
air temperatures.  As  in  the SCAQS
nudging-effectiveness simulations, assimi-
lation of the wind data in combination with
the temperature data resulted in better
agreement between the simulated and ob-
served upper-air temperatures than  as-
similation of temperature data alone.
   A series of data-reduction experiments,
using the LMOS data base, were performed
to investigate the  response  of SAIMM/
FDDA methodology to varying levels  of
data availability. To accomplish this, moni-
toring sites were eliminated from the data
set in a series of three site-reduction exer-
cises. A model run was performed with the
full data set, and then again after each site
reduction to determine how the model ad-
justed to decreased amounts of observa-
tional data  in a similar manner to the
SCAQS runs.
   The nudging coefficients were assigned
based on the results of the nudging-effec-
tiveness simulations and were set equal to
0.0005 for the u and v wind components
and 0.0001 for temperature. The analyses
for the data-reduction simulation were pre-
pared using the larger radii of influence (50
km at the surface and 100 km aloft).
   A thorough graphical and  statistical
analysis of the data-reduction simulation
results indicates that even when the data
set is reduced to what might be considered
a routine data set, assimilation of the avail-
able wind and temperature  data provides
an improved representation of the meteo-
rological fields when compared with  the
no-FDDA simulation.

Summary and
Recommendations
   Testing of the SAIMM/FDDA methodol-
ogy for application to the Los Angeles and
Lower Lake Michigan urban areas was
accomplished through a series of nudging-
effectiveness and data-reduction simula-
tions. For Los Angeles the SAIMM/FDDA
procedure was tested using observational
data collected during the 1987 Southern
California Air Quality Study (SCAQS) and
was applied to 25 June (one of the SCAQS
episode days); for the Lower Lake Michi-
gan area the procedure was tested using
observational data collected  during the
1991 Lake Michigan Ozone Study (LMOS)
and was applied to 26 June (one of the
LMOS episode days).
   To provide a basis from which to as-
sess the FDDA simulations, the SAIMM
was first exercised for both areas without
data assimilation. The SAIMM simulation
without  FDDA seems to capture many of
the important meteorological  features  of
the SCAQS  episode day such as the sea
breeze  and  the upslope  and  downslope
flows; however, the observed data are not
always well  represented in the simulated
fields. In particular, the SAIMM simulation
indicates westerly flow (outflow  from the
Los Angeles basin) earlier than observed,
and the southerly flow that develops aloft
during the evening hours is not simulated.
The SAIMM with a zero geostrophic wind
and without  FDDA is  not able  to simulate
the 26  June meteorology of  the Lower
Lake  Michigan region. While the model
generates some physically realistic me-
soscale circulation patterns, the prevailing
southwesterly flow that is observed over
the region during this episode day is not
simulated. Due to the large  differences
between the LMOS  no-FDDA simulated
fields and the observed data, effective use
of the FDDA methodology for this simula-
tion represented a much greater challenge
than for the SCAQS simulation.
    The nudging-effectiveness experiments
were  designed  to examine the effects of
nudging the  prognostic wind and tempera-
ture variables separately and in combina-
tion with one another and to determine
(roughly) the optimum nudging coefficients
for each. The simulation results for both
the Los  Angeles and Lower Lake Michigan
areas indicate  that assimilation of both
wind and  temperature data provides the
best representation of the meteorological
fields. The importance of the wind field in
air-quality modeling and the indirect ben-
efits derived from the assimilation of the
wind data support the use of a larger nudg-
ing coefficient for the assimilation of the
wind data than for the assimilation of the
temperature data. However, strong nudg-
ing of the wind components  can create
some unrealistic airflow patterns over data-
sparse regions. In both the'SCAQS and
LMOS simulations, the best overall simula-
tion results were achieved with a 0.0005
nudging coefficient for assimilation of the
wind data and  a 0.0001  nudging coeffi-
cient for assimilation of the temperature
data. Specification of the maximum radii of
influence for the interpolation of the data
was an important consideration in the simu-
lation of the LMOS episode. Increasing the
radii of influence in the objective analysis
of the data constrained the simulation over
a broader geographical area and contrib-
uted to much improved simulation results.
   A series of data-reduction experiments,
using the SCAQS and LMOS data bases,
were performed to investigate the response
of SAIMM/FDDA methodology to varying
levels of data  availability. To accomplish
this, monitoring sites were eliminated from
the data sets  in  a series of  three site-
reduction exercises. A model run was per-
formed  with the full  data set, and then
again after each site reduction  to deter-
mine how the model adjusted to decreased
amounts of observational  data. The data-
reduction simulation  results for both the
SCAQS and LMOS episode days indicate
that even when the data set is reduced to
what might be considered a routine data
set, assimilation of the available wind and
temperature data provides an improved
representation of the meteorological fields
when compared with the no-FDDA simula-
tion. As the number of sites is reduced, the
simulation errors increase and the  effec-
tiveness of the FDDA decreases (i.e., some
unusual airflow patterns  were simulated
over data-sparse or unconstrained subre-
gions of the modeling domain).
   The SAIMM/FDDA  methodology ap-
pears to be a promising technique for the
generation of meteorological fields for pho-
tochemical  modeling. Appropriate specifi-
cation  of  the  analysis  and  modeling
parameters is essential to successful ap-
plication of the technique. As these param-
eters will  necessarily be episode- and
domain-dependent, thorough testing of the
model (including no-FDDA simulation) and
evaluation of the simulation results is rec-
ommended for each application. Guide-
lines for evaluation of the meteorological
fields will be developed under Phase II of
this study.  Further study  is-required and
anticipated in order to assess the utility of
the SAIMM/FDDA methodology to provide
accurate inputs for photochemical model-
ing.
                                                                                      •U.S. Government Printing Office: 1993 — 750-071/60191

-------

-------
  Sharon G. Douglas Is with Systems Applications International, San Rafael,  CA
    94903.
  Jamas M. Godowltch and Shao-Hang Chu are the EPA Project Officers (see
    below).
  The complete report, entitled "Application of a Data-Assimilating Prognostic Meteo-
    rological Model to Two Urban Areas," (Order No. ;PB93-126571 Cost: $19.50,
    subject to change) will be available only from;:
         National Technical Information Service
         5285 Port Royal Road
         Springfield, VA 22161
         Telephone: 703-487-4650
  The EPA Project Officers can be contacted at:
         James M. Godowitch
         Atmospheric Research and Exposure Assessment Laboratory
         U.S. Environmental Protection Agency (MD-80)
         Research Triangle Park, NC 27711

         Shao-Hang Chu
         Office of Air Quality Planning and Standards
         U.S. Environmental Protection Agency (MD-14)
         Research Triangle Park, NC 27711
United States
Environmental Protection Agency
Center for Environmental Research Information
Cincinnati, OH 45268

Official Business
Penalty for Private Use
$300
     BULK RATE
POSTAGE & FEES PAID
         EPA
   PERMIT No. G-35
EPA/600/SR-92/221

-------