c/EPA
              United States
              Environmental Protection
              Agency
              Environmental Monitoring
              and Support Laboratory
              PO Box 15027
              Las Vegas NV 89114
EPA-600/4-78-036
June 1978
              Research and Development
Environmental
Monitoring Series

Air Monitor
Siting  by Objective

-------
                RESEARCH REPORTING SERIES

Research reports of the Office of Research and Development, U.S. Environmental
Protection Agency, have been grouped into nine series. These nine broad cate-
gories were established to facilitate further development and application of en-
vironmental technology.  Elimination  of traditional grouping was  consciously
planned to foster technology transfer and a maximum interface in related fields.
The nine series are:

      1.   Environmental  Health Effects Research
      2.   Environmental  Protection Technology
      3.   Ecological Research
      4.   Environmental  Monitoring
      5.   Socioeconomic Environmental Studies
      6.   Scientific and Technical  Assessment Reports (STAR)
      7.   Interagency  Energy-Environment Research and Development
      8.   "Special"  Reports
      9.   Miscellaneous Reports

This report has been assigned to the ENVIRONMENTAL MONITORING  series.
This series describes research conducted to develop new or improved methods
and  instrumentation for the identification and  quantification of environmental
pollutants at the lowest conceivably significant concentrations. It also includes
studies to determine the ambient concentrations of pollutants in the environment
and/or the variance of pollutants as a function of time or meteorological factors.
This document is available to the public through the National Technical Informa-
tion Service, Springfield, Virginia  22161.

-------
                                             EPA-600/4-78-036
                                             June 1978
         AIR MONITOR SITING BY OBJECTIVE


                        by
        Masato Koda and John H.  Seinfeld
       Department of Chemical  Engineering
       California Institute of Technology
           Pasadena, California  91125
            Contract No.  68-03-2441
                Project Officer
               James  L. McElroy
Monitoring Systems Research and Development Division
  Environmental Monitoring and Support Laboratory
              Las Vegas, Nevada 89114
  ENVIRONMENTAL MONITORING AND SUPPORT LABORATORY
         OFFICE OF RESEARCH AND DEVELOPMENT
        U.S. ENVIRONMENTAL PROTECTION AGENCY
              LAS VEGAS, NEVADA  89114

-------
                                 DISCLAIMER

     This report has been reviewed by the Environmental Monitoring and Support
Laboratory-Las Vegas, U.S. Environmental Protection Agency, and approved for
publication.  Approval does not signify that the contents necessarily reflect
the views and policies of the U.S. Environmental Protection Agency, nor does
mention of trade names or commercial  products constitute endorsement or recom-
mendation for use.

-------
                                  FOREWORD

     Protection of the environment requires effective regulatory actions which
are based on sound technical and scientific information.   This information
must include the quantitative description and linking of pollutant sources,
transport mechanisms, interactions,  and resulting effects on man and his en-
vironment.  Because of the complexities involved, assessment of specific pol-
lutants in the environment requires  a total systems approach which transcends
the media of air, water, and land.  The Environmental Monitoring and Support
Laboratory-Las Vegas contributes to  the formation and enhancement of a sound
monitoring data base for exposure assessment through programs designed to:

     •    develop and optimize systems and strategies for moni-
          toring pollutants and their impact on the environment.

     •    demonstrate new monitoring systems and technologies by
          applying them to fulfill special monitoring needs of
          the Agency's operating programs.

     This report discusses the theoretical basis for a method of designing
air quality monitoring networks with regard to the specific objective(s) of
the network.  The method is presently only applicable to nonreactive pollu-
tants and requires the existence of limited air quality monitoring data.
Regional and local agencies may find this method useful in planning or adjust-
ing their air quality monitoring networks.  The Monitoring Systems Design and
Analysis Staff may be contacted for  further information on the subject.
                           -         /        /
                              George B. Morgari
                                  Di rector
               Environmental Monitoring and Support Laboratory
                                  Las Vegas
                                     m

-------
                                 ABSTRACT

     A method is developed whereby measured pollutant concentrations can be
used in conjunction with a mathematical  air quality model  to estimate the
full spatial  and temporal concentration  distributions of the pollutants over
a given region.   The method is based on  the application of estimation theory
to systems described by partial  differential equations, such as the atmo-
spheric diffusion equation.  A computer  code has been developed that can pro-
cess monitoring data to produce concentration distribution estimates.  The
code has been tested extensively on a hypothetical  airshed, designed to illus-
trate the key features of the method.  Once concentration distributions have
been estimated, new monitoring stations  can be located based on several sit-
ing criteria.
                                    IV

-------
                                CONTENTS

Foreword  	    iii
Abstract	    iv
List of Figures   	    vii
List of Tables  	    viii

I.   Introduction   	    1
II.  Design of a Monitoring System  	    3
     A.   Monitoring Requirements of Different Pollutants ....    4
     B.   Siting Criterion  	    7
III. Optimal Estimation of Air Pollutant Concentrations  	    9
     A.   Mathematical Models of Urban Air Pollution   	    9
     B.   Formulation of the Problem  	    11
     C.   Basic Regional Model for Application of Filtering
          Theory	    16
          1.   Scale Transformation in Vertical Direction ....    20
          2.   Mesh Parameters   	    21
          3.   Measurements and Other Conditions	    24
     D.   Simulation Results   	    24
          1.   Performance of the Filter.  Base Case	    26
          2.   Performance of the Filter.  Effect of Number
               of Measurement Locations  	    28
     E.   Application of the Present Method to Monitor Siting
          Problems  	    48
IV.  Conclusions  	    51

References   	    52
Appendices

     A.   Distributed Parameter  Filtering Theory	    55
     B.   Finite Difference Approximation and Square Root
          Implementation of Distributed Parameter Filter	    59
          B.I  Finite Difference Approximation of Distributed
               Parameter Filter  	    59
          B.2  Square Root Implementation of Distributed
               Parameter Filter  	    63
     C.   Square Root Matrix   	    67
          C.I  Lower Triangular  Cholesky Decomposition   	    67
          C.2  Inversion of lower triangular matrix  	    68
          C.3  Modified Gram-Schmidt Orthogonalization	    68
     D.   Discussion of Approaches to the Design of a
          Monitoring System    	    69
     E.   Documentation of Program	    73
          E.I  Function	    73

-------
                  Contents (Continued)

E.2  Program Flowsheet and Subroutines	    73
E.3  Input Variables   	    75
E.4  Output Variables	    77
E.5  Error Messages	    77

-------
                              LIST OF FIGURES
Number                                                                Page
  1       Basic filtering procedure for estimating air pollutant
          concentrations   	     14
  2       Scale transformation in vertical  direction ••	     22
  3       Basic grid system for computation  	     23
  4       Spatial  distribution of pollutant surface flux  	     25
  5       Computed filter variance, case A  	     29
  6       Computed filter variance, case B  	     29
  7       Spatial  distribution of filter variance, case A  ....     29
  8       Spatial  distribution of filter variance, case B  •  .  •  •     32
  9       Comparison of estimated, true and measured
          concentrations, case A   	     35
  10      Comparison of estimated, true, and measured
          concentrations, case C   	•  •     37
  11      Comparison of spatial distributions of estimated
          and true concentrations, case A	     39
  12      Comparison of spatial distributions of estimated
          and true concentrations, case C	     41
  13      Actual estimation error variances, case A   	     44
  14      Actual estimation error variances, case C	     44
  15      Comparison of estimated and true concentrations
          at nonmeasurement points, case A	     45
  16      Comparison of estimated concentrations, base case A  •  •     46
  17      Comparison of variances, base case A	     47
  E.I     Flowsheet of the program	     74
                                     vn

-------
                              LIST OF TABLES
Number                                                               Page
  1      Siting Criteria for Different Objectives  for
         Monitoring  	   8
  2      Optimal Distributed-Parameter Filter	   15
  3      Stability Conditions	   20
  4      Parameters Used in Base Case Simulations	   25
  5      Measurement Conditions   	   28
  D.I    Summary of Previous Approaches to Optimal  Air
         Pollutant Monitor Siting  	   72
                                   vm

-------
                                  SECTION 1

                                INTRODUCTION

      Urban air pollutant monitoring systems consist of an array of stations
located throughout the region at which pollutant concentrations are measured
continuously or intermittently.   At present in most systems the data are trans-
mitted to a central facility and stored for possible future use.  The object
of this study is to develop a methodology by which air monitoring data from
an urban monitoring system may be processed in real time to produce continuous
estimates of the spatial  and temporal  concentration distributions of pollu-
tants over a region, and  thereby to locate pollutant maxima under various
meteorological  conditions.

      Mathematical urban  air pollution models enable the prediction of the
spatial and temporal concentration distributions of pollutants under varying
meteorological  and source emission conditions.  Given a region with no moni-
toring stations it is possible to estimate pollutant concentration distribu-
tions in the airshed as long as  pollutant source emissions and meteorological
data are available.  In such a case it is possible to perform numerical ex-
periments with a mathematical model under various source and meteorological
conditions to find the locations of the highest concentrations of particular
pollutants and then to design a  network initially for the region based on
these predicted pollutant concentrations.

      The more common situation  is that in which several monitoring stations
are already in existence  in a region.   In addition to routine surveillance it
may be desired either to  move those stations or to place new stations within
the region.  In such a case, it  is desirable to use the information available
from the existing stations as well as tta information that can be obtained by
exercising a mathematical model.  The basic problem involved is that of coordi-
nating in a consistent manner the measurements available from the stations
and the predictions of the mathematical model.  The development of a frame-
work for this integration is the object of this work.  Thus, our objective is
to develop means by which urban  air pollution models can be used together with
air monitoring data to estimate the complete spatial and temporal concentra-
tion distributions over a region.  As noted above, if this can be done, then
additional stations can be located or existing stations can be moved for bet-
ter or more effective surveillance of pollutant concentrations.  In Section II
we cite several objectives by which one would locate stations.  Each of these
objectives can be met if one has the knowledge of concentration distributions
over the whole region and how the concentration distributions vary with changes
in meteorology and sources.

-------
      The basic problem we consider, therefore, is the use of data from an
existing monitoring network in conjunction with an air quality simulation
model to produce estimates of pollutant concentrations over the entire region.
Thus, we are developing a means by which point data can be used in conjunction
with a mathematical model to produce area-wide concentration distributions as
a function of time.  This is essentially an optimization problem in that we
want to estimate that concentration distribution that is as close as possible
to both the monitoring data and the mathematical model.  Because of inherent
fluctuations in atmospheric measurements and because mathematical models are
inexact, there will never be complete agreement between model predictions and
measured data.

      The optimization problem associated with estimating concentration dis-
tributions is formulated and solved.  The main consideration in implementing
the theory is the computational requirements.  In short, the solution requires
the solution of an equation similar to the mathematical model for the esti-
mated concentrations and of an additional set of equations for the statistics
(variances) of the concentration estimates.  The equations for the variances
are cumbersome.  A significant portion of the present study has been devoted
to determining the most efficient numerical way to implement the model and
variance equations so that ultimately the algorithm can be exercised by the
U.S. Environmental Protection Agency (EPA) with reasonable amounts of comput-
ing time for an urban region.  In this report we describe a systematic inves-
tigation of each of the available numerical approximate methods for solution
of the model and variance equations.  We present a computer program capable of
producing concentration estimates from an arbitrary number of monitoring
points in a given region with a 3-dimensional model.  The program is struc-
tured in a general way for a 2-dimensional wind field, vertical diffusivity,
and source emissions, and an arbitrary number of monitoring locations.  The
program will, given measurements of pollutant concentrations, produce esti-
mates of the full concentration distribution over the region.  In its present
version the program relates only to inert pollutants and does not include a
vertical component in the wind field.  Such conditions could be added if a
comprehensive air quality simulation model were to be integrated into this
routine.

-------
                                 SECTION II

                       DESIGN OF A MONITORING SYSTEM

      Monitoring ambient air quality is an indispensable,  and perhaps  the
single most important, activity in the evaluation and control of air pollu-
tion.  Without reliable measured data, one cannot establish  a quantitative re-
lationship between atmospheric pollutant levels and source emissions,  nor can
one assess the effects of polluted air on man and his environment.

      With the passage of the Clean Air Act, including the 1977 amendments,
ambient air monitoring programs have become an essential  part of state imple-
mentation plans.  In its 1975 document on design of air monitoring  networks,
EPA states that "much more consideration, in both manpower and monetary re-
sources, should be applied to the issue of siting monitoring facilities than
is currently the common practice...  It is considered inconsistent  to under-
take a monitoring effort involving resources in the tens of  thousands of dol-
lars without investigating the far smaller effort involved in... proper siting
of the monitoring instruments."  (Environmental Protection Agency,  1975.)

      The design of an ambient air quality monitoring network will  depend on
the purpose of the network.  The following general monitoring objectives have
been delineated by the EPA:

      (1)  To establish a basis for comparison of air quality standards with
           actual air quality levels, to measure progress toward compliance,
           and to establish the degree to which compliance is achieved.

      (2)  To provide a basis for ascertaining long term trends.  (The imple-
           mentation of most air pollution control strategies takes time.
           The effectiveness of these strategies, as reflected by the gradual
           changes in air quality, can be evaluated only through careful com-
           parisons of historical records of air quality data.)

      (3)  To provide air quality data during episodes

      (4)  To monitor source compliance with regulations

      (5)  To provide data to support enforcement actions

      (6)  To provide data for research.

      The development of a permanent air quality monitoring  network involves
the determination of the number and location of sampling sites, selection of
appropriate instrumentation, determination of the frequency  and schedule of

-------
sampling, and establishment of instrument and probe siting criteria (Environ-
mental Protection Agency, 1975).  In this work we confine our attention to
the first aspect, namely the determination of the number and location of sam-
pling sites.  As noted above, the configuration of an air quality monitoring
network involves two elements—the number of sensors and their geographical
location.  Decisions on the two elements can be made in either order, that is,
the number of stations can be prescribed based on a criterion of cost and then
distributed geographically in some optimal fashion, or the number of stations
and their locations may be chosen on the basis of the monitoring criteria.
The minimally required number of monitoring stations can be judged, in general,
from factors such as the absolute levels of pollutant concentrations, the vari-
ability of the spatial concentration distribution, and the physical size and
population distribution of the region.

      Air pollutants are emitted from a variety of sources and are transported
and dispersed by an atmosphere,the turbulent processes in which can only be
described statistically.  As a result, pollutant concentrations not only fluc-
tuate in time; they generally differ from place to place as well.  The most
difficult question confronting a designer of monitoring systems can be there-
fore expressed as: "Where in this highly variable field should the stations be
placed so that representative samples can be collected?"  Clearly, an answer
lies in the thorough understanding of parameters that can affect the spatial
distribution of air pollutants.  A variety of non-technical considerations
will influence the choice of where to locate monitoring stations, including
the security of the site, its accessibility, the availability of utilities,
the ability to accommodate future modifications or expansions, and the avail-
ability of the site.  Harrison (1972) has noted that the prime consideration
in selecting sites for the Chicago air monitoring network is security.

A.    MONITORING REQUIREMENTS OF DIFFERENT POLLUTANTS

      The pollutants commonly monitored in an urban area are carbon monoxide
(CO), sulfur dioxide ($02), total hydrocarbons, oxides of nitrogen (NO + N02),
oxidant, and total suspended particulate matter.  Each has certain atmospheric
characteristics that suggest different monitoring requirements.

      Carbon monoxide is essentially a non-reactive pollutant generated by
motor vehicles.  The highest concentrations of CO are observed in urban areas
near roadways carrying high volumes of traffic (Chang and Weinstock, 1973).
In order to ascertain whether air quality standards for CO are being met, it
is therefore most important to monitor in the regions of highest traffic den-
sities.   In assessing long term trends it may also be desirable to measure
CO at points removed from roadways as well as near roadways.  According to
recent EPA guidance on siting of CO monitors, six types of sites are dis-
cussed and assigned the priorities shown below (Environmental Protection
Agency,  1975):

          Type of site	            Priority
          Peak street canyon                1
          Peak neighborhood                 1
          Average street canyon             2

-------
           Type of site            Priority
           Corridor                   3
           Background                 4
           Average Neighborhood       5

Ott and Eliassen (1973) have suggested that present patterns in the siting of
stations for CO measurement may not permit adequate assessment of compliance
with air quality standards.  Ott (1971) conducted a comprehensive survey of
air quality (for CO) in the vicinity of a single monitoring station in San
Jose, California.  Samples were collected along the sidewalk at "breathing
level" while walking at random spatially over a specified grid, and in special-
ly   prescribed patterns near the station.  Ott found that concentrations sig-
nificantly higher than those recorded at the monitoring station prevailed
along the sidewalks of downtown San Jose streets and that these concentrations
showed little correlation with values recorded at the station.  Near-street
measured values (long-term averages) were on the order of a factor of two
greater than values recorded at the station.

      Ott (1975) has attempted to formulate a set of uniform criteria for CO
monitoring.   He suggests a dual monitoring approach, in which two monitoring
stations are used continuously in each area of the region, one to monitor the
lower urban neighborhood concentration and one to monitor the higher concen-
trations to which pedestrians are exposed near traffic.  Ludwig and Keoloha
(1975) have suggested procedures for selecting CO monitoring sites representa-
tive of downtown street canyon areas, along major traffic corridors, and urban
neighborhoods.  They make specific recommendations for the heights of moni-
toring ports, distances from major and minor roadways, and placement relative
to urban areas.

      Sulfur dioxide is emitted from fossil fuel combustion in power plants
and space heating units and from certain industrial operations.  Once emitted,
S02 is oxidized to sulfates on  a time scale the order of hours,  with sub-
stantial amounts of the original gaseous sulfur ending up in airborne par-
ticles.  Because of the nature of its sources, SOg is usually emitted above
ground level from stacks or from the tops of buildings.  Highest concentra-
tions might be expected to occur therefore at rooftop levels directly down-
wind of major sources.  There is mounting evidence that the most serious S02-
related health effects are those resulting from exposure to sulfate-bearing
particulate matter.  These effects would be manifest most strongly well down-
wind of the S02 sources themselves, since time is required to convert gaseous
S02  to particulate sulfate.  With the exception of monitoring downwind of
certain strong sources of S02, such as a power plant, monitoring requirements
for S02 suggest area-wide measurements.

      Hydrocarbons are emitted from motor vehicles and a large number of in-
dustrial sources.  There are currently no air quality standards for hydrocar-
bons based on health effects, although there does exist a standard of 0.24ppmC
(parts-per-million of carbon) for a 3-hour average based on subsequent oxi-
dant formation.  There is no need therefore to measure hydrocarbons for
possible health effects.  The primary reason for monitoring hydrocarbon con-
centrations is based on the relationship of hydrocarbon levels to oxidant
formation.


                                      5

-------
      The oxides of nitrogen, NO and NCU, have rather different spatial dis-
tributions in the atmosphere when there are appreciable emissions of NO.
Nitric oxide (NO) is emitted from motor vehicles and stationary combustion op-
erations and can be classed as a primary pollutant.  Its highest concentra-
tions can be expected to occur in the vicinity of sources, particularly near
heavily travelled roadways.  As in the case of hydrocarbons, there is cur-
rently no health standard  for NO. Nitrogen dioxide (N02) is almost totally an
oxidation product of NO.  A health standard does exist for N02 (0.05 ppm
annual average) so that measurement to assess compliance with the standard is
necessary.  Since N02 is formed in the atmosphere from NO only after the NO
has been mixed with emitted hydrocarbons and allowed to react for a period of
an hour or so, local hot spots of N02 are not to be expected. Area-wide moni-
toring at locations downwind of main NO sources is the basic strategy called
for.

      Photochemical oxidant, primarily ozone, is the major product in photo-
chemical smog. Oxidant forms during prolonged irradiation of hydrocarbon/NOx
mixtures, usually well downwind of where the hydrocarbons and NO were emitted.
Clearly, area-wide monitoring is suggested for oxidant, with one proviso.
Ozone reacts very quickly with NO. Thus, in the vicinity of local sources of
NO, such as roadways, ozone levels are generally significantly depressed
relative to ambient levels due to rapid scavenging by NO. Thus, it is neces-
sary to locate monitors for oxidant beyond the immediate vicinity of NO
sources.

      The final category of pollutant routinely measured in urban areas is
total  suspended particulate matter (TSP). Particulate matter is emitted from
a wide variety of sources, and the monitoring needs of a region will be dic-
tated somewhat by the major sources of particulate matter in that region.
Primary particulate matter is emitted from motor vehicles, aircraft, power
plants, and industrial operations.  The largest particles generally settle
out rapidly near the sources, whereas those in the micrometer range and smaller
become airborne for relatively long periods of time.

      The brief discussion above leads one to the conclusion that it is pos-
sible to identify two basic types of monitoring sites, pJiaxAmate. and unban
level. Proximate sites refer to those situated in the immediate vicinity of
a source, and are of primary interest in the measurement of CO. In those in-
stances in which significant S02 emissions occur from a single source, proxi-
mate monitoring may also be called for. The selection of proximate sites will
depend on the particular source, its configuration and the local topography.
Sources for which proximate monitoring may be necessary are elevated and de-
pressed roadways, street canyons, airports, and perimeters of power plants.
The site is to be chosen at the point when the highest concentration levels
are expected to occur. A detailed consideration of the selection of proximate
sites for CO monitoring has been carried out by Ludwig and Kealoha (1975).

      Urban level sites are used to enable the estimation of concentrations of
pollutants over broad areas of the entire region or certain subareas of the
region.  Thus, these sites should be reasonably removed from strong local
sources so that each station provides data representative of the "region" of

-------
the airshed in the vicinity of the station.   In most cases,  "airsheds"  have
been designated by legislation.   As a practical matter,  an airshed can  be
considered as that region in which pollutant concentrations  are largely the
result of emissions in the airshed.  The airshed is usually  defined as  a re-
gion large enough so that a different "airshed" need not be  defined for each
pollutant.  Urban level sites are the type called for, in general, in measure-
ment of S02, hydrocarbons, NOX,  oxidant, and particulate matter.

B.   SITING CRITERION

     Certain objectives of monitoring have been delineated above.   Table 1
indicates the various criteria one would consider in attempting to meet the
six objectives.  Based on Table  1 we can summarize the following criteria for
siting of monitoring stations:

     (1)  Locate stations so that the pollutant concentration distribution
          over the region can be estimated most accurately.

     (2)  Locate stations where  the expected frequencies of  violation of the
          air quality standards  are highest.

     (3)  Locate stations at points of maximum sensitivity of the pollutant
          concentrations to source parameters.

     The problem of siting on the basis of maximum sensitivity of concentra-
tions to source emission level changes has been considered previously (Sein-
feld, 1972).  We do not consider this approach in the present work.  We
choose to focus on the objective of ascertaining compliance  with regulations,
the most important objective of  air quality monitoring.   Thus, the criterion
with which we will be concerned  is to locate stations to enable the best esti-
mation of pollutant concentration distributions over the region, a criterion
that includes detection of the places of greatest expected violation of
standards.

     The criterion of locating stations so that optimal  estimates of the full
concentration distribution over  the region can be obtained is compatible with
other criteria being considered  by the EPA's Environmental Monitoring and
Support Laboratory-Las Vegas (EMSL-LV).  For example, in Liu et al . (1977)  a
Figure of Merit,  F, was defined as the product of an air quality index (either
observed or predicted) at a particular location and the associated frequency
or probability of occurrence,

              F = I (Air Quality Index) •  (Frequency of Occurrence)

The summation is to be performed over all meteorological scenarios that lead
to air pollution episodes.  The  Figure of Merit is weighted by the frequen-
cies of occurrence of the scenarios because the pollutant concentration at
any location varies significantly over a year, e.g.,McElroy  et al. (1978).

                  Y"1   / Frequency of OccurrenceX  / Max. l-hr.or8-hr.surface\
        F(i,j) =  Z_j   I  of Meteorological       L[  CO Concentration  at    I
                          Pattern I               I    Grid Point i ,j under
                        x                        '                          ;
                                                      Pattern fc

-------
    TABLE  1.  SITING CRITERIA FOR DIFFERENT OBJECTIVES FOR MONITORING
Monitoring Objective
               Siting Criterion
 1.  Assess compliance with air
    quality standards
2.  Assess long-term trends
3.  Provide data during episodes
    Monitor source compliance
    with regulations
    Provide data to support
    enforcement actions
6.  Provide data for research
Locate stations where concentrations are
expected to be largest or locate stations
where the spatial concentration distribu-
tions can be estimated most accurately.

Locate stations where concentrations are
expected to be largest or locate stations
where the spatial concentration distribu-
tions can be estimated most accurately.

Locate stations where concentrations are
expected to be largest under conditions
of stagnation or locate stations where the
spatial concentration distributions can be
estimated most accurately.

Locate stations at points where the sensi-
tivity of concentration levels to source
emission level changes is greatest.

Locate stations at points where the sensi-
tivity of concentration levels to source
emission level changes is greatest.

For the evaluation of diffusion models,
locate stations where the spatial concen-
tration distributions can be estimated
most accurately.
In this approach,  McElroy et al.  (1977), the CO concentration is predicted by
an urban air pollution model for each situation corresponding to the various
meteorological  patterns.  Air quality data are not necessary to implement
that approach (only, of course, for the validation of the air quality model);
with source emission and meteorological inputs the concentrations can be pre-
dicted in the absence of air monitoring data.   When there are no air monitor-
ing stations in a region then the technique of'Liu    et al. (1977) must be
used.  The approach that we have chosen to follow is based on the availability
of air monitoring data.  In short, we seek to utilize past air monitoring
data to estimate the full pollutant concentration distribution over the re-
gion.  Once this distribution has  been estimated, then the Figure of Merit
can be computed.  Our approach recognizes, therefore, the fact that air
quality model predictions will not perfectly match ambient data and attempts
to reconcile the model predictions and the data to produce an "optimum"
estimated concentration distribution.

-------
                                SECTION III

             OPTIMAL ESTIMATION OF AIR POLLUTANT CONCENTRATIONS

     In Section II we discussed several objectives of monitoring and the re-
sultant siting criteria.  We arrived at the criterion of locating stations
to enable the optimal estimation of pollutant concentration distributions
over the region as one that satisfies many of the principal objectives of
monitoring.  In this section we outline the theory underlying the implementa-
tion of this criterion.

A.   MATHEMATICAL MODELS OF URBAN AIR POLLUTION

     In attempting to quantify the siting criterion and determine an opti-
mum design it will be necessary to employ a model that relates emissions to
air quality.  Models of this type can be classified as dynamic or long-term.
Dynamic models describe the evolution of pollutant concentrations in the re-
gion during a day, given source emission and meteorological information.
Long-term models predict the yearly average pollutant concentrations as a
function of location in the airshed and incorporate information on the fre-
quencies of occurrence of various meteorological conditions in the region over
a typical year (Kumar et al . , 1976).  The monitoring location problem can be
formulated on the basis of  the dynamic behavior of pollutants on typical days
(requiring the use of a dynamic model) or on the basis of  the long-term, aver-
age concentration distributions over the region (necessitating the use of
long-term models).  Since dynamic models, if accurate, incorporate much more
detail than long-term models, particularly when episode conditions are of
interest, siting on the basis of the dynamic behavior of pollutants is pre-
ferable to that based on long-term averages.  Consequently, we confine our
attention here to siting on the basis of dynamic behavior..  For an example of
an approach to monitor siting based on long-term averages  we refer the reader
to Darby et al. (1974).

     All conventional atmospheric diffusion models are based on the equation
of conservation of mass


  3c.      9c.      3c.     9c.
                        + S.(x,y,z,t)                                 (1)

where  c-j  is the  concentration of species i; u, v, and w are the fluid veloci
ties in the three  coordinate directions; P-J is the molecular diffusivity of
species i  in  air;  R-j  is  the rate of  generation (or the negative of the rate

-------
 of disappearance)  of species  i  by  chemical  reactions  at  temperature  T;  and
 S..  is  the  rate  of  injection of  species  i  into  the  fluid  from  sources.

     Because  the atmosphere is  a turbulent  flow, the  velocities  u, v, and w
 are random functions of  space and  time.   Consequently, the  concentration c.
 is  also  a  random function  of  space and  time.   Solutions  of  (1) with  realistic
 atmospheric velocities are very difficult to obtain,  even in  the case in which
 R^  = 0 (inert species).   In order  to  render (1) solvable, the  fluid velocities
 are decomposed_into  mean and  fluctuating  components,  u = u  +  u',  etc.   The
 quantities u, v, and w represent the  ensemble  mean velocities of an  infinite
 number of  realizations of  the same flow.  Correspondingly,  we can divide c.
 into c.  +  c!, where  c. is  the ensemble  mean concentration.                n

     Upon  substituting the mean and fluctuating terms into  (1) and averaging
 the resulting equation over the ensemble  of flows  we  obtain the  equation
 governing  Cj.   In  atmospheric applications, the molecular diffusion  term is
 negligible when compared to that representing  advective  transport.   Thus,
 neglecting the  contribution of  molecular  diffusion, the  equation  for c. is
       3c.     3c.     3c.   _ 3c.

       ot      9x      3y      <3z
                           cN,T) + S.(x,y,z,t)                            (2)
     We note the emergence of the new variables u'cj, v'c-}, and w'c-}, which
represent the fluxes of species  i in the three coordinate directions as a
result of the velocity fluctuations, u1, v1 and w\  If species i is involved
in second-order chemical reactions then the term R. will also lead to new
dependent variables of the form  c!c'..
                                  ' *J
     Equation (2) is a rigorously valid equation for c,- (neglecting, of
course, molecular diffusion), _and if the variables ITcj , v'c] , w'c: , as well
as any of those arising from FL- , are known as functions of space and time,
it can be solved in principle to yield Cj.  Unfortunately, u c-} , etc., can-
not be measured at all points in an atmospheric flow, and cannot be predicted
exactly because of the closure problem of nonlinear stochastic equations.
Thus, we must resort to models for these terms.  The model employed- in vir-
tually all cases in which atmospheric flows are involved is that based on the
concept of eddy diffusivities:
                       3c.                -        3c.

                                           '
                                          3C.
                              w'ci ' -
The eddy diffusivities KH and Ky are postulated to be functions of space and
time (and not of c. or any of its gradients).  In addition, all models employ
the approximation
                                     10

-------
               IVC1 ..... cN,T)  = R.(c. .....  EN,T)                        (4)

     The result of using (3)  and (4)  in  (2)  is  the  so-called atmospheric
diffusion equation,


3c.     3c.     ac.      ac.           ac. \       /    ac.
.        .        .        .
L + M _ 1- + v _ 1 + u/ _ L -
                        --
    + M
      u
                                                       L
                _      _            _             _    .
3t      ax      9T     9F-- a7    H^T    ay    H 9r/   91


                + R^c.,..., cN,T) + Sjfx.y.z.t)                         (5)

Equation (5) is the  fundamental  equation  upon which  most  dynamic  urban air
pollution models are based.

     The validity of the atmospheric  diffusion  equation relates to  how closely
the predicted mean concentration Cj corresponds to the true ensemble mean con-
centration.   If the  true ensemble mean  velocities and concentrations are
known for an atmospheric flow,  then it  is  relatively straightforward to assess
the validity of (5)  for specified forms of KH and Ky.  Unfortunately, for any
atmospheric  flow the ensemble mean velocities and concentrations  can never  be
computed since the atmosphere presents  only one realization of the  flow at
any time.  (Of course, for a  statistically  stationary flow,  ensemble averages
can be replaced by time averages.   The  atmosphere is, however, seldom in  a
stationary condition for any appreciable  period of time.)

B.   FORMULATION OF THE PROBLEM

     We denote the measured  value of  the  concentration of pollutant i at  moni-
toring site  i at time t|< by  w.  n(t.), k =  1>2>'-  Because  there  is always
some amount  of instrument error associated with a measurement, w. .(t. ) is
not precisely equal  to the instantaneous  concentration of species'  i  but
differs by an error.   If it  is assumed  that this measurement error  depends
only on the  pollutant and not on where  or when  the measurement is made,
then we can  write
where c- (Xp,t) is the instantaneous concentration of i  at location x»
[x = (x,y7z)] at time t, and e. is the random measurement error for~species  i.
As we noted above, the atmospheric diffusion equation predicts the theoreti-
cal ensemble mean concentration c.(x,t).   There exists  an unknown discrep-
ancy between any instantaneous value and  the theoretical  mean.  In addition,
although the atmospheric diffusion equation predicts in principle the  mean
concentration at all locations, in reality the equation assumes a certain
amount of spatial averaging (Lamb and Seinfeld, 1973).   In addition,  the
implementation of the equation requires solution on a finite grid, introduc-
ing implicit spatial averages into the computed concentrations.  The  measure-
ment is, of course, truly a point measurement.  Thus, equation (6) becomes
                                     11

-------
 when  5.  includes  all  the discrepancies between w. and c..  Because c. is
 a  random function,  £. will  itself  be a random function.

      The random errors  in  (7) can  be combined, so
 where  r\.  is  assumed  to have zero mean.

     Subsequently we will  restrict our attention to  inert pollutants or those
 that decay linearly  (R. =  - k.c.).  Thus, we need consider only a scalar con-
 centration c(x,t) and scalar measurements , w^t^) =  c(x^,tk) + l^'1^)'

     The  basic problem of  interest is to develop a way of siting new stations
 in a region  with at  least  some current monitoring stations in which we use
 the available data and a mathematical air quality model.  Our approach is to
 use the available data and the model to estimate the full spatial and temporal
 concentration distribution over the region.  We consider, therefore, the fol-
 lowing problem:

        Estimate the mean  concentration c(x,t) for t j> t  based on
        the  measurements w.(t. ).

 In mathematical terms this is a sequential estimation or filtering problem,
 i.e., we  desire to estimate the state of a dynamic system based on noisy mea-
 surements carried out on the system.  Filtering theory is a well -developed
 aspect of mathematical system and control theory.  (See, for example, Sage
 and Melsa, 1971; McGarty,  1974; Jazwinski, 1970.)  The air pollution estima-
 tion problem has been posed here as a filtering problem, the concentrations
 being  the states.  In this case, the system is governed by a set of partial
 differential equations (5), a so-called  distributed parameter system.   Fil-
 tering in distributed parameter systems has received considerable attention.
 (See,  for example, Tzafestas and Nightingale, 1968ab; Tzafestas, 1973;
 Sakawa, 1972; Hwang  et al . , 1972; Koda, 1976.)  The elements of the distribu-
 ted parameter filtering theory are given in Appendix A.

     The  atmospheric diffusion equation (5) is not the "true" equation  for
 the mean  concentration c(x,t) because of approximations involved in repre-
 senting turbulent diffusion (assuming that the fluid velocities are exact.)
 The discrepancy between the true, but unknown, value of c and that predicted
 by (5) can be presented in a simple way by a random error term added to the
 right hand side (R.H.S.)  of (5), ?(x,t),
^- + il a£ + 0 *t + w 8c *  8  (K  3c\    3  /K  3c\ + JL /K  3c\
at + u ax + v ay" + w 3?   ^ VKH ax ) + W \ H ay J + az VS/ 3z J

                             + SjU.y.z.t) + £(x,y,z,t)                  (5')

Thus, we assume that the inaccuracies in the meteorological variables and
those arising from inadequate representation of turbulent diffusion can be
"lumped" and then quantified as noise in the manner in (51).  Such an addi-
tive random error almost certainly does not adequately represent the

                                     12

-------
discrepancy, although not enough is known about the deviation between the
time ensemble mean and that predicted by (5) to attempt to account for the
deviation in a more detailed manner.   Consequently, we will  henceforth assume
that a random error 5(x,t) exists on  the R.H.S. of (5) having zero mean and
known variance.  (The variance of £ is, of course, not known; it must be esti-
mated based on our judgment as to the essential adequacy of (5) for the given
situation.)

     The basic filtering procedure is outlined in Figure 1.   The basic fea-
ture of the filter is that the difference between the measured and estimated
concentrations is processed to yield  a better estimate of the concentration.
In effect, then, the filter is simply a learning algorithm that gets "smarter"
as it processes more and more data.  The net result is better and better esti-
mates of the concentrations as time progresses.

     As illustrated in Figure 1, the  "filter" consists of the differential
equations for c(x,t) and P(x,y,t) that "process" the measurements w.(t, ) to
give c(x,t).  The key problem in the  implementation of the distributed
parameter filtering theory is the spatial dimensionality of the filter vari-
ance P.  For a 3-dimensional situation, P is a function of six spatial vari-
ables, i.e., P(x,y,t) in vector notation or P(x,y,z,x',y',z-',t) in expanded
notation.  Thus^implementation of the filter requires the numerical solution
of the partial differential Riccati equation in six spatial  dimensions.

     The distributed parameter filtering problem can, in general, be treated
in either of two ways with respect to finite-dimensional approximation.  The
distributed parameter system can be approximated by a lumped-parameter sys-
tem (i.e., ordinary differential equations) at the very beginning of the
problem, and the filtering problem can be solved with respect to the lumped
system.  This approach can be called  "approximation at the beginning."  On
the other hand, the distributed nature of the problem can be reta.ined
throughout the analysis, and only at  the point where numerical implementation
of the partial differential equations is necessary is a finite-dimensional
approximation introduced.  This approach can be termed "approximation at the
end."  From a numerical point of view there does not appear to exist a funda-
mental advantage for either approach, although approximation at the end does
preserve the distributed character of the problem as long as possible.  Here
we have applied the finite difference approximation at the end.

     Clearly, considerations of computation time and storage are paramount.
Because of the size of the numerical  problem involved in implementing the
filter, numerical considerations of accuracy and stability are also important.
Covariance square root algorithms are generally more accurate and stable than
conventional non-square root algorithms.  For this reason we have adopted
that approach in solving the filter equations  (Appendix C).  Technical details
of the filter are presented in Appendixes A and B.  Table 2 summarizes^the
general equations of the filter. We note that the filter equation for c de-
pends on all the same variables and phenomena as the original atmospheric
diffusion equation.  On the other hand, the variance equation for P is inde-
pendent of the source emissions.
                                     13

-------
    Actual
    Atmosphere
Measurement
Wj(tk)
Mathematical Model
(Atmospheric Diffu-
sion Equation)
    Prediction
                     c(x.,t. )
                       ~J   K-
                      Updated value of c(x,t)
                                                                             -J Filter Gain
                                                                                          P(x,y,t)
                                                                         Variance of the  Estimate
                                                                                              THE
                                                                                              FILTER
           Figure  1.   Basic filtering procedure for estimating air pollutant concentrations

-------
                                TABLE 2.   OPTIMAL DISTRIBUTED-PARAMETER FILTER
Diffusion
ion Process     f^.     (x>t)  +
                                  x
                          c(x,t) =
                                         x


                                         x
                                                                    n
Measurement
                                                        Jo ~"
                                                          K ""
Estimate
                        Time Update Filtering Equations   (t.  <_ t £
                                = L c(x,t)
                                   A  ***
L. c(x,t) =  <(>(x,t)             ,         x G  9ft
  D  ***         **»                          **•*


3P(x,y,t)

	—	= L P(x,y,t) + P(x,y,t)L' + Q(x,y,t)   ,
   \J \4         /\-i^«j      ^  ~*t rv    Jf     f<* *v
                                                                                  x, y
Variance

Lbp
P(x
(x
,y
,y,t) = o
,t)Lb = o
Measurement Update
Estimate
Variance
c(x
P(x
,t
»y
+. _ ^ ..
.J - C^X,ti)
^ •>* l\
-.tf) = P(x,y,


Filtering
M M
+ i-1 '-I
M
t"u) - J.
'
»




X
y
Equations
P(x
M
I
,5^ ,t
P(x,
P
s.
[A
,t
E-9S
€ 9J
(t =
•'^l
u)[A"
7
1
V
<> VA' - =(!j-
"^tJl^Pfs.^.^)


tk)l

                                               1=1 j=l
                                                +R.j(tk)

-------
      Table  2  summarizes  the  filter  in  general  form, based on the development
 in  Appendix A.   In  the next  section we develop a  basic  regional air pollution
 model  based on  a  3-dimensional  form of the  atmospheric  diffusion equation and
 present  the specific  equations  of the  filter corresponding  to that model.  A
 computer code has been developed that^solves the  atmospheric diffusion equa-
 tion  together with  the equations for c(x,y,z,t) and P(x,y,z,x' ,y' ,z' ,t).  The
 code  is  detailed  in Appendix  E.  The code is applicable to  all problems that
 can be posed  as  obeying  the  general model of Section  III.C.  For the purposes
 of  illustrating  the performance of  the filter  we  choose in  Section III.C a
 set of specific  parameter  values.   Section  III.D  is devoted to the results of
 applying the  general  code  to  the specific situation defined in Section III.C.

 C.    BASIC  REGIONAL MODEL  FOR APPLICATION OF FILTERING  THEORY

      We  adopt the following assumptions in  regard to  the basic atmospheric
 diffusion equation  (5):

      1)  The  surface  of  the ground  is  flat  so  that atmospheric motion is 2-
         dimensional , and  vertical wind velocity  components can be neglected.

      2)  Turbulent diffusion  in the horizontal direction is negligible as
         compared with horizontal advection.

      3)  The  location of sources is defined through a boundary condition in
         the  form of  surface  flux.

      4)  There is a temperature inversion layer at a  given elevation.

      Using  the same notation  as in the previous section, the diffusion equa-
 tion  governing a single  species in the atmosphere, under these assumptions is
 (dropping the overbar for  convenience)


                                         te(x>y,Z,t)                    (9)


where  E,  is  an artificial  random forcing term to account for discrepancies
between  c as  predicted by,  (5) and the  true  (but unknown) ensemble mean con-
centration.    It is unlikely that an additive random term properly accounts
 for this modeling error,  although without further information we resort to
this assumption.  Further, we shall  assume that £ is well -represented by a
distributed white Gaussian process with the following properties:

         E(c(x,y,z,t)} =  0

                                = Q(x,y,z,x' ,y' ,z' ,t)6(t-r)
where the variance function Q is positive semi-definite and symmetric with
respect to the spatial coordinates (x,y,z) and (x',y',z').

     The vertical  boundary conditions used are
                                     16

-------
           K   £~. = ^(x y t)          at z = 0

                                                                         (U]

         8z = °                       at z = Zmax

where K   is the eddy diffusivity at ground level and S(x,y,t) is the surface
flux ana  is a specified deterministic function.  Other boundary conditions
are given at the upstream boundaries:

         c(0,y,z,t) = c,
                       b                                                 (12)
         c(x,0,z,t) = c,

where c.  can be a function of time.  It can be assumed that the exact value
of c.     is not given as a priori information.

     The initial conditions necessary for an unsteady state model (9) are now
known precisely, and only the estimated mean concentration

         E{c(x,y,z,0)} = c0(x,y,z)                                       (13)

and its variance

     E{[c(x,y,z,0) - c0(x,y,z)][c(xl,y',zl,0) - CQ(X',y',z')]}

              = P0(x,y,z,x',y',z')                                       (14)

are assumed to be known.

     The measurement data are usually available at discrete times, tk, k = 1,
2,..., only at discrete locations.  Hence the measurement process is repre-
sented as


     W = ^WVV + W '      * = 1'2""' M              (15)

where nnUk) is the error associated with a measurement WnU..).  We shall as-
sume that  n,,(tk) is a white Gaussian noise sequence with properties:

         E(n.(t )} = 0
            i  n
where Rjj(tn) is a positive variance.

     We shall assume that the random variables £(x,y,z,t), c(x,y,z,0), and
ri£(t. ) are mutually statistically  independent.   It  is  then necessary  to speci-
fy tne variances Q(x,y,z,x',y* ,z',t) and R,-,-(O.   The proper specification
of these functions is a major problem  in the ultimate  implementation  of the
theory.  For example, the form of  R,-.:(tk) will depend  on  how the  discrepan-
cies between data and prediction arejexpected to vary  with location.  Such

                                     17

-------
 information  can  be  obtained  through  a  statistical  analysis of  past modeling
 studies  comparing predictions  to  data.   At  present, we  do not  assume  that
 correlations exist  between the errors  in measurements and the  modeling errors
 of different locations.   Hence we can  write
                                                                (17)
      Q(x,y,z,x',y',z',t)  = Q(x,y,z,t)S(x'-x)<5(y'-y)6(z'-z)
      Thus, we  have  constructed a  simple,  but  sufficiently  general, model of
 air  pollution  for application of  the filtering theory.

      For  a convenience, a dimensionless form  of  the model  equations  is util
 ized.   Let Cr, T^,  Lr, IL,  Kr, and Sr denote  arbitrary  reference  values for
 concentration, time, spatial length, wind speed, eddy diffusivity, and sur-
 face  flux, respectively.  Then, using these reference values, the dimension-
 less  form of (9) becomes
       9t*
where the superscript * denotes a dimensionless quantity.  In a similar man-
ner, other equations above can be easily transformed to a dimensionless form.

     The following set of reference values is specified to unity throughout
the simulation:
Cr = 1
                  [ppm]

                  [m2/s]
         Ur = 1   [m/s]

Using these special reference values, the computer input and output variables,
i.e., the dimensionless values K*, u*, v*, and c*, correspond to their -actual
physical values but are dimensionless.  Other reference values will be deter-
mined based on the grid square size employed in the discretization.'1'

     In the  process of discretization necessary to implement the filter, we have
used finite difference methods.  The primary source of errors associated with
the finite difference approximation arises in the discretization of the spa-
tial coordinates.  Although the analysis shows that the higher order trunca-
tion terms always vanish when higher order finite difference schemes are used,
this phenomenon does not necessarily imply that these schemes are more suit-
able.  A notorious example can be found in airshed modeling: near localized
sources, higher order schemes predict negative concentrations, whereas simple
first-order schemes do not.
f Process by which a continuous spatial  domain is represented by an array of
  cells for the purposes of numerical computation.
                                     18

-------
     In order to ascertain numerical  stability and accuracy we have applied
the time-splitting technique to the finite difference implementation of (19).
(For details see Appendix B.)  In this time-splitting concept, the original
advection-diffusion equation (19) is  decomposed into the 2-dimensional  "advec-
tion part," and into the 1-dimensional "diffusion part."

     (a)  Advection part:

                           A -A.      11 _ J_ I
                                                                          (20)
                           dX"      dy /


     (b)  Diffusion part:


                          f   . ^
                                                                          (21)
 Then  stable  finite  difference methods  suitable  for  (20)  and  (21)  are  applied
 independently.   We  have  applied  Fromm's  (1968)  second-order,  zero-average,
 phase error  method  to  the  advection  part (20).  The standard  Crank-Nicolson
 second-order method has  been applied to  the  diffusion  part  (21).   These
 second-order schemes  improved the numerical  accuracy  and stability and  did
 not  lead  to  negative concentrations.

      The  stability  of  Fromm's second-order,  zero-average phase  error  method
 can  be easily evaluated  for the  advection part  (20).   The evaluation  of  the
 artificial  (or pseudo) diffusion term  associated  with  the method  leads to  the
 Courant condition for  stability:


          TrUr
          -P1  (a + 3) < 1                                               (22)
            r

 where a and  B denote the Courant numbers defined  as


            _ u*At*
          01  " Ax*
                                                                          (23)

          „  _ v*At*


 where Ax*,  Ay*, and At*  are the  dimensionless mesh  spacing  in the x,  y,  and  t
 coordinate  directions, respectively

      Using  the Courant condition (22), we can generate the  following  table
 for  the various combinations  of  Lr and T .
                                     19

-------
                       TABLE 3.   STABILITY CONDITIONS
   Tp[sec]         Lr[ra]     TrUr/Lr    TrKr/Lr        Stability Condition


      1            100        0.01      0.0001        Ax* ^ 0.02(u*)    At*
                                                                  II tu A


                                                    (Ay* > 0.02(v*)maxAt*)

     30           1000        0.03      0.00003       Ax* > 0.06(u*)mAt*
                                                        —         max


     60           1000        0.06      0.00006       Ax* > 0.12(u*)mAt*
                                                        —         max
 We  note  that  the  stability  of  the  method  depends  on  the  dimensionless wind
 velocity components  (u*)     and  (v*)    ,  as well  as  on the  spacing Ax*, Ay*,
 and At*.   The spacing  Ax™  Ay*,  and     At* are  usually determined by the com-
 puter  memory  storage and  computation  time requirements.   For most air pollu-
 tion situations,  both  u*  and v*  range in  value  between 0 and 10 on the dimen-
 sionless  scale.   Consideration of  actual  air  pollution applications leads to
 the following reference values for T   and L  :

          Tr = 60   [sec] = 1  [min]

          Lr   = 1000  [m] =   1 [km]

 1.   Scale Transformation in Vertical  Direction

     In  the estimation problem, we are mainly concerned  with the estimation
 of  concentrations  near the ground  level.  Most of the stations in a given
 area provide  measurement readings  representative of  the  average grid-square
 concentrations at  the height of receptor  location.   Measurements reported by
 the  monitoring stations then are used  as  inputs to the filter to compute
 estimates  near the ground level averaged  over the square  grid area.  Hence,
 it may be  very useful if a filter  can  quite effectively  calculate the esti-
 mates representative of the  average grid-square concentrations at the level
 of measurement height.  For  this reason,  a fairly detailed  finite difference
 representation is performed  for the vertical diffusion part (21).

     Consider the following  scale  transformation in  the  vertical direction,

         z* = h(exp z - l)/(exp Az - 1)
                                                                         (24)
         zk =  (k - l)Az

where z is the/transformed dimensionless  value of z*, Az  is the spacing in
this new coordinate system,   and h is the  effective height of measurement.
The definition of the effective height of measurement is apparent from (24).


                                     20

-------
Using this scale transformation in the finite difference implementation of
the diffusion part (21), we can calculate the concentrations near the effec-
tive height of measurement more accurately than by using a standard equally-
space, finite-difference representation.  This transformation is illustrated
in Figure 2 for Az = 0.5 and h = 15.

     (a)  Advection part:


          H=.0.06[uff+vf)                                      (25)

     (b)  Diffusion part:
 £ = 0 00006   exPz-D2 9    __  c                        (26)
9t   u-uuuub   h'expz   8z                                  ^  }
                                               9c "\
                                               8z J
where, for the convenience of notation, we have dropped the superscript *.
Thus, we now apply the finite difference methods to (25) and (26).

2.   Mesh Parameters

     The mesh used in the computation is shown in Figure 3.  We have selected
the following mesh parameters:

        At  =1.5
        Ax  = Ay = 2
        Az  = 0.5
         h  = 15/1000
         Nx = Ny = 13

         Nz = 6

where Nx, Ny, and Nz are mesh numbers along x-, y-, and z-direction, respec-
tively.  This mesh system, which is used for the computation of the estimated
concentrations, is a refinement of the mesh that is used for the computation
of the variances.  The coarse mesh which is indicated by the bold line in
Figure 3 is used for the computation of the filter variances.  The mesh param-
eters used in the computation of the filter variances are:

        At  =1.5
        Ax  = Ay  =4
        Az  = 0.5
         h  = 15/1000
         Nx = Ny - 7
These mesh parameters are determined from the considerations of computer
memory storage and computation time requirements.  Using these mesh systems,
the computer storage required for  the filter is the order of one megabyte

                                     21

-------
          7*
         250
         200
        150
        100
         50
          0
        = EFFECTIVE HEIGHT OF MEASUREMENT
Figure 2.  Scale transformation in vertical direction
                        22

-------
                                    7     8
10  11   12    13
 2



 3
 6




 7



 8



 9




10




11



12
                       MONITORING  STATION LOCATION
            Figure 3.   Basic  grid system for computation
                               23

-------
 or more for the  IBM  370/158  system.   In the measurement update computation
 of the  filter state,  intermesh  (spatial) interpolation calculations have
 been  carried out  for the  values at  refined grid locations.

 3.    Measurements  and Other  Conditions

      The measurements are given by

         W =  c(x5,'y£'h)tk) + W'     *=1>2	M             (27)

 where h is  the effective height of measurement.  We consider five monitoring
 stations, i.e., M  = 5, and the locations of all of these monitoring stations
 are shown in  Figure 3.  We chose to accept any point within a unit radius of
 a monitoring  station as being "at" that station.  The measurement interval is
 assumed to  be 15,  i.e., t.+, - t.  = 15.

      We used  a surface flux  pattern as shown in Figure 4.  Since the numerical
 errors generated by the finite difference approximation originate primarily
 from  inhomogeneities in the  concentration distributions, spatial variations
 of the emissions will have a strong effect on the performance of the filter.
 Hence, we have used a rather smooth pattern for the surface flux as shown in
 Figure 4.   However, as we have already observed in Section III.B, the compu-
 tation of the filter variance is completely independent of the surface flux
 pattern.

     A uniform wind field is assumed over an entire area, i.e., the wind ve-
 locity components  are given  as u = v = 2.  Eddy diffusivities used are K  =
 0.7 and KV  = 0.35.  White Gaussian noises are generated by using a standard
 subroutine:  We have used the following variances:  Q(x.,y.,z.,t. ) = 0.01,
 R.(t.  ) = 0.01.  The process noise is added to each grid location at each
measurement instant.   The measurement noise is added to each measurement lo-
cation at each measurement instant.  These noise statistics were used to gen-
erate a diffusion  process and a sequence of measurement data {wp(t.)}, upon
which the filter algorithm would operate.

D.   SIMULATION RESULTS

      In the intended application of the filter, actual monitoring data are
 used as input to produce the estimated mean concentration field.  Because of
the sequential nature of the filter it operates in real time, accepting data
as soon as they are taken and processing those data to give a continuous, cur-
 rent estimate of the concentration field.   As such, the algorithm is well-
suited for a central  computer in an air pollution  control  district.

     In the present study the monitoring data were simulated numerically by
integrating the model developed in Section III.C and artificially corrupt-
 ing the "exact" concentrations with random dynamic, £(x,t), and measurement,
n(x,t), errors to produce the "data,"  Wj(tk).   The value of such a numerical
experiment is that the true concentrations are known, and, consequently, the
performance of the filter can be evaluated quantitatively under a variety
of circumstances.   Two basic situations were studied:
                                     24

-------
\. WIND VECTOR
x 1
i C
2
3
5
6
* 7
8
9
10
11
12
17 /
.2345
f

- ao<
r> __
iii
)A/
/w;
0.i
ftl
in?
a
£
01
°\
0.
ni

x>2:
?/- 	
Y
i 6 7





8 9 10 11 12 1
f
i i

A
V
05- 	 005-
K—0.



or
0



1 1 1
«
rt /
o/
01
/j /;

)02
02.
.n

00/
.oo/
— n
Figure 4.   Spatial  distribution of pollutant surface flux
                       25

-------
     1.  The performance of the filter was studied as a function of some
         of the basic parameters of the problem for the base case described
         in Section III.C, namely five monitoring stations in the hypothetical
         region of Figure 3.

     2.  The performance of the filter was studied as a function of the num-
         ber of monitoring stations.

We now discuss the results of the simulations in these two situations.

1.  Performance of the Filter.  Base Case

         Three cases were simulated to study the performance of the filter.
The parameters for the three cases are given in Table 4.  The data were gen-
erated by solving the atmospheric diffusion equation numerically using the
same wind velocities, eddy diffusivity, and source emissions for all three
cases.   In addition, the same random noise was employed in each of the three
cases to produce the noisy measured concentrations.  The dynamic and measure-
ment noise, having the variances given in Table 4, were generated by the same
subroutine for each of the three cases.
             TABLE 4.  PARAMETERS USED IN BASE CASE SIMULATIONS
                u(x,y,z,t)  = 2     Kv(z,t)  = 0.7      Q(x,y,z,t)  = 0.01

                v(x,y,z,t)  = 2     KVo(t) = 0.35      R.(tk)  =  0.01
Parameters
c(x,y,z,0)
cQ(x,y,z)
P0(x,y,z,x,y,z)
PQ(Osy,zsO,y,z)
P0(x,0,z,x,0,z)
Pn(x,y,z,x' ,y' ,z')
Case A
0.1
0
102
(1/4)2
(1/4)2
0
Case B
0.1
0
104
(1/4)2
(1/4)2
0
Case C
0.3
0
102
(1/4)2
(1/4)2
0
     x1 / x,y' f y, z1
                                     26

-------
     Numerical ill-conditioning of the filter often results when there are
large initial uncertainties and relatively small noise variances.  Thus, we
specifically considered such a situation to test the convergence of the filter
to the actual concentrations.  The noise variances, Q(x,y,z,t) and R-jCt^)
were chosen to be relatively small (0.01) so that measurements at low con-
centration were the same order as the errors.  A large initial uncertainty
was assumed in case B, P0(x,y,z,x,y,z) = 10^, whereas relatively small initial
uncertainties were chosen for cases A and C, P0(x,y,z,x,y,z) = 10^.  Initial
uncertainties at the upstream boundaries are small, (1/4)S because we have
assumed that these boundary concentrations are relatively well known.  The
true initial conditions are different in cases A and C, whereas the initial
estimates are taken to be the same.

     Let us consider first the behavior of the variance P(x,y,z,x',y',z',t)
for the simulated cases.  (Note that P is independent of the actual data and
depends only on P , Q and R.) Figures 51" and 6 show P(x. ,y. ,h,x. ,y. ,h,t. ]
cases A and B of Table 3 at each of the five measurement locations.  These
                                                                   ,h,tj for
                                                                is.'
values represent the variance of the estimate error at height h,

        P(x.,yi,h,x.,yi,h,t|<) = E{[c(x1,y1,h,t|() - c(xi ,yi ,h,tk)]2}

Figures 7 and 8 show P(x,y,h,x,y,h,t. ).   The algorithm is seen to guarantee
nonnegativity of the computed variances.   Moreover, the variances show the
same general trend in cases A and B.  The numerical stability of the square
root algorithm is evidenced by the lack of sensitivity of P to the choice of
a priori variance for the low noise level cases.

     The estimated concentrations at each of the five measurement locations
are compared with the true and measured concentrations in Figures 9 and 10
for cases A and C, respectively.  In both cases the estimates display the
same general agreement with the true concentrations, and the estimates are,
as expected, closer to the true concentrations than are the measurements.
The true and estimated concentration distributions are compared in Figure 11
for case A.  Similar distributions are shown in Figure 12 for case C.  Simu-
lation and accuracy results for case C were consistently similar to those
for case A.

     Estimation error variances are shown in Figures 13 and 14 for cases A
and C, respectively.  The actual error variances are considerably smaller
than those predicted by the filter shown in Figures 5 and 6.  Thus, good
filter performance is demonstrated in situations with relatively low levels
of dynamic and measurement noise, situations often prone to numerical errors.

     The estimated concentration is compared with the true concentration for
case A in Figure 15.  The estimated concentration field tends to be smoother
than the actual field.  This behavior reflects the smoothing aspects of the
filter.

     The computer storage and time requirements for the simulations of Table 3
were (IBM 370/158)	
 fin this and subsequent figures, numbers on the curves refer to the monitor-
  ing locations in Figure 2.

                                     27

-------
          Core memory               1100 K bytes
          Computation time (CPU)    78.5 min

2.  Performance of the Filter.  Effect of Number of Measurement Locations


          Simulation studies were performed to evaluate the effects of the
number of measurement locations.  Case A of the former section was selected
as the base case.   The five measurement locations shown in Figure 3 represent
the potential monitoring sites.  Three measurement situations have been con-
sidered.  The conditions for the three cases are given in Table 5. Other
parameters of the simulation are the same as those given in Table 4 for case A.
Monitoring station 3, which is located at the center of the region, is used
in all three cases.   The three monitoring stations used in Case II were cho-
sen arbitrarily.

     The estimation  results are compared at each of the potential  monitoring
sites in Figure 16.   The most surprising result of this simulation is the
fact that the filtering algorithm is able to generate meaningful,  but not
accurate, state estimates even in case I in which only one measurement sta-
tion exists.   The results of case I at the potential  monitoring site 4 are
not acceptable since the concentrations are poorly estimated at the end of
the simulation.  The results of cases II and III are acceptable, however.

     The calculated  filter variances and the actual  estimation error vari-
ances are shown in Figure 17 as a function of the number of measurements.
In Figure 17, the plotted values represent the calculated filter variance
                      TABLE 5.   MEASUREMENT CONDITIONS
Base Case
Measurement Situations
Number of Measurements


Monitoring Station
(Figure 3)




1
2
3
4
5

Case I
1
-
-
0
_
Case A
Case II
3
0
0
0
_

Case III
5
0
0
0
0
0
                                     28

-------
                  60    90    120     150
                                                   I   I   I   I   I   I    I   II
                                                            60     90     120    150
Figure 5.  Computed filter variance,
           case A.
Figure 6.   Computed filter  variance,
            case B.
           P(x,y,h,x,y,h,tJ
     Figure 7.   Spatial distribution  of filter variance, case A  (a-k),

                                       29

-------
                                             e
                                            g
Figure 7.  (continued)
         30

-------
Figure 7.  (continued)
         31

-------
Figure 8.   Spatial  distribution of filter variance, case B (a-k),
                               32

-------
                              t, -75
                            t. • 105
Figure 8.   (continued)
         33

-------
Figure 8.   (continued)
         34

-------
                           True  Concentration

                           Measured Concentration

                           Estimated Concentration
       0.5
    V)

    §0.4
    c
    a>
    o
    c
    O
    o
        0
                  30
          60
90
                t




                2
r
               7
  I     i
                   30
           60
90
120
                                                          150
120
150
       0.4
                         I     I     I     I
                   30
           60        90

                 t
          120
          150
Figure 9.  Comparison of estimated, true and measured concentrations,

           case A (a-d).
                                 35

-------
I     I     I      I     I     I     I      I     I
          I     I     I     I      I     I     I
    30
      60       90
             t
Figure 9.  (continued)
120        150
                 36

-------
                                     True Concentration

                                     Measured Concentration

                                     Estimated Concentration
                  i     i    i     i     i
                         i     i    i     i
             0
30
60
90
120       150
          0.5 r-
         c
         o
         c
         a>
         o
         c
         o
        O
                  I     I    I
                      30
          60
          90       120
                   150
          0.5
                  j	 i     i     i    i     i     i     i     i    i
                      30
          60
          90
          120       150
Figure 10.  Comparison of  estimated,  true, and measured concentrations,
            case C  (a-d).
                                  37

-------
 c
 o
 c
 CD
 O
 C
 o
O
            I     I                 I     I     I
                                                            150
                                 (continued)
                              38

-------
                             True
                                                                                   Estimated
CO
<£>
                            True
                                                                                  Estimated
       Figure  11.   Comparison  of spatial  distributions of estimated and true concentrations,
case A (a-e).

-------
True
Estimated
True
Estimated
                 Figure 11.  (continued)

-------
                         True
 Estimated
Figure 11.   (continued)
                         True
Estimated
Figure 12.  Comparison of spatial  distributions of estimated  and  true  concentrations, case C (a-e).

-------
                           True
                                                                                   Estimated
ro

                            True
Estimated
                                                 Figure 12.  (continued)

-------
                              True
Estimated
00
                                                            e
                              True
Estimated
                                                 Figure 12.  (continued)

-------
       I    I   I    I    1   1    I    I   I
ID'5
  Figure 13.   Actual  estimation  error
              variances,  case A.
                                                          ii
Figure 14.   Actual estimation error
            variances, case C.

-------
               True Concentration         '
         	Estimated Concentration /
                              c(IO.IOt2)
         II    I     I     I     II     i
                             t

Figure 15.  Comparison of estimated and true concentrations
            at nonmeasurement points, case A
                          45

-------
            	CASE m (M = 5)
            	CASE n (M = 3)
            	CASE I (M= I)
0.5i-
                                         150
               I    I   I    I    I    I    I    I
          30      60     90      120     150
                                             b   .2
I    I     I    I    I     I    I     I
                                                                  I     I    I    I     I    I    I
           Figure 16-   Comparison of estimated concentrations, base case A  (a-d).

-------
         Computed Filter

             Variance
   ^3_
        Actual Estimation Error

                Variance

      tf_
  10*
   10'
in
a>
o
c
  10
    pi
  10
   -2
     0
J

 5
10
 p2
                 M
                   M
            Figure 17.  Comparison of variances, base case A
                               47

-------
           5

           I  p(vyrh'xryrh'V
          v~ I
           5
           I  E{[c(x ,yp,h,t.)  -
          £=1        £  *    K
and the actual  estimation  error variance

          5
          I [c(xry£,h,tk)  -  c(x£,yrh,tk)]2
         X"~ I

where  the summation is taken at the five potential  monitoring sites.   As  an-
ticipated, after a sufficient time of filter operation, i.e.,t > 105,  both
the filter and actual estimation error variances are inversely proportional
to the number of measurement locations.   At the initial stage, where  the
effects of measurement errors are more significant  than the filtering  effects.
there  appears to be no great advantage in using many monitoring stations.
However, after the initial stage, the filters with  three and five monitoring
stations show the much improved performance as compared to that with  only one
station.  In this respect, we note that  t = 105, as indicated in Figure 17,
gives  a critical time for the filter operation.  After this initiation time,
the accuracy of the filter is, roughly speaking, inversely proportional to
the number of measurement locations.

E.   APPLICATION OF THE PRESENT METHOD TO MONITOR SITING PROBLEMS

     There are a number of methods that  have been suggested for the design of
a monitoring network (Appendix D).  A useful  discussion of many criteria  for
siting stations can be found in Liu et al.  (1977).   In general, the approaches
focus on the situation in which one has  an  air quality model  available and
then uses the predicted concentration fields under  varying meteorological
conditions to select monitoring sites.   These approaches are ideally suited
for the case in which there are no current  stations (or air quality data) in
the region.

     In most cases, however*  several monitoring stations are already in
existence, and it is desirable to use the information available from the
existing stations as well  as  the information that can be obtained from the
basic air quality model  to locate new stations (or  move old ones), for more
effective surveillance of pollutant concentrations.   The present filtering
method provides the pollutant concentration distribution over an urban region
by making proper use of both  the limited  observed data and the basic atmo-
spheric diffusion equation.   Therefore,  the nresent method can provide the
monitoring network designer with an effective computational  technique  to
evaluate different criteria  for monitor  siting that require knowledge  of  the
full  concentration distribution over the  region.  For example, one of  the


                                     48

-------
siting criteria often suggested is  to locate stations at local  points of maxi-
mum concentration.   (For other criteria we refer the reader to Liu et al.
(1977).)  The evaluation of the concentration distribution can be effectively
performed by making use of the present filtering algorithm.  We may note, as
we have already observed in Section III.B, that the filter variance is com-
pletely independent of the source emissions.  This implies that the perform-
ance of the filter as a whole is quite sensitive to the large sources in the
region while the filter variance, or the filter gain, is determined only by
the diffusion model, i.e., wind velocity components and diffusion factors,
e.g., eddy diffusivities.  Thus, it is easy to take into account the occur-
rence of various wind field patterns and diffusion factors to reflect an im-
pact of these factors on a potential monitor as well as a source-monitor
relationship.

     A problem in the application of the present method to the actual air
pollution situations is the identification of the noise statistics, i.e., Q
and R.  As we have seen in the previous sections, the filter variance func-
tion P essentially carries all the information necessary to resolve completely
the filtering problem.  We have observed that P is independent of the actual
data and depends only on Q and R if the initial condition is specified.
Therefore, the proper identification of Q and R is very important for the
present algorithm.

     In general, Q determines the steady-state (t-*») level of the filter
variance.  The filter variance equation with Q = 0, i.e., no dynamic noise,
is notoriously unstable; it frequently gives negative diagonal entries in P.
On the other hand, in situations with high dynamic noise level, the filter
algorithm shows better numerical stability, but the results cannot ofen be
regarded as accurate or reliable.  Hence, it is clear that there are some
lower and upper bounds in the choice of Q to limit the correlations within a
reasonable value, e.g.,

          0 < Q  .  < Q < 0
              xmin - v — vmax

The identification of Q is based on our judgment as to the essential adequacy
of the atmospheric diffusion model for the given situation.  Thus the choice
of Q is problem  (model) dependent and appropriate values are generally deter-
mined from prior simulation studies.

     The choice of R depends on  how the discrepancies between data and pre-
diction are expected to vary with location.  The identification of R can be
carried out through a statistical analysis  of past model validation studies.
Let the predicted and observed  concentrations at the monitoring time t^  and
the monitoring  location x. be  given by c(x. ,t.) and c(x. ,t.), respectively.
For a series of  N model funs,  we can  form   R at each tj" as
where  the  subscript n denotes  the  realization at  n-th model  run.  Thus, these
identification  problems  involve the off-line statistical data analysis based
on  the air quality model.


                                     49

-------
     At present the resources required for the computer time and storage re-
quirements for a realistic situation would be prohibitive for most air pollu-
tion control agencies.  In the test situation considered here, large computer
costs and storage were required when only a 13x13 mesh was used to represent
the region.  Increases in both would occur as the mesh points increase in
number.

     Techniques for estimating the mean concentration field over a region
given sparse measurements have also been developed using more simplified
interpolation procedures such as the inverse of some power of distance and
the calculus of variations (see Appendix D).  It is desirable to compare
results from the relatively sophisticated scheme developed in this work and
simpler, less expensive, techniques such as those discussed in Appendix D in
relation to the improvement achieved per unit of resource expended.
                                    50

-------
                                 SECTION IV

                                 CONCLUSIONS

     This study is an investigation of the application of estimation (filter-
ing) theory to air pollution.   In particular, we have explored the use of
filtering algorithms to estimate the distribution of air pollutant concentra-
tions.  The estimated concentration distribution is then used for the siting
of additional  monitoring stations.

     The specific conclusions  of this study are:

     1.   The filter is able to produce estimates that follow the changes in
          pollutant concentrations as a function of time and location.

     2.   The numerical stability of the filter is not sensitive to the choice
          of a priori filter statistics and low noise levels.

     3.   The estimate uncertainty is, roughly speaking, inversely propor-
          tional to the number of measurement stations.

     4.   The time of filter operations and the number of measurement points
          are the most significant factors that determine the filter
          performance.

     The filter is shown to produce very effectively the real-time estimation
of the fine structure of the pollutant distribution over a region on the basis
of sparse measurement data.  The results reported herein, thus, can be "effec-
tively used in an air monitor siting study.
                                     51

-------
                                 REFERENCES

Bierman, G. J., Factorization Methods for Discrete Sequential Estimation.
    Academic Press, New York (1977).

Buell, C. E., "Objective Procedures for Optimum Location of Air Pollutant
    Observation Stations," Environmental Protection Agency Report EPA-
    650/4-75-005, June 1975.

Chang, T. Y. and B. Weinstock, "Urban CO Concentrations and Vehicle Emis-
    sions," J.  Air Poll. Control Assoc.. 23_, 691 (1973).

Darby, W. P., P. J. Ossenbruggen, and C. J. Gregory, "Optimization of Urban
    Air Monitoring Networks," J. Env. Eng. Div. ASCE, 100, 577 (1974).

Environmental Protection Agency, "Guidance for Air Quality Monitoring Net-
    work Design and Instrument Siting (Revised)," OAQPS Number 1.2-012,
    Research Triangle Park, North Carolina (1975).

Fromm, J. E., "A Method for Reducing Dispersion in Convective Difference
    Schemes," J. Computational Physics, ^, 176 (1968).

Gustafson, S.-A. and K. 0. Kortanek, "Determining Sampling Equipment Loca-
    tions by Optimal Experimental Design with Applications to Environmental
    Protection and Acoustics," Mathematics in the Social Sciences, Paper
    90-725, 1973.                                               ~

Gustafson, S.-8. and K. 0. Kortanek, "Numerical Optimization Techniques- in
    Air Quality Modeling; Objective Interpolation Formulae for the Spatial
    Distribution of Pollutant Concentration," Report EPA-600/4-76-058,
    Environmental Protection Agency, Research Triangle  Park, North Carolina
    (1976), also J. Applied Meteorology. J(5, 1243 (1977).

Harrison, P. R., "Considerations for Siting Air Quality Monitors in Urban
    Areas," 65th Annual Meeting APCA, Miami Beach,  June 18-22, 1972.

Hougland, E. S.  and N.  T.  Stephens, "Air Pollution  Monitor Siting by
    Analytical  Techniques," J.  Air Poll. Control  Assoc.. 2£, 51  (1976a).

Hougland, E. S.  and N.  T.  Stephens, "Air Pollution  Monitor Siting by Ana-
    lytical  Techniques  II:   Multiple Pollutants," 69th  Annual  Meeting of
    the Air Pollution Control  Assoc., Paper 76-39.1, Portland, June,  1976b.

Hwang, M., J.  H. Seinfeld, and G. R. Gavalas, "Optimal  Least Square Filter-
    ing and Interpolation in Distributed Parameter  Systems," J.  Math.  Anal.
    Appl.. 39,  49 (1972).                                    	

                                     52

-------
Jazwinski, A.  J., Stochastic Processes and Filtering Theory.  Academic Press,
    New York (19701:

Kaminski, P. G., A. E. Bryson, a.-.d S.  F.  Schmidt, "Discrete Square Root
    Filtering:  A Survey of Current Techniques," IEEE Trans.  Automatic Con-
    trol. AC-16. 727 (1971).

Koda, M., "Dynamic Estimation of Diffusion Systems:  Application to Predic-
    tion of Air Pollution," Ph.D. Thesis, University of Tokyo (1976).

Kumar, S., R.  G. Lamb, and J. H. Seinfeld, "Prediction of Long-Term Average
    Pollutant Concentrations in an Urban Atmosphere," Atmospheric Environ-
    ment. 1_0, 707  (1976).

Lamb, R.G. and J.H. Seinfeld, "Mathematical Modeling of Urban Air Pollution-
    General Theory," Environmental Sci. Technol.. 7_, 253 (1973).

Liu, M.K., J. Meyer, R. Pollack, P.M. Roth, J.H. Seinfeld, J.V. Behar, L.M.
    Dunn, J.L. McElroy, P.N. Lem, A.M. Pitchford, and N.T. Fisher, "Develop-
    ment of a Methodology  for Designing Carbon  Monoxide Monitoring Networks,"
    U.S. Environmental Protection Agency,  EPA-600/4-77-019, March 1977.

Ludwig,  F.  L.  and  J. H. S.  Keoloha, "Selecting  Sites for Carbon Monoxide
    Monitoring," Stanford  Research  Institute, Menlo  Park,  CA  (1975).

McElroy, J. L.,  J. V.  Behar, L.  M.  Dunn,  P. N.  Lem,  A. M.  Pitchford,  N. T.
    Fisher, M.  K.  Liu, T.  N. Jerskey, J.  P. Meyer, J. Ames and  G. Lundberg,
    "Carbon Monoxide Monitoring  Network  Design-Application in  the Las
    Vegas  Valley." U.S. Environmental Protection Agency, 1978  (in press).

McGarty, T. P.,  Stochastic Systems  and State  Estimation. Wiley-Interscience,
    New York  (1974^

Ott, W.,  "An  Urban Survey  Technique for  Measuring  the Spatial  Variation of
    Carbon  Monoxide  Concentrates in Cities,"  Ph.D.  Thesis, Stanford  Univer-
    sity (1971).

Ott, W.  and R.  Eliassen,  "A Survey  Technique  for Determining  the  Repre-
    sentativeness  of Urban Air  Monitoring Stations  with  Respect to Carbon
    Monoxide,"  J.  Air Poll.  Control Assoc., £3, 685  (1973).

Ott, W.,  "Development of  Criteria  for Siting  Air Monitoring  Stations,"
    68th Annual  Meeting  of the  Air  Pollution  Control  Assoc.,  Paper 75-
     14.2,  Boston,  June 15-20,  1975.

Sage,  A.  P. and J. L.  Melsa, Estimation  Theory with Applications  to  Communi-
     cations and Control,  McGraw-Hill, New York (1971).
                                      53

-------
Sakawa, Y., "Optimal Filtering in Linear Distributed-Parameter Systems,"
    Int.  J. Control. J6., 115 (1972).

Seinfeld, J. H., "Optimal Location of Pollutant Monitoring Stations in an
    Airshed," Atmos. Environ., i, 847 (1972).


Tzafestas, S. 6. and J. M. Nightingale, "Optimal Filtering, Smoothing and
    Prediction in Linear Distributed-Parameter Systems," Proc. I.E.E..
    115. 1207 (1968a).

Tzafestas, S. G. and J. M. Nightingale, "Concerning Optimal Filtering Theory
    of Linear Distributed-Parameter Systems," Proc. I.E.E., 115, 1737
    (1968b).

Tzafestas, S. G., "On the Distributed Parameter Least-Squares State Esti-
    mation Theory," Int. J.  Systems Sci., 4., 833 (1973).

Vukovich, F. M., "A Description of a Technique for Developing an Optimum
    Air Pollution and Meteorological Sampling Network in Urban Regions,"
    IEEE Annals No. 75CH 1004-1 8-4, 1976.
                                     54

-------
                                  APPENDIX

A.  Distributed Parameter Filtering Theory

    The urban atmosphere can be considered as a stochastic, distributed
parameter system the state of which is the (ensemble mean) pollutant concen-
trations as a function of location and time.   In this section we review the
essential results from distributee-parameter filtering theory that are rele-
vant to the air pollution estimation problem.

    The system model consists of equation (5) with the measurement process
given by equation (8).  As we have noted, (5) is not the exact equation for
the ensemble mean because of eddy diffusivities Ku and KV representation of
the turbulent diffusion process.  We propose to add an artificial random
forcing term to the right hand side of (5) to account for discrepancies be-
tween c. as  predicted by (5) and the true (but unknown) ensemble mean
concentration.

    The system is abstractly described by

        3c(x,t)
          9t
                 = Lc(x,t) + £(x,t)                                   (A.I)
on a connected open domain ft of an m-dimensional Euclidean space Em, with
boundary 3ft.  The m-dimensional spatial coordinate vector is denoted by x.
The state c(x,t) is a scalar ensemble mean concentration, and Lx is a spatial
elliptic differential operator associated with  (5).  The process noise
£(x,t) is a scalar distributed white Gaussian process with properties:

         E(£(x,t)} = 0
                                                                       (A.2)
                      )} = Q(x,y,t)6(t-T)
where Q(x,y,t) is symmetric with respect to the spatial variables x and y,
and positive semi-definite variance function. The initial state c(x,t ) Ts a
white Gaussian process  and only its mean c (x) and variance P0U,y), i.e.,

         Etc(x,t0)} - c0(x)            ^                               (A>J)

         E{[c(x,t ) - c (x)][c(y,t) - c (y))> = P (x,y)
              —  u     u—     ~u     u—       u-w —

are assumed to be known.  The  boundary conditions have been assumed to be
inhomogeneous,

         Lbc(x,t) = <(>(x,t)      , x  € an                                (A. 4)


                                    55

-------
where  Lb  is a  linear boundary differential operator and 4>(x,t) is a known
deterministic  function.  It is clear that  (5) falls within the class  (A.I),
and there is no loss of generality  in assuming,  in this discussion, that
the state and  other functions are scalar.

    The measurement process is represented as

        W = c(*rV + W      '   *=1>2»---»M              (A. 5)

where n0(tt) is a white Gaussian process with properties:
       JC  K

         E^.UJ) = 0

         E{n1(tm)nj(tn)}=R..(tn)6mn                                  (A.6)

where R,-j(O  is a positive variance.   It is important to note that (A. 5)
represents the most important class of measurement situations in the air
pollution estimation problem, i.e., discrete-time measurements at M-discrete
points in space.

    The filtering problem is as follows:

         Given a realization (wji(t|(), I = 1,2,..., M; k = 1,2,... },
    find the estimate of the state c(x,t) that maximizes the conditional
    probability density functional of~the state.

This optimal estimation problem can be solved most effectively by using the
maximum likelihood approach based on the theory of the semigroup of linear
operators in Hilbert space.  We now discuss briefly the optimal solution  of
the linear distributed filtering problem.

    First we consider the time update equations, i.e., equations which de-
scribe the time evolution of the filter state between measurements.   Since
there is no additional  information between the measurements,  the maximization
of the conditional  probability is achieved by simply taking the expectation
of the state.   Hence,  from (A.I)  and/v(A.4), the following time update equa-
tions hold for the optimal  estimate c(x,t),
                     -Lc(x,t)         t(x,t)    ,  x e 8fi                             (A.8)


The estimation error


              c(x,t) = c(x,t)  - c(x,t)                                  (A.9)



                                      56

-------
has the variance defined by

         P(x,y,t) = E{c(x,t)c(y,t)}                                    (A.10)

By definition, the variance function P(x,y,t)  is  symmetric with respect to  x
and y and positive semi-definite.

    Subtracting (A.7) and (A.8)  from (A.I)  and (A.4)  respectively,  we can
obtain the following equations for the estimation error c(x,t).
                     =  L  c(x,t)  +  £(x,t)      t.<  t  <  t.xl              (A.11)
             3c(x,t)

             ~ii       vvc:

              Lbc(x,t)  =  0      ,             x ean                     (A.12)

It is important to note that the  estimation  error satisfies  the  homogeneous
boundary condition (A.12) while the optimal  estimate  satisfies the  inhomo-
geneous boundary condition  (A.8).   The general  solution  of (A.ll) with  homo-
geneous boundary condition  (A.12)  can be expressed in the  form:


              c(x,t)  =  f  G(x,t;y,t')c(y,t')  dy
                   JG(x,t;y,T)£(y,T)  dydr
                t1  n

where t1 is a suitable initial time and G(x,t;y,t) is the Green's function
associated with (A.ll) and (A. 12).  Note tfiat G(x,t;y,T) satisfies the
equation,


             4G(x,tiy,T) = L G(x,t;y,T)                              iA.14)
             O L   ~   ~       A  "*   ^*

with the terminal condition

             G(x,t';y,t') = 6(x - y)                                   (A. 15)

and boundary condition

             LbG(x,t;y ,•!•),= 0 ,        x e 8fi                          (A. 16)

Using (A.13) with (A. 10) we obtain the estimation error variance function as


              P(x,y,t) =  ff G(x,t;r,t')P(r,?,t')G(y,t;s,t') drds
                —      JJ~~      —      ~~      ~~
                                      57

-------
          +   f   if  G(x,t;r,T)Q(r,s,T)G(y,t;s,T) drdsdr                (A.17
              jjj    ~   ~     ,, ~     „   ~     —
              t1 ffl

Formal differentiation of (A.17) with respect to t together with the use of
(A.14) - (A.16) gives the following differential equations for the estimation
error variance:
            3P(x,y,t)
            —===	  = LP(x,y,t) + P(x,y,t)L'  + Q(x,y,t)            (A.18)
               ot         x~~        ~~y      ~~


              L.P(x,y,t) = 0     ,       x e an                         (A.19)
               D  ~ ~                   ~


              P(x,y,t) 4 = 0,        y-e an                         (A.20)

where the formal operator L' is defined by the relation P(x,y,t)L' =
LvP(x,y,t), and is an operator to the left.  The same definition is utilized
for L/7  Thus it has been shown that the variance function P(x,y,t) has the
homogeneous boundary conditions (A.19) and (A.20).

    It is important to note that the predicted (a priori) information at the
start of the measurement interval coincides with the filtered results before
any data have been included.  This explains the initialization of the time
update equations:  initial conditions of (A.7) and (A.18) should be filtered
(a posteriori) estimate and variance, respectively.

    At each measurement instant, the solution to (A.7) with boundary condi-
tion (A.8), and the solution to (A.18) with boundary conditions (A.19) and
(A.20) constitute the current estimate and variance.  These become a priori
(predicted) estimates and are combined with the new data at t=t.  , to update
the estimate and variance.  The equation defining the optimal estimate at
measurement instant time t.+, is
                                 I    I      ..
                                i~ i j~ i                         **
where the superscripts - and + denote the time instants immediately prior to
and immediately subsequent to the measuring instant, respectively.  Note that
c(x.»tjV,) 1s a solution to (A. 7) and (A. 8) at tk.-,.  The variance function
P(x,y,t) obeys the following equation at the measurement instant t.
                                                                  . +,
                                     58

-------
         MM                 -,
         r1    T  f\ /   _   _»_"  \ r »™" I / j.   NT   n/—  ,.i   \                  /AOO\

        i=l  j=l
Note that P(x,y,tjJ+1) is a solution to (A. 18), (A. 19), and (A. 20).  The

matrix A(tk+1) in (A. 21) and (A. 22) is defined by its element as
 It is  important to note that  (A. 22) does not depend on the actual measure-
 ments.  Hence, we can quite routinely generate variance functions correspond-
 ing to candidate air pollution models and measurement strategies.   In this
 way estimation accuracy might be determined prior to the occurrence of the
 actual event.

 B.  Finite Difference Approximation and Square Root Implementation  of Dis-
    tributed Parameter Filter

    The major impediment to the application of distributed parameter filter-
 ing is the spatial dimensionality of the filter.  If m=3 (most general air
 pollution problems have three spatial variables), then the filter variance
 P(x,y,t) is a function of six spatial variables.  Numerical solutions of par-
 tiaTdifferential and/or difference equations having more than three spatial
 dimensions are rarely attempted, particularly for equations as complex as the
 variance Riccati equation of  the distributed parameter filter.  The key prob-
 lem, therefore, in the application of distributed parameter filtering to air
 pollution analysis, is the development of efficient methods for solving the
 variance equations of the filter.

 B.I  Finite Difference Approximation of Distributed Parameter Filter

    Approximation of the filter is, of course, required at some point since
 distributed parameter systems span an infinite-dimensional space and it is
 only possible numerically to  obtain solutions in a finite-dimensional sub-
 space.  In order to work with the solutions of the filter, we develop a finite
 difference approximation.

    In discussing the finite  difference approximation it is useful  to consider
 a specific form of the atmospheric diffusion equation.  If turbulent diffusion
 in the horizontal directions  and vertical wind velocity component can be ne-
 glected, the atmospheric diffusion equation, with the addition of a dynamic
 error, becomes
                                      59

-------
           u(z)    +  v(z)    -         z)     + ?(x.y.z.t)              (B.I)
      9t
 where  x  elO,Xmax], ye [0, Ymax] , and ze [0, Z|naj{].  The boundary condi-

 tions  are  given by,

               _K  ac(x.y.O.t) .
               \o     3z

               9c(x,y, Z    ft)                                         (B>2)
               _  max  y - Q
                     9z

 where  S(x,y,t) is the  surface flux of the pollutant.  Other boundary condi-
 tions  are  given at the upwind boundaries:

         c(0,y,z,t)  =  c.
                       b                                               (B.3)
         c(x,0,z,t)  =  cb

 where  c.  can be a function of time.

    Let Ax = WV Ay = Ymax/Ny' and Az = Zmax/Nz' where Nx' Ny> and Nz
 are the numbers of mesh points in the x,y,z directions, respectively, and
 let At be a suitable finite difference time step.  Let us denote x. = iAx,
 y. = jAy, z,  = kAz,  and t  = aAt.  We shall use the following notation:
 J         K             U

where 6 e [0,1] and let 6.  denote the forward difference operator:
                                                                       (B-5)
     Based on the present model (B.I), we can construct a family of finite
difference approximations associated with parameter 6 which represent solu-
tions to the state time update equation (A. 7):
                                    60

-------
        ?lk  -  - MO L  A<;>^ - L2(k, *  fl<^e
 where
                         N
            1 = 1...., N   ; j = 1,..., N : k = 1,..., N
                       *               y              Z
                       u(zk)
                        'v~k'                                           (B.7)
                                3K.,(z.)
               L3(k) =  Kv(zk) + -gX-JL-


 The  superscripts  (1), (2), and (3) on the coefficient matrix A  refer
 to  the  x, y,    and    z coordinate directions, respectively." If 6 = 0,
 (B.6) yields the  so-called Crank-Nicolson method; for 6 = 1, (B.6) is a
 simple forward difference method.  The fully discrete equation (B.6) requires
 the  solution of a system of N  x N  x N, linear algebraic equations at each
 time level.                  x    y    z

      In order to  ascertain the stability and accuracy of the approximation,
 we apply the time-splitting technqiue in the actual  implementatior of (B.6).
 The  technique consists  of applying each 1-dimensional finite difference method
 independently and in succession, with no significance attached to the result
 of the former calculation step.   In this framework the original  finite dif-
 ference approximation (B.6) might be decomposed into the 2-dimensional  "advec-
 tion  part," and into the 1-dimensional "diffusion part."

      (a)  Advection part:
                                    N
               6V3>H/3 _     ,.}V
               \  cijk   - - L2(k) 5,  "jv "ivk
     (b)  Diffusion part:
                ut  cijk   - U3VN'  fa  "kv ^ijv


where 6t   is a difference operator defined by

                                    61

-------
                                                                       (B-n>

This decomposition is consistent and also improves the numerical stability
and accuracy.  In other words, the time-splitting approach decomposes the
original 3-dimensional finite difference equation into a simpler and better
conditioned 1-dimensional finite difference equation.

     The standard Crank-Nicolson second-order method is applied to the diffu-
sion part (B.10).  This implies 0 = 0 in (B.10), and the method is uncondi-
tionally stable.  Fromm's (1968) second-order, zero-average phase error method
is applied to the advection part (B.8) and (B.9).  Then (B.8) and (B.9) can
be represented as:


                     1  ak(ci-l,j,k - C?+l,j,k + c?-2,j,k  - Cijk>
     -0+2/3 _ :a+l/3   1    ^a+1/3     >H/3   + ~        _
     cijk   ' cijk   + 4  6k(ci,j-1,k ' ci,j+l,k + ci,j-2,k    ijk
where a.  = u(z. )At/Ax and Bk = v(z. )At/Ay.  The method is stable for a.  + $.<1 ,

and improves the phase error properties considerably.   We note that the bound-
ary conditions (B.2) and (B.3) can be easily incorporated into these finite
difference equations.

     The numerical  procedure (B.8) - (B.10) can be formally combined so that
the time update equation for the optimal estimate can  be written as follows:

               c(a + 1) = *(a + l.a)c(a)                               (B.14)

where c(a) is the N  x N  x N -dimensional state vector with elements c?..,
and (5 + 1 ,a) is tne corresponding state transition matrix associated  J
with~the finite difference routine (B.8) - VB.10).  It is important to note
that (B.14) is a completely formal expression, and that, in the actual cal-
culation, neither computation nor storage of $(o + 1 ,a) is required.


                                     62

-------
     Using similar notation as in (B.14), the measurement process (A.5) can
be represented in vector form as:

               w(o) = H(a)c(a) + n(a)                                  (B.15)

where w(a) is the M-dimensional measurement vector, H(a) is the suitably de-
fined M x N N N  measurement matrix, and g(a) is the~M-dimensional white
Gaussian sequence with the covariance matrix R(a) whose elements are defined
by R1j(t(J) in (A.6).

B.2  Square Root Implementation of Distributed Parameter Filter

     The finite difference method may be also effectively applied to the time
update equation for the estimation error variance.  The basic problem in this
approach lies in the dimensionality of the filter variance equation (A.18),
for which extensive numerical computations are necessary.

     Usually large-dimensioned systems are overly sensitive to numerical
errors and the effects of numerical errors are generally manifested in vari-
ous type of numerical difficulties.  Difficulties relating to computer round-
off appeared in even the very early applications of Kalman filtering proce-
dure.  Numerics is the dominant error source in the Kalman algorithms, and
they completely obscured the important effects of mismodeling and lead to
misleading estimates of the filter accuracy.

     The measurement update equation of the filter variance (A.22) is sensi-
tive to the effects of computer roundoff and is susceptible to an accuracy
degradation due to the differencing of positive terms.  This numerical accu-
racy degradation is often accompanied by a computed variance that ioses its
positive definiteness.  An alternative and more consistently reliable solu-
tion to the numerical instability problem is to perform some of the computa-
tions using  an algorithm that is numerically better conditioned.  As such
an alternative we utilize the square root implementation of the filter vari-
ance equations.

     The direct application of the finite difference routine to the vari-
ance equation (A.18) results in the following expression (using the same
notation of Section B.I.):

                         Nx         Q            Nx      ^ Q

     t ijk^mn ~ " Lrk^  ^, Aiv Pvjk£mn " Lrn^ "-, £v  ijkvmn
                         V~* I                     v i
                • L2<
                   2     
-------
         i , H = 1 , . . . , N  ; j ,m = 1 , . . . , N ; k , n = 1 , . . . , N
                         A                 y


where the variance function is discretized by
Note that (B.16) involves the computation of P^^o.-p which has six subscripts
for six spatial coordinate directions.  However]     it is not difficult to
relate the 6-dimensional subscript with the corresponding 2-dimensional matrix
subscript by the transformation


      (i,j,k,JUm,n)

      * (i+N (j-1) + N N (k-1), £ + N(m-l) + N N  (n-1))                (B.18)
            /\         /^ y            /\         /\ JF

It is important to note that this matrix subscripting can be  incorporated
into efficient FORTRAN  implementation.  In this way, using the state transi-
tion matrix $(a+l ,a) of (B.14), it is possible to represent (B.16) in  the
formal matrix form:

               P(a+l) = $(a+l,a)P(a) + P(a)$T(a+l ,a) + Q(a)             (B.19)

where P(a) and Q(a) are suitably defined N N N   x N N N  matrices.  Note that
P(a) is a symmetric positive semidefinite matrix.     y

     In order to include the effects of numerical errors in the filter design
requirements, we utilize the covariance square root implementation (Kaminski
et al . (1971)).  The covariance square root filter is a data  processing algo-
rithm based on a triangular square root factorization of the  estimation error
covariance matrix and is reputed to be more accurate and stable than the con-
ventional non-square root algorithms.  The use of square root matrices  im-
plicitly preserves symmetry and assures nonnegative eigenvalues for the com-
puted variance.  The square root algorithms achieves accuracy that is  compati-
ble with a Kalman filter that uses twice their numerical precision.

     Factorization of a symmetric matrix with nonnegative eigenvalues  is al-
ways possible.   As one might expect, the covariance square root is not
uniquely determined, however, a unique square root can be defined by a  tri-
angular square root matrix.  There are three basic algorithms for obtaining
the triangular form of the square root matrix.  The first, and generally the
fastest of the algorithms, employs the Cholesky decomposition.  The second
algorithm is known as the Householder orthogonal izati on transformation.  This
method usually yields more accurate results than the Cholesky decomposition
procedure, however, it is considerably slower.  The third method is the


                                     64

-------
modified Gram-Schmidt procedure.  The method is comparable to the Householder
algorithm with respect to the computation time and accuracy.


     The covariance square root matrix S(k) is defined by


               S(k)ST(k) = P(k)                                        (B.20)


Then the filter time update equations (B.14) and (B.19) can be summarized
in the following covariance square root form:


           c  (k+1) = $(k+l,k)c.(k)                                    (B.21)
           ~~—         -*       «*T


                 = $(k-H,k)Sjk)                                       (B.22)
                   •>*       *vT


                         = S(k+l)ST(k+l) + Q(k)                        (B.23)
where the subscripts - and + denote the value of the estimate, or covariance
square root, at the time immediately prior to and immediately subsequent to
the measuring instant, respectively.  In the actual computation of (B.22),
we apply the same time-splitting technique as (B.8) - (B.10).


     (a)  Advection part:


                                                                        (B-24)
          J/3sk+l/3 = .   (k)   y   (2) k+l/3,6                        (B.25)
           t   pq         2V         jv  vq
     (b)  Diffusion part:



                                                                        (B.26)
where the subscripts are defined by
          P =  i + Nx(j-l) + NxNy(k-l)

          q =  i + Nx(m-l) + NxNy(n-l)

         v-, =  v + Nx(j-l) + NxNy(k-l)                                   (B.27)

         v2 =  i + Nx(v-l) + NxNy(k-l)

         v3 =  i + Nx(j-l) + NxNy(v-l)
                                      65

-------
 We  may  note  that  it  is  not  necessary  to  compute  and  store  the  state  transi-
 tion  matrix  $(k+l,k)  in the procedure (B.24)-(B.26).   As indicated by  the
 subscripts of  S,  only the matrix multiplication  from the left  is  involved  in
 the computation of the  time update  factors  of  S.  Thus, different from (B.19),
 we  can  arrange a  simple and efficient numericaT  implementation for the time
 update  of covariance  square root.

      We use  a  modified  Gram-Schmidt orthogonal ization  procedure in the fac-
 torization of  (B.23). (For  detail see Appendix C.)   The modified  Gram-
 Schmidt algorithm is  reputed to have  accuracy  that is  comparable  with  the
 Householder  algorithm.   Unlike the  classical procedure, the modified algo-
 rithm produces almost orthogonal vectors, and  reorthogonal ization and  pivot
 strategies are unnecessary.  From this viewpoint, it  is possible  to  interpret
 the square root process  as  a Gram-Schmidt orthogonalization of an appropri-
 ately defined  state vector.

      Using the measurement  equation (B.15), define the Cholesky square root
 matrices V(k)  and G(k)  by
         *W        ~*


          V(k) =  R1/2(k)
                                                                        (B.28)
          G(k) =  [H(k)S_(k)sJ(k)HT(k) +  R(k)]1/2


where the superscript 1/2 is reserved for the Cholesky square root (see Appen-
dix C).   Then the filter measurement update equations (A. 21) and  (\.22) can
be summarized in the following covariance square root form:


                     + S_(k+l)S^(k+l)HT(k+l)G'T(k+l)G"1(k+l)
                                 _                                      (B.29)



  S.(k-H) = S (k+1)
  •*. 1         «s* ™

     - S_(k+l)S(k+l)HT(k+l)G~T(k+l)[G(k+l)  + V(k+l )]"1H(k+l)S_(k+l)     (B.30)
The filter measurement update equations (B.29) and (B.30) involve Cholesky
decomposition of two MxM matrices as depicted in  (B.28) and inversion of two
triangular Cholesky square roots.  Since matrix inversion of the triangular
matrix preserves triangularity, this procedure is a costly operation in terms
of computer execution time, especially when M « N N N .
                                                  x y z
     In general, algorithms involving square root matrices reduce the dynamic
range of numbers entering into the computations.  This factor, together with
the greater accuracy that square root algorithms guarantee, directly affects
computer word length requirements.  A rule of thumb is that square root algo-
rithms can use half the word length required by conventional non-square root


                                     66

-------
algorithms.   The modified Gram-Schmidt triangularization, Cholesky decompo-
sition and its inverse algorithm are described in Appendix C.

C.   Square Root Matrix

     Factorization of a symmetric matrix with nonnegative eigenvalues (a
positive semidefinite matrix) is always possible.  Since nxn matrices that
are symmetric can be completely described by n(n+l)/2 of their elements,
their square roots are not uniquely determined in general.  They are, how-
ever, related to one another by orthogonal transformations.

     If S, and S9 are square roots of a positive semidefinite matrix P, that
is,     ~]     "Z

               P = S,SJ = S9s]                                         (C.I)
               ~   ~ | ~ |   ~C~C.

then there exists a matrix T such that

               S2 = S^ and TTT = TTT = I                              (C.2)

in other words, T is orthogonal.  It is important to note that the use of
different square root matrices does not alter the results.

     Using an upper or lower triangular matrix, a unique square root can be
determined since symmetric nxn matrix and triangular nxn matrices are both
characterized by n(n+l)/2 of their elements.  By computing only the upper or
lower triangular nonredundant entries, symmetry and positive semidefiniteness
of the covariance matrix are preserved.  Upper and lower triangula • factori-
zation involves basically the same techniques and computation timr.  Here
we summarize the lower triangular factorization of the covariance matrix.

C.I  Lower triangular Cholesky decomposition

     Any  symmetric positive semidefinite matrix P has a  lower triangular
factorization, P = SST.  S is computed from the following recursive
algorithms:


     For i = 1,.... n
                           1-1   o
                           Y,  s?,                                     (c.3)
                                              (j  <  i)

                                                                        (C.4)
 where n = dim P.

                                      67

-------
 C.2  Inversion  of lower  triangular matrix
      Matrix inversion  of a  lower triangular matrix preserves lower triangu-
 larity.   S~  is computed from the following recursive algorithms:
           For  i  =  1,..., n
           s-1  =  i/s..
                                                                 (C.5)
                  0
                                       (j  <  i)
                                                                      (C.6)
                           E
                           k=i
                              .-1
                              >ki'
 C.3   Modified Gram-Schmidt orthogonalization
      Let  f,,..., f  be linear independent N vectors with N>n.   Consider the
 recursions:
      For  i =  1,..., n
S.. -VfJ
                                                                      (C.7)
                           f\
                            (1)
                                     (j < i)
                                (j =  1+1,..., n)
                                                                      (C.8)
Then
                [fr...,  fj = SS'
                                                                      (C.9)
                                                                (C.10)
                                    68

-------
where S is lower triangular.
      <\*

     Then the factorization (B.23) can be handled by constructing a suitable
orthogonal matrix T such that
               ST(k)"
               WT(k)
          V(k)"
                                                                        (C.ll)
where
          W(k) = Q1/2(k),
                                                        (C.12)
 Using  the  modified  Gram-Schmidt  procedure  (C.7)-(C.9),  the  transformation
 (C.ll)  can be  performed without  explicit storage  or  computation  of T.

     For the technical details and  a more  complete discussion  of the matrix
 factorization  algorithms we  refer the  reader  to Bierman (1977).

 D.   Discussion  of  Approaches to the Design of a  Monitoring System

     In this Appendix we outline prior approaches that  have been proposed  for
 the monitoring system design problem.   The object of this Appendix is  to pro-
 vide the reader  with some  perspective  on other approaches,  their advantages
 and shortcomings, and their  comparison with the approach taken here.

     Seinfeld  (1972) posed the optimal  monitoring location  problem from the
 point  of view  of source surveillance.   Given  an atmospheric diffusion  model
 he  proposed that monitors  be located at points where the pollutant concentra-
 tions  are  most sensitive to  changes in source emissions.  He developed an
 optimization routine that  would  automatically find those locations with maxi-
 mum sensitivity. The results of that  work can be viewed as complimentary  to
 those  of the present work  since  we  have not considered  the  criterion of
 source surveillance here.

     Gustafson and  Kortanek  (1973,  1976) have essentially employed a regres-
 sion approach  to the optimal monitor siting problem.  In their 1973 work they
 propose to determine an optimal  regression model  by  minimizing the differences
 between observed and calculated  concentrations.   Let f(x) represent the con-
 centration of  a  pollutant  at location  x.   The observations  can then be repre-
 sented by  f(x) + n(x) where  n(x) is white, Gaussian.noise.   The authors pro-
 pose that  f(x) be represented By
f(x) =

                                   + n(x)
(D.I)
                                      69

-------
where {<|>.}  is an appropriate set of functions, and {a.} are unknown constants.
The optimal location problem is posed as determining a set of measurement
locations,  x,, Xo,..., XM, to minimize
            -~ I  ~c.      ~|\|
                        ^fx.) - ffXj                                 (D.2,
where p. are an appropriate set of weighting coefficients.
       J
     In more recent work, Gustafson and Kortanek (1976) proposed a scheme
based on a least squares fit of sparse measurement data to an analytical dis-
persion formula.  The concentration of a pollutant at location x is expressed
as

                       M
               c(x) =  I  q,v.(x)                                       (D.3)
                 ~    ,• _ I  j j ~

where q. is the strength of source j, and v. is the concentration produced at
x by a source of unit strength at source j.  Clearly, v. is a function only
of meteorology.  If concentrations are measured at N stations at locations,
x-|,X2»..., XN> yielding measurements c,, C2»..., c.,, then the authors propose
that the M pseudo-source strengths q., j = 1,2,..., M can be determined from
                                    J
                M
                I  q,v.(x.) = c.         i = 1,2,..., N                 (D.4)
               •j = l  j j ~p      '

The solution to (D.4) for the q. can be obtained by least squares.  Once the
q, have been determined, then tfley can be used to predict concentrations at
other locations for the purpose of choosing new monitoring stations.   The key
to this approach is that the source strengths are treated as unknown in order
to determine the model that is to be used subsequently for siting.  This ap-
proach is not as desirable as one based on a good source inventory, such as
that of Liu et al. (1977).

     Hougland and Stephens  (1976ab) presented a location model which defines
a measure of relationship between an emission source and a potential  monitor
site.  The measures, called coverage factors, are defined for each combina-
tion of source, potential monitor, and wind direction.  For a tentative
assignment of samplers to potential monitor site, a "source oriented sum
(SOS)"  is calculated.  Then a heuristic nonlinear programming technique is
used to search for those combinations of sampler assignments that minimize
the source oriented sum for the number of monitors allowed on assignment.  This
model is expressed as follows:
                                     70

-------
               Ns   Nw
Maximize SOS =  I   I
                         Max   {A...  X.}
                  i=l k=l  Kj
-------
             TABLE D.I  SUMMARY OF  PREVIOUS APPROACHES TO OPTIMAL AIR POLLUTANT MONITOR SITING
  Investigators
   Siting Criterion
                                                                       Comments
  Seinfeld (1972)     Locate monitors at  points  of maximum
                    sensitivity to changes in  emission
                    rates from major sources,  the "source-
                    surveillance" siting criterion.
                                   Approach based on atmospheric diffusion equa-
                                   tion air quality model.   Need source inventory
                                   and meteorological data  for region.  Method
                                   applicable to inert and  reactive pollutants.
  Gustafson and       Locate monitors so as to determine  an
  Kortanek (1973)      optimal regression fit to the concen-
                    tration field.
                                   Approach based on statistical optimal experi-
                                   mental design.  Regression fit of limited
                                   usefulness.  Sites sensitive to particular
                                   regression expression chosen.
  Gustafson and       Construct  a model  by determining  best
  Kortanek (1976)      source strengths to match model to
                    observed data.
                                   Treats source strengths as unknown and deter-
                                   mines the  best linear model (inert species)
                                   to fit the observed concentration data.  Once
                                   model is formed,  additional monitor sites can
                                   be selected.  Of  limited usefulness if actual
                                   source inventory  is available.
  Vukovich  (1976)      Fits wind  field by objective analysis
                    and then fits a regression model  to
                    concentration field.
                                   A more sophisticated variant of Gustafson and
                                   Kortanek (1973).  Optimal sites chosen on basis
                                   of regression model for concentration field.
                                   Non-reactive pollutants only.
  Hougland and
  Stephens  (1976ab)
Defines a coverage  factor that depends  Applicable to non-reactive pollutants only.
on source strength,  frequency distri-
bution of wind speed and direction.
  Liu et al. (1977)
Exercise air quality diffusion model    Does not require prior monitoring data.  Appli-
given source inventory and meteorology  cable to inert and reactive pollutants.  Best
to produce concentration field.  Locate way to approach siting in a region where no
monitors on basis of characteristics    stations currently exist.
of field.
  Buell (1975)        Locate additional monitors at points
                    at which the mean square error of
                    estimates is largest.
                                  A starting network of monitoring stations is
                                  assumed.  Regression fit of concer  rations and
                                  interpolation over a region is performed.
                                  Optimization is  carried out based on  a heuristic
                                  one-step-at-a-time method.  Similar to Gustafson
                                  and Kortanek.
of pollutant concentrations over  a region, the location  of  the  point yielding
the maximum error  of estimation is found.   This  point is then chosen as  the
best location  for  a new measurement  station.   This point is  added to the
existing observation points.   The process  is  then repeated;  the location of
the point  yielding the maximum mean  square error is  found,  using the new
observation station.   A second new measurement station is located,  etc.   In
this work,  special  consideration  is  given  to  the calculation of the residual
variances  or the effects of limited  range  of  influence.   The factor analysis
method  is  utilized to  specify  the residual variances which  enter into the
diagonal elements  of the covariance  (or  correlation  coefficient) matrix.

E.     Documentation of Program
                                               72

-------
E.I   Function

     The program accepts a set of monitoring station locations and pollutant
concentration data at these stations and produces a grid of estimates of the
pollutant concentration.  Data cards describing the grid of surface pollu-
tant fluxes, eddy diffusivities, and wind vectors over the region of inter-
est are also required.

E.2  Program Flowsheet and Subroutines

     The flowsheet of the program is given in Figure E.I.  Subroutines re-
quired there are as follows:
COVAC


MGS

CHOLKY

INVTRY


FROMM


TRIG
For the time-update computation of the filter square root
variance

For the modified Gram-Schmidt orthogonalization procedure

For the Cholesky decomposition procedure

For the inverse matrix generation of a triangular square
root matrix

For the Fromm's second-order, zero average phase error
finite difference method

For the solution of an algebraic equation with a tria :gular
coefficient matrix.
                                     73

-------
             INPUT DATA
                       MEASUREMENTS
       Main Program
     FROMM
COVAC
                 Filter Variance
                      MGS
                      CHOLKY
                      INVTRY
   Prediction
     Filter Gain   j-
Filtering Estimate
            Updated Variance
           Figure E.I  Flowsheet of the Program
                          74

-------
E.3   Input Variables
Definition

Filter
states




Mesh
conditions



Time condi-
tions
Measurement
conditions
Physical Variable
c(xry..,zk,t)
s. (t)
Pij(t)
Ax
Ay
Az
At
Nx
N
y
Nz
t
tk+l"tk
M
h
Program Variable
C(I,J,K)
D f T 1\
r V 1 • U I
PA(I,J)
DX
DY
DZ
DT
LL
MM
NN
T
TD
NUMBER
EH IT
Comment
Estimate
Square root variance
Filter variance

Mesh spacings


Mesh numbers



Number of stations
Effective meas.
height
                                      75

-------
Physical Variable
Program Variable
                       Input Needed
cQ(x,y,z)
Po(x,y,z,x',y',z')
s(x,y ;
u(x,y)
v(x,y)
NVO
Q(z)
Ri
C(I,J,K)

IX(L)
IY(L)
OBS(L,K)
SOURCE (I,J)
ED(K)
DK
Q(K)
R(D
C(I,J,K),  I  =  1,...,  LL; 0=1,..., MM; K=l,..., NN
P(I,J), I, J = 1,...,  LL*MM*NN
IX(L) , L  =  1,...,  NUMBER
IY(L), L = 1,..., NUMBER
OBS(L,K),  L =  1,...,  NUMBER,  K  =  1,...
SOURCE (1,0),  I = 1,..., LL;  0  =  1,..., MM
U(I,0), I  = 1,...,  LL; J =  1,..., MM
V(I,0), I  = 1,...,  LL; 0 =  1,..., MM
ED(K), K = 1,..., NN


Q(K), K = 1, ..., NN
R(I), I = 1,..., NUMBER

-------
E.4   Output Variables
Physical Variable
c(x,y,z,t)
P(x,y,z,x,y,z,t)
Program Variable
C(I,J,K)
PA(I,I)
Type of Output
The computed grid maps are
written and/or 3-dimensional
perspectives are drawn.
E.5   Error Messages

      A warning message  is printed  by  the  program when:
1.    Negative concentration  is  calculated.  (If C(I,J,K)  < - 10.)

2.    Too  large a concentration is calculated.   (OVERFLOW if C(I,J,K) > 103.)

3.    Too  large diagonal  entries of the filter variance are calculated

      (COVARIANCE  OVERFLOW  if PA(I,I)  > 1010.)

4.    Too  large correlations  of the filter variance are calculated.

      (If  PA(I,Jf> PA(I,I)*PA(J,J).)
                                    £ U.S. GOVERNMENT PRINTING OFFICE! 197«-78 8-97* I 2SO B.I
                                       77

-------
                                   TECHNICAL REPORT DATA
                            (Please read Instructions on the reverse before completing)
   EPA-600/4-78-036
                                                           |3. RECIPIENT'S ACCESSION NO.
4. TITLE AND SUBTITLE
    AIR  MONITOR SITING BY OBJECTIVE
                                                           IB. REPORT DATE
                                                             June 1978
                                                            i. PERFORMING ORGANIZATION CODE"
7. AUTHOR(S)             "       "	

   Masato Koda and John H. Seinfeld
                                                           |8. PERFORMING ORGANIZATION REPORT NO.
                                                            10. PROGRAM ELEMENT NO.
   Department of Chemical  Engineering
   California Institute of Technology
   Pasadena, California  91125
                                                              1 HP fi?n
                                                              <~r»KITOA^»-l-/^
                                                                   VCT/GRANT NO.


                                                              68-03-2441
                                                                                         ED I
12. SPONSORING AGENCY NAME AND ADDRESS
   U.S. Environmental Protection Agency—Las  Vegas,NV
   Office of  Research and Development
   Environmental Monitoring and Support Laboratory
   Las Vegas,  Nevada  89114
                                                            13. TYPE OF REPORT AND PERIOD COVERED
                                                            14. SPONSORING AGENCY CODE
                                                              EPA/600/07
15. SUPPLEMENTARY NOTES
   For  further information,  contact James L. McElroy,  (702)736-2969,  extension 241,
   in Las Vegas, Nevada
16. ABSTRACT
   A method is developed whereby measured pollutant concentrations can be used in
   conjunction with a mathematical air quality model  to  estimate the full spatial and
   temporal concentration  distributions of the pollutants  over a given region.  The
   method is based on the  application of estimation theory to systems described by
   partial differential equations, such as the atmospheric diffusion equation.  A
   computer code has been  developed that can process  monitoring data to produce
   concentration distribution estimates.  The code has been tested extensively on a
   hypothetical airshed, designed to illustrate the key  features of the method.  Once
   concentration distributions have been estimated, new  monitoring stations can be
   located based on several  siting criteria.
 7.
                                KEY WORDS AND DOCUMENT ANALYSIS
                  DESCRIPTORS
   Air  quality network design
   Air  quality monitoring
   Air  quality modeling
   Air  quality monitoring  criteria
18. DISTRIBU I ION STATEMENT*

   RELEASE  TO PUBLIC
EPA Form 2220-1 (R.v. 4.77)
                       P«V,OU, EomoN,s OBSOLETE"
                                               b. IDENTIFIERS/OPEN ENDED TERMS
                                                Air quality monitoring
                                                estimation theory
                                              19 SECURITY CLASS (Thli Report)
                                                 UNCLASSIFIED
                                                 SECURITY CLASS (This page)

                                                 UNCLASSIFIED
                                                                            COSATi Held/Group
 04B
 09B.D
21 NO. OF PAGES

       88
                                                                          22 PRICE

-------