Urban Air Pollution Modeling Without Computers


V
  EPA-600/4-76-055
  November 1976
                                                                    PROTECTION
                                                                      AGENCY
            JBALLAS
Environmental Moimorinl
               UBRARY

-------
                RESEARCH REPORTING SERIES

Research reports of the Office of Research and Development, U S  Environmental
Protection Agency,  have been  grouped into  five series  These five  broad
categories were established to facilitate further development and application of
environmental technology Elimination of traditional grouping was consciously
planned to foster technology transfer and a maximum interface in related fields
The five series are:

     1     Environmental Health Effects Research
     2     Environmental Protection Technology
     3     Ecological Research
     4     Environmental Monitoring
     5     Socioeconomic  Environmental Studies

This report has been assigned to the ENVIRONMENTAL MONITORING series
This series describes research conducted to develop new or improved methods
and  instrumentation for the identification and  quantification of environmental
pollutants at the lowest conceivably significant  concentrations  It also includes
studies to determine the ambient concentrations of pollutants in the environment
and/or the variance of pollutants as a function of time or meteorological factors
This document is available to the public through the National Technical Informa-
 'on  Service, Springfield. Virginia 22161

-------
                                             EPA-600/4-76-055
                                             November 1976
  URBAN AIR POLLUTION MODELING WITHOUT COMPUTERS
                        by
                Michael  M.  Benarie
Institut National  de Recherche Chimique  Appliquee,
        B.P.I  - 91710 Vert-le-Petit,  France

                Visiting Scientist
        Meteorology and  Assessment Division
    Environmental  Sciences  Research Laboratory
   Research Triangle Park,  North  Carolina   27711
       U.S.  ENVIRONMENTAL  PROTECTION  AGENCY
        OFFICE  OF  RESEARCH AMD  DEVELOPMENT
    ENVIRONMENTAL  SCIENCES RESEARCH LABORATORY
   RESEARCH  TRIANGLE  PARK, NORTH  CAROLINA  27711

-------
                                 DISCLAIMER
     This report has been reviewed by the Environmental  Sciences Research
Laboratory, U.S. Environmental Protection Agency, and approved for
publication.  Approval does not signify that the contents necessarily
            views and policies of the U.S. Environmental Protection
            does mention of trade names or commercial products constitute
            or recommendation for use.
reflect the
Agency, nor
endorsement

-------
                                  FOREWORD

     The modeling of urban air pollution  by mathematically oriented  techniques
has for some time provided the primary quantitative  basis  for the  development
of air quality management concepts and strategies.   As  such,  the topic  of  air
quality modeling has high priority in  the research program of the  EPA.

     The material of the present report formed  the basis for  a series of
three lectures given by Dr. M. Benarie, Chief of the Atmospheric Pollution
Service, Institut National de Recherche Chimique Appliquee, Vert le  Petit,
France.  The lectures were first given on 15-17 Sept. 1976 in Raleigh,  N.C.,
as part of the "Continuing Seminar on  Air Quality Research",  which is a
joint activity of the Meteorology and  Assessment Division, EPA and the  North
Carolina State University.  They were  also repeated  on  20-22  Sept. at
the Pennsylvania State University, University Park,  Pa., under support  of  an
EPA grant with the Select Research Group  on Air Pollution  Meteorology at
P.S.U.  The publication of this material  as an  EPA report  is  made  in the
interests of wide dissemination of the information to air  quality  modelers.
                                  Kenneth  L.  Calder
                                  Chief Scientist
                                  Meteorology and  Assessment Division
                                  Environmental  Sciences  Research  Laboratory

-------
                                   PREFACE

     The lecture material that is consolidated in this report represents an
abridged version of some selected chapters of a monograph entitled Urban Air
Pollution Calculation, by the same author.  As the title indicates the
selection was oriented towards simple but nevertheless efficient methods.
The present discussion stresses the reasons why these methods are useful,
which are their recommended fields of application, and when they are to be
preferred to other kinds of calculation.  As compared with the monograph
this abridgement differs primarily in the completeness of the coverage.
Here only about one tenth of the references are included and discussed.
The primary aim is to stress principles and only to present examples that
are really necessary.  In contrast the monograph tends to be as complete as
possible, by providing comprehensive current information on references,
formulas, applications and validations.

     The author would like to express his thanks to Mr. K. L. Calder who
kindly provided editorial assistance on the first draft of this report.
                                     IV

-------
                                 ABSTRACT

     This report was the basis for a series of three lectures by the author
on urban air pollution modeling, and represents a condensed version of se-
lected topics from a recent monograph by him.   The emphasis is on simple
but efficient models, that can often be used without necessitating a high-
speed computer.   It is indicated that there will  be many circumstances under
which such simple models will  be preferable to more complex ones.  Some
specific topics  included in the discussion are the limits set by atmospheric
predictability,  forecasting pollution concentrations in real  time as for
pollution episodes, the simple box model for pollution concentrations, the
frequency distribution of concentration values including the log-normal dis-
tribution and averaging-time analysis, the relationships between wind speed
and concentration, and lastly the critical question of model  validation and
the need to consider several indices of goodness-of-fit if pitfalls are to
be avoided.

-------
                                 CONTENTS

Foreword	iii
Preface	iv
Abstract	   y
Figures	vii
Tables	viii

    1.  Introduction	   -1
    2.  Limits Set by Atmospheric Predictability  	   7
    3.  Forecasting Pollution	14
    4.  The Box Model	23
            Short-term averages  	   23
            Simple box model for long-term averages  	   28
    5.  Correlation with Demographic Parameters	31
    6.  Concentration Frequency Distribution  	   33
            The log-normal representation  	   33
            Averaging-time analysis 	   34
    7.  Wind and Concentration Relationships	38
    8.  Validation - Or the Ways  to Delude Oneself	41
    9.  Conclusions	43

References	44
Addendum	71
                                    VTl

-------
                                   FIGURES

Number                                                                Page

  1   Sulfur dioxide concentration  as  a  function  of  the  logarithm
       of population density,  in and  around  Paris,  France.   After
       Pelletier (1967),  with  the  permission of  Laboratoire  de
       de THygiene de la Ville de Paris	     51

  2   Smoke concentration  as  a  function  of the logarithm of popula-
       tion density, in and  around Paris, France.   After Pelletier
       (1967),  with the permission of Laboratoire de  1'Hygiene
       de la Ville de Paris	     52

  3   Observed and experimentally predicted annual hourly CO  concen-
       tration  distribution  for San Francisco	     53

  4   Observed hourly wind-speed distribution for San  Francisco
       Federal  Building,  July  10-11,  1968  	     54

  5   Observed and predicted  hourly CO concentration distribution
       for San  Francisco, July 10-11, 1968	     55

  6   Observed and model-predicted  annual  hourly  CO  concentration
       distribution for San  Francisco	     56
                                    vm

-------
                                   TABLES

'm~oer                                                                  Page

  1   Classification  of  urban  air pollution concentrations, following
       their  purpose and  accuracy requirements	    57

  2   bystematics  of  urban air pollution models based on  the  input
       parameters	                            58

  3   Tabulation  prediction technique  	     59

  4   Contingency  table  of the ozone prediction results on the training
       set by Ruff (1974)	     60

  5   Breakdown of the episode forecasts for Rouen, France, after Benarie
       (1971) and Benarie and Menard  (1972)	     61

  6   Contingency  table  of forecast results obtained by Benarie  (1971)
       and Benarie and  Menard (1972), with apparent meteorological
       forecast  error removed	     62

  7   Contingency  table  for the previous forecast results, overall results
       for 549 cases	     63

  8   Multiplying  factors  to be applied to the 30-day running average to
       obtain winter episode  concentrations in Rouen, France 	     64

  9   Comparison of observed,  calculated and forecast data for seven
       sampling  stations  in Rouen, France, in winter 1968/69 	     65

 10   Predictions  of  hourly values of  CO-concentration in ppm, in the
       Los Angeles Basin, on  September 29, 1969, by Hanna (1973) ...     66

 11   Validation of the  simple box model, according to Hanna  (1971) and
       Gifford (1972, 1973)	     67

 12   Data  related to particle and SO  pollution for U.S. cities. ...     68
                                   /\

 13   Distribution of selected cities  by population class and particle
       concentration, 1957 to 1967	     69

 14   Computed CO  concentrations (17-hour day-averages, ppm)  compared
       with observed values for Sept  23, 1966 for the Los Angeles Basin    70

-------
                                 SECTION 1
                                INTRODUCTION
     It is not in the role of some latter-day follower of the machine-
smashers of the 1840's that I chose the title.   At that time tailors  fearing
unemployment destroyed the first sewing machines.   I  would like to speak
about "computer-less" methods and about "unsophisticated methods of modeling."
However, this will certainly not be done with any sense of glee that  "see,
here the abacus outperformed the computer."  The abacus will never do that.
My point of view is strictly that of the engineer.  I consider the test of
the engineer to be the attainment of a given practical goal  by the most
economical of the means available.  The scientist and research worker on the
other hand seek to improve knowledge without consideration to cost, effort
and time, while the inventor labors to increase the available means.   The
good engineer will happily use the output from the research  worker or that  of
the inventor, but his purpose is practical  such as to build  a bridge,
house, highway or gadget.   He has to deliver the bridge, etc., on time and
meet specifications, while avoiding any suspicion of a gamble.  On the other
hand, all research projects contain some element of a gamble.  They verify,
validate, or prove some hypothesis; they compare or attempt  to test or they
search for something as yet undisclosed by  nature.  At the outset, we always
hope for a positive answer, but a hope is only a gamble.  On the other hand,
an engineer who "hopes" is truly a bad engineer.  An  engineer must deliver
his product in the same way as a manufacturer must do.
     In our case, the product to be delivered mostly takes the form of infor-
mation.  It is the result  of a calculation  or of some process akin to calcu-
lation.  These calculations are usually undertaken in order  to avoid  the
unrealistically high cost  of of full scale  experiments.  In  the same  way, it
is cheaper to calculate the strength of a beam than to shatter a real one;
it is also less expensive  to calculate the  impact of a power plant on an
urban area than first to build one and then see what happens.

                                     1

-------
     Before undertaking any urban  air  pollution  calculations,  the  following
questions should be answered:
          1.  who needs the information  and
          2.  what purpose has  to  be attained.
Table 1  presents a schematic outline of  the  answers  to  these questions.
     The user has first to define  his  operational  needs:   If an  annual arith-
metic mean is requested simply  to  check  conformity with an air quality stand-
ard expressed as an annual mean,  it would  obviously  be  foolish to  obtain  it
from 8760 hourly estimates, when more  direct and cheaper methods are  avail-
able.  High resolution—here as everywhere—costs  money.   This money  and
labor are spent first in gathering the high-resolution  input data, and then
in working out the fine details of the output.   High resolution  data  do not
necessarily mean accurate or true  data.  Thus the  first consideration we  must
face before adopting some computationally  sophisticated method—which, by the
way, rather often coincides with computer  vs computer-less methods--, is
whether the quality and the quantity of  the  available input does in fact
justify the computational burden.   On  the  other  hand, some computationally
simple methods need high-grade  input information.  An obvious  example is  the
"persistence model" which without  any  calculations,  when projected a  short
time forward, wiTT give a fantastically  good fit,  because  it already  contains
a tremendous amount of accurate information.  Needless  to  say, extra-
polation for a few years, or even  a few  days ahead,  would  be pointless.
     The juxtaposition of "simple" box models,  requiring little  computing
effort, with mathematically and physically "sophisticated" models, that proba-
bly necessitate the use of large  digital computers,  does not necessarily  mean
that one is better than the other. First  of all it  should be  emphasized, as
Gifford (1973) pointed out, that  "simple"  is not the opposite  of "sophisti-
cated," but of "complex."  The  antonym of  "sophisticated"  is "naive." Simple
urban air pollution models are  not always  naive, and may in fact be quite
sophisticated.  Conversely, complex models can be quite naive, or  can contain
naive assumptions.
     Furthermore, if a complex- urban pollution model cannot estimate  more
accurately than a simple one, say like the use of persistence, then its

-------
development is not profitable in applied studies (Gifford and Hanna, 1975).
This does not mean that the theoretical importance of a complex model is
diminished if it provides a better understanding of the underlying phenomena.
     This leads directly to another question:  Is there some fundamental
limit to the accuracy of the model computations?  If there is such a limit,
then it is clearly pointless to use computational methods of much greater
precision as these only contribute to the proliferation of non-significant
figures in the estimate.  Much of the discussion of the following section
will be concerned with the search for such limits.
     Table 1 is an attempt to systematize computations on the basis of their
purpose.  Another classification can be based on the amount and nature of
external - mostly meteorological - information, needed as input for the model.
     Meteorological parameters have an overwhelming influence on the behaviour
of pollutants in the urban air.  Among them, wind parameters (direction,
velocity and turbulence), and thermal properties (stability) are the most im-
portant.  A classification of the models can be based on the method in which
this kind of input is generated.  The following discussion will use "wind
field" as shorthand for all  the dynamical and thermal properties associated
with the wind.
     In some models, the wind field is assumed to be known or has to be fed
in by forecasting techniques.  Mahoney (1971) has coined the word "driven"
for this kind of input.
     In a second category of computations, a consistent wind pattern (in
the vicinity of the urban area), is either calculated from a full set of
meteorological  model equations, or an actually observed wind field is used
as input
     Finally, there are representations - also called models - that provide
statistical  information on the occurrence of pollutant concentrations, and
which do not make use of wind or other meteorological parameters as input.
     A further distinguishing feature, different from any of those discussed
above, appears in the nature of the model as to whether it is source - or
receptor-oriented.  The distribution and the emission rate of pollutant

-------
sources are assumed to be known in "source-oriented models."   Pollutant con-
centrations are then calculated from this  source  distribution  over  the  entire
region of the model.
     The opposite is true for the "receptor-oriented  models":   In their pure
form no assumptions are made about emissions,  and only  ambient concentration
is monitored at a number of receptor sites.  Statistical  or other inferences,
which may or may not be linked to meteorological  information,  are then
drawn - and possibly extrapolated - from the observed data.
     Source-oriented models tend mostly to be  explanatory and  involve causal
relationships between the pollutant emissions  and concentrations.   Only ex-
planatory models can provide the necessary means  to control the system  and
produce desired changes in performance.  Receptor-oriented models are gener-
ally descriptive and less directed toward establishing  cause and effect
relationships.
     Table 1 also distinguishes between short- and long-term objectives, i.e.
whether the result of the calculation is needed in the  next few hours,  or
in a few years.  This aspect generally coincides  with the distinction made
between computation of short-time concentration values  or forecasts, and the
request for long-time averages.  A typical, but rather  arbitrary, separation
between these two classes, would be 24 hours.   Anyway,  the basic idea  in the
30-minute averaging time computation and that  for the seasonal or yearly
average, show enough difference for recognition of two  quite distinct  classes
for short- and long-time mean calculations.
     The foregoing principles enable us to classify the main types  of  urban
air pollution "models," as shown in Table 2.
     Examination of Table 2 shows that beyond  the main  classification  cri-
teria  (e.g. of short- or long-term averages; source-  or receptor-oriented)
four kinds of models may be distinguished in terms of the type of  information
they provide.  This distinction may be termed  "model  character" and is  de-
noted  by letters from A to D in Table 2.
     A.  This letter denotes models which use  either assumed  or actually
observed values for the meteorological parameters.

-------
     With assumed parameters, the plume and volume-element models  give  nu-
merical results, i.e., concentration values as  a  function  of the space  coor-
dinates.  It is beyond the scope of the model  to  consider  whether  or  when
the assumed set of meteorological descriptors will  materialize.  The  results
are only as good as the input data.  This  category  of  models provides am-
bient concentration from inputs, and is analogous to the situation in chemical
engineering where content of a reactor is  computed  from reactants, stirring,
temperature, etc.  The output is primarily a numerical  value asssigned  to a
space and a time coordinate.  Usually, this kind  of calculation justifies the
use of a computer, mainly when a larger set of  computations is being  made.
This is often the case in long-term calculations  (plume combined with fre-
quency classes).
     On the other hand, the "box model," which  can  conceptually be derived
from the principle of mass conservation, is an  example of  a model  where the
use of a computer can often be avoided.
     B.  In forecasting pollution, the output from  the calculation is ex-
pressed sometimes numerically, but more often by  categories, by probabilities
or in some other convenient way as used in meteorology.  The quality  of the
forecast is limited by the atmospheric predictability.
     C.  Statistical  description, in either of  its  forms,  is a summary  of
data already on record.  Valuable for predicting  trends or cycles, it is
of littla use for a true forecast (i.e., today's  estimate  of tomorrow's pol-
lution).  For data-management, compilation and  computation, the computer
is almost a necessity; for search and exploitation, it is  only an  advantage.
     D.  Finally, we have the description  (which  may also  be termed "statis-
tical") or summarization of data already on record, mostly in form of graphs
or tables, and intended mostly for long-term inferences.   The output  is in
terms of a frequency not assigned to any time  coordinate.   Although the pre-
liminary tabulation is facilitated by use  of a  computer, the use of the re-
sulting tables or graphs is mostly computer-less.
     The use of computer-less methods may  be advisable, mainly in  the follow-
ing situations:

-------
     1.   When relatively low resolution  information  (e.g.,  just  an  annual
mean) is sought.
     2.   When the available input information  does not  justify a complex
algorithm;  when the range of error of a  high precision  calculation  would  still
be large because of observational  inaccuracies.   This case  is rather more
frequent in air pollution than  generally assumed.
     3.   When the predictability of the  atmospheric  motions sets an upper
limit to the predictions of the models.

-------
                                SECTION 2
                THE LIMITS SET BY ATMOSPHERIC PREDICTABILITY
     We shall distinguish between the conservation of the identity of air
parcels and our ability to simulate or compute the trajectory of these air
parcels.
     Let us suppose that at a certain instant of time volume elements of air
can be marked by tracers, which are "ideal" balloons that are able to follow
every motion of the surrounding air, and the tracks of which can be observed.
Thus, each mesh cube is determined by eight balloons in the atmosphere.   These
'mesh particles' will  undergo a rapid change of their shapes during the  fol-
lowing days, long bands will  be stretching, and finally the development  will
proceed to a chaotic state where the 'particles'  have lost their identity.
     All particles have the shape of a cube, i.e., as bounded by squares at
t = o.  A particle will be said to have ceased to exist if one of the corner
points of the quadrilateral  crosses one or both opposite sides during the
course of time.
     Robinson (1967) found that a particle with mesh size equal  to 300 km
should cease to exist  within  the period 12h < t < 75h.   Egger (1973)  using  the
data of KAP (1968, 1969) on large-scale dispersion of clusters of particles
in the atmosphere, suggests 45h < t < 72h while using the data of EOLE
(Morel 1972; Larcheveque, 1972) he arrives at t«45h.
     This estimate is  one on  an upper limit for atmospheric predictability.
No numerical forecast  model  however it is designed, can do better than this.
Our ability to predict is further limited by the following factors:
     One of these arises from the finite representation of the atmospheric
fields in the models,  which makes it impossible to describe scales of motion
below grid scale.  Due to the nonlinearity of the hydrodynamic equations,
parts of the turbulent energy which is contained in the subgrid range, will

-------
appear under an alias in the larger scales,  thus  limiting  the  predictability
of these scales.   "It is this last type of uncertainty that  is generally felt
to be responsible for the limit of predictability of  various scales" (Fleming,
1971).  Another factor is the insufficient knowledge  of the  initial  condi-
tions, such as errors in the raw data.
     For planning purposes, we want to  be able  to calculate  the influence of
different sources at specific sites, on specific locations of the urban area.
By using source-oriented models we attempt to establish a  cause - to -
effect chain, between the emissions of  a number of sources and the ambient
concentration at given locations.   The  main  links of  this  chain are the
following:
     1.  Knowledge of the source strength.
     2.  Adequate definition of the meteorological  parameters.
     3.  A reliable method for the calculation  of the dispersion from
inputs 1. and 2.
     4.  Adequate knowledge of the pollutant losses (or formation]  by
chemical or photochemical reactions.
     Almost all these requirements can  be subdivided  into  many parts.  There-
fore in passing from source to ambient  concentration  a total of ten to twenty
elementary processes have to be estimated.  Only  a few of  these can be cal-
culated free of error.  Many can only be estimated roughly so  that each
estimate may be tainted by large instrumental or  theoretical uncertainties.
Almost all  of these errors increase with decreasing wind velocity.  A brief
summary of the facts is as follows:
     1.  Source strength : wind velocity has no influence  on this factor.
     2.  The main meteorological parameters—wind velocity and direction—
are not monitored by currently available instruments  when  the  wind velocity
sinks below 1 or 2 m s"  .  However, this is  not only  an instrumental diffi-
culty that could be remedied in the future.   For  the  literature on turbulent
motion in the atmosphere is unusually scarce on the topic  of the directional
variance of very light winds.  This arises from the fact that  the stability
of high building structures and the safety of aircraft is  not  affected by
such winds, and specialists in these areas of research have  more immediate

                                     8

-------
problems at the upper end of the scale.   Theoreticians  are embarrassed by the
lack of an exact approach and prefer to  pass  on to other topics.   In contrast
the synoptic meteorologists are fully aware that light  winds  most frequently
are variable and with poorly-defined directions.   The common  observation  of a
weathervane or of sailboats under such conditions, makes it unnecessary to
cite in detail the few available tetroon-flight experiments.   These  only  add
very little to the already plentiful  evidence.   In the  monograph  of  Lumley
and Panofsky  (1964, p 151) which contains extensive information about
atmospheric turbulence, one only finds the following  brief statement on the
subject of the standard deviation of the wind azimuth:   "The  unexpected fea-
ture is the tremendously large scatter and the  frequently considerable values
of standard deviations in stable air.  Further  analysis of the observations
of inversions indicates that the largest standard deviations  of azimuth occur
in light-wind conditions	gradual azimuth  drifts with periods of the
order of 20 minutes were observed in light-wind inversions.   The  origin of
these drifts is unknown.  Their occurrence adds two difficulties  to  the esti-
mation of the lateral  diffusion:  first, they make lapse rate and wind speed
poor indicators of lateral wind fluctuations; secondly, even  if the  standard
deviation of azimuth is known to be large, it is  not  known whether these
large, but more or less local, standard  deviations produce rapid  spreading of
air pollution."
     3.  All known plume dispersion equations have a  singularity  near zero
wind velocity, and therefore their use at very  low velocities becomes suspect.
     4.  The incomplete knowledge available about pollutant transformations
and sinks, is certainly not made any less important in  the case of light  winds.
Optimistically, we can only hope that these deficiencies in knowledge will
not increase the error under calm conditions.
     Thus, even if low wind velocity did not influence  the questions 1 and
4 above, its effect through the questions 2 and 3 would be so overwhelming
that source-oriented models would break  down  completely during light winds.
It can be conjectured from evidence concerning  urban  airflow  and  urban heat
islands that during conditions favorable to the formation of  an urban heat
island, the source-oriented models will  be of no use.  In numerical  terms,
this limit might be expected when the geostrophic wind  diminishes to less than

                                      9

-------
3 m s~ .   Very probably, this is  not a  rigid limit,  but varies  with  the  city
size.
      It should be emphasized that because of these  arguments,  we  do not
speak about the usefulness of calculations based  on  plume-dispersion formu-
lae at very low wind speeds.   The whole model  concept,  as  being the  causal
chain between pollutant source and ambient concentration,  becomes  meaningless
when the wind velocity falls  below a certain value.
      To express these considerations in the terminology of operational  re-
search, one would say that we are dealing with a  multi-nodal  chain.   At  each
node, along with some information, we introduce more or less  random  noise.
Yet, just such a multi-nodal  chain with noisy input  could  be  used  to simulate
the outcome of a throw at roulette.  For suppose  the torque applied  to the
roulette wheel could be electronically  monitored, and assume  the same for
the velocity and the angle of the roulette ball.   Then  apply  the known accu-
rate equations of the mechanics of rigid bodies.   Do a  few more steps of
computations and you have the final definitive system to beat Las  Vegas.
      Obviously, you will never be able to do that.   But by the same logic,
multi-nodal models with the introduction of random noise at every  step
will not indicate with accuracy tomorrow's pollutant concentration.   On  the
contrary, the more steps (nodes)  that are used, the  less accurate  will be
the forecast of the outcome of any one  individual occasion (calculation).
Sophistication may be a way to improve  the precision of averages,  of find-
ings about categories or to observe trends, but it seems of no  use for
improving the accuracy of forecasts.
      The strong statement should not be interpreted as saying  that  all  so-
phistication is definitively to be rejected.  As  the body of this  paper  will
show, some very simple one- or two-step schemes show an honorable, if not
outstanding performance mainly in forecasting. On the other hand, if very
sophisticated, long-chain arguments must be bad,  then there is  some  inter-
mediate length of operational chain which might give optimum results.  Re-
search should be oriented towards methods which are  intermediate between
utmost simplicity and noisy sophistication.
      The roulette wheel is an example of a mechanical  system beyond the
                                     10

-------
reach of mechanical cause-to-effect calculations.   But we shall  try to develop
this concept gradually, uy considering a heavy beam supported by an axis of
low friction situated near to its center of gravity.  If the latter is below
the axis, the device becomes a sensitive balance.   Any perturbation of the
balance can be described analytically, in terms of oscillation and equilibrium
positions.  If, however, the center of gravity and rotation axis are made to
coincide (they never actually do), the angular position at which the beam will
stop, can no longer be predicted analytically and  the problem becomes one of
probability.  Somewhere in the process of approaching the axis to the center
of gravity, the chain of causality has broken down and has been  replaced by a
probability situation.  Of course, I do not wish to discuss the  fundamentals,
as these are well-known from the probability calculus; my purpose is only to
emphasize that a similar situation occurs in urban air pollution as for the
example of beam.  When the chain of governing equations between  cause and
effect becomes too long, and at each step rather unknown perturbations are
introduced then the use of calculus should be abandoned and a new probabilis-
tic approach should be attempted.
      This is what occurs in urban air pollution,  when the wind  velocity sinks
below approximately 3ms; above this lower limit, atmospheric aerodynamics
is a powerful tool, but below or close to it, hydrodynamical  equations are of
as much use as classical mechanics would be for calculting the face on which
a dice will fall.   There are two distinct regimes  in urban air pollution:
one for strong to moderate winds and another for light winds during calm con-
conditions.
      The difference between urban air flow conditions with moderate and
strong winds and those during light winds, and also the fact that street ven-
tilation changes character when rooftop wind speeds fall between 2 and 5 m s~ ,
is already well-stressed in the  literature.
      Insofar as source-oriented models rely on classical analytical equations
and on a cause-to-effect chain, they will behave very poorly in  warning sys-
tems or in episode control  strategies, because generalized and protracted
pollution episodes occur mostly during moderate and light winds.  On the con-
trary, plume-concepts can be quite useful to localize pollution  effects due
                                     11

-------
 to  point sources or groups, when the winds are above 3ms
      By the same argument, source-oriented models, when used as a basis for
 long-term averages, may be useful if treated with circumspection and provided
 that  light winds and calm periods only happen infrequently.  However, when
 meteorological taoles of the urban area of interest indicate that eyen only
 5 to  10% of the winds are below 2 m s~ , then the validity of the concentra-
 tion  distribution as computed by a source-oriented plume model, should be
 questioned.  Numerically, these concentrations will be in gross error at the
 higher levels, which—even if they occur with low frequencies—are the most
 important ones, as regards effects.
      Receptor oriented models, sometimes with some empirical keying to the
 source inventory, can be used for warning systems, provided that meteorolo-
 gical parameters are correctly forecast.  The vital question is, what can
 be  reasonably expected from this kind of forecast.
      Though nowhere clearly stated, a widespread belief prevails in air pol-
 lution circles.  It seems to say that for any two time intervals, character-
 ized  by unchanged emission rate and by approximately two score meteorological
 parameters  (such as wind direction and intensity, thermal gradient, cloudi-
 ness, the situation of a given air parcel relative to a front, etc., etc.),
 if  all these parameters were equal then pollutant concentrations would also
 be  the same, for both time intervals.
      By the same logic, it could be expected that if two or three score ap-
 propriate parameters were identical, then the same form of cumulus cloud
 would hover over the same quarter of the city.  Of course, nobody would dare
 to  assert this as fact.  Continuing in this vein, we should not expect that
 pollution concentration forecasts will be fully accurate, all the time.
      The  following example may also emphasize what can be reasonably ex-
 pected, as  regards the accuracy of air pollution concentration computation.
 The average deviation from scheduled arrival times at Paris airport, due to
 weather  conditions, was only  six minutes, during 1973  (personal communica-
 tion).   Flights  cancelled before departure, as well as delays due to techni-
cal  or commercial  reasons, are not  figured in this  statistic.  Considering
that the  average  flight  times  were  about  three hours, this  means that the
                                    12

-------
"estimation" was done with  4% error.   Now,  these  aircraft  are  driven  by thou-
sands of horsepower,  guio^d by exceptionally skilled  crew, assisted on  the
ground by other most  competent people and the most  powerful  computers ever
built.  If all  this complex system results  in a 4%  relative error, then how
can we expect that the calculation of an  air parcel's trajectory, driven by
its own buoyancy and  some turbulent airflow only  -  instead of  a  jet engine  -
should perform any better?
                                     13

-------
                                  SECTION  3
                            FORECASTING POLLUTION
     Perhaps the least computerized branch of air  pollution  engineering  re-
lates to the forecasting of pollution.   We have  to distinguish  quite  clearly
between forecast and calculation.   The  latter term is  used to denote  the
operation of taking some formula - e.g.  plume, statistical time series or
anything else - and then substituting  into this formula  some assumed (e.g.
for the next winter season...)  or meteorologically forecast  parameters.   The
forecast on the other hand is a process which used knowledge that  is  avail-
able today (e.g. past statistical  record,  to-day's pollution concentration,
to-day's meteorological  forecast,  etc.) to predict (a) a  time  (e.g.,  tomorrow;
or even a given hour...) and (b) a pollutant concentration for  that time.
The upper limit of the time span is that for which a forecasting skill  can
be demonstrated, and- might be for a few hours or a few days  in  advance.   We
exclude from "forecasting" the  climatological estimate of long-term averages,
although it is implied that they are often taken into  account by the  fore-
caster.  In order to be termed  a "pollution forecast," the pollution  concen-
tration estimate must refer to  a specific day or hour  and not  to a probabili-
ty of occurrence within a given time span.
     Almost synonymous with "air pollution forecasting" are:  "episode
forecasting" and "alert announcements."  By "episode"  and "alert," everybody
means a spell with above-average pollutant concentrations.   However consider-
able confidence must exist as to "the likely duration of the  spell  and how
high the concentration may rise before  calling an  "alert."   If  by "episode"
we understand a relatively high, and not too-frequent  pollution level,  and
not just the fact of exceeding  some hygienic or  legal  limit, then by  defini-
tion an episode is a statistically rare event or an extreme  occurrence.
Former experience about such events is  generally scarce,  and difficulties
increase when the episode level is set  very high.   If the level is set  un-
reasonably low, so that it is often attained through stochastic fluctuations,
                                     14

-------
then all predictive qualities will  be suppressed by the noise.   In  terms  of
the cumulative concentration frequency,  it would not be sound practice to
attempt to forecast the upper 0.1  percentile of the concentrations.   On
the other hand, concentration forecasts  of either below or above the 50%
percentile would scarcely be considered as episode forecasts, but rather as
being air pollution concentration forecasts limited to two classes.  Thus
the success or the skill in episode forecasting depends in very large measure
on the  definition of an episode.  Often instead of using  the concentration of
a single pollutant to set the episode criterion, an "air pollution  index" is
defined.  This may be considered as a scheme that transforms the weighted
concentrations of several individual pollutants into a single number.  OTT
and THOM (1976) found that 35 U.S. metropolitan air pollution agencies use
some form of air pollution index and no two indices are exactly the same.
Thus, an index value of 100 reported in Washington, D.C. means something
entirely different from a value of 100 reported in Cleveland, Ohio.
     The goal of forecasting a numerical value for the concentration in air
pollution episodes is statistically more elusive than that of the day-to-day
prevision of the "episode."  With stringent definition of the eoisode inten-
sity, 90% or more days are non-episode days even in winter.  Hence, a random
guess,  based on the average frequency of the non-episode days, will yield 80%
or more correct forecasts.  If instead of a random guess, one decides to
forecast "non-episode" for each day, then 90% or more of the days will be
"correctly" forecast.  As during non-episode days the concentration is rela-
tively  low, a constant-value estimation near the ensemble average will
minimize the RMS error, without demonstrating any real skill of the method.
If concentration episodes are predicted exclusively, such spurious  effects
cannot  perturb the judgement on the method.
     The difference between air pollution potential forecasting and episode
forecasting is that the first does not take into account pollutant emissions,
while the second is concerned with the emission-dispersion interaction.  The
first is a weather forecast that is oriented towards ventilation-forecasting,
while the second is an air pollution forecast.
     The difference is also one of scale.  Air pollution potential  forecast
                                                                      2
concerns a large area on a regional scale, covering perhaps 200,000 km  or
more, and extending in time over 24 or 48 hours.  These scales are related
                                     15

-------
to the controlling synoptic event,  the warm core stagnating anticyclone
which, with the associated light winds and subsidence,  reduces  atmospheric
dispersion and enhances the accumulation of pollutants.   Thus  this  forecast
requirement is primarily dependent  on  meteorological  variables  (W.M.O.  1972).
     Urban episode forecast has  a space-time scale where local  circulations
(e.g. land-sea breeze, drainage  winds, heat island effects, etc.)  become very
important, especially during large-scale stagnation situations.   The length
scales are from 10 to 100 km,  the latter for large megalopolis  areas.   The
time scales of the prediction  requirements range from a few hours  to about
two days.
     All forecasting represents  a correlative evaluation of the post hoc -
ergo propter hoc type.  It was observed that calm winds and restricted verti-
cal exchange are conducive to  high  pollution episodes.   How many times was it
observed?  Perhaps nobody cared to  count and to make up a contingency table.
It is not necessary to be a scientist, nor even a grown up human in order to
link together two simultaneous or subsequent events into a predictive corre-
lation.  Animals build up conditioned reflexes in the same way.
     Air pollution levels expressed in such terms as "insignificant", "near
average" or "high" can be surprisingly well predicted by a skilled observer
who is familiar with the air pollution record of a site, has access to the
current weather forecast, and who is able to look out through the window.
Such "no cost, no computer" forecasts provided from 75% to 85% correct an-
swers (depending on the forecaster and the season) for the Rouen and Stras-
bourg (France) urban areas.  Such performance equals or even surpasses that
of much more elaborate schemes described in the literature.
     The opponent of the sophisticated  would now exclaim with relish, that
simplicity has finally triumphed!  This however, is not really the case.
The skilled observer here performed with a sophistication not equaled by any
computer program available at the present time:  he recognized a pattern.
We do not have a computer able to identify a handwritten numeral, a signa-
ture or a face.  Almost any human being can do these things.  A computer
cannot even recognize  the form "circle", unless pre-coded  in color, contrast,
dimension, perspective, etc.  The human is able to do these things automati-
cally.  Thus  the human outperforms the  computer in a specific skill called
pattern  or gestalt recognition.  We do  not necessarily have to relinquish
                                     16

-------
the computer.  We must simply chose the pattern elements,  or in other words
the predictors, and the series or the program which will  enable the computer
to perform objectively the same forecast which was  made subjectively and
unconsciously by the skilled observer.
     The pattern elements or prescriptors must be chosen  in an economic
fashion.  For not only  is their observation expensive, but when the number
of prescriptor class combinations becomes large, thp number of cases included
in each category may be undesirably small, even for the longest stable re-
cords that are available.
     When the prescriptors are of qualitative or discontinuous nature, they
can easily be divided into classes, and contingency tables become very useful.
The number of classes or groups should be small, and usually less than five
for each variable.  Contingency  tables can strictly only  be used if it can
be assumed that the data are independent of each other, which is rarely the
case in air pollution climatology.  The best way to test  stability is to
generate a new multiple contingency table of similar size and then compare
the relative frequencies in the two tables.  From such a  comparison it
should be possible to determine how the system will work  when used in actual
forecasting.  When using a contingency table in forecasting, the actual pre-
dictor combination is determined first, and then the forecast will be the
predictand group that occurred with the highest frequency in the past.  The
probability of the forecast can also be estimated.
     MOSES (1969, 1970) has described the implementation  of this method at
the Argonne National Laboratory.  Called "the Tabulation  Prediction Scheme,"
the places of the prescriptand values and the freauencies were reversed.
The column headings in the Table are the minimum, the 25, 50, 75, 90, 98, 99
cumulative percentiles and the maximum.  The entries for each column then
give the respective pollutant concentrations:  this can be seen in Table 3.
Also shown are the inter-quartile range, the difference in S0£ concentrations
between 75 and 95 percentiles, the mean, the standard deviation and the num-
ber of cases for each entry.  A computer is not an absolute necessity for
this  compilation, although it certainly makes it easier.
     As is the case for any other empirical statistical model, the tables of
the tabulation prediction scheme must be continuously updated.  The use of
this scheme in an air pollution incident control test in  Chicago, 111. was
                                     17

-------
demonstrated by CROKE and BOORAS  (1970).
     Although the tabulation  prediction  scheme  is  easy  to  use  -  it  is  possi-
ble to look up any set of meteorological  conditions just as  one  would  look  up
a word in dictionary - a considerable  amount of insight is necessary in  order
to develop an effective set of tables.   For fuller details on  their selection
of variables, and construction of the  tabulations  and application,  see CROKE
and ROBERTS (1971, p. 169-184).
     This selection may also  be  performed by a  computer.   By choosing  a  set
of predictors, which may initially contain useless or redundant  ones,  an
adaptive pattern classifier,  which is  a  device  whose  actions are influenced
by its past experiences, can  be  used to  assign  each pattern  to a category
that has been a priori characterized by  a set of parameters.  First the  pat-
tern is digitalized by a "preprocessor."   If any of the dependent variables
prove to be misleading or irrelevant,  then a technique  must  be devised for
their deletion.
     RUFF (1974) used the adaptive pattern classification  for the forecast  of
ozone levels above 0.1 ppm at San Jose,  California.
     The following Tist of trial  inputs  were used as  ozone predictors  for San
Jose:
     1.  N02 - Selected because it reacts with  sunlight to ultimately  form
°3-
     2.  CO - While there may be significant photochemical reactions involv-
ing CO, evidence indicates that CO is  a  good indicator  for automotive  exhaust
pollutants, which are known to play an important role in  photochemical smog
formation.
     3.  The time-rate of change of CO in the atmosphere  as  a measure  of the
degree to which the primary pollutants are being dispersed.
     4.  Oo - The concentration of ozone in the early morning hours is
indicative of the amount of photochemical activity.
     5.  Percent sunshine, used along  with temperature, to represent radia-
tion intensity.
     6.  Ventilation index computed at 0400 and 1600  hours daily.  Low index
values imply that the temperature inversions exist with low  wind speeds
that inhibit dispersion.  High values  indicate  that the condition of the
                                    18

-------
atmosphere is more conducive to thorough dispersion.
     7.  The daily average surface wind.
     The time of prediction is another variable that must be considered.  If
an accurate prediction is made in the early morning hours, such as at 4 a.m.,
then abatement action can be taken if necessary to reduce the amount of emis-
sions.  On the other hand, a later prediction, such as one at 9 a.m. is not
as effective in curtailing the sources but can serve to warn the general
public to modify their physical activity.  The approach will be to optimize
the predictor over a time period ranging from 2 a.m. to 10 a.m. in one hour
increments.  Therefore, the model is predicting from twelve to three hours
in advance of the normal  daily ozone peak.
     Since the degree of photochemical smog is strongly dependent upon time
of year, the specific model is optimized over a limited period of three
months.  Implied in this  approach, is the fact that the time of year is it-
self one of the variables.  In an attempt to hold this variable constant,
the training set consists of August to October data.  The model is then
subsequently evaluated for September data.  One further restriction is that
only week days are considered in the analysis.  The rationale is that week
days are generally characterized by similar source emission patterns.
     A special program was developed so that various combinations of input
variables could be tried  in rapid succession.  Results of the application of
this program showed that  no single parameter exhibited a pronounced effect
on the classification accuracy.
     The next step involved eliminating groups of variables.  The best re-
sults correspond to the case where all N02 inputs were deleted.  The predic-
tion distribution for 9 a.m. with all NO- inputs deleted, is shown in Table
4.
     It should be emphasized that the results of Table 4 were obtained on a
development sample, the accuracy of the meteorological variables being 100%.
When used on independent  data (September), prediction accuracies between 65%
and 95% were obtained, depending on the hour of prediction.  The 7 a.m. pre-
diction, perturbed by the high pollutant peaks during the morning rush hour,
exhibits the lowest accuracy.  The total number of cases - 7 days with _> .1
ppm 03, 11  days below this limit - does not allow a significant statistical

                                     19

-------
analysis of the results.
     If this is what the utmost sophistication and resort  to  a  corpjte*-  car
accomplish, one should perhaps look at the other end  of the  scale  ^cr  a  set
of predictors as simple as possible (Benarie,  1971).
     As an episode severity criterion any 24-hour concentration larger DV  a
factor of at least 2.0 than the running average over  the 3G  previous ciaxs,
was selected.  The 30-day running average was  considered as  a fair approxima-
tion of the actual seasonal average.
     An essential  condition in defining the predictors  was their general  avail
ability through broadcast, i.e., not limited to the users  connected by tele-
type to the National Meteorological Service.  This condition  limits not  only
the number and kind of predictors, but also the time  of their availability.
The latter if determined by the radio or the press, is  several  hours behind
the information dispatched to the teletype user.  Finally, to establish  an
air pollution forecast it should also be meaningful to  the non-meteorologist.
     The rationale behind the choice of the main predictor was  that 24-hour
calm inversion conditions are seldom conducive to pollution episodes.  Ac-
cording to the theory of BOUMAN and SCHMIDT 11961), confirmed by the Dec.
1952 episode of London, England and the Dec. 1959 episode of Rotterdam,
Netherlands, and by LAWRENCE'S (1967) analysis of several  London episodes
and KOLAR's (1967) discussion of several others, concentrations increase
proportionally with time at the beginning of an episode, and grow  proportion-
ally to the square root of time afterwards.  During the first 24-hour period
of calm winds, twice the value of the running seasonal  (previous 30-days)
mean is seldom attained, although it occurs with high probability  if calm
conditions persist for a second day or more.  In this way, the predictor set
becomes of utmost simplicity:
     1.  The observation of an elapsed period of 24-hours during which the
mean ground  level wind velocity was less than 3.0 m s"  .
     2.  A forecast of a similar situation  for  tomorrow.
     This simple  rule was  checked at Rouen, France, during two additional win-
ters (BENARIE  and MENARD,  1972) from October to March,  i.e.  540 forecasts,
and is shown in Table  5.   It  appears from  this  Table that of the  average  13
episode-days per winter, on the average  8  are correct  even when the occasions
                                     20

-------
of incorrect meteorological  forecasts  are included.   If the  latter  are  ex-
cluded in order to reflect more closely the real  merits of the method,  then
out of 28 episode-days only 3 were incorrect,  that is around 10%.
     Table 5 shows a total of 18 false alerts, of which roughly  one half are
due to incorrect meteorological forecasts, while  the other half  are due to
the prediction method itself.  Research is in  progress to determine predic-
tors which will optimize the ratio between the two kinds of  errors.   Obvious-
ly a "broad predictor" would hit all  the real  episode-days,  although  addi-
tionally set off a great number of false alerts;  a "sharp-predictor"  would
avoid false alerts, but miss a number of real  episode-days.   But as things
now stand, meteorological error is the cause of 17 incorrect episode  fore-
casts, while methodological ("choice  of predictor")  error is responsible for
only 14 failures.  Until meteorology  becomes a much  more accurate science,
the search for more accurate predictors would quickly become one of steadily
diminishing returns.
     This situation appears even more clearly, if Table 5 is re-interpreted
as a contingency table, that contains not only the (necessarily  restricted)
number of episode-days, but the whole forecast period.  Table 6  is  such a
display for 523 days when no meteorological error occurred,  indicating  the
skill score of the predictor choice.   Table 7 contains the whole period of
540 days and provides the skill score for the effectiveness  of the  forecast-
ing program when the meteorological error is included.
     The forecast method was further tested during the 1970-71  and  1971-72
winters, in Strasbourg, France (BENARIE, unpublished) and produced  the  same
skill score.  Its application to other sites seems possible, since  the  fore-
cast criterion is not the absolute value of wind velocity or some other
locally influenced value,  but the duration of the stagnation spell.
     Furthermore, a semi-quantitative relation can be found  between the
duration of the calm and the concentration increase.  For the winter season,
at Rouen, France, the following concentration factors were obtained by  re-
gression analysis of the  1968/69 data  ( = development  sample):  Table 8.
In each case these factors multiply the 30-day running average concentration
recorded at the  same station.
     The factors  relate  to  surface wind and its duration and thus may be
suitable for general application:  they were  foreseen  in the theory of BOUMAN

                                     21

-------
and SCHMIDT (1961)  and confirmed by  the  Rouen  data;  they  also  check well with
observations from other cities  (LAWRENCE,  1967;  KOLAR,  1969).  BRINGFELT (1971)
found in Stockholm, Sweden,  that during  stagnation periods  lasting  3-5  days,
the average SCL-level  becomes  2.3 times  the  winter mean.  The  factors figur-
ing in Table 8. were further checked and found adequate for two  more winter
seasons in Rouen and Strasbourg, France.   The  temperature factor is  less
general, as it is an emission  factor that  is related to climate  and space
heating habits.  These factors  which depend  on wind  direction  are completely
local and linked to the source-receptor configuration (Rouen).
     The factors provided by Table 8 were  further utilized  to  obtain the  re-
sults shown in Table 9 for the 12 forecast episode-days of  the test-set
represented by the 1968/69 winter season.   Four of the 12 forecast episode
days did not materialize.
     Meteorological forecasts  not being sufficiently detailed  to justify:the
breakdown into factors on the day before the episode forecast, an average
multiplying factor of 2.0 was  applied in each case.
     The RMS error of the values using the actual meteorological data  observed
                                  _2
during  the episode-days is 82 yg m  ; the mean for 7 stations  and 12 forecast
                        _3
episode-days is  187 yg m  , i.e. a relative RMSE of 0.43.  For the true fore-
                                                             _2
cast values (third line in Table 9), the RMS error is 87 yg m    , corresponding
to a relative  RMSE of 0.46.  These figures are comparable and  rather below
those generally  attained by other mathematical models in the more favorable
case of day-to-day pollution calculations.  It should be kept  in mind,  that
most models break  down just in  the stagnation-episode conditions where this
simple  method  seems applicable.
     Continuing  to test two more winters (BENARIE and MENARD,  1972) which
were outside the development sample, the relative RMSE obtained was 0.66,
somewhat  larger  than  previously, but still below that for some elaborate
forecasting systems.
                                     22

-------
                                  SECTION 4
                                THE BOX MODEL

SHORT-TERM AVERAGES
     This concept is much too well known to require discussion  of the  details
which will already be familiar to most modelers.   Only a few general  ideas
and some results will be mentioned.  A very thorough discussion of the box
model was given by LETT All (1970).
     1.  Firstly we remember that the box model  may be considered as  derived
from the idea of the continuity of mass of a volume element, as used  in the
advective transport equation.  If the volume element becomes large enough so
as to include the whole urban area, or at least  a major part of it, and if
diffusion can be neglected, then we are concerned with the so-called  box
model represented by the simple formula:
                                 X = cQA/u                               (1)
where QA is the source strength per area unit and u the local wind speed.
     This appraoch almost coincides with the intuitive idea which is  to
assume that pollution coming from an area source is completely mixed  within
a box, which has its base at the ground, and its top at the limit of  vertical
mixing L; in this case L~  = c (SMITH, 1961).  Here we have the basic form
of the box model.  The product uL is the ventilation rate, i.e., the  flush-
ing rate per unit width of the box.  In fact, the general idea of using a
simple proportionality between emissions and concentrations goes back at
least to the Leicester survey (UNITED KINGDOM, 1945).  SHELEIKOVSKII  (1949,
p. 97) also derived an equation of the form of Eq. (1).  He took as the
source strength for particulates emitted by domestic space heating, the pro-
duct of  the population density by an emission factor.  Thus his concept
belongs  to the  long-term box models.
     2.  Another way of considering the problem was developed by GIFFORD
(1970, 1972, 1973), GIFFORD and HANNA (1970, 1973), and HANNA (1971,  1973a).

                                    23

-------
It involves the integration in the upwind direction  of  a  cross-wind  infinite
line-source diffusion formula.  GIFFORD and HANNA  used  the  Gaussian  version,
but other formulations, such as those  based on  Lagrangian similarity theory,
could be used just as well.  The results tend to be  not very  sensitive  to  the
particular diffusion model employed,  since according to the simplifications
that are made, only vertical diffusion is involved.   The  usual  simple power
law
                                  az = Bxb                                 /2\

where  az  is the standard deviation,  x is the downwind  distance,  and B  and b
constants, is used to represent the standard deviation  of the concentration
distribution in the vertical.  The receptor point  is assumed  to be  located
at the center of a source square.  The lateral  dispersion is  neglected, so
that the concentration at a receptor at any time  can only be  influenced by
the (sum) of the upwind sources (GIFFORD 1959,  CALDER 1969:  "narrow plume
hypothesis").
     Based on this line of reasoning,  GIFFORD and HANNA concluded that  the
area-source component of a stable, non-reacting pollutant species would be
adequately described by the following formula:
x =
                                         QA
                                          A
                                             f
                                              ( 2i
                                             L
(3)
                      B ( 1-b) u
where x is the pollutant concentration at ground level,  u  is  the mean wind
speed, x is the source inventory grid spacing,  and QA  the source  strengths
in the (N H- 1) upwind source boxes, i  = 0, 1,  	N.   The  total  ambient
air quality then follows by combining the contributions  from  Eq.  (3), toge-
ther with the point-source contribution and Q0 , the background  concentra-
tion.  Eq. (3) is closely related to several  area source formulas  based  on
the Gaussian model, particularly the study by  CLARKE (1964).
     In fact, Eq. (3) actually takes into account N advective steps,  and so
by itself does not correspond to the simple box, but rather to the multiple
box category.  At present we are concerned with the application of Eq.  (3)
to a series of upwind boxes.  However, if the  wind direction  changes,  1...N
may also be interpreted as the contribution of the l...Nth direction  multi-
plied by its  class frequency.
     On a statistical basis and without considering what happens  upwind
(here we depart  from the source oriented, deductive argument) HANNA (1971)
                                     24

-------
concluded that Eq. (1) is a good approximation to Eq.  (3)  with:

                   c = (~Y  |{2N + 1 ) Ax/2 I     -iBd-b})                   (4)
Equation (1) may be considered as  a box model whose lid height increases
downwind, according to Eq. (4).  Since the quantity (1  -  b)  is quite small
and the product B(l - b) only vanes slowly in the stability range ordinar-
ily encountered over cities, the assumption that C = constant for a given
stability condition is quite reasonable.  According to GIFFORD and HANNA, C
can be assigned the approximate values 50, 200 and 600 for unstable, neutral
and stable conditions.  Since  az  = Bx  in Eq. (2), it follows that
                                  c «  x/az                                 (5)
     It should also be pointed out that by combining Eqs.  (1) and (5) we
obtain
                            x/Qa  =  ( 1/oz) (*/u)                            (6)
The quantity u/x is essentially what LETTAU (1970) terms the "flushing fre-
quency" in his exposition of the role of the box model of urban diffusion.*
     3.  The statistical argument put forward by GIFFORD and HANNA - a third
way to come to the box model - changes the nature of Eq. (1) from source-
oriented into receptor-oriented, and at the same time the formula becomes
a description rather than a deduction.  Nevertheless, it is included in this
section so as not to disrupt the discussion of the box model.
     4.  MILLER and HOLZWORTH (1967), HOLZWORTH (1972) also treated the city
source as a continuous series of infinitely long cross-wind sources.  Verti-
cal concentrations follow a Gaussian distribution and average unstable con-
ditions are assumed.  The normalized concentration X/OA 1'n these circum-
stances is given by HOLZWORTH as
                         x/QA = 4.0 (x/u) 0-115                            (7)
for x/u 10.47 L1'13 (L = mixing height)  and if n£ pollutants achieve  uniform
vertical  distribution.   However,
                   X/QA - 3.61 z0.13 + JL._
*See addendum
                                    25

-------
for x/u^.0.47 and if some  pollutants  achieve uniform vertical  distribution.
In most cases, the term with 0.088 coefficient is  very small  and can be
neglected.
     HOLZWORTH (1972) presents tabulated values of X/OA as  a  function of L,
u and x (=  + the distance the wind travels across  the city),  for the two city
sizes of 10 and 100 km.  The smaller the values of L  and u, and  the larger the
value of x, the smaller are the relative difference between the  x/QAva^ues
from HOLZWORTH's model  and those from the box model.   Thus  HOLZWORTH's
approach may be considered as a fourth  way to converge towards the box model.
The approximation depends here also (as with  GIFFORD  and HANNA's deduction)
on the vertical  mixing.
     5.  Finally, we may note that a test of  the theoretical  approaches will
depend on the empirical determination of
                a.  the vertical  pollutant profiles and
                b.  the horizontal mass balance of the box.
These were  experimentally checked by HALPERN  et al (1971);  sulfur dioxide
concentrations were obtained from helicopter  soundings in the New York City
area and vertical wind profiles by using pilot balloon observations.  The
data of HALPERN at a.1 confirm that the  mass balance expressed by Eq. 1 be-
tween emissions and ambient concentrations is a fair  approximation in urban
areas.
     Also,  we may note the calculation  of CO-concentrations by HANNA (1973a)
and their comparison with the observations at eight stations  in  the Los Angeles
basin, for  the period 5 a.m. - 4 p.m.,  on Sept 29, 1969.  These  are presented
in Table 10.  The concentrations are set initially equal to the  5 a.m. ob-
served concentrations.
     The correlations at each station are of the same order as for other more
sophisticated models.  The overall correlation coefficient from  6 a.m. to
4 p.m. for  all eight stations, is 0.43.  Obviously, in the computation of
correlation coefficient the 5 a.m. figures -- where identity  of  the calculated
and observed concentrations was assumed -- were omitted.  The correlation af
any given hour over the geographical extent of the Los Angeles basin is un-
certain.  It seems that in this case, with eight "boxes" centered on the eight

                                     26

-------
stations, the theoretical basis of the box model has been stretched too far.
In the first place, the single box model is not meant to provide spatial
resolution and in the second, when a multi-box is used, as here, advection
must also play a role.  It seems that we have here an illustration of the
principle, that lengthening the deductive chain by introducing more sophis-
tication (Here  in the quest for spatial resolution) does not necessarily im-
prove the results, because at the same time we introduce an increased amount
of noise.  When hourly means are taken at each hour over the entire basin
(last column of Table 10), it is seen that the calculated concentrations
are mostly too high, by a factor of two or even greater.  The correlation
in this column is 0.74, significant at the ]% level.
     Eq. 1 has been validated by HANNA (1971), and GIFFORD (1972,  1973) on
short-time concentration values for several urban areas.  Table 11  is a sum-
mary of these validation results.  For the comparison of the box model  per-
formance with other models, the references should be consulted.  It is  a
personal opinion of the author's, that only models with similar amounts of
detail, as regards the input and the output, should be compared (BENARIE,
1975).
     The discussion, as so often happens between protagonists of (simple) box
models and others pleading for more elaborate ones yielding a finer space-
time resolution, somehow misses its point.  An analogy with road maps may
perhaps help to clarify the situation.  A general map, say of 1 :  1,000.000
scale will give the distances among various points  almost as exactly the
ten sheets of 1 :  100,000 maps pasted together.  This is because the higher
precision of the latter will  probably be destroyed by matching errors.   Thus
the low resolution map may win, because it achieves the desired purpose
(  = distance measurement) by far more economical means.   But when  it comes
to the intermediate resolution -- the 1 :  1,000.000 map has nothing comparable
to offer.  In the same way, it is not fair to compare calculations  by simple
box (or other space or time average) models with average values given by
higher resolution models.  Lf the user only needs averages then it  would be a
waste of money to search for high-resolution input and compute expensively
detailed estimates, only then to lump them together.  On the other  hand, if
the user asks for high-resolution data, even the best quality average estimates

                                     27

-------
will not satisfy his needs.
     Notwithstanding the attempts to mathematicize it on a  Gaussian or other
basis, the box model is really rooted in the statistical validations advanced
by HANNA and GIFFORD.  Only further testing and comparison  with  observed data
can show just how "good" the concepts are.
     HANNA (1973 b,c) proposed the extension of the simple  box model to chemi-
cally reactive pollutants.
     The model seems fairly successful  in the prediction of hourly variations
of CO, hydrocarbons and NO.  However, it is inconsistent in its  prediction of
N02.
SIMPLE BOX MODEL FOR LONG-TERM AVERAGES
Integral Application of the Box Concept to an Entire Urban  Area
     It was pointed out by GIFFORD and HANNA (1973, where further references
may be found) that if Eq. 1 is applied to yearly or seasonal,  i.e., long-
term averages, the estimates for the pollutant concentrations  compare favor-
ably with those obtained from other models.  Writing Eq. 1  as

                                      QTOT                             n   x
                                 X = c -—                             -O.a)
                                       Au
where QTOT is the total yearly, seasonal . . . pollutant emission of the
source
     A is the area within which the pollutant is being emitted
     uf is the yearly, seasonal . . . average wind velocity; and  using pub-
lished average urban pollutant concentrations, GIFFORD and HANNA obtained
Table 12.
     The average value of c_ from the particle data is 202, and that for S02
is 50.  The authors believe that a large part of this difference is caused
by the fact that Q™,. for SOp generally contains a much larger fraction of
emissions from tall-stack "point" sources, such as steam-electric power
plants, than  does QjOT for  particles.  Eq. l.a. is designed to account for
point-sources only  to the extent that they are low enough to be  considered
part of the distributed area-source component.  If all  the contributions from
strong, elevated point-sources were removed, the two c_-values would probably

                                     28

-------
be closer together.  However, the necessary source data to make this correc-
tion is not included in the published reports.
     While this reason, as advanced by GIFFORD and HANNA, certainly contributes
to enhance the difference between the c_-values, there may also be others.
Thus SCL-averages used do not seem to be representative.   For example, it is
                                                                         -3
not likely that the average SCL concentration in Detroit would be 16 ug m  .
Even the averages listed in Table 12 for Denver, Buffalo, Kansas City, Mil-
waukee, seem below the usual  rural averages.  Another reason for the dis-
crepancy may be the transformation of the SOp into sulfates, thus reducing
the monitored SO^-concentrations.  It is possible that if all these input
errors could be removed, Eq.  l.a. would perform even more generally.
     GIFFORD and HANNA (I.e.) tested Eq. l.a. on the particulate pollution
of 15 additional U.S.  cities, obtaining the same range of c_ values as pre-
viously.  BENARIE (1975) used Eq. l.a. in a reverse way,  calculating from
it the yearly average mixing height, since c_ can be interpreted as the ratio
of the transport distance from the city's edge to the average mixing height.
Very plausible values  were obtained for SCL, for the French cities of Rouen,
Paris, Strasbourg, Lyon, Bordeaux and Marseille, as well  as for the Japanese
cities of Tokyo and Osaka.*
     The relative success of the simple box model for long-time averages must
be attributed at least partially to the fact that all box or cell concepts
are based on the idea  of the infinite vertical diffusivity inside the cell.
Taking long-term averages, the diffusion time may be neglected when compared
to the averaging time.  Thus at least this requirement is satisfied.
Multiple Application of the Box Concept to an Urban Area
     It was pointed out earlier that the box concept is the extension of a
single volume element  over an entire urban area.  As such, it is not meant
to provide resolution,and its specific advantages are linked to the proper-
ties of low resolution, when this is all that is required.  GIFFORD and HANNA
(1970) applied the box concept to a source pattern in the form of a 1 by 1
km grid.  This is obviously a hybrid case lying between the multi-cell and
box concept, and with  the inherent difficulty of allowing for the influence
of the upwind and neighbcring boxes.  GIFFORD and HANNA (1970) adopted a
*See Addendum
                                     29

-------
simple step-wise scheme for wind directions other than the cardinal ones,
while THUILLIER (1973) in a pragmatic way made a subjective estimation of
the relative contributions of the upwind boxes for each county-size box
and each of the possible patterns.   These relative contributions were then
multiplied by the annual recurrence frequencies of the patterns to obtain a
set of annual average contribution weighing factors.
     A comparative (model-to-modelj validation of this multi-box model (with
some modifications) and some plume models, was made in a very relevant paper
by TURNER, ZIMMERMAN and BUSSE (1972).  A subsequent model-to-model and
model-to-observed comparison was made by STROTT (1974) for Frankfurt, Germany.
As with the previous model-to-model comparisons, the computed concentration
isopleths exactly reflect the emission inventory for the area sources -- a
rather obvious result, for which no modeling should be needed.  Quantita-
tively, the multi-box model strongly overestimated the concentrations.  It
seems that the box model should not be overstretched to yield simplicity and
resolution at the same time.
                                     30

-------
                                 SECTION 5
                 CORRELATION WITH DEMOGRAPHIC PARAMETERS
     The basic box model Eq. 1 requires proportionality between pollutant
concentration and the source strength per area unit.  If the wind speed is
averaged over a whole year or even a longer period, this mean value is sub-
ject to only relatively slight changes.  Hence Eq. 1 may be written:

                              x =-  QA = c' QA                       .(Kb)

     where Q^ is the source strength per area unit
           jj_is a long-term average  of the wind speed.
     Relation l.b.  was found graphically by PEMBERTON et al  (1959)  to hold
between the smoke* in Sheffield,  Engl., and the number of electors  per acre
in the district surrounding the  sampling site.  The authors considered as an
index of population density the  number of electors per acre.  On the  other
hand, they assumed  that domestic  heating accounts for the major part  of smoke
pollution and hence, the source  strength per area unit should be represented
by the "elector per acre" index.   It should be noted that although  PEMBERTON
et al did not express the graphical  regression algebraically, the "box model"
was proven in this  way two years  before its first mention by SMITH, in 1961.
In the analogous graph for sulfur dioxide,  PEMBERTON et al  did not  find a
correlation.  This  may be explained  by the fact that sulfur dioxide showed
higher average concentration near heavily industrialized areas with low or
moderate population density.  Consequently, the population density  in Shef-
field was not a good index for sulfur dioxide source strength per area unit.
     The situation  is somewhat different in Paris, France,  where the  contri-
bution of industrial sources to  area source strength, is relatively smaller
than for Sheffield.  As PELLETIER (1967) pointed out — Figs. 1. and  2.  —
both smoke and sulfur dioxide show a strong correlation with population
density.   It may be noted that population density, as obtained from census
                                    31

-------
figures, is just a first approximation  to source  strength,  but  it works  out
well within reasonable limits,  inside comparable  demographic  entities.
     Thus, PEMBERTON's and PELLETIER's  linear regressions are not transpos-
able directly to differently structured areas.  The  per  capita  pollutant
emission may vary between wide  limits,  depending  on  the  fuel  use, nature of
heating devices and other factors.   Population  density may  have a different
meaning in different countries, because census  survey  is done within  admini-
strative limits that are not well  defined emissive entities at  the  time  of
origin.
     PEMBERTON's and PELLETIER's approaches  are regression  methods.   Instead
of meteorological parameters an index of source strength per  area unit was
considered as an independent variable,  linked to  population density.
     Another good predictor of  pollutant concentrations, is the total popula-
tion of the urban area.  In the United  States,  concentrations of suspended
particulate matter were associated with urban population and  this is  pre-
sented in Table 13.  Although this table does not fit  a  linear  regression,
portions of it may be approximated linearly, as for  example:
                           x(M9m-3) = 45logP-145                         (9)

     where log P is the common  logarithm of the population.
This seems to represent fairly  well the important 50,000 to 500,000 popula-
tion range.  Thus we may have an extremely cheap  way to  estimate, within a
range of 50%, the yearly average of the particulate  concentration of an  urban
area, provided that climatology, fuel use, traffic  conditions and population
density are about the same as in the Continental  U.S.  BOLIN et  al  (1971)
reported a similar relationship to Eq.  9 for Swedish cities.
     Statistical correlations for 23 cities, ranging in  population  from
100,000 to 2 million, were run  by CARTER (1973) and  by CARTER and NELSON
(1973).  Using the correlations between the pollutant emissions and the
demographic factors of population, number of passenger vehicles registered
by  county, and the percentage of the work force employed in manufacturing, a
linear modeling technique describes the future air pollution  emissions of a
city by size and the growth of the emissions.

                                     32

-------
                                 SECTION 6
                   CONCENTRATION FREQUENCY DISTRIBUTION
THE LOG-NORMAL REPRESENTATION OF CONCENTRATIONS
     The histogram of urban air pollutant concentrations  sampled over any
given time span (1 minute, 1  hour, 24 hours and so on)  is quite skew.   There
are only a few near-zero values, but afterwards the frequency increases
sharply, only to decrease again gradually towards  the higher concentrations.
A large number of skew distribution functions known in  statistics can be
fitted to such data:  Poisson (WIPPERMAN, 1966), negative  binomial (PRINZ and
STRATMANN, 1966), Weibull (BARLOW, 1971), exponential (BARRY, 1971,  SCRIVEN,
1971), gamma ( = Pearson IV), beta (= Pearson I) and Pearson IV (LYNN, 1972).
None of these has enjoyed the practical  success and the wide acceptance of
the lognormal distribution.  POLLACK (1973, 1975)  demonstrated that  there is
a fundamental similarity among these distributions utilized to fit air qual-
ity data.
     As early as 1958, it was empirically shown that cumulative frequency
distributions of suspended particulates  at CAMP (urban) sites fit remarkably
well a straight line  when plotted on log-normal paper (U.S.D.H.E.W., 1958).
Pronounced tendency towards log-normalcy of particulate concentrations was
also observed by ZIMMER et al (1959) and by GOULD (1961).  LARSEN (1961)
extended this representation to carbon monoxide and ZIMMER and LARSEN (1965)
to carbon monoxide, hydrocarbons, nitric oxide, nitrogen  dioxide, oxidant
and sulfur dioxide and to the main urban areas of the U.S.A.  From this point
on, the lognormal plotting gained almost exclusively amongst the graphical
and functional representations of air pollution concentrations, and  the num-
ber of papers and reports that make use  of it is in the hundreds.
     Considerable theoretical (GIFFORD,  1972; KNOX and  PCLLACK, 1972; KAHN,
1973) and empirical (BENARIE, 1970) support exists for  the lognormal distri-
bution, as the most appropriate for characterizing both reactive and inert

                                    33

-------
pollutant concentrations for a wide range of averaging  times.   These argu-
ments were systematized and publicized by POLLACK (1973,  1975), although
overwhelming acceptance of the log-normal representation  was  a  fact long be-
fore theoretical  proofs became available.  Some of the  reasons  for this
acceptance are:
     1.  The lognormal  distribution is a relatively simple two-parameter dis-
tribution.  Both  parameters have easy-to-grasp physical meaning.
     2.  Convenient plotting paper and methods are available;  the user does
not have to resort to lengthy numerical calculations.  The two  parameters are
easily read off the graphs.
     3.  The lognormal  function has some mathematical properties  (see AIT-
CHINSON and BROWN, 1969) which make its use very easy.  Standard  statistical
tests, mostly requiring a normal distribution of the  population,  may readily
be applied after  the logarithmic transformation which is  automatically pro-
vided by the plotting.
     Inhomogeneous source distribution around the measuring site  may lead to
deviations from the lognormal behavior (BENARIE, 1970).   JOST  et  al (1974)
have attributed this reason for the occasional departures from lognormalcy
observed in Frankfurt,  Germany.
     Objections of a theoretical nature may be raised against  the log-normal
representation of pollutant concentrations (BARLOW, 1971; MILOKAY, 1972;
MARCUS, 1972).  These arguments mostly consider the extreme values, like zero
values of the pollutant concentrations.  It should be realized that such
(theoretically important) concentrations are ordinarily below the sensitivity
threshold of the  measuring instruments, which therefore introduces a thres-
hold parameter.  The practical advantages of the log-normal representation
are full justification for its wide-spread use in air pollution engineering.
AVERAGING-TIME ANALYSIS
     LARSEN (1964), ZIMMER and LARSEN (1965), LARSEN et al (1967), LARSEN
(1969, 1973,  1974) plotted by computer -- the first paper for a period of
one year, and the last for up to a seven-year period -- the concentration
frequencies as a function of averaging time for:  carbon  monoxide, hydro-
carbons, nitric oxide, nitrogen oxide, nitrogen oxides  (NO + N02), oxidants
                                    34

-------
and sulfur dioxide for the CAMP sites in downtown Chicago, Cincinnati, Denver,
Los Angeles, Philadelphia, St. Louis, San Francisco and Washington.  These
plots have been called "arrowhead diagrams" by STERN (1969).  They may be
characterized by the following properties:
     1.  Concentrations are lognormally distributed for all averaging times.
     2.  Plotted on a (log averaging time) - (log concentration) diagram, the
points representing a given percentile (frequency) are aligned almost on a
straight line.*   Hence,  at constant frequency, concentration is proportional
to averaging time raised  to a constant power.
     3.  The 30 percentile is close to the arithmetic mean concentration.
The exponent (see above)  is only a little different from zero, so that the
30 percentile and the arithmetic mean only vary slightly for all averaging
times.
     4.  For the longest  averaging time calculated (usually one year), the
arithmetic mean, geometric mean, maximum concentration, and minimum concen-
tration are all  equal (and thus plot as a single point).
     5.  For averaging times of less than one  month, maximum concentration is
approximately inversely proportional to averaging time raised to an exponent.
The maximum concentration is that corresponding to the 1/n frequency point
(n = number of samples, e.g., 8760 hourly samples per year) on the linearly
extrapolated cumulative diagram.
     Potential  reasons for characteristic 1  above were cited by BENARIE
(1970), GIFFORD (1972), KNOX and POLLACK (1972) and KAHN (1973).  Properties
2 and 3 are the most important experimental  findings based on the analysis
of the  CAMP-results mentioned above.  Property 4 is a necessary consequence
of the  averaging process.   SINGAPURWALLA (1972) has cited possible reasons
for the property 5.  McGUIRE and NOLL (1971) verified the relationship be-
tween maximum concentration and averaging time, for five different air pollu-
tants at 17 California sites in Los Angeles  and San Francisco.  The exponents
are wrthin the range observed by LARSEN.
*The f and the 1-f frequency loci  are in fact asymptotes of parabolae.
The vertices of these parabolae are  located at the one-year arithmetic
average point.  The nearer to this point, the greater the deviation from a
straight line (Personal  communication of Dr. LARSEN).
                                    35

-------
     The main consequence of this concept is to interrelate short- and long-
averaging times in a descriptive, statistical,  and receptor-oriented way.
Therefore, in the systematics embodied by Table 2, the concentration fre-
quency distribution model belongs under both the headings  short-  and long-
term (however, receptor-oriented).   But there may exist a  possible source-
oriented extension, as noted by STERN (1969):  If we were  able to separate
the source factors subject to human control, from the weather factors beyond
such control, we would be able to synthetize the distributions of air quality
data that would result from the application  of  specific control  strategies.
We would also be able to compare them with air  quality objectives, expressed
in like format, to determine which strategy  comes closest  to effecting a
match.  The concentration versus averaging time and frequency diagram  might
have as its components the weather factors and  the source  factors.  The
analysis of the source factor arrowhead chart for its individual  components
would be the converse of the emission inventory approach,  in that the latter
seeks to arrive at the same result through synthesis, whereas the approach
just outlined seeks to arrive at it through  analysis.  The two approaches
should tend to check and reinforce each other,  and thus improve our chances
of determining the relative influence of various source categories across
the averaging-time spectrum.  This should give  us useful leads to control
strategies.
     In the papers cited (mainly those of 1969, 1973 and 1974), LARSEN pro-
vides examples for interrelating air pollutant effects, air quality standards,
air quality monitoring, diffusion calculations, source reduction calculations,
and emission standards.
     In the same papers, LARSEN published extensive tables -
     1.  interrelating the ratio of expected annual maximum pollutant concen-
trations to arithmetic mean concentrations for various averaging times and
standard geometric deviations, and
     2.  the slope of the annual maximum concentration line for various stand-
ard geometric deviations of the one-hour frequency plot.
     With the experimental arrowhead diagram at hand, the expected annual
maximum, or  the  slope of the  line linking it to various averaging times, or
                                     36

-------
to any other parameter of interest,  can  be read  off at least  as  easily as
would be their readout from tables or numerical  calculation.
                                    37

-------
                                SECTION  7

                    WIND AND CONCENTRATION RELATIONSHIPS
     It is evident that pollutant concentration  distributions  are only the
footprints of the windfield.  At the same  time,  it  has  been  observed that
the logarithmic normal  function is a convenient  empirical  representation of
the wind velocity distribution (BENARIE,  1969).   The  fact  that we are con-
cerned at this point with a rough approximation  appropriate  to the argument
that follows below -- without pretending  to describe  general physical  pro-
perties of the wind -- was stressed in  the Appendix and a  subsequent discuss-
ion of a paper by BENARIE (1972).
     Using numerical simulation, for area  sources represented  by n point
sources, BENARIE (1971) obtained fair approximations  to the  log-normal dis-
tribution, provided that n >_ 10 and that the geometric  means and standard
geometric deviations of the component log-normal  functions were randomly dis-
tributed.  The conditions under which the  sum of n  lognormal variates is
approximately lognormal for a limited number of  variates  in  the sum, have been
previously formulated by MITCHELL (1968).   BENARIE's  (1971)  paper supports
the empirical observations that pollutant  concentration for  all cities and
for all averaging times is approximately lognormally  distributed (Section 6)
as a consequence of the (approximately)  lognormal windfield, but the paper
does not quantitatively link the lognormally distributed  windspeeds to urban
pollutant concentrations.
     This link was accomplished in an elegant way by  KNOX and  LANGE (1974),
who noted that the basic box model Eq. 1  suggests that frequency distribution
of the wind speed determines the frequency distribution of the normalized
concentration x/^A where x is the surface  air concentration  of the pollutant
and Q. is the unit area source strength provided that c (the proportionality
constant) is nearly independent of frequency.  In principle, for the cases of
good frequency correlation between x and U~  at  a given sampling station, the
                                     38

-------
constant c1 = cQ. can be found through graphic superposition of the observed
and predicted distributions.  In this case, no direct knowledge of the
source strength Q. is required.
     The normalization constant c1 for the 50 percentile point, is obtained
from the superposition of c'/U distributions on the corresponding observed
x distribution curves.  For the CO-values observed at the building of the
San Francisco, California Bay Area Air Pollution Control District during the
year 1966, c1 was found to equal to 7.5 ppm m s"  and with 1970 data, 7.4
ppm m s~ .  The average value c1 = 7.45 was used as an experimental normali-
zation factor in Fig. 3.
     Instead of using observed x values, c1 may also be estimated by the
values calculated on basis of the means of some model.  For this purpose,
KNOX and LANGE used McCRACKEN et al's (1971) multi-box model.  Since charac-
teristic times in this model are of the order of one hour (box dimension/
wind speed), the concentration values can be interpreted as hourly averaged
values.
     This model was used to predict the CO concentration for a 48-hour period,
July 10-11, 1968.  Fig. 4 shows the observed wind speed frequency distribu-
tion for this period as obtained by the U.S. Weather Bureau on top of the
11-story San Francisco Federal Office Building.  This frequency distribution
is quite closely lognormal.  Fig. 5 shows the observed and the model pre-
dicted CO concentration frequency distributions.  The predicted concentration
distribution XpR has the same slope as the observed distribution XQB, and
a geometric mean 20% above that observed.  The normalization factor c1,
as derived from this model with the data from Figs. 4 and 5 is c1  = xpR
(50%)(i(50%) = 1.6 x 5.7 = 9.1 ppm m/sec.  Fig. 6 shows how the distribution
curves shift when we use this Bay Area model derived normalization factor
c1 = 9.1  on the date of Fig. 3.
     It is of interest to note, that an experimental value of c1 could also
be computed from the observed wind speeds and the observed concentrations
XQD used as a basis for the 48-hour period in the Lawrence Livermore Labora-
tory air pollution model study.  This factor obtained from Figs. 4 and 5 is
c1 = XgB(50%)U(50%) = 1,3 x 5.7 = 7.4 ppm m/sec, which agrees well with the

                                    39

-------
experimental c1 = 7.45 discussed previously for the 1966,  1970 annual  distri-
butions.  The fact that the mean annual  normalization constant can be  deter-
mined so well by considering only a two-day period indicates that the  48-hour
period may be sufficiently long to give  a good average of  the CO source varia-
tion in the city.  This is not surprising, if one remembers that the main
source of CO is the daily automobile traffic.  In other words, to find a
regional normalizing factor c' for an annual  mean concentration frequency
distribution, a model  need only cover the longest basic time period of any
time-dependent sources or sinks involved, just so long as  this period  is
typical.
                                     40

-------
                                SECTION 8
                 VALIDATION - OR THE WAYS TO DELUDE ONESELF
     It is the proper function of the statistician (and I am not one) to pro-
nounce on the merits of chi square, skill score, correlation, RMS and abso-
lute error, and a host of other measures of goodness-of-fit.  I  am sure that
everybody knows all about the computation of these indices and so, in prin-
ciple, nobody needs my advice about the fact that any one goodness-of-fit
index may be misleading.  Nevertheless, I cannot resist the temptation to
illustrate this point by just one example.
     Table 14, in its second column, shows the results of a model calculation,
(LAMB, 1968, based on the concept of mass transport balance taking into con-
sideration chemical reactions) although the nature of the model  and the
method of calculation does not concern us here.  This model was  taken as an
example, since it is rather often quoted as a reference.    Along with the
calculated values, a "random" estimate (Column 4) and a "constant" (average)
estimate (Column 5), are presented in Table 14.  To obtain the random esti-
mate, monotonically increasing values from 0 to 17 ppm were assigned, in
alphabetical order to each station.  As for the last column, values of 14 and
13 (to avoid fractional values as the true mean is 13.5 ppm) were assigned
alternately.  Incidentally, 13.5 ppm is not only the average of  the first
column but also a very likely average figure for many urban areas with auto-
mobile traffic anywhere in the world.
     The entries that give the root mean square error (RMSE) are a caution
against validation by just one statistical criterion.  The model shows a
higher RMSE than the (almost) random or the constant value guesses.  The
correlation coefficient entry rectifies this situation.  The constant esti-
mate -- a parallel line to the abscissa axis — shows as  expected, no corre-
lation with the observed values.  The model's correlation attains the 5%
significance level for 11 degrees of freedom.  However, even the random

                                     41

-------
guess presents a correlation which  could  not be entirely  rejected.   It
should be noted that this "guess" is  not  completely  at  random,  but  rather
an educated guess, since the lowest and highest values  are  linked to some
knowledge about the concentrations  which  might actually be  observed.
     This simple and somewhat superficial  example  can be  generalized and pro-
vides a warning against some of the pitfalls.  The lack of  representativeness
for any single goodness-of-fit index  has  already been mentioned.  A second
point is that pure chance can frequently  produce a fit  which  is not too bad,
provided that the series to be fitted is  short and the  span of  the  estimation
limited.  A third point, also linked  to a limited  span  of possible  values,  is
that judged by the RMS error, the mean is often a  very  good bet --  better
than most calculations.  Finally, no  validation should  be presented without
a comparison with the random estimate (the skill score  does just  this).
                                     42

-------
                                SECTION 9
                                CONCLUSION
     The guiding idea in the present paper is to recommend the use of models
which correspond with the utmost parsimony to the end result which has to
to attained.  Thus, high resolution, sophisticated models  should only be used
when high resolution output information is really needed.   For long-term,
low-resolution purposes, we frequently have fully adequate, low-cost models.
     It is not a safe procedure to obtain long-term,  low-resolution informa-
tion by integration from short-term, high-resolution  estimates.   As the chain
of reasoning lengthens,  unavoidable noise is being introduced at each step.
The end result often is  that the long-term estimate obtained in  this fashion
is less reliable than one obtained by some "computer-less" shortcut.  Also,
it has been shown that,  for forecasting purposes, models that involve a very
large number of modeling steps must perform less well  than those involving
simpler chains.
     Finally, brief warning was given against validations  based  on a restric-
ted number of narrow-span values and the use of a single goodness-of-fit
index.
                                     43

-------
                                REFERENCES*

Aitchinson, J. and J.A.C. Brown, 1969.   The lognormal  distribution.  Cambridge
    University Press, 176 pp.

Barlow, R.E., 1971.  Average time and maxima for air pollution  concentration.
    Univ.  of California, Berkeley, Cal., Operations  Res.  Center Rept.  ORC-
    71-17, NTIS AD-729-413.

Barry, P.J., 1971.  Use of argon-41  to  study the dispersion  of  stack efflu-
    ents.   Proc. of Symp. on Nuclear Techniques in Environmental  Pollution,
    Internet. Atomic Energy Agency,  Vienna, Austria, p.  241-253.

Benarie, M., 1969.  Le calcul  de la  dose et de la nuisance du pollutant emis
    par une source ponctuelle (Text  in  French). Atm. Env.  3^ p. 467-473.

Benarie, M., 1970.  nhout the validity of log-normal distribution of pollu-
    tant concentrations (Text in French).  Second Clean  Air  Congress,  Dec.
    6-11,  Washington, D.C., Ed.  H. M. England and W.T. Beery, Academic
    Press, New York 1971, p. 68-70.

Benarie, M., 1971.  About the validity of the log-normal  distribution  of pol-
    lutant concentrations (Text in French).  Proc. of the 2nd Internat. Clean
    Air Congress, Dec. 6-11, 1971, Washington, D.C., Academic Press, Ed.
    H.M. Englund and W.T. Beery, p.  68-70.

Benarie, M., 1971.  Essai de prevision synoptique se la  pollution par
    Vacidite forte dans la region rouennaise (Text in French). Atm. Env.
    5_, p.  313-326.

Benarie, M., 1972.  The use of the relationship between  wind velocity and
    ambient pollutant concentration  distributions for the estimation of
    average concentrations from gross meteorological data.  Proc. of the
    Symp.  on Statistical Aspects of Air Quality Dat, Nov.  9-10, Chapel Hill,
    N.C.,  U.S.E.P.A., Research Triangle Park, N.C., EPA-650/4-74-038,  p.  5-1
    to 5-17.

Benarie, M., 1975.  Modelling urban air pollution.  Atm.  Env. 9_, p.  552-553,
    discussion to a paper of S.  Hameed,  Atm. Env. 1974,  8 p. 555-561.

Benarie, M., 1975.  Calculation of the mean yearly mixing height over urban
    areas, from air pollution data.   Sci. Tot. Env. 3_, p. 253-265.

*See addendum
                                     44

-------
Benarie, M.,  and T. Menard, 1972.   Verification,  pour les divers 1969-1970
    et 1970-1971 de la prevision de la pollution  par Vacidite forte dans
    la region rouennaise (Text in  French).   Atm.  Env. 6^ p.  65-67.

Bolin, B. et al, 1971.  Sweden's case study for the United Nations  conference
    on the human environment.  Royal  Ministry for Foreign Affairs - Royal
    Ministry of Agriculture, 96 pp.,  see p. 21  and ref.  4 by Brosset C.

Bouman, D.J.  and F.H. Schmidt, 1961.   On the growth of ground concentration
    of atmospheric pollution in cities during stable atmospheric conditions.
    Beitr. Phys. Atm. 33_, p. 215-224.

Bringfelt, B., 1971.  Important factors for the sulfur dioxide concentration
    in Central Stockholm.  Atm. Env.  5_, p.  949-972.

Calder, K.L., 1969.  A narrow plume simplification for multiple urban source
    models.  Unpublished, ref. No.  20 in Gifford  and Hanna (1970).

Carter, J.W., Jr., 1973.  An urban  air pollution  prediction  model based  on
    demographic parameters.  Thesis,  Oklahoma Univ., 171 pp.

Carter, J.W., Jr. and R.Y. Nelson,  1973.  An urban air pollution prediction
    model based on demographic parameters.   Preprint, 66th Annual Meeting,
    Air Poll. Contr. Assoc., Chicago, 111., June  24-28,  20pp.

Clarke, J.F., 1964.  A simple diffusion model for calculating point concen-
    trations from multiple sources.  J. Air Poll. Contr. Assoc. 14, p. 347-
    352.

Croke, E.J. and S.G. Booras, 1970.   Design  of an  air pollution incident  con-
    trol plan.  J. Air Poll. Contr. Assoc.  20_,  p. 129-138.

Croke, E.J. and J.J. Roberts, 1971.  Chicago air  pollution systems  analysis
    program - Final Report.  Argonne  Nat. Lab., Argonne, 111., ANL/ES-CC-
    009, 393 pp.

Egger, J., 1973.  On the determination of an upper limit of atmospheric  pre-
    dictability.  Tell us 25_, p. 435-443.

Fortak, H.6., 1970.  Numerical simulation of the  temporal and spatial dis-
    tribution of air pollution concentrations.   Proc. Symp.  Multiple-Source
    Urban Diffusion Models, Ed. A.  Stern.,  U.S.E.P.A. AP-86, p. 9-1 to 9-33.

Gifford, F.,  1959.  Computation of pollution from several sources.   Int.  J.
    Air Poll. 2^ p. 109.

Gifford, F.A., Jr., 1970.  Atmospheric diffusion  in an urban area.   Paper
    presented at the 2nd IRPA Conf.,  Brighton,  Engl., May 5, 5 pp.
                                    45

-------
Gifford, F.A. and S.R.  Hanna,  1971.   Urban Air pollution modelling.   Second
    Clean Air Congress, Dec.  6-11,  Washington, D.C.,  Ed.  H.M.  Englund and
    W.T. Beery, Academic Press,  N.Y., p.  1146-1151.

Gifford, F.A., 1972.   The form of the frequency distribution of air  pollu-
    tion concentrations.  Proc.  of the Symp.  on Statistical  Aspects  of Air
    Quality Data, Nov.  9-10,  Chapel  Hill, N.C., U.S.E.P.A.,  Research Tri-
    angle Park, N.C.,  EPA-650/4-74-038, p. 3-1 to 3-7.

Gifford, F.A., 1972.   Applications of a simple urban  pollution model.  Proc.
    Conf. on Urban Environment and Second Conf. on Biometeorology, Philadel-
    phia, Pa., Oct.  31-Nov.  2, p.  62-63.

Gifford, F.A., 1973.   Lie simple ATDL urban air pollution model.   Paper pre-
    sented at the 4th  Meeting of NATO/CCMS Panel  on Modeling,  Oberursel,
    Germany, May 28-30, p. XVI-1 to XVI-18.

Gifford, F.A. and S.R.  Hanna, 1973.   Modeling urban air pollution.  Atm.
    Env. ]_* P- "131-136.

Gifford, F.A. and S.R.  Hanna, 1975.   Modeling urban air pollution.  Atm.
    Env. 9^ p. 267-275, discussion to Hameed S. (1974).  Atm.  Env. 8., p.
    555-561.

Gould, G., 1961.  The  statistical  analysis and interpretation of dustfall
    data.  Preprint.   Proc.  54th Annual Meeting Air Pollution Contr. Assoc.,
    New York, N.Y.

Halpern, P., C. Simon  and L.  Randall, 1971.  Source emission and the verti-
    cally integrated mass flux of sulfur dioxide across New York City Area.
    J. Appl . Meteor.  TO., p.  715-724.

Hanna, S.R., 1971.  A simple method of calculating dispersion from urban
    area sources.  J.  Air Poll.  Contr. Assoc. 21, p.  774-777.

Hanna, S.R., 1973a.   Urban air pollution models--why?  Paper presented at
    the Nordic Symp.  on Urban Air Pollution Modeling, Oct. 3-5, Vedbaek,
    Denmark, 19 pp.

Hanna, S.R., 1973b.   Application of a simple model of photochemical  smog.
    Proc. of the 3rd Clean Air Congr. Dusseldorf, Germany, Oct. 8-12,
    VDI-Verlag, Dusseldorf, p. B72 to B74.

Hanna, S.R., 1973c.   A simple dispersion model for the analysis of chemical-
    ly reactive pollutants.   Atm. Env. _7, p. 803-817.

Holzworth, G.C., 1972.  Mixing heights, wind speeds,  and potential for urban
    air pollution throughout the contiguous United States.  U.S.E.P.A.,
    AP-101, 118 pp.
                                     46

-------
Jost, D.9 R. Kaller, H. Markush and W.  Rudolf, 1974.   Analysis of six years
    continuous air pollution surveillance.   In:  Automatic Air Quality Moni-
    toring Systems, Ed. T. Schneider, Elsevier,  Amsterdam, p. 251-260.

Kahn, H.D., 1973.  Note on the distribution of air pollutants.  J. Air
    Poll lit. Contr. Assoc. 23, p. 973.

Kao, S.K. and A. Al-Gain, 1968.  Large-scale dispersion of clusters of par-
     ticles in the atmosphere.  J. Atm. Sc. _25_,  214-221.

Kao, S.K. and D. Powell, 1969.  Large-scale dispersion of clusters of par-
     ticles in the atmosphere.  II.  Stratosphere. J. Atm. Sc. 26., 734-740.

Kolar, J., 1969.  The increase in the SO^-concentration during long-term
     weather situations with poor diffusion.  Staub (Engl.) 29, No. 12,
     p. 32-35.

Knox, J.B. and R. Lange, 1974.  Surface air pollutant concentration fre-
     quency distributions: implications for urban modeling.  J. Air Pollut.
     Control Assoc. 24, p. 48-53.

Knox, J.B. and R.I. Pollack, 1972.  An  investigation  of the frequency dis-
     tributions of surface air-pollutant concentrations.  Proc. of the Symp.
     on Statistical Aspects of Air Quality Data, Nov. 9-10, Chapel Hill,
     N.C., U.S.E.P.A., Research Triangle Park, N.C.  EPA-650/4-74-038, p.
     9-10 to 9-17.

Lamb, R.G., 1968.  An air pollution model  for Los Angeles.  M.S.  Thesis.
     Univ. of California, Los Angeles,  Cal.

Larcheveque, 1972.  Turbulent dispersion--EOLE experiment.  COSPAR XV,
     Madrid.

Larsen, R.I., 1961.  A method for determining source  reduction required to
     meet air quality standards.  J. Air Poll. Contr. Assoc.  11, p. 71-76.

Larsen, R.I., 1964.  United States Air  Quality.   Arch.  Env. Health 8., p.
     325-333.

Larsen, R.I., 1969.  A new mathematical model of air  pollutant concentra-
     tion, averaging time and frequency.  J. Air Pollut. Contr. Assoc. 19,
     p. 24-30.

Larsen, R.I., 1973.  An air quality data analysis system for interrelating
     effects, standards and needed source  reduction.   J. Air Pollut.  Contr.
     Assoc. 23_, p. 933-940.

Larsen, R.I., 1974.  An air quality data analysis system for interrelating
     effects, standards and needed source  reductions  -  Part 2.   J. Air
     Pollut. Contr. Assoc. 24_, p. 551-558.


                                     47

-------
Larsen, R.I., C,E. Zi'mnier, D.A.  Lynn and K.G.  Blemel,  1967.   Analyzing air
     pollutant concentration and dosage data.   J.  Air  Poll.  Contr. Assoc.
     17., p. 85-93.

Lawrence, E.N., 1967.   Atmospheric pollution during spells  of low-level  air
     temperature inversion.   Atm. Env.  1_, p. 561-576.

Lettau, H.H., 1970.  Physical  and meteorological  basis for mathematical
     models of urban diffusion processes.  Proc.  Symp. on Multiple-Source
     Urban Diffusion Models, Ed. A.C.  Stern, U.S.E.P.A.   AP-86, p. 2-1 to
     2-26.

Lynn, D.A,, 1972.   Fitting curves to urban suspended particulate data.
    Proc. of the Symp.  on Statistical  Aspects  of  Air Quality Data, Nov.
    9-10, Chapel  Hill,  N.C., U.S.E.P.A., Research  Triangle  Park, N.C.,
    EPA-650/4-74-038,  p.  13-1  to 13-28.


McCracken, M.C., T.V.  Crawford,  K.R. Peterson  and B. Knox,  1971.  Develop-
    ment of a multi-box air pollution  model  and initial  verification for the
    San Francisco Bay  Area.   Lawrence  Livermore Lab. - Univ. of California,
    UCRL-733 48, 96 pp.

McGuire, T., and K.E.  Noll, 1971.  Relationship between  concentrations of
    atmospheric pollutants and averaging time.  Atm. Env. 5_, p. 291-298.

Mahoney, J.R. and B.A.  Egan, 1971.  A mesoscale numerical model of atmos-
    pheric transport phenomena in urban areas.  Second Internat. Clean Air
    Congress, Dec. 6-11,  Washington, D.C., Ed. H.M. Englund and W.T. Beery,
    Academic Press, p.  1152-1157.

Marcus, A.M., 1972.  A stochastic model for estimating pollutant exposure
    by means of air quality data.  Proc. of the Symp.  on Statistical Aspects
    of Air Quality Data,  Nov.  9-10, Chapel Hill,  N.C., U.S.E.P.A., Research
    Triangle Park, N.C.,  EPA-650/4-74-038, p.  7-1  to 7-15.

Miller, M.E. and G.C.  Holzworth, 1967.  An atmospheric diffusion model for
    metropolitan areas.  J. Air Poll.  Contr. Assoc. 17,  p.  46-50.

Milokay, P.G., 1972.  Environmental applications  of the Weibull distribution
    function: oil pollution.  Science 176. p.  1019-1021.

Mitchell, R.L., 1968.   Permanence of the lognormal distribution.  J. Opt.
    Soc. Am. 58, p. 1267-1272.

Morel, P., 1970.  Satellite techniques for automatic platform location and
    data relay.  COSPAR XV, Madrid.

Moses, H., 1969.  Mathematical urban air pollution models.   Argonne Nat.
    Lab., Argonne, 111., ANL/ES/RPY-001, 69 pp.
                                     48

-------
Moses, H., 1970.  Tabulation techniques.   Proc.  of the Symp.  on Multiple-
    Source Urban Diffusion Models, Research Triangle Park, N.C., Ed.  A.
    Stern, U.S.E.P.A. - A.P.C.O.  Publication No.  AP-86, p.  14-13 to  14-15.


N.A.P.C.A., Air Pollution Control Administration 1969: Air quality criteria
     for particulate matter.  Publication No. AP-49, 211 pp.

Ott, W.R. and G.C. Thorn, 1976.  Air Pollution index systems in the United
     States and Canada.  J. Air Poll. Contr. Assoc. 26, p. 460-470.

Pelletier, J., 1967.  Enquetes de pollution atmospherique dans 1'environ-
     ment.  (Text in French) Poll. atm.  36, p.  240-252.


Pemberton, J., M. Clifton, O.K. Donoghue, D. Kerridge and W.  Moulds,  1959.
     The spatial distribution of air pollution  in  Sheffield,  1957-1959.
     Int. J.  Air Poll.  2, p. 175-187.

Pollack, R.I., 1973.  Studies of pollutant concentration frequency distribu-
     tions.  Thesis, Univ. of California, Livermore, Cal., 82 pp.

Pollack, R.I., 1975.  Studies of pollutant concentration frequency distribu-
     tions.  U.S.E.P.A., Research Triangle Park, N.C., Report EPA-650/4-75-
     004, 82  pp.  This  paper is a reprint of the previous reference.

Prinz, B. and H. Stratman, 1966.  The statistics of propagation conditions
     in the light of continuous concentration measurements of gaseous pollu-
     tants.  Staub (Engl.) 26_, p. 4-12.

Ruff, R.E., 1974.  Application of adaptive pattern classification  to  the
     derivation of relationship between  air quality data.  In: "Automatic
     Air Monitoring Systems", Ed. T. Schneider,  Elsevier Amsterdam, p. 145-
     166.

Scriven, R.A., 1971.  Use of argon-41 to study  the dispersion of stack ef-
     fluents.  Proc. of the Symp. on Nuclear Techniques in Environmental
     Pollution.  Internat. Atomic Energy Agency, Vienna, Austria,  p.  254-
     255.

Sheleikhovskii, G.V., 1949.  Smoke pollution of towns.  Translation by
     Israel Program for Scientific Trans, for U.S. NSF and U.S. DOC (1961),
     203 pp.

Singapurwalla, N.D., 1972.  Extreme values from a lognormal  law with  appli-
     cations  to air pollution problems.   Technometrics 14, p. 703.


Smith, M.E.,  1961.   The concentration and residence time of pollutants in
     the atmosphere.  Intern. Symp.  for  Chem. Reactions of the Lower  and
     Upper Atmosphere.   San Francisco, Stanford  Res. Inst. Advance Papers,
     p.  273-286.


                                    49

-------
Stern, A.C., 1969.  The systems approach to air pollution control.   Proc. of
     the Clean Air Conf. on the Clean Air Soc.  of Australia and New Zealand,
     Vol. 2; p. 2.4.1  to 2.4.22.

Strott, J.K., 1974.  Application of the AQDM model  and the ATDL model  and
     the comparison of the results.   Fifth Meeting NATO/CCMS Expert Panel
     on Air Pollution Modeling, 4-6 June, Roskilde, Denmark, p. 12-1 to
     12-19.

Thuillier, R.H., 1973.  A regional  air pollution modeling system for practi-
     cal application in land use planning studies.   Preprint,  Bay Area Air
     Poll. Control District, San Francisco, Cal., 25 pp.

Turner, D.B., J.R. Zimmerman, A.D.  Busse, 1972.  An evaluation of some clima-
     tological dispersion models.  Proc. 3rd Meeting of the Expert Panel  on
     Air Poll, Modeling, NATO/CCMS,  Paris, France,  Oct.  2-3, p. VIII-1 to
     VIII-25.

United Kingdom, 1945.   Atmospheric  pollution in Leicester.  Dept. Sci. and
     Industr. Res., Techn. Paper No. 1, 161 pp.

U.S.D.H.E.W., T958.  Air pollution  measurements of the National Air Sampling
     Network - Analyses of suspended particulates,  1953-57.  PHS Publication
     No. 637, p. 245.

Wipperman, F., 1966.  On the distribution of concentration fluctuations of
     a harmful gas propagating in the atmosphere (unpublished MS) 17 pp.

Zimmer, C.E. and R.I. Larsen, 1965.   Calculating air quality and its control.
     J. Air  Poll. Contr. Assoc. 1_5, p. 565-572.

Zimmer, C.E., E.C. Tabor and A.C. Stern, 1959.   Particulate pollutants in
     the air of the United States.   J. Air Poll. Contr.  Assoc. 9^, p. 136-140.
                                     50

-------

                                                                                                                    I
                                                                                                                   
-------
52
-------
     100
  E
  Q_
  O_
  C
  o
  X
            o  1965
            D  1966
            x  1967
              1970
         Observed annual hourly CO concentration X
v iy/u  ;



            I  Experimentally predicted annual hourly


              CO concentration ———


               i          ppm - m
              C  -  7.45
                            sec
        12    5    10   20  30  40 50  60 70  80    90  95  97

               Percentage of time ^ or  C /U is exceeded


Figure 3.  Observed and experimentally predicted annual hourly  CO
concentration distribution for San Francisco.
-------
T 1   T
IT)
CN
       C -
       CD

       0) —
C
o
 .
4—
c
CL»
oo
CD

Q.
0)
                     I  i  i   i
      O
      CK

      00
      CN
      O

      O



      o
      00

      o

      o
             o
             CO
                                 o
                                 CN
                                 CN
                                      0)
                                     •D
                                      cu
                                      CD
                                      O
                                      X
                                      CD
                 c
                 
-------
   4.0
   1.0
E
Q_
Q_
   0.1
                     Bay Area Model

                   ,  predicted concentration



                   , observed concentration
      1
10
20   30  40  50  60  70   80
90
                   Percentage of time X was exceeded

Figure 5.  Observed and predicted hourly CO concentration distribu
tions for San Francisco, July 10-11,1968.
                               55
-------
   100
  E
  o_
  o_
          o 1965
          D 1966
          x 1967
          v 1970
Observed annual hourly CO concentration \
  I
                      Predicted annual hourly CO
   concentration
    C  -9.1
                                         C1
                                        Li
                                     - m
                sec
       \x\
 -U

 -o
  C
  0
  X
    106-  \<   ^
       1
 10    20  30  40 50 60  70   80    90  95 97
               Percentage of time x or C /U is exceeded
Figure 6.  Observed and model-predicted annual hourly CO concentra-
tion distribution for San Francisco.
                              56
-------
•9
       o
      •H
       (H
      •H
       ro
      MH  rH
       O   O
           MH
       C
       O
       (B   O
       O   -H
       •H   4J
       MH   ro
       •rH   H
       CO   d
       CO   O
       (0   <-i
       rH   ro
       o   o
           4->   -H
                 d
       c   c?   tr
       ro   c    a)
       .fl  -H    M
                 o
                 ro
u
o
ro
                                                                                  57
-------
 jo;
 !«£
033=
ctt—
u_o
o
oTuJ
l_)l/l
I — 1«^
I— 03

c

0

•r4

-P
IT)

S-l

4J

G

0)


c

o

o













S-l
o





c
o

•H

in

en

'
e

Q)


.•


-p
3

P-

C

H





1
^
0)
a
in
•H
•a

a
o

10
-p
10
Q


-P
3
a
c
M





-P —
C T3
OJ 0)
i ] _)
•i-l -H
X! C
£ CD
10 -H
^
T3 0
0)
> S-l
S-4 O
a) -P
tn d
_Q QJ
O 0
tt)
IH OS
O
II
0) —
O
C C
0) 0
•H -H
Lj 1-1
H -H
0) 10
d S-l
X 4-1
oj c
CU
4-1 U
in c
iO O
d o






(0
i |
iO

c
0
•H-
en
en
•H -—*
E 'O
OJ GJ
-U
T3 C
0) (1)
e -H
3 S-i
in O
tn
10 a)
o
O^^
•-J
o
TJ CO
0)
> II
S-l —
0)
in
n
O


&_i
0
^^
•a ^
C 10
10 O
-r-l
T3 tn
rH O
0) rH
•H 0

0
T3 CD in
a -P s-i
•r-l 01 1>
S E -P
cu
- M E
s_; ai  O
^ ^


























• y- •• ••
'ft O
tn -H
1C 4-1
o 3 m
O r-l
S-l i— 1
0 O
t, d

























4J

CO

10

H 0

0 O

S-l

o

M-l






















V fl 	
T3 QJ in
•H -H G
(I) 4-1 O
•H -H
4-1 • 4-1 Q
T3 O 10
C C i-H
•H 0 (1)
S O S-l
0) 1 (0
1 S-i O -P
•r-4 SH rH (0
-P O in O t!
in o c s-i
•H ^ O O rH O
4-> -H X! d> (ti
10 rH 4-1 -P 4-1 O
4-1 (0 (0 -H (U -H
co u rH ;s E tn




X

o

m






1 -C
E -P >i
O -H O
o S C tn
0) Q)
0) T) 3 in
E 0) ty in <
3 C 0) 10
i— 1 -H S-l rH
di X) MH O
"?i
1 II M
iO OH) O
> 14-1 > rH S-l 4-1
S-l O ^ CD O O
(D 0) 1
1 — 1
1 — i t3 £H TJ
(0 CU rH d)
C 4-1 S-l rH >
s-i 3 O m s-i
(DO, 3 CD
4J e 4J in
X O O XI
wo 10 o


"-B E
fl CD U
1 " — 1 *T3 'H
-H Q CD Xi
i s-i s-t x; Cb m
4J J-i -p (0 -P
C in o -H S4 10
O -H CJ 5 tn ^3
4J
(0 £>i
S4 O
4-> C
C CD C
0) 3 O D
O tP'-H
C CD 4J
O S4 3
o m xi

1 CD

4-1 -H ---
tn 4-1 in
•H ^ 0)
-P -H
10 rH S-l
4-1 m CD
in u in





















































rH
(0
o
•H
en
o
rH -P
4-1 O 3
3 M a
o o c
x; cu -H
-p -P
-H CU
s e








^


1 1















Xi

tn
•H

BC











|5

















x;

CP
-H

n;







c
o
• rl
-JJ
D
( — 1
0
tn
cu
S-l

CD
u
iO
d
CO


•\


4

c


tn

O
i-3





4-1
S-l
0
x:
CO





tn
C
0







4J
S-l
0
s:




CP
c
o





4-1
S-l
0
x;
CO





Cn
C
O
J





4J
J_4
O
X!
CO





in
CD ~
tn C
iO O
S-l -H
CD 4J
> 3
03 r-H
0
cu in
E CU
•rH S-l
r | — ^


P
3
d
P
3
D



OO









r-









{O









in







^r







CO








(N









rH













S-l
CD
Q
g
3
C


C
e
3
rH
0
CJ

                                                     58
-------
LLJ
3
O

z
I
u
UJ
K
Z
O
H
U

5
UJ
DC
a
z
o
CO
a
3
sr
k.
>
01
•0
C/5
1
Percentile values of SO2(ppm) concentrations
in
r»
in
O)
"
in
r^
X
ID
0)
O)
00
en
in
o>
§
in
r*.
s
8
_c
IB
a u.
E a
•
co
1
o

8
0
f^
o
<3-
1
o
(O
co
00
o
ci
co
o
6
CN
o
o
co
o
o
8
o
CO
o
o
(O
o
o
8
o
(O
o
o
*t
o
o
co
0
o
»—
o
0
*••
o
o
in
T
*r
O)
c^
1
o
r^
co
0

1
rv
O
 en
t— T—
1 1
O O
T™ T*

1 1


* *
1 1
O O
§Q
Q
r» r»
0 0
-------
                             Table  4
           Contingency table of the ozone prediction
           results on the training set by RUFF (1974)
                           Predicted  03         Total
Measured 0~          l.p.p.m                 1 p.p.m


   I p.p.m             25                       4          29


   1 p.p.m              1                      10          11


            Total      26                      14          40


               Skill score with the training set,
                      N02 deleted : 0.75
                                60
-------
                                 Table  5
                 Breakdown of the episode forecasts for 540
                 forecadts (3 winters) for Rouen, France, after

                 BENARIE  (1971) and BENARIE and MENARD  (1972)
Winter     Number of   Out of these forecasts     An episode was
           episode    	  forecast, but did
           days        Correct    Incorrect       not materialize
                                      due to in-             due to
                                      correct                incorrect
                                      meteorolo-             meteorolo-
                                      gical                  gical
                                      forecast               forecast
68/69
69/70
70/71
Total
8
11
19
38
6
7
12
25
2
1
0
3
0
3
7
1O
1
3
7
11
3
3
1
7
                                   61
-------
                 Table 6
 Contingency table of the forecast results obtained
 by BENARIE (1971)  and BENARIE and MENARD (1972)  with
 apparent meteorological forecast error removed :
 523 cases out of the total of 540.
N° episode
P r e d
N° Episode Obser- 484
ved
Episode 3
Total 487
Episode
i c t e d
11
25
36
Total
495
28
523
Per cent
correct
97.5
- 90
Skill score :  0,76
                    62
-------
                              Table  7
           Contingency table of the forecast results obtained
           by BENARIE (1971) and BENARIE and MENARD  (1972),

           overall results for 540 cases
                        N° episode      Episode      Total       Per cent
                            Predicted                     correct
N° episode   Obser-          484          18          502          94.5
             ved

Episode                       13          25           38          67


          Total              497          43          540


                 Skill score  :  0.60
                                    63
-------
                             Table  8
             Multiplying factors to be applied to the 30-day running
             average to obtain winter episode concentration in ROUEN
             France.
       Condition                                          Factor
Mean surface wind
_ it _
_ H _

_ it _
_ it _
Mean temperature below -
Wind blowing from 00° to
Factor for
Factor for
3ms for 24 hours
1ms"1 - " -
0.5m s'1 - " -
-1
0.5m s second 24 hour
0.5m s~ third 24 hour
3°C
80° direction
downtown
"Petit Couronne" station
1.5
2.0
2.5

3
3.5
1.5

1.3
0.7
Wind blowing from 260° - 280° direction,
     at 4-7 m s~~ ,  only for the                            2.0
     "Petit Couronne" station.
                                  64
-------
                         Table
   Comparison of observed (first of each  group of three  lines)
   data, calculated  from the virtually  occuring meteorological
   data (second lines)  and forecast, using forecast meteorological
   data (third lines)  for seven sampling  stations in ROUEN,  France
   winter 1968/69
Factors for
Wind Temp. Direc-
speed tion
Mean 9Nov-8Dec.

9Dec.


IQDec


UDec


12Dec

Observed
Calc.a posteriori 1.5 0.5
Forecast
Observed
Calc.a posteriori 1.5 0.5
Forecast
Observed
Calc.a posteriori 1.5
Forecast
Observed
Calc.a posteriori 2 1.5 0.7
Forecast
Mean 3ONbv-29Dec.

30Dec


31Dec


Uan.

Observed
Calc.a posteriori 2.5 0.7
Forecast
Observed
Calc.a posteriori 2 1.5
Forecast
Observed
Calc.a posteriori 2
Forecast
Mean 24Dec-22Jan.

23Jan


24Jan

Observed
Calc.a posteriori 2
Forecast
Observed
Calc.a posteriori 2
Forecast
Mean 5jan-3Fev.
(Observed
4 FebfCalc.a posteriori 2
[Forecast
(Observed
5 FebfCalc.a posteriori 2
(Forecast
Mean 18jan-16feb
(Observed
17feb[Calc.a posteriori 2
(Forecast
SS
8
102
103
76
102
119
76
2O4
212
153
204
261
214
204
100
169
175
200
367
300
200
346
200
200
117
211
234
234
145
234
234
97
207
194
194
171
194
194
130
198
260
260
SS
15
102
137
76
102
137
76
204
242
153
204
274
214
2O4
99
143
173
198
352
297
198
358
198
198
110
202
220
220
148
220
220
88
226
176
176
208
176
176
126
180
252
252
Pref.
52
56
39
52
42
39
1O4
193
78
1O4
126
1O9
1O4
46
102
81
92
271
138
92
296
92
92
67
1O4
134
134
42
134
134
54
192
1O8
108
149
108
1O8
78
109
166
166
Sott.
101
133
76
101
111
76
202
377
151
202
218
212
202
121
157
212
242
(360)
362
242
(360)
242
242
131
225
262
262
1O1
262
262
102
306
2O4
204
261
204
2O4
145
(298)
290
290
Fac.
86
106
65
86
52
65
172
127
128
172
286
180
172
75
205
131
150
(214)
235
150
(214)
150
150
30
150
160
160
65
160
160
73
110
146
146
112
156
146
82
(108)
164
164
Mar.
82
131
62
82
80
62
164
136
123
164
323
172
164
92
175
161
184
(236)
276
184
(236)
184
184
103
94
206
206
66
206
206
84
190
168
168
130
168
168 '
95
(115)
170
170
Pt-
Cour.
66
97
50
66
82
50
132
255
99
132
148
138
132
96
86
167
192
256
288
192
188
192
192
105
128
210
210
284
210
210
161
214
322
322
170
322
322
186
226
372
372
Note. The bracketed observed values correspond to 3 days' sampling and were
     not taken into consideration for the computation of the RMS error.

                             65
-------




























0
rH

0)
rH
Q

C -
O CTl •
o VD e
CTl •
1 rH fd
O CTI in
U CN
m M -P
O 0) -H
• 3»
Q) 01 T3
3 -P 0)
rH Q, N
(tt 0) -H
rd
>i C -H
rH O -P
D - C
O C -H
A -H
01 0)
VH 18 r<
O 03 10
01 01 01
d 0) c
O rH O
iH 01 -H
•P &> -P
O C O
iH rfj iH
Tj *O
r-oo^O'Hcgro
<~H rH rH rH



F— 1 CNJ OO rH ^D LO ^34
^* o^ ro o^ ^* ^o .
o



00 CN i-H
co CM in
? ? ?

CTl
VO

rH rH 00 rH 00 Q






in VD VD oo o



CTl
i-H
co in co in in r~ X
CN CN u





CTl
in in in in in T Q






j
T rH T IT) TJ- rH °°.
rH I-H l-H Q




T
CO rH CO OO T CO •
1-1 O



co
co
CTi CO VD CO VO CO Q
00
co co CM ro VD co o
• d • d • d
W rH W rH W rH
S3 S3 S3 gt-
•H 3
-P -H
(d U
T in VD "m ijll
• ^j _. W J*~*
S
O U
66
-------
                 Table
11
Validation of the simple box model,
according to HANNA  (1971) and GIFFORD
(1972, 1973).
City
CHICAGO
CHICAGO
LOS
ANGELES
SAN
FRANCISCO
LONDON
Concentra- Period
tion value
modelled
Point 24 hours
(for 18
days)
Point 6 hours
(for 4
days)
Point 1 hour
(for 17
hours
Point 1 hour
(for 48
hours)
Area 24 hours
average average
Pollutant Correlation
coefficient
S02 .67
S02 .66
CO .89
CO .74
S00 .76
                   67
-------
           TABLE 12.DATA RELATED TO PARTICLE AND SO* POLLUTION FOR U.S  tints*


          City             (1)      (2)      (3)      (4)      (5)    (6)      (7)      (8)
Washington
New York
Chicago
Philadelphia
Denver
Los Angeles
St. Louis
Boston
Cincinnati
San Francisco
Cleveland
Pittsburg
Buffalo
Kansas City
Detroit
Baltimore
Hartford
Indianapolis
Minneapolis-St. Paul
Milwaukee
Providence
Seattle-Tacoma
Louisville
Dayton
Houston
Dallas-Ft. Worth
San Antonio
Birmingham
Steubenville
247
1795
1780
1168
29
187
662
428
349
174
818
934
410
125
786
255
337
164
215
242
118
225
303
186
144
16
2
33
638
35
243
586
231
29
101
176
74
73
102
304
387
140
60
240
104
56
78
46
100
23
33
128
174
158
52
129
205
155
7.5
8.2
7.3
7.8
6.3
5.1
6.5
8.0
6.2
5.4
7.4
7.1
7.6
7.3
7.3
7.5
8.1
6.8
7.5
7.5
8.3
5.5
6.6
7.0
6.6
7.0
6.5
5.9
6.1
775
2330
2590
4400
260
1035
595
775
905
1035
650
4815
620
390
1035
200
465
415
775
775
155
260
620
1035
620
570
630
520
520
72
98
139
124
117
114
161
83
122
73
106
140
116
97
143
133
81
146
88
95
113
79
121
116
67
96
68
128
173
412
265
154
634
227
205
122
240
323
138
57
426
135
158
155
67
188
182
384
191
218
117
133
166
60
253
75
66
121
90
346
221
217
18
—
132
—
44
—
78
93
25
12
16
107
62
54
44
28
125
35
—
49
—
—
—
—
—
58
128
81
218
35
—
27
—
24
—
16
117
10
10
5
22
24
32
41
23
47
7
—
66
—
—
—
—
—
Average                   441      146     7.0    1027      111      202      90       50

* Column legends:
(1) QTOT, SO,, 103 tons yr"1, FENSTERSTOCK et al (1969); 103 tons yr"1 = 28-73 g s-1;
(2) QTOT, particles, 103 tons yr"1, FENSTERSTOCK et al. (1969);
(3) u, m s-1, FENSTERSTOCK et al. (1969);
(4) Approximate area, km2, enclosed by 0.1 tons  day1 mi'2 urban  particulate source, estimated
   from Reports on Consultation, U.S. DHEW (1968-1969);
(5) Observed average concentration, X, of particles, /*g m~3, FENSTERSTOCK et al.  (1969);
(6) c (dimensionless) for particles, using equation  (2) and cols. (2, 3, 4 and 5);
(7) Observed average concentration, X, of SO,, /*g m~3, from U.S. DHEW (1968);
(8) c (dimensionless) for SO,, using equation (2) and cols. (1, 3, 4 and 7).
                                           68
-------








fc
o
NH
-H HH 4.
u H e
Q 2 .0
g H «
t^ J2J +3
«B g
55 C
O ®
CQ ?r w

ofirH
rH rH



O O O>
OO -4->O»


§O w*
** t—



O O O)

0

V


09

O
C
.2

3
a
o
PL,



c^




CM







rH



T~


































j
^
CO
A


00




00








^H


C^


































fi
^c
"i
CO
T—


t> Ci O O CC
rM o oo la
»-H rH Tj*
10
0
t"" OO O) CO rH ^>1* ^*
rH Oi O> t- «O Tj<



1 TH CO CO 1
1 i
l 1
1 1
rH CM rH CM rH
rH CO »O rH CO CM



rH CM rH


' «O t- CM O U5 CM
1 r-i rH rH
l
1
CM "3 T* «D CM O» rH
CM rH rH rH



i ^J< O 00 CM Cft U5
1 CO CM rH rH rH
l
1
i-H t- O -^ OO C-
CM CM rH



CO ^^ ^f^ t* tf^


^^










fi 0 0 0
.2 o o o o o o
ZS 0 O 0 0 0 0
Booo°.°.°.
S 0 0 0 0 10 O
T* i T T i i \/
t- O O O U5 O v
• O O >O CM r- 1
O ••* rH





I— I
o



t-


00
<£>
rH

_.J
r^
CO


CM
us

a>
t-



00
o
rH

t-
t-



C^
CM

••4
1"^





C
cd
XI
J2
73
"o





















o
o

^
gf
o
c
o
•r"N
-4->
rt
3
^»
o
u,

rS
"§e
09
C3
£
CU

" W
•4-1
O
H*
0
• •— «
c
s
cd
T5
V
ts

O
8
t-H



69
-------
                       Table 14

Computed CO-concentrations (17 hour day-averages, ppm)
compared with the observed values for Sept. 23, 1966, of
the Los Angeles Basin.
1 2
Stations Observed
concentra-
tion
Downtown LA.
Azusa
Pasadena
Burbank
East IA
West IA
Long Beach
Hollywood
Pomona
Lennox
Anaheim
La Habra
BMSE
Correlation
Coefficient
a
b
16
13
17
16
12
16
14
17
13
13
9
6





3
Computed
(LAMB, 1968)
22
3
15
7
5
13
8
7
3
11
7
3
6.8

0.55
0.32
10.7
4
Random
9
7
15
8
10
17
14
11
16
13
6
12
4.7

0.25
0.23
1O.8
5
Mean 13.5
14
13
14
13
14
13
14
13
14
13
14
13
3

0
0
13












.2

.00
.00
.5
  a and b refer to the coefficients of the regression equation
  Computed cone. = a  (Observed cone.) + b
                        70
-------
                                 ADDENDUM
P 25   After Lettau reference add:

       Instead of the integration of the Gaussian  infinite  line  source,
       GOUMANS and CLARENBURG (1975) considered  a  large  number of  randomly
       distributed point sources over the area and a  Sutton-like plume
       formula.  Their computational formula  is  equivalent  to that proposed
       in 2. above (GIFFORD and HANNA 1976)

P 29   After line 19 add:

       Calculation of seasonal  means by GOUMANS  and CLARENBURG  (1975) for
       The Hague and Amsterdam (Netherlands)  show  a very good fit  with
       the observed values

P 46   Add to References:

       Goumans, H.H.J.M. and L. A.  Clarenburg, 1975:   A  simple model  to
       calculate the SCL-concentrations in urban regions.   Atmos.  Env. 9_,
       pp 1071-1077    ^

       Gifford, F. A. and S. R. Hanna, 1976:  Discussion to the  paper of
       Goumans and Clarenburg,  Atmos. Env. 10, p 564
                                     7  1
-------
                                   TECHNICAL REPORT DATA
                           (Please read Instructions on the reverse before completing)
1. REPORT NO.
   EPA-600/4-76-055
                                                           3. RECIPIENT'S ACCESSION'NO.
4. TITLE AND SUBTITLE
                                                           5. REPORT DATE
                                                             November 1976
  URBAN AIR POLLUTION MODELING WITHOUT COMPUTERS
                                                           6. PERFORMING ORGANIZATION CODE
7. AUTHOR(S)
                                                           8. PERFORMING ORGANIZATION REPORT NO.
  Michael  M. Benarie
9. PERFORMING ORGANIZATION NAME AND ADDRESS
  Environmental Sciences  Research Laboratory
  Office of Research and  Development
  U.S.  Environmental Protection Agency
  Research Triangle Park,  North Carolina 27711
             10. PROGRAM ELEMENT NO.

               1AA009
             11. CONTRACT/GRANT NO.
12. SPONSORING AGENCY NAME AND ADDRESS
                                                           13. TYPE OF REPORT AND PERIOD COVERED
  Environmental Sciences  Research Laboratory
  Office of Research and  Development
  U.S. Environmental Protection Agency
  Research Triangle Park,  North Carolina 27711
               Inhouse
             14. SPONSORING AGENCY CODE
               EPA-ORD
15. SUPPLEMENTARY NOTES
  Prepared by Visiting  Scientist
16. ABSTRACT
       This report was  the basis for a series of  three  lectures by the author
  on urban air pollution  modeling, and represents a  condensed version of selected
  topics from a recent  monograph by him.  The emphasis  is  on simple but efficient
  models that often  can be used without resorting to high-speed computers.   It  is
  indicated that there  will  be many circumstances under which such simple models
  will be preferable to more complex ones.  Some  specific  topics included in the
  discussion are the limits  set by atmospheric predictability, forecasting  pollu-
  tion concentrations  in  real  time as for pollution  episodes, the simple box model
  for pollution concentrations, the frequency distribution of concentration values
  including the log-normal distribution and averaging-time analysis, the relation-
  ships between wind speed and concentration, and lastly the critical question  of
  model validation and  the need to consider several  indices of goodness-of-fit  if
  pitfalls are to be avoided.
17.
                                KEY WORDS AND DOCUMENT ANALYSIS
                  DESCRIPTORS
b.lDENTIFIERS/OPEN ENDED TERMS  C. COS AT I Field/Group
  * Air pollution
  * Meteorological  data
  * Mathematical modeling
  * Model tests
                                13B
                                04B
                                12A
                                14B
18. DISTRIBUTION STATEMENT
        RELEASE  TO PUBLIC
                                              19. SECURITY CLASS (ThisReport)
                                                     UNCLASSIFIED
                                                                         21. NO. OF PAGES
                                82
                                              20. SECUR
                                                                        22. PRICE
EPA form 2220-1 (9-73)
                                            72
-------