EPA-650/4-75-005
OBJECTIVE PROCEDURES FOR OPTIMUM LOCATION
    OF AIR POLLUTION  OBSERVATION STATIONS
                              by

                         C. Eugene Buell

                     Kaman Sciences Corporation
                     1500 Garden of the Gods Road
                          P. O. Box 7463
                   Colorado Springs, Colorado 80933

                       Contract No. 68-02-0699


                 EPA Project Officer: Kenneth L. Calder

                  Meteorology and Assessment Division
               Environmental Sciences Research Laboratory
               Research Triangle Park, North Carolina 27711
                           Prepared for

              U.S. ENVIRONMENTAL PROTECTION AGENCY
                  Office of Research and Development
               Environmental Sciences Research Laboratory
              Research Triangle Park, North Carolina 27711

                            June 1975

-------
                              DISCLAIMER
     This report has been reviewed by the Environmental Sciences
Research Laboratory, U.S. Environmental Protection Agency, and approved
for publication.  Approval does not signify that the contents necessarily
reflect the views and policies of the U.S. Environmental Protection
Agency, nor does mention of trade names or commercial products constitute
endorsement or recommendation for use.

-------
                    ACKNOWLEDGMENTS

     We express our appreciation for the assistance of
Mrs. Barbara Byerly for basic programming and especially
to Mr. Edwin Bathke for invaluable assistance in solving
our major problems and preparing this report.

-------
                       ABSTRACT

     This document is concerned with the development of
linear regression techniques for interpolation of air pollutant
concentrations over an area and, using these techniques, the
construction of a computer program for determining the
optimum location of air pollution observing stations.  The
general interpolation problem is surveyed in the first
chapter and the advantages of using linear regression formulas
as interpolation formulas are discussed.  Special emphasis is
placed on the case in which the observations contain errors
of observation or effects of limited range of influence.
Since the use of linear regression methods depends on know-
ledge of the two-point correlation function for pollutant
concentration measures, the construction of correlation co-
efficient functions from synthetic data is taken up, together
with methods for interpolation of this information.  Con-
siderable attention is given to the estimation of residual
variances or the effects of limited range of influence.  It
is pointed out that certain aspects of Factor Analysis can
be used for this purpose.  These methods are extended to
a continuous formulation of the problem in integral equation
form and it is shown that the lack of accuracy in the strictly
mathematical process of solution of the integral equation tends
to be more important than the statistical significance of
the data unless the residual variances are removed.  If this
is done, then the tests for accuracy and statistical significance
are reconciled.

     The computer program depends heavily on this last point.
It appears to work well when the residual variances are care-
fully handled.  Many of the difficulties encountered in this
program were traced to this source so that users of this program
should be aware of this in constructing the input materials.
The reader's attention is directed to Chapter IV where this
is discussed in detail.
                           111

-------
                        PREFACE

     The spatial distribution of air quality over an urban
area represents a statistically random field of pollutant
concentration since, even in principle, it is not possible
to specify its structure in a deterministic fashion, but
only in terms of various statistical properties.  The space-
and time-variability of air quality thus present difficult
questions that are very inadequately and simplistically
analyzed at the present time, and of which a fundamental
understanding is still lacking.  The accuracy of any analysis
that utilizes air quality data evidentally depends on the
intrinsic accuracy of the data and the density of the sampling
network at which the data are available, since based on
the latter values, it will normally be necessary to interpolate
values for intermediate locations.  An analysis of the criteria
for objective selection, i.e., that does not involve personal
or subjective judgment, of the optimum sampling network does
not exist at the present time and is urgently needed.  An
"optimum network" is here meant in the sense of a network
that is free from redundant observations, namely, data that
could be derived with "sufficient accuracy" by some specified
interpolation procedures from the given sampling network.
Similar decisions are required in establishing meteorological
observation networks and in this case have received a great
deal of attention that is reported in an extensive modern
literature.  It is now required to develop and extend the
appropriate statistical methodology so that it will be directly
applicable to the selection of air quality sampling networks,
and having due regard to both the cost and informational
content of the network.  At the time the present study was
initiated this appeared to be of particular importance in
view of the then forthcoming EPA Regional Air Pollution
Study which required the establishment of both a large air
quality sampling network and also an extensive meteorological
network.
                               iv

-------
     Very early in the research described in this report
the exceptional subtleties and sophisticated difficulties of
the optimization problem'became apparent, and an unexpectedly
large number of unforseen statistical, mathematical and com-
putational problems were uncovered.  Several different approaches,
even involving variations of the logical structure of the
problem, were explored by the contractor in a highly innovative
fashion.  Because of time limitations it was finally decided
to specialize the problem and to study in some detail a one-
step-at-a-time add-on method of locating sampling stations.  For
this it is assumed to start with that there are a few existing
observation points, or at least a few points at which obser-
vations will be made based on prior considerations.  On the
basis of this starting network of observation points, a pro-
cedure is then developed to determine the location where the
statistical error of estimate, using a simple linear-type
interpolation formula in terms of the observed network con-
centration values, would be largest.  This point would then
be accepted as a best location for a new observation point,
and the process repeated in an iterative fashion until the
required number of station locations had been determined.

     Unfortunately, a fully operational and purely objective
computerized program was not achieved within the scope of
the present contract.  However, in support of further study
of the problem, it is considered very desirable to make
readily available a complete account of the present research.
This is now offered in the hopes of it becoming a major
contribution towards future resolution of an exceptionally
difficult and subtle problem that continues to be of major
importance in defining the spatial distribution of air quality.
Research Triangle Park,
North Carolina
December 1975
Kenneth L. Calder
Chief Scientist
Meteorology & Assessment Division
Environmental Sciences Research Lab,
                           v

-------
                   TABLE OF CONTENTS
                                                          Page
ABSTRACT                                                  i:Li
PREFACE                                                    iv
TABLE OF CONTENTS                                          vi
CHAPTER I.  INTERPOLATION OF POLLUTANT CONCENTRATIONS       1
     A.   Interpolation Between Observation Points          1
     B.   Linear Regression as an Interpolation Formula     6
          1.   Derivation of the Basic Relations            6
          2.   Interpretation of Results                    9
               a.   The point P at a point of P.            9
               b.   Continuous Correlation Coefficient     10
               c.   Extremely Discontinuous Correlation
                    Coefficient                            H
               d.   Discontinuous Correlation Coefficient  12
               e.   Range of Influence                     13
               f.   Small Scale Effects                    14
               g.   Summary                                15
          3.   Error of Interpolated Values                16
               a.   Implications for Measurement Locations 17

CHAPTER II.  DETAILED FORMULATION OF THE PROBLEM           i9
     A.   Area Covered                                     19
     B.   Observation Collection                           20
     C.   Source Locations, Types, Rates                   21
     D.   Meteorological Conditions                        21
     E.   Type of Pollution Measurements                   22
     F.   Interpolation/Extrapolation Methods              22
     G.   Optimization Method                              28
                           VI

-------
TABLE OF  CONTENTS (Cont'd)
                                           Pac
CHAPTER

A.
B.
C.
D.

CHAPTER
A.
B.






C.


D.
CHAPTER
A.
B.
C.
D.
E.
III. THE CORRELATION OF POLLUTANT CON-
CENTRATION BETWEEN POINTS
The Model Used to Generate Synthetic Data
Examples of the Correlation Coefficients
Interpolation of the Correlation Coefficients
Generation of Synthetic Pollution Correlation
Coefficients
1. Main Program
2. Subroutines
IV. THE ANALYSIS OF THE COVARIANCE MATRICES
Evaluation of the Effect of Errors of
Measurement or of Small Scale Phenomena
Determination of the Residual Variances
1. Representation of a Matrix in Terms
of Proper Functions/Proper Values
2. Factor Analysis Methods
3. Integral Equation Methods
a. The Jump Discontinuity
b. The Quadrature Factors
c. The Product Integral Technique
d. Test for Accuracy of Solutions
Analysis of St. Louis SO- Data
1. The Basic Data
2. Quadrature Factors and Techniques
3. Factor Analysis Methods
Summary and Conclusions
V. PROGRAM BAST AND ITS SUBROUTINES
Program BAST
Subroutine CIRCUM
Subroutine FMFP
Subroutine FUNB
Subroutine FUNCT

30
30
30
36
42
42
43
56
56
61
62
64
70
72
74
85
88
91
91
98
111
123
125
126
148
151
158
161
              Vll

-------
              TABLE OF CONTENTS  (Cont'd)
                                                       Page
     F.   Subroutine CORFUN                            167
     G.   Subroutine INT2D                             173
     H.   Subroutine MATINV                            175
     I.   Subroutine ADOPT(XO,YO)                      178
     J.   Subroutine TRIFIX                            184
     K.   Subroutine PORDR                             188
     L.   Subroutine AR2                               190
     M.   Subroutine TRITST                            192
REFERENCES                                             194
APPENDIX A:  Conditions on an Empirical Formula
             for a Correlation Coefficient             A-l
APPENDIX B:  Note on the Rank of a Covariance Matrix   B-l
                          Vlll

-------
                       CHAPTER I
       INTERPOLATION OF POLLUTANT CONCENTRATIONS

     The relationships between the correlation function of
pollutant concentrations at two points and the spacing of
observation points are discussed in the first two sections.
The first section is devoted to a simple analysis of the error
of interpolation using standard interpolation methods.  This
serves to introduce some of the basic ideas involved.  The
next section is devoted to the use of a linear regression as
an optimum interpolation formula.  Several important aspects
of the problem are covered, particularly the role of errors
and small scale effects.  The use of the linear regression as
an interpolation formula is intimately connected with the idea
of a Wiener Filter (Wiener, 1949) for both smoothing and
interpolation.

A.   Interpolation Between Observation Points
     The pollutant concentration is observed at a network of
points P. and has values x. at these points.  The total field
of pollutant concentration is depicted by drawing contours of
equal concentration values which are determined by the observed
values.  Usually, the values are sketched "by eye".  To make
the procedure quantitative for analysis purposes, it is neces-
sary to specify a particular procedure used to determing the
interpolated pollution field.  This is done by specifying an
interpolation formula that is used to describe the process.
Examples of such formulas are given in Abramowitz & Stegun,
1964, p. 882.  Linear interpolation between two points:
     f(xo+Ph)  = (l-F)f0f0 + Pflf0

-------
Three point formula (plane fitted to the data)
           f (xo+ph,yo+qk)  = (l-p-q)f    + pf
Four point formula (hyperbolic paraboloid fitted to the data)
                             l-p)(l-q)f    + p(l-q)f,  n
                                       LJ j vJ          JL j O
                           pqf
           f (xo+ph,yo+qk)  =
                              lfl

Six point formula (general quadratic fit)
                                         _  + [p(p-l) /2 ]f
                   pq - p2 - q2)f    + [p(p - 2q +
                                 v/ j \J                     A. j \J
              [q(q - 2p + l)/2]fQ 1
     In the above formulas  f  ,   represents  the  concentration
                            a ,D
of the pollutant at  the point with  coordinates  x=a,y=b,
with the point  (x ,y ) taken as  (0,0).   The points  (x,y)
are on a rectangular grid spaced h  and  k units  apart in  the
x and y directions respectively.  The parameters  (p/q) are
usually confined to  the range  (0,1), but not necessarily
so depending on the  formula.
        The efficiency of the interpolation  formula  may be
 evaluated by estimating the error  that would  occur.  We use
 the simple linear  interpolation  formula  as  an example to
 keep the arithmetic within bounds  and because it  illustrates
 the essential features of the problem.   The mean  square
 difference between the actual value  and  the interpolated
 value may be written  as
        E  =  (y-(l
 where the over-bar  indicates  a  suitable  average  value

-------
 Expanding this, and assuming that the mean field has  been
 removed so that XQ, x  , and y are departures  from  the mean,
 then,
                - 2(l-p)x7y" - 2p l^J +  (1-p)2"^  +
 To further simplify the situation, assume that the concentration
 variances at P, where y is measured (the interpolated coordinate)
 and at PQ where XQ is measured and P   where x.. is measured  are
 all the same,,  We denote this common variance by the symbol
 a2 .  Then note that

        xTy = a2r(y,x0 )
        Xjy = CT2r(y,xi)
        x0X! = asr(x0 ,Xj. )

 where r(a,b) is the ordinary correlation coefficient relating
 concentrations at the points A and B,  where a and b are measured,
 respectively.  Then

 F"=2a2[l - p + P2 - (l-p)r(y,x0) - pr(y,Xl ) + p(l-p)r(x0>Xi ) ]

       The correlation coefficients not only relate the measured
concentrations, but are also functions  of the relative locations
at which the concentrations are measured.  Thus, we write
       r(y,xQ) = r[ph]
       r(y,X;L) = r[(l - p)h]
       r(xQ,x1) = r[h]

where  h is the appropriate scale factor, the distance between
the points P  and P .  (p is a dimensionless parameter that  is
zero at P  and 1 at P-,.)  Then the mean square error of the
interpolated value is E2".  it is readily seen that at p = 0,
r[ph] = 1, and E2 = 0 and that at p = 1, r[(l - p)h] = 1 so

-------
that again E^ = 0 ;  that is,  the interpolation error is zero
at the two data points, which is as it should be.

      The important feature of  the above relation  is  that  the
mean square error of interpolation depends on the  correlation
coefficients as  functions of the spacing between data points,
r[h], and of the distance of the interpolated point from the
data points, r[ph] and r[(l-p)h].

      Consider now a particular example of a correlation
coefficient which is 1 at zero  distance and  reduces linearly to
0  at a distance  4, and  we assume that  I is larger than h, the
distance  between P  and P, .  Then
                  o      1
      r[h]  =  1-hA
      r[ph] =  l-ph/4,
 The mean square error of the interpolated value may then
 be written as

        & = 2a2p(l-p)hA

 It is readily seen that the error of the interpolated value
 is a maximum at p = | and has the value there of E2 = ash/2l.
 Under these conditions, we have an explicit expression that
 can be used to determine the spacing of the observation points.
 Thus, if we specify the maximum allowable mean square error
 of interpolation, I2", then the distance between data points
 may not exceed the value h = 2£(P/o-2).

        The value of I may be thought of as a "range of  influence"
 of the correlation coefficient.  The larger the value of I,
 the farther apart the observation points may be spaced.  The
 spacing also depends on the inherent variability of the data
 through the term a2 .  The more highly variable the data the

-------
closer the observation points to achieve the same maximum mean
square error of interpolation.

       Other analytical expressions may be used for the
correlation coefficient and the location of the point of
maximum error may be found.  It is readily shown that the
point of maximum mean square error is at p = | and that the
mean square error of interpolation will be given by

      "i*  = 2a2   jl -  (l/4)[l-r(h)]  -  nh/2) j .

This expression makes it possible to compute the spacing
between observation points that must not be exceeded when a
mean square error of interpolation is specified and the
correlation coefficient function is known.

       The more complicated interpolation formulas for a two
 dimensional array of points lead to vastly more complicated
 arithmetic, but do not change the essential ideas  brought
 out by the above elementary analysis.  The main idea is that
 the mean square error of interpolation depends on the structure
 of the correlation coefficient as a function of the distance
 separating the points at which the pollutant concentration is
 measured.  When this structure is known, the spacing of the
 observation point to achieve a given mean square interpolation
 error may be specified.

-------
 B.      Linear Regression As An Interpolation Formula

        This section is devoted to an elementary derivation of
 the linear regression estimate of pollutant concentration at a
 point P based on observed pollutant concentrations at a network
 of  points P., i =1,  --- ,  n.  It is initially assumed that the
 values of pollutant concentration at P are observed.  The point
 of  view is then reversed and the regression equation is considered
 from the point of view of an interpolation formula which is used
 to  estimate the pollutant concentration at P when it is not
 observed there .

 1 .      Derivation of  the Basic Relations

        Let Y be the pollutant concentration at P and let X. be
 the pollutant concentrations at points P. , where the P. are a
 network of n observation points, i = 1, --- , n.  To simplify
 the situation we consider the standardized variables (departure
 from the mean divided by the standard deviation)

        y = (Y-Y)/dY , x± = (Xi-Xi)/ax  , i = 1, — , n

 and we consider the relation

     y = bX+ ... + b
where y is the estimate of y given the values KI, --- x .

The least squares procedure for determining the coefficients b.
leads to the set of equations
                                                             (2)
        (yxn) =
where the bar over the symbol indicates a mean value and
where y is assumed to be a measured value at P.  Since
the variables are normalized, these are correlation coefficients

-------
 It will be convenient to write these equations also in the
 standard form
bla1
                            - g
                                                           (2a)
                      Vnn
 and we note that the matrix of coefficients is symmetric,
        The solution for the b.Ts may be written  in the  long
 form using Cramer's rule
 b.  =
        |(xf)  ,  (x1xs),	,
        \(xsxl ),  (xf~)  ,	,  (yxj),	, (x3x
                                            n'
                          ,  --- , (IF)
          nXl  '
        (*nXl ) ,  (xnx2 ) ,  --- ,
                              ^ith column.
 and also in the form (Kenney and Keeping, 1951)
         .  a
                                                           (3)
                                             ,
                                                          (3a)
                                     is the element from the
inverse of the matrix of coefficient  (&.;-;)•  This  follows
immediately from (3) by expanding the determinant  in the
numerator in terms of the sum of the products of the elements
in the i'th column and the cofactors of this column.  The
ratio of the cofactor of the element of the i'th row of the
j'th column to the value of the determinant yields the element
a1-1 of the inverse  (Turnbull, 1960) .  The solution for
finding y may then be written as
                                                          (4)

-------
        The  important  point  to  be  considered  now is  that  the
 correlation coefficients  (y x. ) = g.^  (and  also  the  correlation
 coefficients (x.x.; = a..)  are functions of  the locations of
 the point P and the point P.  (or  of the points  P.  and P.,).
                            •*•                            J
 Since the points P^ are fixed, we focus our attention on
 the point P and write
 Then (4) may be expressed as

                     g (P,P )ald] =  I §,(?,?.)[£ x.aij]  (4a)
Thus, (4a) may be considered as an interpolation formula since
                                     s\
the location of the point P at which y is estimated appears
explicitly in the correlation coefficient g.(P,P.).  If we
know the functional form of the correlation coefficients as
dependent on coordinates, then (4a) may be looked on as deter-
       /v
mining y from a linear combination of the observed concen-
tration x. with coefficients which depend on the location of P,
or it may be considered as a linear combination of correlation
coefficients, functions of the location P where the concen-
tration y is estimated, weighted by factors that depend on
the observed concentrations at the points P . .

-------
2.      Interpretation of Results

        The results of the simple linear regression for pollutant
              /v
concentration y at a variable point P in terms of observed
pollutant concentrations x. at a fixed network of points P.
are discussed below.  The point of view in this discussion
is that the linear regression is a particular realization of
a "Wiener filter" for the interpolation and smoothing of ob-
served information on pollutant concentrations (Wiener, 1949).
The elaborate mathematical apparatus of N. Wiener's original
treatment is abandoned in favor of a more general point of
view so that more general results are obtained (at least in
a limited sense).  We discuss several particular situations
that illustrate the kind of results that can be obtained.

       a)   The  point  P at  a  point of  P.

       If  the point P at which  concentrations  y  are measured
coincides  with  a point of  the observing network, say P^, and
if  the data on  y is identical with  the values  of x. at  P. ,  it
is  readily seen that  yx~. = x. x. and that,  from (3) b. = 1,
                       1     K 1                    K
and b. = 0 if j  ^ k.  (In  the first case  the kth column of  the
     tj
numerator  in  (3) is exactly  that of the denominator; in the
second the kth  column of the numerator is  exactly  the same  as
the jth column  and  hence the determinant  has the value  zero.)   In
this case  y = x. is the result  of the use  of the regression,
which is precisely  what it should be.

-------
        b)   Continuous  Correlation  Coefficient
        Consider  the  case  in  which  the  correlation coefficients
  g.j(P,P.) are  continuous  functions of  P and  for which g.  -* I for
   J    J                                               J
  P -» PJ.   A one  dimensional  schematic  of this situation is shown
  in Fig.  I-la.   The  use of the  interpolation
       0
         0
(P - P.)
Fig. I-la.  Schematic illustration of the Correlation Coefficient
g. (P,P.)  as a function of (P-P.).

  formula (4) or (4a) leads to a smooth interpolation of the data
  at the points which lie between the data points ?i and the
  interpolated values lie on a surface that passes through the
  data  values x..^.   This  is  shown schematically in the Fig.  1-b for
  a one dimensional situation.
 Fig. I-lb.  Schematic illustration of the interpolation between
 data points for correlation coefficients that are continuous.
                             10

-------
       c)  Extremely Discontinuous Correlation Coefficient


       The correlation coefficient as a function of the location

 of P with respect to P. as in g.(P,P.) must be a continuous
                       J        J    J  	
 function of P-P^, except that it may have a jump discontinuity
                «J
 at the origin P-P  =0.  An extreme case is that in which
                  J
 g.(P,P.) = 1 when P = P. and is equal to zero if P £ P..  This is
  J    J                ,"J                              J
 illustrated in Fig.  I-2a.   The  resulting  interpolation  for
        0
          0
P-P.
     J
Fig. I-2a.  Schematic illustration of the extreme discontinuity

possible for a correlation coefficient in which it has the

value 1 at P-P.=0 and has the value 0 at P-P. ^ 0.


this kind of a correlation coefficient is  illustrated  in  Fig.  I-2b,

 The interpolated values are zero between the data points, but

 at the data point the observed data values are obtained.
             x
                                            X
                                                   X
Fig. I-2b.  Schematic of the interpolation between data points

when the correlation coefficient is that illustrated in Fig.  I-2a.
                               11

-------
         d)
Discontinuous Correlation Coefficient
         When measured values are subject to independent, random
  errors, the correlation coefficient has a, (small) jump discon-
  tinuity at the P = P. which is dependent on the relative values
                      
-------
       e)   Range of Influence

      The "range of influence" of a correlation coefficient
is of importance in the discussion of observation network density,
This is loosely defined as the distance P - P. over which the
                                             J
correlation coefficient is significantly different from zero.
A correlation coefficient with limited range of influence is
illustrated in Fig. 4a.  When the data points are spread out
Fig. I-4a.  Schematic of a correlation coefficient with a limited
"range of influence".

so that no one is within the "range of influence" of another,
the interpolation formula (4) and (4a) leads to results illustrated
in Fig. I-4b.  It is immediately apparent that for an adequate
Fig. I-4b.  Schematic of interpolation when data points are so
sparsely located that one is not within the "range of influence"
of another.

network of observation points, the distance between points
must be small enough that one or more data points must be within
the "range of influence" of some other data point.
                           13

-------
        f)   Small  Scale  Effects

       In  dealing  with atmospheric problems,  the influence
 of small  scale effects  must  be  adequately accounted for (or
 adequately smoothed out).  These show up as  a  small hump on
 the correlation coefficient  peaking it sharply upward at
 P - P. = 0 as illustrated in Fig. I-5a.
      0
       0
P - P.
Fig. I-5a.  Schematic illustration of a correlation coefficient
showing both large and small scale effects.

 The effect on the interpolation formula (4) or (4a) is shown
 in Fig. I-5b.
Fig. I-5b.  Schematic illustration showing the results of small
scale effects on the interpolation of data.
                              14

-------
       g)   Summary

      The implications of the above illustrative examples for
the determination of observation network density are immediate.
The first result is that the network of observation points
should be sufficiently dense that each data point is within the
"range of influence" of another data point.  The second important
result is that the effects of errors and small scale effects
must be carefully determined and the interpolation formula used
in such a way that there is adequate filtering of these effects
in interpolating the results of observations.
                             15

-------
3.      Error of Interpolated Values

       The mean square error of the interpolated values may be
written at once from the results of the least squares
estimation (interpolation) formulas (1), (2), (2a), (3), and
(3a).  The expression for the mean square error is

       E~2 = y2 - (blgl + — + bngn)                     (5)

       E2" = y2 - E bigi                                  (5a)
                 i
and using (3a) for the value of the b.'s, one obtains
                        lj
       E2 = y2 - S Z g,ag                               (5b)
                 i j       J
       The expression (5b) gives the mean square error E3 in
terms of the mean square deviation of pollutant concentration
at the point P, y2 ,  the correlation coefficient involving con-
centration at the point P and the measurement points P. ,
g.(P,P.), and constants involving the geometry of the measure-
ment points alj .  The value of the mean square deviation, y2 ,
is readily estimated from the field of observation point values
x2. .   Since we have used the normalized form, the values of x2.
are all 1 and the value of ys would be taken as 1 also.

       The expression (5b) may be written to show the dependence
of E2 on the location of the point P, by displaying this
dependence in the correlation coefficients g., g.
                  - Z Z g.(P,P.)aiJg (P,P )               (5c)
                    i j             J    J
                           16

-------
     The mean square error of interpolation for pollutant
concentration is illustrated schematically for a one-dimensional
example in Fig. 1-6 (assuming a continuous correlation coefficient
function as in Fib. I-lb or Fig. I-5b).
     0
 Fig. 1-6.  Schematic illustration of the mean square error of
 interpolated values for a one-dimensional case.

        a)  Implications  for Measurement  Locations

        The  implications  of the  above for determining the location
  of  additional  measurement points is reasonably obvious.  We
  assume that  it is  desired to find the  location of a few more
  observation  points to  improve  the observation  network.   On the
  basis  of (5c)  one  may  easily compute the field of values of E2
  as  a function  of the location  of an interpolation point P.  The
  point  where  E2(P)  is a  maximum is the  point  where there is the
  greatest error in  interpolated values.   This could then be a
  point  where  an observation of  pollutant concentrations  could
  contribute the most information to the  augmented  network systems.

        There  are many practical considerations  that must be
  taken  into account in  the location of observation points.   These
  may be readily accounted for in the computation procedures.   One
  may, for example,  prescribe in advance  the location of  many
  "feasible" observation  points  and compute the  mean square  errors
                              17

-------
of  interpolation  at these  "feasible" points.  The  "feasible"  points
at  which  the mean square error of  interpolation  is  largest
would  then be candidates for the augmented set of  additional
observation points.

       Note that the selection of additional observation  points
is  a step-wise procedure.  One locates  the first additional
observation point where the mean square error of interpolation
is  maximum (a "feasible" point).   This  point is  then  added to
the network so that there  are now  n+1 points considered  and the
computation of E2  is redone on the basis of the  augmented network.
The maximum of E2  is found for this augmented case  and a second
(feasible) observation point located.   The network now contains
n+2 points and ~&  is computed on this basis, etc.,  etc.

        (Note:   This kind  of  iterative  procedure  can be  shown
to lead to a  solution that is  not necessarily  optimum in the
general case,  but it is a  very practical approach and the
results are usually not far from an optimum.)
                             18

-------
                      CHAPTER II
          DETAILED FORMULATION OF THE PROBLEM

     The problem of determining an optimum network of air
pollution observation sites consists, first of all, of
determining what particular characteristics of the pollution
field are of primary interest, how the data are to be used
to obtain an estimate of these characteristics, and then to
specify what it is that is to be optimized and with what
restrictions this optimum is to be obtained.  An extensive
monograph might be written in the exploration of the various
aspects of the problems involved in the items that have been
enumerated above.  Rather than go into all of this detail,
we will short-cut these considerations and set down some
rather simplified ground rules that will be followed.

A.  Area Covered
     In order to formulate the problem of specifying the
optimum location of air pollution observation stations it
is necessary to start by defining the specific area over
which the pollution is to be observed.  This is sufficiently
obvious that it scarcely needs any explanation.  If one is
to optimize the location of air pollution stations in the
area of Dirty City, they need to be located in the vicinity
of Dirty City and not in Clean County, because Clean County
is not the area with which we are concerned.  In fact, one
must be even more specific and define exactly what constitutes
"in the area of Dirty City".  This involves specifying a size,
shape, and location.  It may be, for example, that Dirty
City is a compact area of more or less uniform diameter in
any direction with this area centered approximately at the
City Hall.  In such a case, an adequate specification might
well consist of the statement that the area to be covered
is a circular area centered at the Dirty City City Hall
                          19

-------
with a radius of 23 km (say).   This implies that one is
to optimize the location of air pollution observation stations
throughout this particular circular area.  It may well be
that one is also interested in air pollution measurements
at distances farther than 23 km from City Hall, but these
locations will not be a part of this particular air pollution
station location problem.

B.  Observation Collection
     It is assumed that the data collected from the air
pollution measuring sites will be measured simultaneously
at all locations.  The term simultaneous is used somewhat
loosely and depends to some extent on the nature of the
measurements to be used.  If,  for example, one is concerned
with one-hour integrated air pollutant concentrations, the
averaging should be done over the same clock hour at all sites;
but a difference of five minutes or so between stations for
the beginning and ending of this one hour integration period
is relatively unimportant.  In other words, the differences
in time of beginning and ending of the totalizing interval
are a relatively small fraction of this interval itself.  If
the averaging is considered on a 24 hour basis, then the
exact beginning and ending of the averaging interval at each
location can be correspondingly relaxed.  On the other hand,
if observation data at the different locations are taken at
different clock hours, then it may well be that a comparison
between stations is meaningless.
                             20

-------
C.  Source Locations, Types, Rates
     It is assumed that the locations, types and emission
rates of the pollution sources are known or at least can be
estimated with reasonable accuracy.  The basic reason for this
is the fact that it is completely impossible to even approach
the problem of the intelligent location of pollution obser-
vation sites without this kind of information.  If one were
interested in pollution at only one point, then obviously
one locates a measuring site at this point.  When the problem
is to locate a network of measuring points, then the question
immediately arises concerning the relation of pollution
measurements at the several points with each other.  If two
(or more)  pollution measurements essentially duplicate each
other, then one is unnecessary (or at least of much less
value).  On the other hand, if measurements at two locations
are completely unrelated to each other, then obviously
additional measurements are desirable between them.  The
locations, types and emission rates of the sources involved
are quantities which determine how the measurements at related
points are going to be related, or unrelated, to each other.
A more exact specification of the relationship between the
pollution measurements at different locations is discussed
at some length in Section G below.

D.  Meteorological Conditions
     The meteorological conditions that prevail over the
area concerned are responsible for determining how the
pollutants are carried from their source locations to other
points and whether they tend to be concentrated near the
ground or carried high into the atmosphere.  The principal
factors concerned here are the winds in the lower atmosphere
and the stability (or unstability) of the lower layers of the
atmosphere.  These conditions need to be carefully specified
                           21

-------
and should adequately describe the conditions over the area
concerned.  If in the area concerned the wind has a strongly
predominant direction, it would seem reasonable to locate
pollution observation points down-wind from the more important
pollution sources.  Since stable conditions tend to hold
pollutants near the ground, it would seem reasonable to locate
observation points in the down-wind direction from the important
sources where this wind direction is that which prevails
when the air is stable.  Consequently, the meteorological
conditions need to be specified as the frequency of occurrence
of various classes dependent on wind direction, wind speed,
and stability.

E.  Type of Pollution Measurements
     The type of pollution measurements to be made is another
factor contributing to the optimum location of pollution
measuring locations.  If hourly average concentrations are
of primary interest, it would appear necessary to have more
closely located observation points than if, say, 24 hour
average concentrations are considered the more important.

F.  Interpolation/Extrapolation Methods
     The method that is used to interpolate data between
observation points is critical for the formulation of the
optimization technique for site locations.  There is an
almost limitless number of techniques that may be used.  The
method of interpolation used in this study consists of using
a linear estimate of the pollution concentration based on
the concentrations at the points where the concentrations
are observed and for which the correlation coefficient
function or covariance function is known  (or at least can
be approximated with reasonable accuracy).  Various aspects
of this process were discussed in Chapter I with illustrations
                            22

-------
in the case of a one-dimensional field  (to simplify the
figures).  This provides a highly flexible procedure that can
be adapted to a wide variety of different situations.

     The interpolation/extrapolation method then consists
of estimating a measure of pollutant concentration at a point
                                       /\
(£,n) in the region to be covered, say Y=Y(£,n), from a linear
combination of observations, X. = X(£.,n.) made at observation
points P. with coordinates  (£.,n-).  The formula for this
estimate is

     * = bo + blXl = — + bnXn               (1)

(where it assumed that there are n observation points).  The
coefficients b., i=0/l, 	, n, will be determined later.
The numbers that describe the pollution concentration vary
over several orders of magnitude and are by no means normally
distributed, the logarithm of the pollutant concentration is
actually used as the measure of concentration.  The logarithm
of concentrations is more nearly normally distributed (i.e.,
the pollution concentrations tend to be distributed in a
log-normal way).

     It is assumed that reasonably accurate estimates of
the mean concentration measure are known.  These will be
denoted by Y, X-, ,	_,_ X  so that one may use the departures
               i   ~ ~  n       _                _
from the mean, y = Y-Y, x, = X,-X,, 	, x  = X -X  with
the result that
     y = b,x, +	+ b x                     (2)
          11          n n
                           23

-------
The coefficient b  that appeared in (1) is determined from

     Y = bo + b^ + — + bnXn              (3)

after the coefficients b, ,	, b  have been determined.
                        1        n
(Actually, the problem at hand does not require that any
of the coefficients be computed explicitly; the technique
used is more clearly described if they are included at this
point.)  If one had an observation point at the point P,
                         /\     /\
coordinates (E,,r}), where Y (or y) is being estimated, the
observed value would be Y (or y).  If we assume that
actual values y, x, , 	, x  are observed in an ensemble
of situations that have been defined by the conditions of
Sections C and D above, then the coefficients b,, 	, b
would be determined from the normal equations
                               bn(xnxnj
where (x.x.) is the covariance of the concentration measure
at the points P. and P. and  (x.y) is that for the points
P. and P.  The normal equations  (4) are obtained by requiring
that the square of difference between the concentration
measure and the estimate from equation  (2) summed over
the ensemble of situations, be a minimum in terms of the
coefficients b,, 	, b .   (See standard texts on this
subject, for example Kenney and Keeping, 1951.)
                           24

-------
     If we let a. . =  (x.x.) and g. =  (x.y) , then equations
 (4) may be written as
                                                   (5)
These equations may be solved for the coefficients b, ,  --- ,
b , the formal solution being expressed as

     b^ = 'Ea^g-,            i/j=l/ --- , n        (6)
           j
     The mean square error of estimate is given by the
expression
      2        2
     e  = (-r
where y is the expression  (2).  This may be expressed  in
the form
     e2 = (y2) -  2>igi                           (7)
                  i
(Kenney and Keeping, 1951, or other suitable text).  When
the expression (6) for the b.'s is substituted into  (7),
the result is
     e2 = (y2) -  Z£g,aijg..                     (8)
                          2
In (7) and (8) the term  (y ) is the variance of the pollution
concentration measure at the point P.

      (Digression.  The quantities y, x. above have been
expressed in terms of departures from the mean as per the
expressions in the text between equations  (1) and  (2).
                              25

-------
The quantities (y )  and (x.x.)  are the variances of the con-
centration measures at the various points__concerned.  In
                                         2  2  	   2
terms of standard deviations, a and a., y =a ,  (x.x.)=a. .
These may be used to convert the concentration measures to
standardized form (Y-Y)/a, (X.-X.)/a., and the argument
remains unchanged with the exception that in (7) and (8)
                                 2
the value 1 is substituted for (y )  and the error of estimate
 2          22
e  becomes e /a .)
     The role of the coordinates of the points concerned in
the preceding relations is important, but is concealed in
the notation.  The covariances that appear in the normal
equations (4) and subsequently are functions of the locations
of the points for which the index number appears as a sub-
script with the exception of the point P for which no sub-
script appears.  Thus a., = x.x..  = a..(P.,P.) or
= ai;. (£>i,ni;Cj/nj) while g± = x^y = g^P-j^P) or = g± (^ ,1^; 5 / n)
The elements of the inverse matrix, a1-1 , are exceptions to
this.  In the process of matrix inversion, all of the elements
of the matrix are involved so that each element of the  inverse,
a1-1, involves all of the points P,, --- , P  , but not the
point P at which the concentration measure is being estimated.
This point appears only in the terms g. = g. (P. ,P) .  If one
substitutes the expression (6) for b. into the equation for
the estimated concentration measure,  (2), the result is
     y =
         i     j
where in the second line the location of the point P at
      /\
which y is estimated is explicitly shown.  If the covariance
                              26

-------
function gi(P.,P), as a function of the points P. and P,
is known or can be estimated reasonably well, it may be
                                            /^
interpreted as an interpolation formula for y based on
a linear combination of the observed concentration measures,
x., at the points P..  The expression in parenthesis gives
the "weight" of each observed measure, x., in terms of the
location of the point P with respect to the other points,
P., i=l, 	, n.  This is quite analogous to the standard
two-dimensional interpolation formulas as for example those
shown in Section A of Chapter I.

     One may also note that if in  (9) the point P is the same
as the point P, , then g.(P.,P) becomes g.(P.,P,).  Now
              K        D  J             D  J  K
by virtue of its definition [preceding equation  (5)] this
is only another way of writing a.,  , that is
The summation in parenthesis in equation  (9) then reduces
to
      Z>ajk = 6ik

where S.,=0ifi^k and = 1 if i = k.  This means that
       1JC
the expression (9), when summed on index i, ignores all of
the observed values at other points and assigns to the point
                     A
P (=Pk)  the estimate y=x,, i.e., the value observed.  Note,
however, that the covariance function 9],(pv'p) is not
necessarily continuous at the point P, , so that as P approaches
             /\                       JC
P,  the value y need not approach x, .  The details of this
situation were discussed in Chapter I.
                              27

-------
G.  Optimization Method
     The expression for the mean square error of estimate
given by equation (8)  forms the basis of the optimization
method that will be used.   The development of this relation
was based on the assumption that the covariance of the con-
centration measurements between any two points, P. and P.,
was known, a. . -x.x.,  so that the elements of the inverse
         in
matrix, a  ,  could be found.  It was also assumed that the
covariance function for concentration measure between an
observation point, P., and any other point, P, in the area
concerned was also known or at least could be reasonably well
estimated.  (The details of how these assumptions are realized
are discussed in Chapter III.)  The equation  (8) then permits
one to compute the error of estimate using (9) as an inter-
polation formula for any arbitrary location of the point P.

     One may say that the selection of the n observation
points P, ,	, P  has been chosen in an optimal^ manner
        in                             „
if the largest mean square error of estimate, e , at any
point P not an observation point has been reduced to a satis-
factory level.   There are many ways in which such an optimi-
zation procedure can be carried out.  The one adopted here
is a one-step-at-a-time add-on method.  It is presumed to
start with that there are a few existing observation points
or at least a few points at which observations will be made
based on prior considerations.  On the basis of this given
starting network of observation points, the equation  (8)
is used to locate the point at which the error of estimate
is largest.  This point is then accepted as a best location
for a new observation point.  This point is then added to
the list of observation points that are used to determine the
error of estimate from equation  (8).  The process is then
                           28

-------
repeated; the location of the point of maximum mean square
error is found, a new observation point is located, etc.
The iterations of this process are terminated when the largest
mean square error of estimate has been reduced to an acceptable
level.
                              29

-------
                      CHAPTER III
THE CORRELATION OF POLLUTANT CONCENTRATION BETWEEN POINTS

     In order to implement equation (8)  of Chapter II for
use in estimating the mean square estimate, it is necessary
to know the correlation coefficient for pollutant concen-
tration measures between observation points and between an
observation point and an arbitrary point in the area of
interest.  The use of actual data from past measurements of
pollution concentrations is desirable, but, in dealing with
an area over which observations have not been made, it is
not possible to follow such a procedure.  Consequently,
the correlation coefficients for pollution concentrations
were determined by using a model to generate synthetic data
and the correlation coefficients were computed from the
synthetic data.

A.  The Model Used to Generate Synthetic Data
     The model that was used to generate the synthetic
data on which the correlation coefficients were based was
a simple Gaussian Plume model described by equation  (3.2) ,
page 6, of Turner (1970) as follows

     X(x,y,0,H) = (Q/Trayazu)exp[l/2{(y/ay)2+(H/az)2]    (1)

where  (x,y) are coordinates of the point at which the con-
centration X(x,y,0,H) is calculated (x is measured down-wind
from the source, y is measured cross-wind from the down-wind
axis), Q is the source strength, H is the source  (stack)
height, u is the wind speed, a , a  are dispersion coefficients
The dispersion coefficients, 0,0, were computed from  the
formulas developed by Eimutis and Konicek  (1972).
                            30

-------
     The wind conditions were computed on the basis of a
double circularly normal density function.  This is expressed
as

  f(u,9)udud9 =  [k1f1(u,6;w1,1,a1)+k2f2 (u,6;w  ,  2,a2) ]udud9

where k,,k_ represent the proportion of the time the wind
is in the state described by f,(u,0;	) and f2(u,6;	)
respectively and k,+k2=l.  The functions f,(u,8;	),
f_(u,9;	), are the circularly normal bivariate density
functions with parameters as shown after the semicolon, but
expressed in terms of wind speed, u, and direction, 9, instead
of in terms of rectangular wind components.  Thus

  f (u,9;w,c|),a) = (7ra2)~1exp{-a~2 [u2+w2-2uwcos (9-) ] }

where u is the wind speed, 9 is the wind direction, w is the
mean resultant wind speed,  is the mean resultant wind
direction, and a is the vector standard deviation.  (See
Brooks, et al, 1946, or Brooks, et al, 1950.)  A single
circularly normal density function for wind at St. Louis
is inadequate since there are two distinct modes of the
over-all wind density.  One of these is at w.=7.90, $,=198.5°,
a1=6.5, k1=0.4 while the other is at w2=9.00, <}>2=3250,
a2=6.50, kp=0.6.   Stability class 4 was used throughout
(Turner, 1970).

     The use of the probability density function for wind
enables one to use far fewer parameters to describe the
wind direction and velocity frequency tables.  For 16
direction categories and 9 speed categories, a total of 144
                             31

-------
entries are required.   In this case of a bimodal density
function, only 8 parameters are needed while the speed and
direction categories may be divided into arbitrarily small
intervals.

     There  were 21 pollutant sources used to compute the
correlation coefficients.  Their locations, strength and
stack height are listed in Table III-l.  Correlation coefficients
between points were computed for a 9X9 point grid of locations
centered at the "arch" in St.  Louis.  The spacing between
points was  10 km.  Thus an 81X81 matrix of correlation co-
efficients  was obtained.  (See Section D below for further detail
of the program.)

     Since  the logarithm of pollutant concentration rather
than concentration itself was to be used in computing the
correlation coefficients, a very small background pollutant
concentration was added in each case to avoid the difficulty
presented by taking the logarithm of zero.

B.  Examples of the Correlation Coefficients
     Some of the correlation coefficients obtained in the
way described in Section A are illustrated in Figures III-l
and III-2.   The points of the grid were numbered serially
starting in the lower left corner with point no. 1 and num-
bering upward in each column.   Thus the bottom row of points
are those numbered 1,  10, 19,  28, 37, 46, 55, 64, 73.  The
top right corner is point 81.   The arch is at location 41.

     Actual correlation coefficients for observed 24 hr SO-
concentrations were also computed using information from the
St. Louis 1964-1965 sulphur dioxide field data  (Ruff, 1973a).
                           32

-------
                      TABLE III-l
    SDURCc STRENGTH 3flR
                                         y$         HS
                                       168.       183.
                                       16*.        48.
                                       I5t.        71.
                                       1U8.        65.
                                       1U7.        66.
                                       1*5.        1,3.
                                       ii,?.        71.
                                       ii,w.         9.
                                       i4(,.        ?i.
                                       130.       108.
                                       120.       1C9.
                                       11*.       IQD.
                                       150.        8<».
                                       15^.        ia.
                                       li»9.        10.
                                       1!»9.        10.
                                       1*9.        10.
                                       1*9.        10.
                                       139.        10.
                                       139.        10.
                                       129.        10.

   MINIMUM  CONCENTRATION  =  l.OCOE-20
(1)   The data were provided by EPA from the 1964-1965  St.
     Louis Air Pollution Study.

(2)   Source strength and meteorology were assumed to be
     independent.

(3)   XS, YS are the coordinates of the source locations  in
     grid units.   HS is the source height in meters.   Source
     strength is in units of grams per second.
POINT
i
2
5
6
7
8
9
10
11
12
13
1<*
15
16
17
13
19
20.
23
STRENGTH
• 36oo£O03
.1730t+03
• 2- <+ Jc fO 3
.5595EO4
. 2i»3^L *0 i#
.^903E<-32
.5900^f 32
. 120Jc*33
. 120 JEi-33
• 120 Of.*!)*
. 1 2 0 0 r. O "»
.1203L+-03
. 12GOt O3
.120Q£f03
.1200^03
Xb
89.
ICQ.
9i!
95.
9n .
93.
90.
67.
8*.
80.
75.
106.
7b.
bo .
93.
1 Ofr.
«fa.
^b .
115.
                         33

-------
lORR.COtFF.
   -.576^
-.5fa2
             cOIATION fO
          623.1  -.56^  4 -.375
                   V
                              .194     .176
 .097
 4
                                                                    063
^ -if    -or    -+4    ij
     FIGURE III-l.   Unsmoothed Contours of Correlation
              Coefficient  Centered at  Location  40
                           34

-------
;apR.co£FF. MATRIX  LO;
                                                                     510
   -.044
                                                                   -.751
                                                                   -.7*6
         FIGURE III-2.   Unsmoothed Contours  of  Correlation
                 Coefficient  Centered at Location 42
                              35

-------
     The correlation coefficient contours from actual SO-
measurements are shown in Figures III- 3 and III-4 for
locations No. 7 and 12 respectively.   Location No. 7 has
coordinates X=18.2, Y=22.50 and is described as "Power pole
with transformer in front of 1914 Obear St., between 20th
and Blair".  Location No. 12 has coordinates X=22.48,
Y=18.64 and is described as "Pole South of East Side Health
District, 628 N. 20th St., East St. Louis, supplies power
to building approximately 100"  east of street".  The scale
is shown on these figures in the lower right hand corner
since it is quite different from that of Figures III-l and
III-2.  On these charts, the entire area corresponds to
an area approximately 3  squares wide and 2  1/2 squares high
on the preceding charts.  The station numbers and locations
for Figures III-3 and III-4 are shown in Figure IV-5, p. 92.

     Figures III-l and III-3 together with Figures III-2
and III-4 illustrate the large changes that occur in the
correlation coefficient contours when the location of the
point with which all other locations are being correlated
crosses the central part of the area in which the pollution
sources lie.

C.  Interpolation of the Correlation Coefficients
     The use of equation  (8), Chapter II, to obtain the
mean square error of estimate of pollutant concentration
requires that the correlation coefficients be those between
selected observation points or between observation points
and an arbitrary point.  The correlation coefficients
(synthetic) calculated are those for points on a 9X9 grid
with 10 km separation between rows/columns of points.  To  go
from the latter to the former requires that some kind of
                         36

-------
ST.LOUIS SO? POLLUTION STATISTICS
STATION  r   CORRELATION C.OFFF 1C TF-NT
     FIGURE  III-3.  Observed Correlation  Coefficient of  24  hr.  SO,
          with that at Station No. 7.  Similar to the  Simulated
              Correlation  Coefficients in  Figure III-l.
                                37

-------
ST.LOUIS SOS POLLUTION STATISTICS
STATION 12    COOPELATTON COEFFICIENT MATRIX
        •-.051,
                                                                           ."00
       FIGURE III-4.  Observed Correlation Coefficient  of  24 hr. SO,
            with that at Station No.  12.   Similar to  the Simulated
               Correlation Coefficients  in Figure III-2
                                    38

-------
interpolation procedure be used.  The method finally adopted
is based on the proper function/proper value representation
of the correlation coefficient matrix.

     To illustrate what is involved in the interpolation
process, it is desired to obtain the correlation coefficient
r(x.,y.;x., y.)  where  (x.,y.)  are the coordinates of one
observation site, P., and (x.,y.) are the coordinates of
another observation site, P..  We have available correlation
coefficients between points on a 9X9 grid.  Let these points
have coordinates (£ ,r\ ), (£ ,n ) so that the correlation
coefficients are functions r(£ ,n '^n'^)'  One ^s tnen faced
with an interpolation procedure that involves four separate
coordinates.  This in itself is a reasonably formidable task.
In view of the great irregularity of the correlation field,
as illustrated in Figures III-l and III-2, the task is even
further complicated.
     The proper function/proper value representation of the
correlation coefficient matrix reduces the problem to that
of performing two interpolations, each in a two-dimensional
field, in succession.  It also has the advantage that a
certain amount of smoothing of the correlation coefficient
field is being done at the same time.  The details of this
process are discussed in Chapter IV, so that only a brief
extract of the essentials is given here.  The essence of
the situation is that the correlation coefficient matrix
with elements cmn=xmxn=r^m'nm;^n'nn^ may be rePresented in
the form
                            Xk*k (?m' V *k (^n' V     (2)
                           k
where, on the right hand side the summation over the index
k involves the proper values X..,X2,	, A, ,	, X ,
                           39

-------
all positive numbers and in decreasing order, ^-\>^j --- >^  /
and the proper functions ,(£  ,r\ ), one for each of the
                          K.  HI  in
proper values.   (These are also referred to as eigenf unctions
and eigenvalues, principal values and principal components,
empirical orthogonal functions, etc.)  These are computed
at the points of the 9X9 grid over which the matrix of
synthetic correlation coefficients was obtained  (81x81) .   Each
of these is dependent on only the two coordinates of a grid
point.  Then to obtain the synthetic correlation coefficient
between observation sites a. . =r (x . ,y . ;x . ,y . ) one simply
interpolates among the grid points to obtain each v(x.,y.)
                                                   K  1  1
and each ,  (x.,y.) and substitutes back in equation  (2).
          K  J  D
To obtain the factors g., g. that appear in  (8), Chapter II,
one notes that these are also correlation coefficients, but
involve an observation site  (x.,y.) and an arbitrary point
(x,y) .  The coordinates of the arbitrary point are simply
used to get an interpolated value of 
-------
            (1,0)      (1,1)
                      -•
                      N»(o,D
                                              ( B  )
          ( C )
                      FIGURE III-5


THE  SIX POINT  ARRAY FOR QUADRATIC  INTERPOLATION (A)  AND

THE  THREE 90° ROTATIONS ABOUT  (0,0).  THE AREA OF BEST

INTERPOLATION IS INDICATED IN  EACH.
                          41

-------
The arrangement of the points is shown in (a)  of Figure III-5.
This formula gives the best results over the triangle bounded
by  (0,0), (1,0), and  (0,1) (shaded in the figure).  In order
to maintain this degree of interpolation accuracy, corres-
ponding formulae were written for the three 90° rotations of
this unsymmetrical array of points as shown in  (b),  (c),
and (d) of the figure.  To carry out an interpolation, the
point  (0,0)  on the grid was located so that (x,y) fell within
or on the boundary of one of these triangles and the corres-
ponding interpolation formula was applied.

D.  Generation of Synthetic Pollution Correlation Coefficients
     The synthetic correlation coefficients for two-point
pollutant concentrations were generated by program PSCOV
which is reproduced on the pages following this discussion.
The pages following the program contain a print-out of the
input parameters that were used.

1.  Main Program
     The input parameters are read into the program in
lines 15 through 87.  The wind selector, JW, was used  (line  20)
to provide for wind information from either a frequency
function (JW=1) or from frequency tables  (JW=2) and provided
a termination of the computation  (JW=0).  The subdivision
of wind directions and speeds, stability classes, and
inversion heights were read in on lines 29 through 35.  The
parameters of the wind frequency function are read on lines
45 through 53  (option JW=7, that actually used).  The provision
for the wind frequency tables is contained in lines 55
through 62.   The ground coordinates of the locations at which
the correlation coefficients are to be computed are read
                               42

-------
 on line 67.  Pollution sources, strength, effective stack
 height, and coordinates are read on line 82.  A minimum
 pollutant concentration was read on line 86.  The purpose
 of this was to provide a very small, but non-zero pollutant
 concentration at all points in all situations so as to avoid
 the possibility of an impossible logarithm of pollutant con-
 centration which might occur in subsequent calculations.

      The accumulation of data is initialized on line 90.
 The data accumulation is carried out in a sequence of
 nested DO-loops beginning on line 96 and ending on line 170.
Subroutines FREQF  (line 112) and SIGMAS  (line 152) are
called  in this  section and are discussed  subsequently.

     The statistical parameters are  accumulated and reduced
in the  section  from lines 174 through line 190.  The  section
from line 190 to the end consists of output  instructions.
Subroutine MAPI is used to printout  the correlation co-
efficients, covariances, and other parameters in a "map"
format  so that  contours of equal correlation coefficient  (for
example) could  be rapidly drawn.

     2.  Subroutines
     a)  Subroutine FREQF
     The wind frequency function (probability density) is a
combination of circularly normal bivariate distributions

     f(v,6) = ^f (v,e;w1,F1,a1) + ... + *nf (vf 9;wn, W

in which k  + ... + kR = 1, the ^'s being the fraction of
the total wind  population in the ith category and the
parameters w. ,  6". ,a. are the speed and direction of the
mean wind vectors and the vector standard deviation of wind
speed in the  ith category.  The probability density functions
                            43

-------
on the right differ only in the values assigned to the
pa rame te r s.  Thus

   f (v,0;w,e", ) = (ira2)~1exp{-a~2[v2+w2-2vwcos(e-'e) ] }

The common multiplier for all the density functions, vdvdG,
is omitted in the above.  It has been found that the most
complex wind distributions can be represented succinctly
and with reasonable accuracy in the above form.   This saves
storage of extensive tables of observed frequency distributions

     b)  Subroutine SIGMAS
     This subroutine provides for the computation of the
"sigmas" that appear in the pollution concentration formula.
The data were taken from E.G. Eimutis and M.G. Konicek,
Derivations of Continuous Functions for the Lateral and
Vertical Atmospheric Dispersion Coefficients, Atmospheric
Environment,  Pergamon Press, 1972, Vol. 6, pp. 859-863.

     c)  Subroutine MAP
     This subroutine prints out on the standard page printer
the data input to it at the proper coordinates  (as nearly
as the printer will allow).  Since we are primarily interested
in correlation coefficients, the decimal point serves not
only to fix the magnitude of the quantity concerned, but
also as a  "fix" for the point location.
                           44

-------
  PROGRAM     PSCOV

                  PROGRAM PSCOVM INPUT, OUTPUT,TAPE2l
            C     COMPUTES MEANS, STO.DEV. ,COV. ,ANO COR.  OVER  POINTS  FROM
            C     UP TO 30 POINT SOURCE  INPUTS.
            C
 5                COMMON                  F(18,10, 6) , X(100) , Y(tOO) ,QS(3Q) ,XS(30) ,
                 *       YS{30l,HS<30),CaAR(10ai , CCC V 15050 1, CH ITL ( 100 1,CSIG{ 1001,
                 *       CCOR(50501
                  COMrtON/FFF/IH(61, W«(6,4I ,H3AR( 6 ,<*) ,T8AR ( 6, 41 , HSIG(6,4I
                 *          ,OIR(18l,l/£L(101 ,IO,IV
10                COM*ON/STA9/AL(6) ,ISTAB(6I
                  DIMENSION KNTC10) , ARRAY! 10 01 , I TYPE (4)
                  DIMENSION XK100) , YK100)
                  PI=3. 1415926535
                  RAO=0. 0174533
15          C
            C     INPUT QUANTITIES
            C     JH=HINO DATA SELECTOR*  JW=0  FOR STOP,JW=1 FOR  FREQ.  FUNCTION,
            C     JW=2 FOR TABLES
            C
20             10 REAO 1005, JH
             1005 FORMATU5)
                  IFUW.EQ.OI STOP
            C
            C     IQ.NO.OF WIND DIRECTIONS*  OIRC INDIRECTION  TABLE,  MAX*13
25          C     IV*NO.OF WIND SPEEDS,  VELfI)*SPEEO TABLE, MAX=10
            C     IST»NOt OF STABILITY TYPES,  ISTABCIlsSTABILITY TABLE,
            C     AL(I)*INVERSION HEIGHT,  MAX  =  6, 0.  =  NO  INVERSION
            C
                  REAO 1010,10, (OIR(I) ,1=1,10)
30                REAO 1010,1V, UEL(I> ,1=1, IV)
             1010 FORMAT(I5/(8F8.2) I
                  RE A 3 1015, 1ST
             1015 FORMATII5I
                  REAO 1116, < 1ST ABC II ,AL< II, 1=1, 1ST I
35           1116 FORMAT! 5CI5,F10.2l)
                  GO TO <15,25),JH
            C
            C     READS FREQ. FUNCTION DATA
            C     IW
-------
   PROGRAM     PSCCW

             C     F(I,J,K)=FREQUENCY TABLE ENTRIES
             C
                25 DO 30 1=1,1ST
                   CO 30 J=1,IO
 fcO                REAO 1025,(F(J,K,I),K=1,I\M
                30 CONTINUE
              1025 FORMATC9F10.0)
             C
             C     REAOS GROUND COORDINATES
 65          C     IG=HO. OF GROUND POINTS,X(I),Y(I)=COORCINATES
             C
                35 REA3 1030,IG,(XI
              i030 FORMAT(I5/(2F10.0))
                   11 = 0
 70                00 36 1=1,IG,9
                   00 36 J=1,IG,9
                   11=11+1
                   X(III=XICI)
                36 Y(II)=YKJ)
 75          C
             C     REAOS SOURCE POINTS AND PARAMETERS
             C     IS=NO. OF SOURCE POINTS
             C     QS
                         46

-------
   PROGRAM     PSCUi/

                   lie  TO  (55,60)  JW
                55 CALL F*EQF< 11,12, 13, FF)
                   GO  TO  65
                60 FF=F(I2,I3,I1)
115             65 IFCFF.tQ.O.)  GO TO  115
                   SUHF=SUMF+FF
             C
             C     LOOP ON COORDINATES
             C
                   INO£X=0
                   uO  110 I*t=i,IG
                   YY=Y(I<»»
             C
125          C     ROTATES TO  WINu  COORDINATES
             C
                   X1 = XX*COYY»SS
                   Y1=-XX*SS*YY*CC
             C
130          C     LOO? ON SOURCES
             C
                   SUM=CHIMIN
                   00 100 15=1, IS
                   XX3=XS
-------
               PSCOtf

             C
             C     £NO LOOP ON COORDINATES
               110 CONTINUE
             C     ENO LOOP ON SPEED, DIR., AND STA3ILITY
170            115 CONTINUE
             C
             C     PUTS STATISTICS IN STANDARD FORM  ANO  UNITS
             C
                   IDX = 0
175                DO 120 1=1,IG
                   C6A<(I)=CBAR(I)/SUMF
                   00 120 J=1,I
                   IOX=IOX*1
               120 CCOV(OX)=CCOV , M3AR (I , J ) ,MSlG (I, J)
               205 CONTINUE
              20<*5 FORMAT(I10,<»F10.2)
2?0                GO TO 220
                               48

-------
   PROGRAM     PSCOV

               210 PRINT 2050
              2050 FOR*AT<»OHINOS FROM FREQUENCY TABLES*)
                   00 215 1=1,1ST
                   PRINT 2055,ISTA3(I)
225           2055 FORMAT<*OSTABILITY CLASS',15)
                   PRINT 2060,CVELCJ),J=l,Itf)
              2060 FORNAT(8X,10F8.2)
                   00 215 K=1,ID
                   PRINT 2065,OIR
-------
   PROGRAM
 PSCOtf
                   00 250 J=1,I,8
                   J1 = J
                   J2=J*7
                   IFU2.GT.I) J2 =
280
285
290
295
300
305
310
315
320
325
330
     00 2i»6 JJ=J1,J2
     IOX=IDX*1
     *=<«•!
     KNT(K)=JJ
 2<»8 ARRAVIK)=CCOVUOX)
     PRINT 2119, 
-------
   PROGRAM     PSCOV

                   INO=IOX
                   INC = I
                   11=1*1
                   00 320 L=I1,IG
335                ARRAYIK)*CCOV(IND+INC)
                   INC=INC+1
               320 K=K*1
                   ENCOOEl2flf3000,ITYPE(3ll I
              3000 FORHAT(*LOCATION*f I3.9X)
                   CALL MAP1(IG,X,Y,ARRAY,2,ITYPE»
               330 CONTINUE
               335 ITYPEI1)»10HCORR.COEFF
                   ITYPE(2)=10H. MATRIX
                   IOX = 0
                   00 360 1=1, IG
                   00 3<*0 J=l,I
350                ARRAY IK) «CCOR(IOX)
               3^0 K=KH
                   IND=IOX
                   INC=I
                   11=1*1
355                00 350 L=LltIG
                   ARRAY (K)=CCOR(INO*INC)
                   INO*INO*INC
                   INC=INC»1
               350 K=K*1
360                tNCODE(20,3000,ITYPE(3)J  I
                   CALL MAP1(IG,X,Y,ARRAY,2,ITYPE1
               360 CONTINUE
                   GO TO 10
                   END
                              51

-------
  SUBROUTINE  FREQF
                  SUBROUTINE FREQF(I1,I2,I3,FF)
                  COMi10N/FFF/IW<6» , UK »
                  GO TO 20
20             18 OV=(VEL(I3*l)-»/ELCI3-U)/2.
               20 J2=IW(I1»
                  00 22 J=1,J2
                  SIGSQ= MSIG(I1,J)»»2
25                W1=W8AR(I1, J»
                  F=CVl**3+Hl**2-2.*tfl*Wl*COS(RAD*OIR(I21-RAD*TBAR(Il, J)))/SIGSO
                  F=W
-------
  SUBROUTINE  SIGHAS

                  SU3ROUTINE SIGMAS(IltISTa,X,SIGY,SIGZfH)
                  DIMENSION A 161 ,A1 (18) ,81 (18) ,C1 (18 )
                  COMNON/STA8/AL (61 * ISTA8 <6I
                  DATA (Af I) , 1*1,6) /. 365 8, 0.2751,0. 2 089, 0.1471,0. 1046, 0.0722/
 5                DATA (AKI), 1=1, 18) /O. 000 2<*,0 .0 55, 0. 113 ,1.26,6. 73, 18. 05 ,
                 *                      0.0015,0.028,0.113,0.222,0,211,8.086,
                 *                      0.192, 0.156, 0.116, 0.079,0. 063, 0.053/
                  DATA Cai(I) ,I»1,18) /2.09i», 1.098, 0.911, 0.516, 0.315, 0.180,
                 *                      1.9<»1,1. 1*9, 0.911, 0.725, 0.671, 0.740,
10               *                      0.936, 0.922, 0.905, 0.111,0. 171, 0.81W
                  DATA 
-------
  SUBROUTINE  MAP
               10
               15
10
15
20
25
30
35
C
C
C
               20
     SUBROUTINE MAP   MATRIX(I)
      FORMAT(5X, 1H*,F6.3,8X)
      GO TO 200
      £NCODE(20,1016,ARRAY(N,L)»  MATRIX(I)
      FORMAT(6X,1H»,F6.3,7X)
                             54

-------
   SUBROUTINE  MAP

                   GO TO 200
               127 £NCOO£<20,1017,ARRAY(N,U) MATRIX(I)
              1017 FORMAT<7X,1H*F6.3,6X)
                   GO TO 200
 60            128 £NCOD£(20,1018,ARRAY,N=1
 70            210 CONTINUE
              2004 FORMATCX,13A10>
                   RETURN
                   ENTRY MAPI
                   11*0
 75                IS=9
                   00 330 N~1«NP,9
                   I=IS
                   00 320 NN*1»9
                   11*11*1
 80                X(II)»MATRlxm
               320 1=1*9
                   IS=IS-1
               330 CONTINUE
                   PRINT 2002,KIND
 85                PRINT 2006, IX(I>,I = 1,NP)
              2006 FORMATOF10.3/////1
                   RETURN
                   END
(Input data on point source statistics is listed in Table  III-l,
p. 33)
                             55

-------
                      CHAPTER IV
        THE ANALYSIS OF THE COVARIANCE MATRICES

     In the use of the covariance function or correlation
coefficient function it is important to evaluate the part
that may be due to small scale effects and to random errors.
The problem is analogous to the problem in communication
theory in which a signal is observed in a noisy background.
In order to determine the signal, it is also required that
a considerable amount of information be available concerning
the nature of the noise.

     The situation in general was described in qualitative
terms in Chapter I where the effect of a jump discontinuity
at zero distance or a part with small "range of influence"
on the statistical interpolation were discussed.  Up to this
point the method of finding the magnitude of this discon-
tinuity at zero distance or the magnitude of the effects
with small "range of influence" have not been discussed.
They are the specific subject of this chapter.

A.   Evaluation of the Effect of Errors of Measurement or of
     Small Scale Phenomena
     To evaluate the effect of errors of measurement or
of small scale phenomena, we carry further the procedures
that led to the mean square estimate of Chapter II, equation
 (8) .  The expression for the estimate of pollution concentration,
^
y, at P was given in  (9), Chapter II, as
where the x.'s are the observed concentration measures on
a particular occasion, g.=(x.y) is the covariance of the
                                     i~i
concentration measures at P. and P, a J is an element  (row
i, column j) of the inverse of the covariance matrix {a. .},
                         56

-------
a. .= (x . x . ) , where a. . is the covariance of concentration
measures at P. and P..  It was pointed out following
equation (9) of Chapter II that if we let P^Pi,/ an ob-
servation point, and assume that g.+a.,  at the same time,
     XV                            ]  JK
then y+x, ,  i.e., the estimate at P approaches the observed
value at P,  when P approaches P,.

     We consider now the situation illustrated in Figures
I-3a, I-3b and Figures I-5a, I-5b.  To express the ideas shown
there qualitatively we need an explicit formulation for the
covariances g.=g. (P.,P).  Thus, let

          (1)    (2)
     q .  = g .  + q .
     yD    y:    yn
      (1)
where g.  is the part of the covariance g. which describes
the overall variation of g .  as a function of the location
                          D     (2)
of P with respect to P .  while g .   is the part that represents
the amount of discontinuity in g . at P .  on the one hand or
the part of g. that has a limited "range of influence".  The
expression "limited" also needs definition (or at least
needs to be made explicit) .   For our purposes, "limited range
of influence" will be taken to mean that at distances from
P . to P  that are of the order of magnitude of the spacing
between  prospective observation sites the magnitude of
(2)                                       (2)
g.  is negligible  (i.e., the presence of g.  cannot be
distinguished from sampling variations) .  With this speci-
fication of "limited range of influence", the situation
                     (2)
is treated as though g .   were simply the magnitude of the
jump discontinuity if g (P.,P) for P-»-P . .  Then we may write
                          3            3
                             57

-------
                    = g.j(1) (Pj'Pj)  + 9j(2)
or
          g..(2) (PjfP)  = 0 if

          g.(2)(P.,P)  = g. (2) (P. ,P.) if p=p.
           D     D       J     D  D        D
Equation (1)  then becomes
where P,  may be any one of the points P . .
     Consider now the element of the covariance matrix {a. .},
a. . = (x.x.).  Note that these covariances are essentially
the same as those of g .  = (x.y) except for the fact that the
wandering point P involved in g .  = g.(P.,P) is now restricted
to one of the observation sites.   As long as the points P.
and P. at which the covariance a. .  = (x.x.) is computed
are distinct, the values concerned are just those of
g.    (P., P.).  The part of the covariance that had "limited
range of influence" is not involved.  But throughout the
principal diagonal of {a. .} one has i=j .  For these elements
of the matrix {a. .} both parts are involved.  These elements
of the matrix will be written as
     a.. =a..(1) + e.2
      11    11       i

where a..     is what we have called g.    (P.,P.) above and
                           58

-------
  2                          (2)
e.  is what we have called g. v  (P.,P.).   (Note:  the
 J-         «                1     11
quantity e.  is not_to be confused with the mean square
                    2
error of estimate, e , without subscript/ used in equation
                               2
(8), Chapter II.)  The terms e.  will be referred to as the
"residual variances".  They are the quantities that must
                                              (2\
be determined in order to separate the part g.    from g..

     The method used to determine the residual variances
is an important part of Factor Analysis.   It is unfortunate
that in the physical sciences the importance of these residual
variances has been, until recently, neglected to a large
extent.  At least a part of this is due to the fact that
physical scientists also tend to be instrument designers and
have felt that to recognize "errors of measurement" was to
cast doubt on the quality of their instruments.  Factor
Analysis is traceable to the psychologist Carl Spearman
(1904)  and the recognition of the importance of the residual
variances originates with him also.  As used by psychologists,
the covariances  (or correlation coefficients) that make up
the matrix elements a..  are those between "test i" and
"test j" when given to a group of subjects.   The psychologists,
like the physical scientists, were hesitant to admit that
these tests involved "errors of measurement".  They did,
however, recognize that each test had something about it
that was unique  (belonged to it alone).  Consequently what
we here call the residual variances were called by the
psychologists the "uniquenesses".

     Factor Analysis, as such,  is devoted to things quite
different from the determination of the residual variances.
These are rather incidental, but have an important bearing
on the prime objective in that subject.  We confine our
attention to finding the residual variances alone and ignore
                           59

-------
all other aspects of Factor Analysis.  The best treatment
that we have found is that of Lawley and Maxwell (1963)  and
(1971) who treat the subject from the point of view of a
statistician and include discussions of the applicable
tests of significance.  The ordinary texts on Factor Analysis
tend to emphasize the computational and interpretive aspects
of the subject and neglect the significance tests.   For
example, Horst (1965) is a 730 page compendium of computing
programs and the associated mathematical manipulations which
does not mention a single test for significance.  On the other
hand, adequate tests of significance only go back as far as
Bartlett (1951) and Lawley (1956) .  The methods used here
depend chiefly on Joreskog (1962) and may also be found in
Lawley and Maxwell (1971)  [but not in  (1963)].  As far as
we have been able to determine, the treatment of the method
of finding the residual variances using the formulation of
the problem as an integral equation rather than as a matrix
equation has not appeared elsewhere except for a brief note
by the writer  (Buell, 1972) which covers only a small part
of the work reported here.
                      60

-------
B.  Determination of the Residual Variances
     The discussion of Section A preceding leads us now
                                    2
to consider the matrix A = {a.. + e. 6..}, where 6..=0
if i^j and =1 if i=j so that we may write
A =
a22+e2
                         n2
                                            2n
                	,  a  +e
                      nn  n
and in which a..=a.., i.e., the matrix A is symmetric.
The matrix A is also positive definite.  To keep the notation
from becoming excessivly complex, the diagonal terms shown
as a..  here were denoted by a..     in the previous section.
We note now that one may write the matrix A as a sum of two
matrices, A=C+D, where C is the matrix {a..} and D is the
                   2~                      -1
diagonal matrix {e.  6..}.  The problem at hand consists of
finding the elements of D and the diagonal elements of C
                                                     2
(i.e.,  the values of a..) so that their sum, a.. + e.  will
have the known value that appears on the diagonal of the
given (known) matrix A and such that all of the diagonal
elements of D are positive (or perhaps zero) and such that
the resulting matrix C will be positive definite.
                        61

-------
1.  Representation of a Matrix in Terms of Proper Functions/
    Proper Values
     In this section some background material is intro-
duced which forms the basis on which the following sections
depend.  We consider a symmetric positive definite matrix
B={b..}.  It is well known that such a matrix may be written
in the form

     B=$A$',        $•= transpose of $            (2)

in which the matrix A is a diagonal matrix such that the
elements on the principal diagonal, X., are all positive
(or at least non-negative) real numbers.  It is also further
specified that these be written in decreasing order of
magnitude:

     A,>A0>A_>	>A >0
      1— &— j—   — n—

These are the proper values belonging to the matrix B.
(Also called eigenvalues, latent roots, characteristic roots,
etc.)  The matrix B is assumed to have n rows and columns.
There are then n proper values, not necessarily all distinct.
The proper values are the solutions of the determinantal
equation

     |B-AI| = o

where  |•|  stands for the determinant and I is the nXn unit
matrix.  This equation is a polynomial of degree n in A.

     With each of the proper values, there is associated a
proper function  (or proper vector) which appears in the
column of $ that corresponds to the position of the associated
                           62

-------
proper value  in A.  Thus to X . , there corresponds the
column vector col{<}>, . ,4>~ . , --- ,cj> .} which stands in the
                   .1. -L  £ J_       II J_
i'th column of  .   (These are also called eigenvectors,  latent
vectors, characteristic  vectors, etc.)  These are the solutions
of the homogeneous equations
       k
(It is also said that the matrices A, $, are the solutions
of the matrix equation B$=$A.)

     The proper vectors or proper functions have the property
of being an orthonormal set of vectors.  That is to say, they
are normalized,
       k
and they are orthogonal to each other
       k
     The original statement that the matrix B may be ex-
pressed in terms of its proper values and proper functions
is written in explicit summation notation as
            k
     The decomposition of a matrix into its proper values/
proper sectors is an important aspect of matrix analysis,
of which we have presented only the bare essentials for a
particular case.  See any standard text such as Bellman  (1960) ,
Gantmacher (1960a) , MacDuffee  (1949) , Perils  (1952) ,
Turnbull (1960), etc. for a general treatment.  Not only
                          63

-------
is it important for the subject at hand, but has many
applications, particularly to oscillatory systems from simple
strings and mechanical systems (Gantmacher, 1960b) to nuclear
spectra (Mehta, 1967).

     In the above, the terms proper vectors and proper
functions have been used interchangeably.  This usage
is intentional and is to emphasize the fact that what is
termed a "vector" with n discrete "components" in this dis-
cussion will turn out to be a function on a continuum and
it just happens that our computing procedure is limited to
the direct computation of values of this function at only
n points.

2.  Factor Analysis Methods
     Modern factor analysis techniques provide methods for
                                   2
finding the individual values of e.  , i=l, 	, n.  Most
of the methods for finding these quantities (i.e., those that
are statistically sound)  require the solution of the matrix
equation

     (A-D)$ - $A                                  (7)
                                               2
where D is the diagonal matrix with elements e.  on the
principal diagonal and O's elsewhere.  All of D,A,$ are to
be found while only A is given.  Nearly all of the methods
concerned are iterative, and have the characteristic feature
that (with only a few exceptions) they converge very slowly
(and in some cases, some methods fail to converge at all)
(Lawley and Maxwell, 1963).

     The method used here is straight-forward and has given
results that appear to be satisfactory.  It is that developed
         it
by K.G. Joreskog  (1962).  The technique is as follows.  It
                           64

-------
is assumed that the covariance matrix, A, may be written in
the form
     A = $A$'+D                                    (8)
where $' is the transpose of $ which in turn is the matrix
of proper functions of A-D, A is the diagonal matrix of proper
                   2
values, and D = {e. a. .} is the diagonal matrix of residual
variances.  The proper functions (or vectors) corresponding
to the proper values are ordered in the same way as the
associated proper values [col (4>, . ,_., --- ,<)> .) corresponding
to A.].  The values of A., i=l, --- , n, are the roots of the
determinant equation

      A-D-Al  = 0
where I is the n x n unit matrix.  Now consider the matrix
A-D instead of the matrix A and assume that all of the diagonal
                            2    2
terms of D are identical, e.  = e , i=l, --- ,n.  It is rather
easily shown that (a) the matrix A_has proper values which
                                   2
are those of A-D but increased by e  and  (b) the proper
functions of A are exactly those of A-D.  Thus, if y . ,
i=l, --- ,n, are the proper values of A, and A., i=l, --- ,n,
are those of A-D, then

          y  = A  + e2 .
(Anderson, 1958, p. 287, problem 7)

If, now, we were to find the proper values of A as
y, , ...,y  and if we had some way of determining that the
last n-k of these proper values were not significantly
different from each other, it would then seem reasonable that
we could lump these last n-k values into one batch with
                           65

-------
average value (y, , , , ..., +y )/(n-k), take this value as the
            ~T                                     ~~2        ~T
estimate of e ,  and then consider the values of y,-e , ...,y,-e
                                                 _L         K
as proper values of the matrix A-D, which will then be of
rank k( .)/ i=l,...,k, and the last n-k proper values,
A., would be zeros.   It just so happens that just such a
test as required above is available.  Its specification is
deferred until later since it is used in a somewhat more general
sense than is required by the heuristic argument above.
                                     H
     Return now to the technique of Joreskog (1962) in which
                     2
the elements of D={e. 8. .} may differ from each other.  It is
assumed that the matrix D is given by the expression D =
0[diag(A~ )]" ,  where 6 is a scalar constant to be determined,
D is proportional to the diagonal matrix which is the inverse
of the diagonal of the inverse of the covariance matrix that
we started with.  We reduce this notation to the form D=9A  ,
where A=diag (A~ ) and (8) then becomes

     A =  (*A$')  + 0A"1.                            (9)
Since A is a covariance matrix, the elements of A will
all be positive, and if we take the positive square roots of
                                                      1/2
the elements of A and denote this diagonal matrix by A    ,
it then follows that (9) may be written as

               = (A1/2**1/2)^1/2**1/2)' + 01
or
     A1/2AA1/2 = CC1 +01
where
          C =

            1/2  1/2
The matrix A   AA    is then to be expressed in the form
described in the heuristic argument preceding where the
                            66

-------
single scalar coefficient, 6, corresponds to the common
                    2
residual variance, e , used there.  To find the value of 6,
one finds the proper functions and values of the matrix
A^AA1/2, or

     (A1//2AA1//2)F* = F*A*

where A* is the required diagonal matrix of proper values
(properly ordered) and F* is the matrix with proper functions
appearing in the columns in the same order as the proper
values.

     The test for the last n-k proper values being different
from each other is to the effect that the quantity Q is
                              2
approximately distributed as x / where

     Q = N'{-loge(A*+1- 	A*) + (n-k)log[(A*+1+	+A*)/(n-k)]

and
     N1 = N-k+[2(n-k)+l+2/(n-k)]

                                          2
and the number of degrees of freedom for x  is

     d.f. = (n-k+2)/(n-k-l)/2.

(N+l = number of observations on which A is based.)  The
criterion operates in an inverse sense.  If the value of
 2
X  for a given value of k (the number of proper values
accepted as being different from each other) is exceeded
then there is at least one more significant proper value that
should be included.  If it is found that at a given significance
level, the number of significantly different proper values, k,
is adequate for the representation of the matrix A / AA    ,
                           67

-------
then the value of 6 is taken as
     9 = (Xk+l + ~~~ + V/(n~k) '

the average of the last (n-k)  proper values.

     We have now succeeded in representing the matrix
A1/2AA1//2 in the form
                  * *  *
                   A(F) '  + 61
where A, ,  F,  consist of only the first k proper values and
       k   k
proper functions.  The matrix A may then be written as
     A= (A~1/2F*)A*(A~1/2F*) '  + 9A'1, D=6A~1={ei26ij}   (10)

                       -1/2 *
     Since the matrix A  '  F,  has columns not orthogonal to
each other, these are no longer proper functions of any matrix,
To obtain proper functions and values, it is necessary to
recompute values of the k proper values/functions from
scratch, only using as the matrix concerned the matrix
  -1/2 *  *  -1/2 *
(A    F, )A,(A  '  F, ) '.  There are, of course, only k non-zero
proper values and proper functions, A, , $, , which are usually
                                     K   K *   *
quite close to those already obtained for A,, F,  (except
for a scale factor for the proper values).

     To recapitulate, it was pointed out that the use of
equation (9) of Chapter II as an interpolation formula for
estimating the measure of pollutant concentration involves
two important items that are frequently overlooked.  The
first is that the variance of the small scale effects, the
effects of limited range of influence, must be specified
                             68

-------
quantitatively.  The Joreskog  (1963) representation of
the covariance matrix does this.  The required values are
exactly those specified by (10).  The second item is that
a method of interpolation was required that would preserve
the character of the matrix A as a covariance matrix.  This
is provided when the first matrix expression on the right
of (10) is recomputed in terms of its proper functions, $,,
and corresponding proper values, A,  so that
               i
     A = $, A, $.   + D .                             (11)
          k k k

In (11) note that the subscript k is not a summation index.
It is used to indicate that only the first k proper values
and functions are used.  The matrices concerned have dimensions
as follows:  $,  is (QXk) , A,  is (kXk) , $,  is  (kXn) , so that
       i       K.            K.            JC
$, A, $,   is (nXn)  as are A and D.

     It appears that the solution to the problems concerned
is at hand, but this is not necessarily the case.  The Factor
                                         2
Analysis method for finding the values (e ) is applicable
to quite general covariance matrices.  We are concerned with
covariance matrices of a stochastic process on a continuum.
The Factor Analysis method is strongly dependent on the
value used for the number of statistically significant proper
values, k.  The problem will next be considered using a con-
tinuum formulation.  The effect on the evaluation of k is
found to be important.

     In Sub-section 3, Integral Equation Methods, immediately
following, the problems involved in the solution of the
integral equation corresponding to  (7)  are considered.  A
method for testing whether the solutions of this integral
                           69

-------
equation obtained by two (or more)  only slightly different
procedures are consistent with each other is developed.   The
section is strictly theoretical.   The practical application
of the techniques of Sub-section  3 is made in Sub-section 4
to actual S0? measurements.

3.  Integral Equation Methods
     The factor analysis method for determining the values
of the residual variances fails to take into account the
fact that there are strong geometric relations connecting
the pollution concentration covariances at the various
points of the observing network.   To bring this into the
picture, the problem is re-stated in terms of a continuum of
values rather than in a totally discrete form that ignores
this situation.  To illustrate, the values x., x. that enter
into the covariances (x.x.)  can be (in the factor analysis
case frequently are) test scores  from the i'th and the j'th
tests given to a batch of subjects.  The i'th test might be
on arithmetic ability and the j'th test on manual dexterity.
If a third test is considered, say x,  for the score on the
k'th test, which is on social adaptiveness, it is silly to
place this in a strictly geometrical relation with respect
to the other two.  On the other hand, when x., x., x,  are
                                            i   3   K
pollution concentration at points P., P., P,, we know all
                                   i   j   K
about the strictly geometrical relations between these
points, i.e., we can say that P,  is such and  such a distance
                               x
from P. and also from P. and that P. and P. are so far from
each other and that the line joining them has a certain
direction.  The natural generalization of the proper value/
function analysis of the covariance matrix is the integral
equation formulation for proper values and functions in a
continuum.  Thus

           =  fKUrX'HU1) dx1                         (12)
              R
                          70

-------
where X is a proper value, <}> (x) is the corresponding proper
function, and K{x,x') is the kernel,- the exact analogue of
the covariance matrix.  The kernel, K(x,x') is precisely the
covariance of concentrations at the points x and x'.  Note
that x in the above may be a multidimensional variable in
which case dx1 is multidimensional and R is a region of the
same dimensions.  We are concerned here with the two-dimen-
sional field of pollution concentrations.  The kernel function,
K(x,x')/ is now a positive definite symmetrical  [K(x,x') =
K(x',x)] function of the location coordinates.  The fact
that we are now working on a continuum is also reflected in
the fact that (12) has a denumerable infinity of solutions,
X ,  (x) , of proper values and functions.  The proper
values are considered as ordered
     X.. >X^ > --- >X  > --- > 0
      1 - 2 -     -n-     —

and the proper functions  (x) are taken in the same order.

     The standard method of solving (12) is to reduce the
problem to a matrix problem where the values x,x" are
specified on a network of points and the integral is replaced
by a quadrature formula.  This approximation may be written
as
                                                   (13)
where the factors A. correspond to an element of area about
the point x .  such that ZA .  = R = area of the region of
integration (for a two dimensional network of points) .  If
the kernel function were given by an explicit formula, then
one could evaluate it at any number of point pairs x. and x . ,
i,j=l, ..., n, and the number of points could be taken as
very large.  The larger the number of points selected, the
                           71

-------
more accurate would be the estimate of the solution for a
proper value/function of some specific order, n.  On the other
hand, one is faced with the fact that one can only obtain a
finite number of solutions when it is known in advance that
there are a denumerable infinity of such.  Thus, for a fixed
number of points, x., i=l, ..., n, n fixed, there must always
be some point, say n*, beyond which the solutions depart
more and more from the theoretical or exact solution.  When
one applies the technique to a kernel function that is known
only at point pairs from a fixed number of points and for
which the value of the kernel function is known only to within
a certain sampling error, other considerations limit the
accuracy of the solutions.  One of these is the accuracy of
the values of the kernel function itself and another is the
choice of the quadrature factors, A., and still a third is
the fact that K(x,x') may have a "jump discontinuity" along
x=x' (or at least an effective "jump discontinuity" as far
as the observation station spacing is concerned).  These
several points will be considered separately in the following
paragraphs.   None of  these points is  trivial.   Considered
from the integral equation point of view, the proper values/
functions obtained from the matrix equations by Factor Analysis
methods are simply approximations to a correct solution in
which several of the factors concerned have been overlooked
due to the over-simplification of the problem.

     a)  The Jump Discontinuity
     The fact that there may be a jump discontinuity along
x=x' may be handled by the following integration technique.
Subtract from the equation (12) the identity
          R
                           72

-------
to obtain

     4>(x) [A-yK^x'Jdx1]  =  y*K(x,x') [(x)]dx'         (14)
             R               R
The discontinuity of  the  kernel  function in the integrand on
the right of  (14) is  no longer effective since the factor
[ (x1) - (x) ]  is  zero  on x=x'.  It is still present in the
integral term on the  left side.   In this integral we use
quadrature factors that are  dependent on x in addition to
                         (x)
x1 in such a  way that B.   =0  if x=x..   Quadrature formulas
using such factors are said  to be of "open type".  See
Abramowitz and Stegun (1964).  If these quadrature factors
                   (x)
are indicated by B *   while  those on the right are A., then
the matrix equivalent of  (14) may be written as
     fK i v ^ T A ^  » Tf i v v  ^ *Q   i  1  ^~  » V / *v  'V i i A f v \ ^ ^f\ i ^ i 1 ^     I I ^ I
     T\^^/ I ^  y ^ J\VA*XI/C*  i  J  —  £J^- \X • f x • ^ |_tpvx./tpvx.^j/\.    v J--J /
               j                  j
Rearranging the terms  of  (15), one finds that

     X(fi(xi)= J^K(xifx.)  (x. )A.+ EKfx.^/x .) ((> (x^) [B. i -A.]    (16)

where neither summation contains  the  term i=j.  This term
is missing  from the first  summation because  regardless of A.,
it would be canceled by the  corresponding term of the second
summation;  it is missing from  the  second to  also account for
               (x )
the fact that B. i  =  0.   The  second  term on the right is the
equivalent of substituting for K(x.,x.)A^ the expression
                    K(xi,xj) [B.-Aj].                      (16a)

This is equivalent to using  an  interpolation formula to
obtain K(x.,x.) from values  of  K(x.,x.)  that make no use
                                      J     /   \
of x- = x. explicitly.  One  must  have  I. B.i =R and
                          73

-------
 £  A.=R-A. so that  Z [B !xij -A . ] =A. , i.e., the sum of the
J*L  D    X         J^i  D     D
weights given to each of the terms K(x.,x.) of  (16a) is  just  exactly
the weight ascribed to K(x. ,x.) itself.

     In order to obtain an explicit solution using the symmetric
positive definite matrix algorithms (which are much faster
than those for an unsymmetric matrix) , this form may be  made
symmetrical using the device described in the following
section.

                                            *      *
     When the proper functions and values, <(>,  (x. ) /^,, have  been
found from (16), then a new kernel K*(x.,x.) can be constructed,
thus
and the amount of the jump discontinuity along x.:=x. may
be determined from
     J± » J(x±) = K(xi,xi) - K*(xi,xi).

     b)  The Quadrature Factors
     The unsymmetrical formulation for the matrix  approximation
to the integral equation using a quadrature  formula,  (13) ,
may be made completely symmetric by multiplying  both  sides
by i/AT.  Thus  (13) becomes
     X[/A7(xi)] =    {/ATk(xi,xj)/A7}[  /A7(Xj)]             (17)

If we let /A~7(x.) =  9(x.), then the values  of  6(x.),  X are
proper functions/values corresponding to  the weighted
                        74

-------
covariance matrix {/A7~K(x. ,x . ) v/AT} .  Note that  if we have
proper functions 0,  (x. ) and 8,(x.) corresponding to proper
                  JC  1       XX
values A, ,A,, these functions are orthonormal ,  i.e.,


         k6l = 6kl         6kl =    0  if
If now we substitute the expressions for  0v(x.)/  6, (x.)  in  terms
                                          JC   1     XX
of 4>Jc(xi)f ct>1(xi)/ one obtains
which corresponds to the orthonormality condition  in  integral
form which is satisfied by the proper functions belonging
to the integral equation (12),

                  ^ (x)dx = 6kl
          R
provided that the same quadrature factors are used to evaluate
this integral as were used to reduce the integral  equation  (12)
to matrix form as in (13) or  (17).

     The quadrature factors that are to be used in going
from (12) to (13) or (17) need to be very carefully considered.
First,  even in the case of a single variable x, the choice
of quadrature factors is by no means unique.  The  standard
handbooks of mathematical formulae  list the  quadrature  factors
for the Trapezoid rule and Simpson's rule.  The more complete
work, Abramowitz and Stegun  (1964), lists eight additional
sets of quadrature factors, not to mention the open-type
formulae which would be used to obtain quadrature  factors
          (x )
like the B.  i'  of the preceding section.  In the case of
these formulae an error term is listed which generally  is
proportional to some derivative of the integrand evaluated
                           75

-------
in the interval of integration.   This,  at least,  gives the
impression that if the integrand is sufficiently  "smooth",
the higher the order of this derivative,  the more exact the
quadrature formula.  All of these formulae are for equally-
spaced abscissae, x. ,-x.=constant.  When the abscissae are
not equally spaced, one must be prepared to go to the basic
relations from which such quadrature formulae are usually
derived to obtain adequate expressions.

     The situation when the parameter of integration is
multidimensional is worse.  Even the compendious  Abramowitz
and Stegun (1964) list only one quadrature formula for
points located on a square grid and which are also on the
boundary of the area over which one is integrating.  It is
usually mentioned that in dealing with a rectangular grid of
points, one may make a multiple application of the one
dimensional quadrature formulae.  None of these cases describe
the situation at hand, which we now explore to some extent.

     To obtain the experimental values of the kernel K(x.,x.)
we have a fixed number of points, P., with coordinates  (£.,n-)
that are arbitrarily located.  We cannot increase the
number of points and we can do nothing about their location.
The domain of integration, R, is determined by this network
of points to a large extent, but not exactly.  We may as
well simplify the problem as much as possible by specifying
that the boundary of R is a polygon obtained by connecting
the points on the perimeter of this area.  Note that the
network of data points does not really define such a boundary
uniquely.  We need to specify something like a "convex"
boundary before even the area of integration becomes uniquely
defined.  We do not consider all of the ramifications of  the
boundary selection any further since it would only make a
difficult situation more difficult.
                          76

-------
     The ambiguity of the relation between the points, P., and
the domain of integration is illustrated by the simple
example of a re-entrant quadrilateral as shown in Figure IV-1.
In A, the four points are considered to be bounded by a simple
triangular region with the fourth point as an interior point
and the region divided into three non-overlapping triangles.
In B, the fourth point is considered to lie on the boundary
of a re-entrant quadrilateral which is subdivided into two
non-overlapping triangles.

     Inside and on the boundary one has the points P..  Now
subdivide the region into elementary triangles with a point
P. at a vertex of each triangle.  It is readily proved that
if P = total number of points, B = number of boundary points,
S = number of triangle sides, T = number of triangles, then

     S = 3(P-1)-B
     T = 2(P-1)-B

The fact that the boundary is not uniquely defined is reflected
in the fact that B (which is included in the total number
of points, P) is also a parameter that needs specification.

     Consider now a single triangle that has for vertices
the points 1, 2, 3.  In order to carry out the integration
of f(£,n) over this triangle when only the values f,,
f~, f.- at the vertices are known, it is reasonable to
consider a generalization (slight) of the trapezoid rule.
Let the function be approximated by a plane over this triangle.
It is easily shown that the integral over the triangle is
given by

          (£,n)d£dn  = (f1+f2+f3)•(A/3)                 (17)
      A
where A is the area of the triangle.
                           77

-------
             ( A )
             ( B )
                   FIGURE IV-1


THE AMBIGUITY OF THE RELATION FOR THE NUMBER
OF TRIANGLES AND SIDES DEPENDING ON BOUNDARY
POINT ASSIGNMENT
                       78

-------
     Next consider the array of points that is subdivided
into non-overlapping triangles that completely cover the
region of integration.  With each point, P., there will be
one or more triangles with this point as a common vertex.
The integral of f(£,n) over the entire region R may then be
approximated by the quadrature formula
       R              i
where i is the point index and A. is given by

     Ai = f Z\(i)]/3

where T, (i)  represents the area of the k'th triangle that
       *                          *
has the point P. as a vertex and Z  is the sum of all such.
               1                 k
     The situation is amenable to a simple geometric con-
struction to visualize the areas thus associated with the
points P..  In an individual triangle, the lines joining
the vertices with the mid-point of the opposite side divide
the triangle into three quadrilaterals each with the same
area.  Thus, for the triangle 123, Figure IV-2, the points
A,B,C are the midpoints of the sides and lines Al, B2, C3
all meet at the point Q, the "center of gravity" of the
triangle.   The quadrilaterals QB1C, QC2A, QA3B each have
an area equal to 1/3 the area of the triangle 123.

     When several triangles are put together to form a region
subdivided into triangles, the area associated with each point
is illustrated in Figure IV-3.
                           79

-------
                  FIGURE IV-2
THE THREE EQUAL AREA QUADRILATERALS, 1CQB ,
2AQC , AND 3BQA , RESULTING  FROM JOINING THE
VERTICES WITH THE MID - POINTS OF THE OPPOSITE
SIDES OF THE  TRIANGLE.
                      80

-------
                  FIGURE IV-3
  AN AREA SUBDIVIDED INTO TRIANGLES DETERMINED
BY THE  POINT  LOCATIONS AND THE AREAS ASSIGNED TO
EACH POINT ON THE BASES OF THE QUADRILATERALS IN
EACH TRIANGLE ILLUSTRATED IN FIGUREIV-2.
                         81

-------
     The situation seems to be well under control at this
point, but now consider how we divide a very simple convex
quadrilateral into triangles.  There are two choices of the
way in which this is done, as shown in Figure IV-4.  It
is readily seen that the quadrature factors assigned to the
points 1,2,3,4 in these two cases are vastly different from
each other.  When a region covered by many data points is
considered, the number of ways that it may be divided into
non-overlapping triangles becomes large, and each of these
methods of subdivision will be associated with a different
assignment of quadrature factors to the points.  The problem
then resolves itself into the question of what is the best
way of subdividing the area covered by given points P. into
triangles so that the resulting quadrature formula will best
approximate the integral concerned.

     As an example of the wide variety of quadrature factors,
consider a square with data points at  (±h,±h),  (±h,0), (0,±h),
and (0,0).  There are four sub-squares and each can be divided
into two triangles in two ways.  There are thus 16 possible
ways of dividing the area into non-overlapping triangles.
Each of these will give a different system of quadrature
factors based on the integration of a plane approximation over
the triangles.  These are listed in the following table.
The quadrature factors in each case add to 24 so they are to
                  2
be multiplied by h /6 where h is the spacing between points
to get the correct units.
                        82

-------
                  FIGURE iv-4
THE TWO WAYS THAT A CONVEX QUADRILATERAL  MAY
BE DIVIDED INTO TWO TRIANGLES AND THE RESULTING
DIFFERENCE IN THE AREAS ASSIGNED TO EACH POINT.
                          83

-------
    Quadrature Factors for the 9 Points Covering a Square
Number

Coordinates
  (0,+h)
  (0,0)
  (0,-h)
                                               1       1
                                             (Trap.)  (Simp.)
1
4
1
4
4
4
1
4
1
2
2
2
2
8
2
2
2
2
1
3
2
3
6
3
2
3
1
1
4
1
3
6
3
2
2
2
1
3
2
3
7
2
2
2
2
1
4
1
3
5
4
2
3
1
3/2
3
3/2
3
6
3
3/2
3
3/2
2/3
8/3
2/3
8/3
32/3
8/3
2/3
8/3
2/3
     The "number" row indicates the possible number of cases
of each type:  1 is associated with a symmetrical arrangement,
2 indicates that a 90° rotation gives another arrangement but
that a second rotation of 90° reproduces the original
arrangement; 4 indicates that the original is not reproduced
until the 4'th rotation through 90°.

     The last two columns are symmetrical quadrature factors
for this point array that are derived from the double
application of the trapezoid rule and Simpson's rule respectively,
The quadrature factors for the trapezoid rule  (applied
twice) are the average of those shown in the first two
columns.

     In this example, there appears to be no good over-all
criteria for prefering one assignment of quadrature factors
over another.  In the case of each square, the two possible
triangle subdivisions are symmetrical.  One might prefer a
symmetrical arrangement as in the first two columns, but this
seems to be justified more on the basis of taste rather than
                            84

-------
a good substantiation of the better accuracy of the quadrature
formula.  When the points are not regularly arranged on a
grid, but are more or less at random, one might prefer a sub-
division into triangles that would favor a tendency toward
equilateral triangles over long, skinny triangles, but again
this appears to be justified more on the basis of taste than
mathematics.

     The standard hyperbolic paraboloid interpolation scheme
(sometimes called double linear interpolation or iterated
linear interpolation) over convex quadrilaterals might be
used, but the same ambiguity on the resulting quadrature
factors still exists because of the many ways that the
observation net can be divided into convex quadrilaterals.
In this case, one may be stuck with a few triangles since
the quadrilateral subdivision need not "come out even".

     c)  The Product Integral T_ech_ni._qu_e
     If instead of interpolating the whole integrand on a
triangle to obtain the quadrature factors one interpolates
separately the two terms of the product, one obtains the
relation

     IA =
         A          1 1-q
               = 2KJ yK(x,y;p,q)4>(p,q)dpdq
where              0  0

     2~*i) + q(3-<|)1)

     K(x,y;p,q)  = K  + p(K-K) + qdC--K-
                      85

-------
and in which cj>, , 2, 3 and K, , K2, K3 are  the  values of the
proper function and the kernel function at  the  triangle

corners and A is the triangle area.  The  results of this

integration lead to


     IA =  (A/12) [(|)1(2K1+K2+K3)+cj)2(K1+2K2+K3)+(j)3(K1+K2+2K3)]


which may be expressed in matrix/vector form  as

                             (2,  1,  1)   /*!)
     I. =  (A/12){K ,K~,K }   {1,  2,  1)   h }
                             111   71   II
                             !-L,  X,  n   (^ J
                                          *J


where now one has a matrix of quadrature  factors.  Sum over

all of the triangles that cover  the region  of integration to

obtain the quadrature form for the  integral equation as
                                 ik        *  ]                (18)
                      JJ    k      D

where  = KA
where  is the column vector  of  proper values (x.).  Now
assume for the moment that A  is  a positive definite matrix.
(It is symmetric.)   It will then have a square root which we
                1/2                              1/2
will denote as A '  .  Multiply on each side by A '   to obtain
                         86

-------
     ,,Al/2,.    ,A1/2...,. 1/2, ,.1/2, >
     A (A ' $) =  (A ' KA '  ) (A   ) .

                 1/2
If, now, we let A ' =6, we have the  relation
    xe  =
which is the analogue of  the  relation (17)  obtained when the
quadrature factors were simple  scalars associated with each
point.

     The solutions of the above matrix problem,  8]c(xi^'^k/
are such that, as before,
Then in terms of the values  v(x.)f  one  has
     Vxi> -Z
               m
so that
                            m             n
          = I  24^) !<*„> (2
            m   n              i
                        1/2                               1/2
where, in the above,  {A. .  }  stands  for the elements of A '
the square root matrix of the matrix A (not the  square
roots of the elements of A) .  Since A1/2  is symmetric
(as was A) the  summation on  i  in  the above simply gives the
elements of A.  Thus
                        m n
                          87

-------
This is the same result that would have been obtained had
we evaluated the "product integral" expression
          R
using the same individual interpolation formula for the
factors in the integrand that was used to evaluate the
product integral that appears in the initial integral equation.

     The square root matrix of the matrix A is readily
obtained by recourse to the proper functions and values
of the matrix A.  If the proper values/functions of A are
y,, ty.,  so that M is the diagonal matrix of the yk's and
¥ the matrix of which fy.,  are the column vectors, then
A = yMV.  Since for a symmetric positive definite matrix
the proper values are real and positive, then the proper values
have square roots (use the positive sign).  Then the
matrix YM ' Y1, where M '   is the diagonal matrix of elements
 1/2
y,    on the principal diagonal and zeros elsewhere, is the
square root matrix of A,  i.e., A1//2 = ¥M1//2<{" .  Note that
 1/2 1/2 _    1/2 ,    , 1/2 ,   _   1/2  1/2  , _   1/2 1/2  , _


     d)   Test for the Accuracy of Solutions
     In view of the fact that the quadrature factors for
the numerical solution of an homogeneous Fredholm integral
equation of the second kind are not well-defined quantities,
a test of the validity of solutions seems to be in order.
The usual mathematical error estimates are unsatisfactory
for this since they all require more information than is
available from experimental data on the kernel function.  A
test is available, however; namely a simple comparison of
the proper functions arising from two equally valid choices
                         88

-------
for the quadrature factors.  Thus, let there be two choices
of quadrature factors, A.  and B..   T
algebraic equations to be solved are
of quadrature factors, A. and B..  The corresponding
                   £  [/A7K(xi,xj)/A7]
and let

     e#(x.) =

     e*(x.) =

                                                  #   *
be the proper functions at the points x. and let A£, A,
be the corresponding proper values.  Then consider the sum
of the squares of the differences of these solutions.  If
the solutions are reasonably similar, this quantity should
be very close to zero.  On expanding the square and summing
over the points x. , one obtains
                 i
The first and last term on the right are each unity so that

       I[e^(xi)-e*(xi)]2 = 2 [i-s ej^x^e*^)].

The second term in brackets on the right is simply the
"correlation coefficient" for the point by point comparison
of the two solutions.  (Each has unit second moment about
zero, but need not have a first moment of zero, although
                          89

-------
for proper functions of order greater than a small number
this is essentially the case.  We compute the sum of products
as indicated ignoring the fact that the first moment may
not be zero.  It is an immediate consequence of the Schwartz
inequality that the value of this sum of products lies between
+1 and -1 so it "looks like" a correlation coefficient
anyway . )

     The quantity
serves as a test parameter for the "self consistency" of the
               *          #
two solutions 0, (x.) and 67 (x.).  If C,  is close to 1, it
               JC  1       K.  3-        K.
may be said that the solutions are self consistent.  If it
departs from 1 by a significant amount, then the solutions
are not self consistent.  A better condition for self con-
sistency is much stronger than this.  We have a sequence of
values C., k=l, . . . , n.  It is expected that for k small,
the values of the test parameter will be close to one.  If
                         *
at some value of k, say k , the value of C,* has decreased
abruptly compared to the previous values, then we can say
that not more than k*-l solutions are self consistent.  This
is regardless of whether or not the subsequent values of
C,, k>k*, may be close to 1.  This is because in each
sequence of solutions every solution is orthogonal to all
previous solutions of the sequence.

     One cannot say which of the solutions is incorrect if
C,<1 when the quadrature factors are equally valid.  One
can only say that k*-l solutions appear to be equally valid
and that any further solutions in the sequence are suspect.
                         90

-------
     From the preceding, it is concluded that by formulating
the problem of matrix reduction  (outlined in Sub-section 1
and treated from the Factor Analysis point of view in Sub-
section 2 as an Integral Equation) there are ambiguities of
technique that have an important bearing on the accuracy of
determining the proper functions.  These ambiguities may be
used to test for self-consistent solutions.  If the solutions
are self-consistent one can say nothing about their accuracy.
Accurate solutions must be self-consistent, but inaccurate
ones may be also.  On the other hand, if a pair of solutions
are inconsistent, then one or the other or both must be
inaccurate.  This has an important bearing on the value of k
determined from the Factor Analysis method (k = number of
statistically significant solutions to the matrix problem)
as outlined in Section 2.  It seems reasonable that if there
are not more than k* solutions of the integral equation
formulation that can be reasonably accurate from a mathematical
point of view, then using more than k* solutions to the
matrix problem appears to be of doubtful value, even if the
appropriate statistical tests indicate that they are "statis-
tically" significant (k*
-------












00
CM
4






in
CM





CM
•








mm
W
MM
O
J
e
en
u.
o
tn
z
o
p
§
LU
V,
5
liJ
a:


1 1
n oo iv
>J CM CM

CO
~~~
U)
CO
^ 	

CM
CO
• CM
CM
• oo _
O CO
co _
o 2
CM TJ
•

co to
ID f ™ °° O —
*• ** CM *4 ^
^ • •
• 8
•«




0)
1
H

cn
o
TORIN
O
Z
o
CM
8

« s •
—
in • J
co
CM



10
CM




CM


CM
CM



O
CM


00


— ^
jn n ^ 0H
en A «y W
^ • * _
CM CT)
A °° —

 - "T «*>
« . .
"^

—
CO
A in
~  ^ —
co 0
1 1 1 1 1 1 ^ 1 1 1 1 1 1 1
ID



—


CM



O

C\ICMCMC\JCMCMOJ«-«-^-«— ^-^-^-


















T
u
a:
3

iZ

















92

-------
covariance matrix would be of rank 27 instead of a more
desirable rank of 40  (see Appendix B).  Inspection of the
data for missing days by stations revealed that there were a
large number of days on which data for only one or two
stations were missing.  It was felt that it would be desirable
to "interpolate" data for the missing stations under such
circumstances.  This would increase the validity of the data
as a whole without seriously degrading that for the single
station at which the interpolated data was inserted.  The
result of this synthetic increase in the data base resulted
in an increase to 59 effective data days, which in turn
assured a full rank of 40 for the covariance matrix.

     The area covered by the stations was divided into two
systems of non-overlapping triangles covering a region with
common exterior boundary in each case.  The station coordinates
and the triangle assignments are listed in Tables IV-1 and
IV-2.  It will be noted in Table IV-1 that though the areas
assigned to points are correlated, there are also some large
variations between the areas assigned to a given point, the
ratio in some instances being larger than 2:1.  The arrange-
ments of the triangles are also shown in Figures IV-6 and
IV-1.  The station coordinates are in grid units as in Ruff
(1973a).  The areas assigned to the points in Table IV-1
are in square grid units and represent one third of the area
of the triangles which have the point concerned as a common
vertex as on page 79.

     The triangle assignments were made in order to obtain
quadrature factors as outlined in Sections Ib and Ic.  Only
the points shown in Figure IV-5 with coordinates given in
Table IV-1 at which covariance data are available are given
a priori.  To obtain quadrature factors which are associated
with each of these points (that is, the areas that have been
tabulated in Table IV-1 under the columns headed 1 and 2
                          93

-------
Table IV-1.   Station Coordinate and Area Assigments
Station
Index
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
Coordinates
X
19.42
20.16
18.66
20.24
17.72
21.18
18.12
20.76
16.22
20.48
16.52
22.48
18.04
22.32
16.32
20.42
14.14
21.02
14.16
22.44
14.96
24.14
16.88
20.24
18.50
18.88
15.42
22.30
13.78
23.26
11.14
24.76
10.34
27.10
10.68
26.04
13.96
23.54
14.74
16.64
y
20.86
20.06
19.16
22.36
20.14
21.20
22.50
19.20
21.16
17.88
18.92
18.64
17.76
16.64
24.12
25.20
21.46
23.84
20.26
20.64
18.04
19.78
15.88
15.80
28.42
14.92
27.88
26.86
25.06
25.04
23.64
23.22
20.34
19.98
17.18
17.06
16.06
14.52
14.32
13.04
Area
j 1
1.77
2.00
1.50
3.53
3.35
2.48
4.70
3.89
3.65
1.53
3.22
1.59
5.68
8.63
11.14
9.87
9.19
3o88
4.97
6.50
4.80
3.62
4.48
3.42
4.20
5.90
6.01
0.90
2.47
3.20
5.08
5.71
4o01
5.96
5.10
3.14
3.51
4.70
3.75
2.71
2
3.20
1.15
2.89 .-
3.50
1.76
3.16
5.04
2.46
4.96
3.60
3.40
4.52
4.48
4.87
10.85
7.77
7.29
3.94
6.39
2.83
6.20
5.24
6.34
4.77
6.10
3.88
3.46
2.38
4.59
2.44
3.61
5.43
4.71
3.56
4.97
5.15
3.09
4.70
1.77
3.57
                               94

-------
Table iv-2.   Triangle Assignments
Triangle
Index
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30



No. 2
1
26
24
14
14
14
14
20
20
18
16
16
16
25
16
15
15
15
15
17
17
17
19
19
21
21
23
13
26
13
13



2
38
26
24
36
34
22
22
32
20
18
30
25
28
25
16
27
29
17
31
33
19
35
21
37
23
26
23
39
24
14



3
40
38
38
38
36
34
34
34
32
32
32
30
30
27
27
29
31
31
33
35
35
37
37
39
39
39
26
40
26
24



No. 1
1
26
24
14
14
12
12
22
22
20
6
4
4
18
16
16
16
15
15
15
15
17
17
17
19
19
21
21
23
23
23



2
38
26
24
36
14
22
34
32
22
20
6
18
30
18
28
25
16
25
27
17
29
31
19
33
21
35
23
37
39
26



3
40
38
38
38
36
36
36
34
34
32
32
32
32
30
30
28
25
27
29
29
31
33
33
35
35
37
37
39
40
40



Triangle
Index
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
No. 2
1
10
8
8
12
8
8
2
2
6
4
4
4
4
7
7
9
9
9
11
11
5
3
3
8
2
2
1
1
1
4
1
5
5
2
13
10
12
14
12
20
8
6
18
6
16
15
7
15
9
17
19
11
21
13
11
5
8
10
3
3
2
2
6
6
5
7
9
3
14
14
14
22
22
22
20
20
20
18
18
16
15
17
17
19
21
21
23
23
13
13
13
13
8
5
5
6
7
7
7
9
11
No. 1
1
23
13
10
10
10
8
8
12
6
2
1
1
1
4
7
7
7
9
9
9
11
11
13
3
3
3
2
1
1
3
5
1
1
2
24
23
13
14
12
10
12
20
8
6
2
4
4
7
16
15
9
15
17
11
19
13
21
11
10
8
3
2
3
5
9
5
7
3
26
24
24
24
14
12
20
22
20
8
6
6
7
18
18
16
15
17
19
19
21
21
23
13
13
10
8
3
5
11
11
9
9
                               95

-------
96

-------
97

-------
since we are to compare two different ways of obtaining
the quadrature factors or areas)  the entire region is covered
by non-overlapping triangles and to each point is assigned
the area equal to one third of the area of all triangles with
a vertex at this point=  This covering of the region by
triangles is not unique, so we say that we make triangle
"assignment" or "assignments" since there is some freedom
of choice here.  For the purpose at hand,, only two such
"assignments" are made„  Many others could have been made.

     The areas, or quadrature factors, for each of the two
assignments of the triangle coverage  (Table IV-1) that are
associated with each point are "correlated" in the sense that
if for one assignment of triangle coverage the quadrature
factor (or area) associated with a given point is small (large)
then for another assignment of triangle coverage it will
also tend to be small (or large).  In other words, if one were
to compute the product moment correlation coefficient for
the areas (quadrature factors) shown in columns 1 and 2
under Area in Table IV-1, this correlation coefficient would
be positive and significantly different from zero.  This is
simply due to the fact that, as shown in Figure IV-5, in some
areas of the region the points are more dense than in others.
The statement that the areas  (or quadrature factors) are
correlated would be true for any two ways of covering the
area with non-overlapping triangles^ not just the two that
happen to have been selected here-

     2.  Quadrature Factors and Techniques
     In order to obtain a comparison between the proper
values/functions for the different area assignments  (triangle
coverages) these were computed and compared using the "trap-
ezoid method" and the "product integral technique".  The
"trapezoid method" refers to the assignment of quadrature
                          98

-------
factors on the basis of fitting the entire integrand as a
plane over each triangle.  The "product integral technique"
refers to fitting each factor in the integrand separately as
a plane over each triangle.

     With reference to the quadrature factors used in the
"trapezoid method", a search was made of the literature on
the subject.  It was found that the texts Davis and Rabinowitz
(1967) and Stroud  (1971) were devoted to more mathematical
aspects of the quadrature problem in several dimensions.
This was true also of papers found in the recent literature
such as Ewing (1941), Tyler  (1953), Synge (1953), Mises (1954),
Thacher (1957),  Hammer and Wymore  (1957), Hammer and Stroud
(1958), Albrecht and Collatz (1958), Stroud  (1960a and 1960b),
Ceschino and Letin (1972) to mention only a few.  Even
Dixon (1973), a survey paper, did not list multidimensional
quadrature formulas that could be applied.  Almost invariably
mathematical interest has been confined to quadrature formulas
wherein the function values are given at arrays of points
prescribed in advance.  These are, of course, useless when
the data points are given in an essentially random manner.  It
was found that Mises  (1936) did give a formula that could be
applied to an arbitrarily given triangle.  It is for this
reason that in Section Ib the derivation of the quadrature
factors for the "trapezoid method" was developed in some
detail from "first principles" and the implications of the
use of such a formula (especially the ambiguity of the triangle
selections)  were pointed out.  We have not seen this kind of
treatment of the problem in any published work.  (On the
other hand,  it is so elementary that we feel sure that it
must have been treated before somewhere and that our search
has not been sufficiently exhaustive to find it.)
                           99

-------
     With respect to the "product integral technique",  it
was found after the method of Section Ic had been developed
that a somewhat similar procedure had been developed by
Boland and Duris (1969 and 1971)  and Boland (1972)  but,
again, only for the case in which the function is evaluated
at a prescribed regular array of points in one dimension.
See also Beard (1947)  for an earlier treatment, also in one
dimension.  We have not found anything treating the case at
hand, arbitrarily preassigned points in two dimensions.

     The proper values so obtained are shown in Figure IV-8
for the trapezoid method and Figure IV-9 for the product
integral method.  Each figure shows the proper values for
each triangle assignment for index values 20-40, one by a
triangle and the other by a dot which appear one above the
other.  For the index values 1-19 the differences were so
small that the points are scarcely distinguishable and
consequently only the dots are shown.  The differences in
the assignments of the areas to each data point apparently
make little difference in the proper values obtained.  On
the other hand, comparison between figures shows at once
that for index 5 through 40 the proper values obtained by
using the product integral method of evaluating the integral
yields proper values that are distinctly less than those
obtained by the trapezoid method.  The first ten proper
values are compared in more detail in Table IV-3 in which
the ratio decreases somewhat irregularly from 0.9 at index 1
to near 0.5 at index 5 and higher.  This is discussed in
more detail on p. 104.

     The "knee" in the curve of log proper value vs index at
index 5 or 6 represents an interesting phenomena which has
been noted by Craddock and Flintoff  (1970) and investigated
                           100

-------
UJ
Ul
0.
O
o:
a.
     10.0
      1.0
      0. 1
     o.ot
    0.001
                                  I
I
         0       10      20       30      40       50       60       70
                             PROPER VALUE INDEX

                                  FIGURE IV-8


          PROPER VALUES FOR DIFFERENT TRIANGLE ASSIGNMENTS USING
          THE TRAPEZOID METHOD.
                                   101

-------
     10.0
      1.0
      0. 1
u
D
DC
LJ
Q-
O
ft
0_
0.01
    0.001
                                    I
                 10
                     20       30       40

                      PROPER VALUE INDEX
50
60
70
                              FIGURE 1V-9


        PROPER VALUES FOR THE DIFFERENT TRIANGLE ASSIGNMENTS

               USING THE PRODUCT INTEGRAL METHOD
                                 102

-------
Table IV-3.   Comparison of the First Ten Proper Values by
             the Different Computing Methods
   Index    Trapezoid
      Proper Values
                  Product
                  Integral
                  Ratio
     1
     2
     3
     4
     5

     6
     7
     8
     9
    10
6.28810
3.02629
2.23924
0.98689
0.57566

0.35671
0.32275
0.29378
0.26159
0.23855
5.71721
2.49737
1.88175
0.68170
0.24709

0.18841
0.16377
0.16263
0.11669
0.11485
0.90921
0.82522
0.84035
0.69075
0.42923

0.52819
0.50742
0.55358
0.44608
0.48145
                            103

-------
synthetically by Farmer (1971) .  Craddock and Flintoff
remark to the effect that this marks the point beyond which
the proper functions appear to be more or less random, but
they give no firm criteria on which they base their measure
of randomness.  We will be able to provide a criterion that
seems to fit the situation rather well.

     In order to compare the proper functions obtained by
the two triangle subdivisions and the two quadrature techniques
the "self consistency" test parameter, C, , i.e., the corre-
lation of the proper functions for the methods being compared,
was computed.  The values of |ck| as a function of index
number are illustrated in Figures IV-10 and IV-11.  The
absolute value is used since there is always a basic ambiguity
in the sign of the proper functions.*  In Figure IV-10 it is
to be noted that the first significant drop in the self
consistency test parameter occurs after the 7'th index where
the quadrature by the trapezoid method is used and after the
6'th index where quadrature is by the product integral
method.  These are one index higher than the knee in the
curves of log proper value against index.  Although some of
the test parameter values are reasonably large for higher
values of the index, it does not seem prudent to admit
validity to the corresponding proper functions.  Each proper
function is orthogonal to those of lower index and low
values of the self consistency criterion precede these
larger self consistency criterion values.
     The proper functions are the solutions of the integral
     equation  (12), page 70  (or of its algebraic counterpart
     when formulated in discrete terms).  This integral
     equation  is "homogeneous".  It is readily seen that
     if  (x)  is a  solution,  then -4>(x)  is also a solution.  When
     solved in discrete terms,  a standard eigenvalue/eigenfunction
     computation routine may be used.   Whether one obtains  (x)  or
     - (x)  from such a computation routine is more or less a matter
     of chance.  Either one  is  valid.

                             104

-------
                                                                      u
                                                                          en

                                                                          z
                                                                          LU
                                                                          tn
                                                                          en
                                                                          15

                                                                          <

                                                                          2
                                                                          i-

                                                                          i-

                                                                          UJ
                                                                          a:
                                                                          LU
                                                                          u.
                                                                          LL
                                                                          §
                                                                          o
                                                                          o;
                                                                          LU
                                                                          a.
                                                                          o
                                                                          a:
                                                                          a.


                                                                          LU
                                                                          LU


                                                                          LU
                                                                          m


                                                                          o
                                                                          P
                                                                          LU
                                                                          £
                                                                          o;
                                                                          o
                                                                          o
SNOl-LONnj  H3dOdd N33MJ.38
                                 105

-------
     The self consistency criterion between the different
quadrature methods for the two triangle subdivisions is
illustrated in Figure IV-11.  The situation seems to be
similar to that of the comparison between triangle subdivisions
of the preceding figure except that the first "break" in the
test parameter appears to occur following the fourth index.
It is to be noted that this is the point at which the ratio
of the proper values obtained took a very abrupt drop to
below 0.5 (and remained in the neighborhood of 0.5 thereafter).

     The reasons for both the abrupt drop in the ratio of
proper values for the two different computing procedures
(trapezoid vs product-integral methods) and the drop in the
correlation between proper functions for these two computing
procedures lies in the fact that the product-integral technique
is in effect a procedure which "smooths" the kernel function
as compared with the trapezoid method.  Thus, from  (13),
page 71, for the trapezoid method we have (slight obvious
change in notation)
     A<|> .  =  Z K. .A. .
      Vi    j  ID IT J

while from  (18), page 86, for the product-integral technique

     X.  =  £   (ZK. A  )
       i    j   k IK K:  j

where A.  is defined on page 79 and A, . on page 86.  In order
       J                              K3
that the relations above be equivalent, it needs to be shown
that
     Z K., A,  . = K ..A.
     k  ik kj     rj 3
                                            *
where A. and A,  . are defined as above and K  ..  is the required
       D      KD                             iO
smoothed value.  In other words, it will be  necessary to
show that
                           106

-------
SNOIlONHd
o                    o



N33MJ.38 NOIJ.V-13ddOO
                                                   I



                                                  UJ
                                                  o;

                                                  o
                                                       j

                                                       3
                                                     Ul C?


                                                     |s

                                                     *£
                                                     2§
                                                     < DC


                                                     §fc
                                                     Z<
                                                       UJ
                                                     o o;
                                                     ZUJ

                                                  - Sit
                                   D =

                                   C0°
                                   zo
                                   O 3E


                                   yuj


                                   If
                                   IV U
                                   rr a
                                                     o a z
                                                     K UJ UJ
                                                     Q-F-2
                                                       z z
                                                     i- D.
                                                         en
                                                     z  .g
                                                     o tn 5
                                                     UJ Z 0
                                                     a i2
                                                     a: o J
                                                     OLJ <
                                                     OH >
                        107

-------
     Z A
     k
           = A.
so that K. . will be the weighted average of  the values
K., .  Rather than go through the details of  a  formal
 1 1C
demonstration in the abstract, we consider only a  concrete
example which illustrates what is involved.  In the diagram
(no number) consider the point j and for the example  we
take j=6.  Let the triangle selection be such  that the
triangles with 6 as the common vertex be (2,3,6),  (3,8,6)
(8,9,6),  (9,2,6).
Since A, .=0 unless k=j=6 or the edge
                                           is  involved,  there
are only five values that are not  zero; An,,  A-,,  A,,,  Aoc,
                                          <£Q    JO    DO   OO
Ag, and these have the values as defined  on page  86:
     *36
     ^66
     ^86
            [(9,2,6) +  (2,3,6)]/12
            [(2,3,6) -I-  (3,8f6)]/12
            2[(2,3,6) +  (3,8,6) +  (8,9,6)  +  (9,2,6)1/12
            [(3,8,6) +  (8,9,6)112
            [(8,9,6) +  (9,2,6)1/12
where  (i,j,k) indicates the area of  the  triangle with
vertices P., P., P, and  (i,j,k) must be  a  legitimate triplet
          1   ]    K
from the specific triangle assignment  concerned.
these up, their total is
                                                  Adding
A26+A36+A66+A86+A96 =  [ (2 ' 3 ' 6)
                                       / 8 ' 6) + (8 ' 9/ 6)
                                                         ' 6) ] /3
                           108

-------
which is exactly the area (quadrature factor) A, as specified
on page 79.  This demonstrates the assertion in this particular
example.  The general case is not difficult to handle.

                           *
     The ordered ensemble A.  will start with reasonably large
        *   *              1
values A , X., 	 but these will decrease rapidly.  On the
other hand, the ordered ensemble of proper values, y., will
                                              *     -1-
generally not be nearly as large as those of A .  to start,
but will decrease rather slowly (the nature of the decrease
will depend on the ratio of the number of observations to
the order of the matrix concerned).  As a consequence, the
ordered proper values of the total matrix K.., A., will
                           *               ^O   a
approximate the values of A.  for small i, but will be
dominated by the proper values y.  for larger values of i.
The "change-over" point will be in the neighborhood of the
"knee" of the curve of log A.  vs i as illustrated in Figures
IV-8 and IV-9 but cannot be exactly located there.  The
effect of the "smoothing" property of the product-integral
technique is to drastically reduce the effect of the proper
values of the matrix of departure from the true covariances,
k. .  (i.e., all of the proper values y. are reduced in size
 ij                                   i             *
by smoothing)  while the effect on the true values A. is much
smaller.  The net effect is then to decrease the size of the
larger computed total proper values, A., when the product-
integral technique is used.

     Now that the smoothing of the covariance kernel in
the product-integral technique is established, we proceed
to consider the effect that this would have on the proper
values and proper functions.   Since we are dealing with an
empirically determined function, an important consideration
is the fact that every covariance value K..  is affected by
sampling variations and the values K..  (i.e., when i=j)
contain the residual variances in addition to the variance
                          109

-------
 of  the  true  values.   The  next  section  is  devoted  to  the
 method  for accounting for residual variances, but they have
 not been  accounted for at this point.  Thus, the  matrix of
 observed  variances and covariances may be written as the  sum
 of  two  matrices Kij=K.ii+k.ji where 1^ .  are the variances
 and covariances of the "true values" and  k.. are  the departures
 from the  true values  (which on the diagonal i=j may  be quite
 large).   Now the proper values of the  matrix K..,  say A.,
 are  associated with the proper values  of  the matrix  K.^ . ,  say \^,
and the proper values of the matrix k..,  say y., but we cannot
           *               *         13       1
write A.= A.+k.  where  A.,  A.,   y.  are all three ordered.   We
can only write A.  = A  + y,  where A.  is ordered, A,>A_>A-	A
                ia]D        i              j-^jn
and the subscripts a and b fall in some order to suit the
situation.  It will be generally true that the effect of
the residual variances and sampling variation on the proper
functions for the trapezoid method as compared with  the
product-integral technique is  of a somewhat different
character.  These are  dependent in a large measure on the
nature of the sampling variation displayed by k.. at the
off-diagonal points,  i^j.   In  the product-integral technique
these values are strongly smoothed while  in the trapezoid
method they are not.    As a consequence, one would expect
that there would be (after a. certain undetermined index
value)  a  larger difference (smaller correlation) between
proper functions computed by the different quadrature
methods for the same triangle  assignments than for different
triangle assignments for the same quadrature method.  This
is, of course, what is being illustrated  in Figures  IV-10
and  IV-11.

     The conclusion to be drawn from the  above comparison of
two quadrature methods  for two apparently equally valid
triangle  assignments  is that there are certainly no  more
                          110

-------
than 7  (or possibly 6) proper functions that are adequately
self-consistent and that if one wishes to take a more con-
servative attitude, this can be reduced to only 4.

     3.   Factor Analysis Methods
     In the preceding section it was shown that both the
assignment of the triangle coverage of the region .concerned
and the quadrature method used to evaluate the integral
had a strong effect on the self-consistency of the proper
functions computed.  It was also pointed out there that at
least one reason for this was due to the effect of sampling
variation and the fact that the residual variances had not
been removed.  Of these two, one may at least approximate
the residual variances and remove their effect by using
Factor Analysis methods.  This approach is considered in this
section.

     In order to obtain an estimate of the residual variances
contained in the St. Louis SO- data, resort was made to the
                           ii ~
abbreviated method due to Joreskog  (1962).  The method of
 ii
Joreskog was modified in that, in order to keep the advantages
of the integral equation formulation of the problem, it was
applied after the equations were formulated in a symmetrical
matrix form.  (That is, the covariance matrix was modified
by the proper quadrature factors for the trapezoid method;
the straight principal component formulation based on "equal
quadrature factors for all points" was not used.)

     The comparison of proper functions for the two methods
is shown in Figure IV-12.  It is to be noted in this figure
that the covariance of proper functions for the two triangle
subdivisions is reasonably high up through the 8'th proper
function, after which it takes an abrupt dip.  On the basis
of the argument that once this dip occurs, the proper
functions are essentially irrelevant due to the mathematical
                          111

-------
                                                 CM

                                                 T
                                                 - o
                                                 UJ


                                                 I
                                                 h.
                                                     f-U-
                                                     HU.
                                                     oo
                                                     31-
                                                   LJ «
                                                   j>0
                                                   CQ I Q
                                                   u o;
                                                   o u
SI -r "y \J
B J £ X
.  z2l-

z°!^
ui >.*52
Stultn


t3§§

aSo*j

   Z Ld
f- z a. o
  oo"1


£0a-
,. H H tn
52   z D
                                                       u o
                                                   
-------
 (not statistical) ambiguities of the quadrature formula,
we can assert that there are not more than 8 significant
proper functions.  This is indicated by the line A in Figure
IV-12.
                                      ii
     On the basis of the analysis of Joreskog's method using
the 5% level, there are at least 13 significantly different
proper values.  This is indicated by the line B in Figure IV-12,
It is to be noted that the drop in the correlation of the
values of the proper functions between 8 and 13 is apparently
transient in that the correlation between functions 12 and
13 is as high as that for those of index less than 8.  It
would appear that the statistical test for significantly
different proper values is far less restrictive than the
test based on the ambiguity of the mathematical problem.
After the 13'th proper function has been passed, the cor-
responding proper functions are very poorly correlated,
uniformly until the last one is reached.

     The correlation of proper functions shown in Figure IV-12
corresponds to that of Figure IV-10 (the points indicated
by the dots) except for the fact that in the case of Figure
IV-12 the proper functions correspond to the weighted co-
variance matrix modified by multiplying ahead and behind by
the square root of the diagonal of the inverse of the weighted
covariance matrix as described on page 66.  As shown there
 (page 66 and following) this has the effect of making the
resulting "residual variances" (that appear on the diagonal
after modification) all equal.  It is well known that if a
positive quantity is added to the diagonal of a symmetric
matrix, all of the proper values are increased by this amount
while the proper functions remain unchanged (Anderson, 1958).
                                          H
Figure IV-12 then implies that after the Joreskog modification
 (as described on page 66) the effect of the two different
                            113

-------
triangle assignments is to effectively make all corresponding
(same index) proper functions after the 13'th essentially
uncorrelated, while those with index 13 or less are highly-
correlated  (the three exceptions are discussed later).
This, of course, is primarily due to sampling variations alone
since the effect of the residual variances has been made
uniform after the matrix modification.  One is thence led
to conclude that the early "cut off" of the correlation
between corresponding proper functions shown in Figure IV-10
was due to the fact that the point to point variation of the
residual variances was a dominant factor.

     To summarize the situation in different terms, the
computation of the proper functions and proper values without
removing the residual variances  (more exactly, without
modifying the covariance matrix weighted by the quadrature
factors to give a uniform equivalent of the residual variances)
results in proper functions that become inconsistent (low
correlation of corresponding proper functions for different
weight factors of equal validity) at a very low index and for
which the consistency check  (correlation) behaves in a very
irregular way  (sometimes high, sometimes low) for the higher
                                           ii
index numbers.  On the other hand, if the Joreskog method is
used (wherein the weighted covariance matrix is modified to
obtain a uniform equivalent of the residual variances)  then
the consistency check  (correlation of proper functions for
different weight choices) clearly divides the proper functions
into two distinct groups; those with index lower than a certain
value (here, 13) that are rather uniformly self-consistent
(highly correlated for different weight choices), and those
with higher index value  (here, 14 and above) that are uniformly
inconsistent  (low correlation for different weight factor
choices).  This is precisely the result that we had hoped
to achieve.
                            114

-------
     The low values of correlation between proper functions at
index values 9, 10, 11 of Figure IV-12 are due to the fact
that the matrix method does not give self-consistent results
for the solution of the integral equation type problem when
different quadrature factors are used.  In other words,
the modification of the original matrix by the quadrature
factors has introduced considerations that are not accounted
for by the sole question of statistical significance.
                                 II
     It is to be noted that the Joreskog test is based on
proper values alone while that of the correlation test is
based on the proper functions alone.  On the other hand,
applications are all essentially based on the proper functions
themselves.  As a consequence, it is necessary to consider
further the question of the application of the proper functions
9-13.

     To see more clearly what is taking place in the con-
sistency check for different quadrature factors (weights)
the elements near the principal diagonal of the matrix

     cij =  I e*i(xk)e?(V

are tabulated in Table IV-4.  In this table the central column
headed "same" contains the values C..  on the principal
diagonal while the column on the left headed "order" is the
value of i.  The tabulated values are thence the correlation
of the proper functions of the same index but with different
but equally valid quadrature factors  (weights).  The diagonals
of the matrix C-.  that border the principal diagonal appear
in the adjoining columns of Table IV-4.  The values of C- ^+,
are listed under the columns headed by the values k=-3,-2,-l,
+l,+2,+3.  The absolute values of the numbers in the column
"same" are those shown in Figure IV-12.
                            115

-------
Table IV-4.  Correlation Between Proper Functions for Different
             Area Assignments.   The central column indicates
             proper functions of same order. Adjacent columns
             are the correlations between proper functions of
             different orders.
Order
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-3
___
	
	
.013
.003
.001
.045
-.189
.070
.177
.200
.044
-.054
-.011
-.038
-.021
.055
-.011
.114
.075
-2
___
	
-.050
.062
-.027
-.025
.001
.226
.050
-.103
-.585*
-.231
.039
.015
.095
-.002
-.529
-.118
-.061
.198
-1
	
.002
.007
.006
-.134
.082
-.176
.308
-.017
-.297
-.575*
-.043
.074
-.002
-.824
-.040
.007
-.037
-.445
-.367
Same
.982
.974
.974
.967
.935
.928
.858
.862
-.671
-.193
-.391
-.929
.965
-.098
.072
-.292
-.338
.163
-.004
.397
+1
-.005
.010
-.007
.133
-.016
.087
-.212
-.091
.674*
.880*
.184
.060
.057
.099
-.011
-.199
.057
-.027
.341
.445
+2
+ .049
-.061
+ .031
.014
.024
-.242
.254
-.128
-.092
.028
.024
.009
.025
-.207
-.103
.398
-.037
-.162
-.018
.024
+ 3
-.013
.015
-.009
-.072
.229
.139
.223
-.041
-.191
-.018
-.026
.030
.025
.534
.068
.047
.062
.190
.115
-.017
*  See text for a discussion of these values
                            116

-------
     In Table IV-4 the large off-diagonal correlations for
proper functions 9, 10 and 11 are marked with an asterisk, *.
It is indicated that proper function 9 is highly correlated
with itself and 10, 10 with 11, and 11 with 10 and 9.  In
other words, as the weights (quadrature factors) are changed
the proper functions 9, 10 and 11 shift about among themselves.
                    *      *   4      #
In more detail, if 0.(x,)=0., 9v(x,)=0. are the i'th proper
                    1  K.   1   1  K   1
functions for two different but equally valid quadrature
factor assignments, we have the approximate relations

     6* «* -0.6718* + 0.6740*0

     6*  ~                    + 0.8806*
      10                              11

     9^ «* -0.5859* - 0.5750*Q

(The relations would be exact if all 40 proper functions
were listed on the right with appropriate coefficients from
the full table.)  Thus 0*^ shifts to 0*Q while 0* and 0*Q
go into 0* and 0*  (via a rotation of about 135°).   (All
of this is, of course, dependent on the relations between the
two quadrature factor selections and would be invalid for
any other pair of such quadrature factors.)  Further information
on this phenomena is provided by the proper values of the
modified weighted covariance matrix which are illustrated
in Figure IV-13.  It is seen there that proper values 9
and 10 are very nearly equal.

     When two (or more) proper values of a matrix (symmetric,
positive definite)  are equal, the proper functions cor-
responding to these are not both uniquely defined (Anderson,
1958).  Apparently this pair of proper values are sufficiently
close to each other that the change in the quadrature factors
used was sufficient to show up this "near indeterminancy".
                          117

-------
     Table IV-4 for order numbers 14-20 shows the small
correlations for C..  of Figure IV-12 in the column headed
"same", generally small correlations in the off-diagonal
positions, but a few scattered larger values but no apparent
pattern.

     The residual variances depend strongly on the point
at which the number of significant proper values/functions
is terminated.  These are shown in Table IV-5 for a termination
of 8 or 13 such.  Only the values for one subdivision of
the area into non-overlapping triangles is shown since to
the number of digits shown in this table, the results of the
two subdivisions into triangles were identical.  It is to be
noted that, since S0~ concentrations are approximately log-
normally distributed, the variances are those of the logarithm
of the SO- concentration.  The station-to-station variation
of the fraction of the residual variance compared with the
total residual variance is to be noted.  These are particularly
large at stations 2,  9, 10, and 30.  It is presumed that
this is due to station instrumentation or instrument exposure
at these locations since these locations are not obviously
related to each other  (i.e., other stations with relatively
small residual variances lie between each pair  (see Figures
IV-5, 6, 7).

     The residual variance computed for station 9 clearly
indicates that all 13 of the proper functions are significant
in the case at hand.   If only the first 8 proper functions
are used, the computed residual variance is larger than the
total variance, which is quite impossible.

     It was pointed out above that the residual variance
at locations 2, 9, 10, 30 appear to be quite large.  A remark
on the significance of the ratio of residual variance to
                             118

-------
Table IV-5.
Total Variances, Residual Variance and the
Fraction of the Total Variance for St. Louis
S02 Data at 40 Stations.  The abbreviation P.V./F.
stands for proper values/functions.  The number
preceding this abbreviation indicates the number
of P.V./F.'s used in computing the residual variances
shown.  See text, p. 118.
Station
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
Total
Variance
.0497
.0829
.0374
.0583
.0269
.0982
.0738
.0784
.0945
.0376
.0295
.0758
.0352
.1118
.0976
.1938
,0269
.0532
.1155
.0567
.0778
.0684
.0854
.1468
.1527
.1183
.1352
.1765
.1195
.1067
.1100
.0743
.1299
.0976
.0861
.1051
.1114
.0570
.1016
.0428
8 P.V.
Residual
Variance
.0049
.0351
.0072
.0121
.0029
.0151
.0086
.0048
.1050
.0208
.0026
.0028
.0049
.0229
.0114
.0184
.0060
.0051
.0341
.0090
.0040
.0074
.0087
.0159
.0139
.0071
.0070
.0378
.0047
.0752
.0111
.0129
.0186
.0126
.0222
.0141
.0059
.0126
.0126
.0125
/F.
Fraction
.10
.42
.19
.21
.11
.15
.12
.06
1.11
.55
.09
.04
.14
.20
.12
.09
.22
.10
.30
.16
.05
.11
.10
.11
.09
.06
.05
.21
.04
.70
.10
.17
.14
.13
.26
.13
.05
.22
.12
.29
13 P.V.
Residual
Variance
.0026
.0183
.0037
.0063
.0015
.0079
.0045
.0025
.0548
.0109
.0014
.0014
.0026
.0120
.0059
.0096
.0031
.0027
.0178
.0047
.0021
.0038
.0045
.0083
.0073
.0037
.0036
.0197
.0024
.0392
.0058
.0067
.0097
.0066
.0116
.0074
.0031
.0066
.0066
.0065
/F.
Fraction
.05
.22
.10
.11
.06
.08
.06
.03
.58
.29
.05
.02
.07
.11
.06
.05
.12
.05
.15
.08
.03
.06
.05
.06
.05
.03
.03
.11
.02
.36
.05
.09
.07
.07
.13
.07
.03
.12
.06
.15
                             119

-------
total variance may be appropriate at this point.   The ratios
in the column headed fraction, F, are the values

          7   ?    o
     F = a /(a  + a )
          r/ \      T>

       2                               2
where a  is the residual variance and a  is the "true"
variance.  The ratio of the residual to "true" variance is
given by

     aj/a2 = F/(1-F) .

The ratio of the standard deviation of the residuals to the
"true" standard deviation is the square root of the above.
Some values are tabulated in Table IV-6.
Table IV-6.  Ratio of Standard Deviation of Residuals (a )
             to the Standard Deviation of "True" Values
             (a) as a Function of the Ratio of Residual
             Variance to Total Variance (F)
F
0.00
0.05
0.10
0.15
0.20
tfr/a
0.00
0.23
0.33
0.42
0.50
F
0.30
0.40
0.50
0.60
1.00
or/a
0.66
0.82
1.00
1.22
CO
It is to be noted that in the case of the four locations
noted above all have a value of F in excess of 0.20 which
means that the standard deviation of the residuals is more
than 50% of the standard deviation of "true" values.
Table IV-5 lists 13 locations with F values of 0.10 or more
for which the standard deviation of residuals is more than
33% of the standard deviation of "true" values.  These seem
to be rather large.
                       120

-------
     The significance of the residuals should also be noted
here.  In the type of analysis that has been made, the term
residual includes not only the errors of observation, but
also the ability of the station network to resolve small
scale effects.  Thus, the size of the residuals is dependent
on the density of the pollutant concentration measurement
points.

     It was pointed out earlier that the proper functions
were not comparable with each other for various triangle
subdivisions or quadrature techniques after a certain index
had been reached and that this index seemed to be associated
with the index at which a "knee" appeared in the plot of
log-proper value against index number.  The proper values of
the matrix after being modified by multiplying ahead and
behind by the square root of the diagonal of the inverse
                                     it
matrix, an essential feature of the Joreskog technique, are
shown in Figure IV-13.   The  numerical values of the proper
values have been radically changed, but the shape of the
curve of the logarithm of the proper value against its index
remains about the same with the exception that the "knee"
has been removed and is replaced by a gradual change of
slope between index numbers 5 and 10.

     In view of the preceding analysis, we would then conclude
that the criterion of Craddock and Flintoff (1970) and
Farmer  (1971) is not really a measure of the significant
number of proper values/functions but is a phenomenon associated
with the fact that the residual variances have not been
adequately treated by a principal component analysis.  The
Factor Analysis procedures used here give different results
that seem to be self consistent and which in addition provide
a quantitative estimate of the residual variances that were
hitherto ignored.
                            121

-------
   1000
                          LOGARITHMS OF  PROPER  VALUE PLOTTED
                     AGAINST ITS INDEX NUMBER FOR ST. LOUIS  SO
                     DATA   THESE VALUES APPLY  AFTER THE CO-
                     VARIANCE MATRIX  HAS  BEEN MULTIPLIED AHEAD
                     AND BEHIND  B:  THE SQUARE ROOT OF  THE DIA-
                     GONAL OF THi. INV£r
-------
     Note that in Figures IV-10, IV-11 and IV-12 the
correlation of the 40'th proper function for different
quadrature methods (Figure IV-11)  and for different quadrature
factors  (Figures IV-10 and IV-12)  is extremely large and is
an isolated point in Figure IV-12.  This is a mathematical
phenomena and has nothing to do with the fields involved.
As a matter of practical experience, the larger the index,
the more oscillatory the proper functions.  This is strictly
true of what has been called an "oscillatory" matrix (Gant-
macher, 1960b).   We have not found a proof for any other type
of matrices and one can in fact construct symmetric positive
definite matrices for which the opposite is true.  We strongly
suspect that "oscillatory" matrices do not exhaust the class
of matrices with this property, but characterize a particular
class of matrices for which this property could be proved.
With the fact that experience indicates a highly oscillatory
behavior for the last proper function, we also suspect that
there is a tendency for these oscillations to either be in
or out of phase for matrices that are similar to each other,
as is the case for the matrices concerned here.  This would
account for the high correlation observed for proper functions
No. 40.

D.   Summary and Conclusions
     In this chapter, the subject of the analysis of co-
variance matrices has been treated in some detail, since it
is of great importance in the analysis of data fields of
all kinds and in particular to the fields of pollution con-
centration data, especially in connection with the problem
of optimum observation station locations to which it is
applied in the computer program that is discussed in detail
in the final chapter.  Section A was devoted to a general
look at the problem, Section B to a detailed discussion
of how to determine the residual variances and the number
                          123

-------
of significant proper values and proper functions required
to describe the statistical properties of the pollutant con-
centration field using Factor Analysis methods but modified
by the fact that one is dealing with a two dimensional field
and should use a basic integral equation formulation to
account for variable data density.   The techniques developed
in Section B were applied to St. Louis S0~ concentration data
in Section C.

     As a result of the analysis in Section C, it was found
that the Factor Analysis technique as modified by the integral
equation formulation gave (1) a quantitative evaluation
of the number of significant proper values and proper functions
required to describe the pollution field, (2) a quantitative
evaluation of the residual variances in such detail that
unique values were assigned each observation location
separately (much more than a general estimate),  (3) that the
method was consistent in spite of the fact that there are
mathematical ambiguities in the quadrature process  (evaluation
of the integrals involved), and (4) that the self-consistency
of the solution from the mathematical point of view cor-
responded very closely to the tests for significance from
the statistical point of view.  It was further concluded
that the Factor Analysis approach is a prime requisite for
the success of the analysis since when this approach is
disregarded the results tests for mathematical self-consistency
and for statistical significance are not completely compatible
with each other; and it was shown that this results from
ignoring the importance of the residual variances.
                             124

-------
                          CHAPTER V
              PROGRAM BAST AND ITS SUBROUTINES
     The program for an optimal location of pollution observation
points (BAST) and its twelve associated subroutines are described
in some detail in the following sections.  The interrelations
between these subroutines and the main program is shown in the
following table.  The order in which they are discussed follows

                           TABLE 1
             THE RELATIONS BETWEEN PROGRAM BAST
                 AND ITS SEVERAL SUBROUTINES
SUBROUTINES
CALLED
CIRCUM (B)
FMFP (C)
FUNB (D)
FUNCT (E)
CORFUN (F)
INT2D (G)
MATINV (H)
ADgPT (I)
TRIP IX (J)
PORDR (K)
AR2 (L)
TRITST (M)
PROGRAM
BAST (A)
/
/
/
/



/




CALLING SUBROUTINE
CIRCUM
=
/





/




FMFP3

=
/
/








FUNB


=
/








FUNCT5



=
/
/
/





ADOPT6







=
/
/
/
/
TRIFIX7








=

/

PORDR8









=
/

that indicated by the letters in parentheses in the subroutines
called column, BAST itself being considered first.
                              125

-------
     As shown in the table,  the subroutines are divided into
two categories:  those involved in computation of the station
location [(B) through (H)],  and those involved with book-
keeping required to preserve,  at all times, a satisfactory
subdivision of the area concerned into completely covering
but non-overlapping triangles [(I) through (M)].

A.  PROGRAM BAST
     This program is set up to accept NSTN coordinates
XS(I), YS(I), 1=1, NSTN, of already located observation stations
of which NBDY of these form the convex boundary of the area
covered by these station locations  (line 38).  The index
numbers of the boundary stations are entered as IBDY(I), 1=1,
NBDY, and should be in counterclockwise order around the
boundary but with any initial starting point  (line 41).  The
boundary of the area covered by the initial stations is a
convex polygon with the boundary stations at the vertices.
The area is convex in the sense that it must not have any
re-entrant corners.  Put in a different way,  if one traverses
the boundary in the counterclockwise direction, then at each
vertex, the line segment that forms the boundary is rotated
in a counterclockwise direction.   (The mathematical statement
is that the area is convex if the line joining any pair of
points in the area lies entirely within the area.)

     The area covered by the existing station location is
subdivided into non-overlapping triangles with stations
at the vertices (line 52).  The boundary segments are  sides
of some of these triangles.  There are NTR such triangles.
The index numbers of their vertices are ITR(I,1), ITR(I,2),
ITR(I,3), 1=1, NTR, and these must be in counterclockwise
order about each triangle.
                           12G

-------
     The entire region is interior to a circle with center
at  (XC,YC) and of radius R  (line 56).  Some of the stations
listed previously may lie on this circle.  If this is the
case, their azimuth must be specified.  The number of such
is NCIR and the point azimuth AZ(I), 1=1, NCIR is measured
counterclockwise (degrees) from East.

     Additional information to be specified  (line 81) are:
the number of additional stations to be located  (NADD), the
percent reduction required between successive maximum errors
of estimate (PMIN), the expected absolute error of the
minimization subroutine  (EPS), the maximum number of iterations
allowed for this subroutine  (LIMIT), and a control for print-
out of details (IPRNT).

     Since the program calls for computation of correlation
coefficients via  their proper value/function representation
at a grid of points covering a square 80 km on a side, the
coordinates of the grid points (lines 87,88), the proper
values (line 93)  proper functions  (line 96), and standard
deviations for weight parameters (line 101) are required
inputs.  The program, as it stands, is set up for 15 proper
values/functions on a 9x9 grid with 10 km spacing between
points.  The weight parameter  (line 101) corresponds to the
real statistical situation where expected standard deviations
of pollutant concentrations are used, but the weight parameter
may be arbitrary.  The program assumes that it may be
differentiated using finite  (1 km)  differences with sufficient
accuracy.

     If less than two existing stations are listed on the
circle that defines the region being considered, then two
such points are located on the circle (lines 114-135).
                            127

-------
     The criterion for station location is to the effect
that a "best" location is at a point where the error of
estimate using a linear least squares regression on sample
values at existing stations would be a maximum.  Since such
errors of estimate are bounded between zero at an existing
station and the variance of pollution concentration, it is
reasonable to look for such maxima at locations distant from
existing stations.  Consequently, candidate locations are
taken to be (a) at the center of gravity of the station
triangles or  (b) midway on the circle between stations
located on the circular boundary  (lines 135-173).  Of the
errors of estimate found at these locations, the largest
three are selected and the neighborhood of each is searched
to find the value of the local maximum in its neighborhood.
These local maxima are then compared and the new station
location assigned at that point which has the largest local
maximum  (lines 174-252).  A bit of manipulation is required
(lines 253-275) to insure that new points on the circular
boundary do not wander outside the circle.

     The total number of stations located (NSTN) is then
checked against the total number required (NSTOP)  (line 287).
If this number has not been reached the process is repeated
by returning to line 135.  If the required number has been
reached the remainder of the program  (lines 290 through 322)
prints out the results and terminates.  The final lines 323
through 345 are devoted to various diagnostics.

     Note:  The subroutine FMFP is a standard subroutine
that locates the minimum of a function of several variables.
The procedure described above involves locating a maximum.
To "trick" FMFP into thinking a maximum is a minimum, the
mean square error of estimate is given (internally) a negative
sign.  Whenever the term minimum appears in subsequent des-
criptions it is to be understood in this sense.

                          128

-------
                 I READ INPUTS |
       I ADD BOUNDARY POINTS VIA CIRCUM |
           LOCATE C.G.  S OF  TRIANGLES
                      I
       GET FUNCTION VALUES VIA FUNCT.
                      I
           [SELECT BOUNDARY  POINTS  ]
                       r
        GET FUNCTION VALUES  VIA  FUNB
                      i
       J
    FIND THREE SMALLEST FUNCTION VALUES
                      I
 I GET LOCAL MIN.  FOR EACH OF THESE VIA FMFP \
                      I
       [PICK SMALLEST AS  THE GLOBAL  MINT}
                      I
      ADD THIS TO STATION LIST  AS  A

           BEST LOCATION VIA ADOPT
            YES
NO
                 FIGURE V-l

ABBREVIATED FLOW  CHART OF MAIN PROGRAM BAST
                     129

-------
            'DECK
20
35
            C
            c
            C
            c
            c
      PROGRAM 3AiT(  INPUT,  OUTPUT  )
      GtTERMlNES LOCATION  OF  »OINTS  QF LARGEST EK*GR OF ESTIMATE  FRO
      EXISTING N-T  AND  UCATc^  N'£W STATION THERE.  PSOCE-JURE  ITERATES
      UNTIL EITHER  TOTAL NO.  OF STATIONS REACHES PRESCRIBED SIZE  OR
      TILL THE PERCENT  REDUCTION OF  TH£ MAX EOFE IS dELOW PRESCRIBED
      AMOUNT
      CGMiIGN / 3LK1  / NSTN, N6JY,  NTR, XS(^0», YS(30), ITR(pO,3)
      , , XC, YC, R,  I3DY(20)
      GC.IMON / BLK?  / XD(9),  YD(9),  PH(15,9,9), ALAM(15), WO,9)
      CCMMON / BLK3  / NCIRt AZ(20)
      CLMrtON / PLK
-------
               3AST
 tO
 65
 70
 75
 85
 90
 95
100
105
110
C     NCI* = NUM3EF OF  POINTS  ON  THE  CIRCLE
      READ 1058,  XC, YC,  R,  NCIR
1058  FORMAT <3F10.0,I5)
C     AZIMUTH OF  POINTS ON THE CIRCLE
C     AZ(I», 1*1, NCIR, AZIMUTH  IN  DEGREES CC  FROM EAST
C     THESE MUST  BE BOUNDARY POINTS APPEARING IN THE LISTS XS(II, YSU).
C     NONE OF THE BOUNDARY POINTS  MAY LIE  OUTSIOE OF THE CIRCLE.
C     SOME OR ALL MAY LIE INSUE.  IF ALL  LIE INSIDE THEN NCIR = 0.
C     THEY MUST APPEAR  IN ORDER OF INCREASING AZIMUTH,  0 TO 360 uEGREtS.
      IF ( NCIR .LE. 0  )  GO  TO 15
      READ 1060,  ( AZ(I>. 1=1,  NCIR )
1060  FORMAT ( 9F8.0)
      DC 1  1=1, NCIR
1     AZ(I) = AZU) * .017<»53292
C     MISCELLANEOUS OTHER INPUT PARAMETERS
C     NAOO « NUMBER OF  POINTS  TO  3t AJOEO.
C     IF NCIR = 0, 1, THEN 2 OR 1  POINTS HILL BE ADOEO  ON THE CIRCLE
C     REGARDLESS  OF NAOO.  IF  THIS IS NOT  THE STOPPING  CRITERION, NAUO=0
C     PMlH * PERCENT REDUCTION 3ETHEEN SUCCESSIVE MAXIMUM EPPORS OF
C     ESTIMATE.   NO POINTS HILL BE ADDED AFTER PERCENT  REDUCTION GOES
C     BELOH THIS  VALUE.
C     EPS = EXPECTED ABSOLUTE  ERROR FOR SU8RPOUTINE  FMFP
C     LIMIT = MAX. NO.  OF ITERATIONS  FOR SUBROUTINE  FMFP
C     SUGGESTED i/ALUES  FOR EPS =  .01,  AND  LIMIT  = 5
C     IPRNT = REQUEST FOR PRINTOUT OF DETAILS OF THt INPUT OTHER
C     THAN COORDINATES  OF INITIAL  STATIONS
15    REAO 1062,  NAOO,  PMlN, EPS,  LIMIT, IPRUT
1062  FORMAT (15 ,F10 . 0,£10.1,215)
      TPI = 6.2831853
      NSTOP = NSTN * NAOO
C     COORDINATES FOR THE CORRELATION COEFFICIENT
C     PARAMETERS  XOU), YD1I>, I  = 1,  9
      REAJ 1056,  ( XO(I), I  =  1,  9 )
      REAJ 1056,  ( YOU) , I  =  1,  9 )
C
C     CORRELATION COEFFICIENT  DATA INPUT IN  TERMS OF
C     PROPER VALUES AND PROPER FUNCTIONS
C
      REAO 1055,  ( ALAM(I),  I  = 1, 15  I
1055  FORMAT (8F10.0)
      00 2 I * 1, 15
2     READ 1057,  ( ( PH(I,J,K>, K  = 1,  9), J = 1, 9  )
1057  FORMAT ( 16X, *»£16. 5/5E16. 8/ <<»E16. 8/5E16. 8 11
1056  FORMAT (9F8.0)
C     HEIGHT FUNCTION TABLE  FOR VALUES AT  POINTS XO(I),  YD(Jl
      00 3  J = 1, 9
      REAO 1056,  ( H(I,J), I = 1,  9 )
3     CONTINUE
C     OUTPUT OF THE INPUT STATIONS
2000  FORMAT (1H1)
2002  FORMAT 
      PRINT 2000
      PRINT 2010
2010  FORMAT (5X,»A. COORDINATES OF INITIAL  STATIONS*/5X,6HSERIAL,2X,
     lllHCOOROINATES/5Xt6HNUM8£R,5X,lHX,5X,1HYJ
                                           131

-------
               3AST
i?o
12?
130
13
1<*0
150
155
160
165
             13
                   PRINT  2012,  <  I,  XS(I),  YSd), 1 = 1, NSTN  )
             2312  FORMAT  (I10,F7.1,F6.1)
             C     Me. A3 ING FOR  RESULTS
                   Fn.EF =  1.
                   IF  ( NCIR  .GT.  1  )  GO TO 3
             C     ACJS ONE CR  TWO POINTS  ON TH£ CIRCLE
                   10  = 1  ? 1C  =  2
                   CALL CIRCUMJ  XC,  YC,  R,  NCIR 1
                   PRINT  2018
             2018  FORMAT  (5x,*e.  COORDINATES OF CIRCLE FCINTS  AJCEO*)
             C     FRISTS  COORDINATES  ANb  SERIAL NUMBERS OF  POINTS  ADDED ON CIRCLE
                   OC  6 I  = 1,  NCIR
                   II  = NSTN  -  NCIR  *  I
                   PRINT  2012,  lit XS(I1),  YSdl)
             6     CONTINUE
                   PRINT  201 » AZdl) ) /



2
LT. AZ(I) > ANG
 IF (  AZ(I1)
 CALL  FUN6( 1, ANG, F, G  >
 II =  NTR *• I
 XT (ID  = ANG
 SAMP(I1> = F
 NTOT  =  NTR * NCIR
 PRINT 13,  ( I, (  ITRd.JI,
,1=1, NTR )
 FORMAT  UI5t2E15.6tl5X,£15
 NTR1  =  NTR * 1
 DO 304  I = NTR1,  NTOT
 X(l)  =  XC » R * COS( XT(II
                                                   ANG
                                                          3. li»15926E<»
                                               J =  1,  3J,  XT(I),  YT(I), SAMP(I)
                                               6l
                                            132

-------
               3AST
170
175
180
185
190
195
200
205
210
215
220
      X(2> = XC * R » SIN( XT(I) )
304   PRINT 11, I, X(l», X(2), XTU), SAMP(I)
11    FORMAT (I5il5X,4£15.6l
c     PIC
-------
2?5
230
235
2 *.0

503

£02


50*
C
C




26

28



30

32

3k
3b


38

C



C


kO


IF 1 YT1PU) .EQ. 99.9
YTMP(J) = YC +• * » SIN
XTMD = XC * R * COS
CONTINUE
PRINT 50<*, ( XTMF(J) ,
PRINT bOi», ( XMAX(J>,
) 503, 502
( XTMP(J) 1
( XTMPCJ1 )

YT*P, STAMP(J), J = 1, 3 )
YMAX ( J), FMAV( J) , J = 1, 3 )
FORMAT (3(2F10.5, 2X,E12. 5) )
END TcMP CARDS
NOW ORDER THESE
DC 30 K = 1, 3
SML = 1.
00 28 J = 1, 3
IF f FMAX(J» .LT, SML
SML = FMAXtJ)
Jl = J
CONTINUE
JTRY (M = Jl
FM2(K» = SML
FMAX(Jl) = 1.
CONTINUE
IF ( F-3EF .GT. 0. ) 32
PER = 999.9
GO TO 36
PER = ( FBEF - FM2UJ
DO 38 J = 1, 3
XAU) = XMAXf JTRV(J»
YA(J) = YMAXC JTRYUI
FM2(J» = -FM2(J)
FBcF = -FM2<1)
ADDS NEW POINT TO LIST
XO = XA(1)
YO = YA(1)
CALL AOOPT( XO, YO )
ADDS TO CIRCLE LIST IF
oisr = SORT ( ( xo - xc
IF { OIST .Gt. ( R - 0
ANG = ATAN2( YO - YC,
IF ( ANG .LT. 0. ) ANG
DO kk I = 1, NCIR





) 26, 28







, 3 <»


) / FBEF

>
»






NECESSARY
)**2 » < YO - YC )**2 )
.1 ) ) <+0, 50
XO - XC )
= TPI <• ANG

11 = I *• 1 $ IF ( I ,£Q. NCIR > 11 = 1


IF ( AZ(I1) .LT. AZdl
IF 1 ANG .LT. AZ(I1) .
) GO TO <»2
ANO. ANG .GT. AZ(I» ) <<6, kk
k2 IF ( ANG .GT. Azdi .AND. ANG .LE. TPI » GO TO <*&

kk

2017

<+6





IF ( ANG .GE. 0. .AND.
CONTINUE
PRINT 2017, ANG
ANG .LT. AZ(Il) ) GO TO U6


FORMAT (* FAILED TO FINO NEW CIRCLE POINT FOR ANG =
CALL EXIT
IA = 11
12 = NCIR - 11 * 1
NCI* = NCIR + 1
00 *fl I = 1, 12
11 = NCIR t 1 - I
IM = 11 - 1







250
255
260
265
                    PRINT  2017,  ANG
                                                                            F10.5I
                    CALL  EXIT
             V6     IA  =  II
270




275          k&     AZ (ID  = AZ(IM)



                                        134

-------
   PROGRAM     3AST

                   AZfIA) = ANG
             50    CONTINUE
             C     PRINTS OUT THE RESULTS OF THIS PASS
                   PRINT 2020, NSTN, XA(1), YA(1), FM2C1),  PER,  (  XA(I),  YA(I),
2£0               , FH2(I) , I = 2,  3 )
             2020  FORMAT (I10,Fa.l,F6.1,tll.3,F8.2,F8.1,F6.1,E11.3,Fr.l,F6.1,£11.3)
                   ISPACE = ISPACE  * 1
                   IF (  ISPACE .EQ. 5  ) 52, 5*
             52    ISPACE = 0
265                PRINT 2002
             5*    CONTINUE
             C     HAS THE NUMBER ASKED FOR BEEN REACHED
                   IF (  NSTN .GE. NSTOP ) GO TO 900
             C     TRY AGAIN
290                GO TO 8
             C     PRINTS REST OF INPUTS
             900   IF (  IPRNT .NE.  1 ) GO TO 999
                   PRINT 2000
                   PRINT 202*
295                PRINT 2026, ( J, ITR(J,1), ITRU.2),  ITR(J,3),  J  =  1,  NTR )
             202*  FORtlAT (8X,*lNOEX NUMBERS OF TRIANGLE  VERTICES')
             2026  FORHAT <8(IX,12,1HI,13,1H,,I 3,1H,,13,1H/))
                   PRINT 2000
                   PRINT 2028
300          2028  FORMAT (5X,'PARAMETERS OF THE EMPIRICAL  CORRELATION COEFFICENTSV
                  /dX,*COOROINATES  OF  THE DATA GIRO'»
                   PRINT 2030, ( XD(I), I = 1,9 )
             2030  FORMAT (* X = *,9{F8.2,1H,M
                   PRINT 203?, ( YO(I), I = 1, 9 )
305          2032  FORMAT (* X = *,9(F8.2,1H,I)
             2038  FORMAT (1P9E12.*)
                   PRINT 20*0
             20*0  FORMAT ('WEIGHT  FUNCTION ARRAY IN  FORM OF  COORDINATE GRID*)
                   00 91* J = 1, 9
310                Jl =  10 - J
                   PRINT 2038, ( W(I,J1), I = 1, 9 )
             91*   CONTINUE
                   PRINT 2000
                   PRINT 20*2, XC,  YC, R
315          20*2  FORHATC* CUTER CIRCLE HAS CENTER  AT X  =  *,F7.3,2Xf» Y  = *,F7.3,2X,
                  ,' AND RADIUS = *,F«.3)
                   PRINT 2002
                   PRINT 20**, NADO, PMIN, EPS, LIMIT
             20**  FORMAT (* NUMBER OF POINTS TO BE  ADOtD = *,I5/» PERCENT CHANGE OF
320               'MAXIMUM ERROR OF ESTIMATE BETWEEN  ITERATIONS  =  »,F7.3/
                  »* EXPECTED ABSOLUTE ERROR FOR FMFP =  ',El0.3/»  MAXIMUM NUM3ER OF I
                  'TERATIONS OF FMFP = *,I5)
                   GO TO 999
             C     DIAGNOSTICS FROM FMFP
325          950   Nl = NSTN * 1
                   IF (  IER ,£Q. 1  ) 952, 95*
             952   PRINT 2050, Nl,  LIMIT
             2050  FCRMATC NO CONVERGENCE FOR POINT  NO. » ,I3,2X ,*IN» , 13, 2X,
                  "ITERATIONS')
330                GO TO 960
                                              135

-------
33?
3<+0
  3AST

95-*   IF ( IER .£0.  2  )  956,  959
95o   PRINT 2052, Nl,  LIMIT
20^2  FORMAT  (* NO  MAXIMUM,  POINT  NO.  = *,i3,2x,» ITERATIONS  =  *,I3»
      GO TO 960
958   PRINT 205<», Nl,  LIMIT
205<+  FORMAT(* ERRORS  IN GRADIENT.   POINT NC. = *,I3,2X»*  ITERATIONS  = *
     ,,I3»
960   IF ( II ,GT.  NTR )  GO  TO  9b2
      PRINT 2056, XT(T1),  YTdl),  STAMP(J), X(l», X(2),  F,  G<1),  G(2)
2Q56  FORMAT  (» INITIAL  VALUES*/2F10.5,E15.5/» FINAL  VALUES  ANC GRADIENT*
     **/2F10.5,3E15.5l
      Gu TO 23
962   PRINT 2058, XTflll,  STAMP(J),  X(l), F, G(l)
2058  FGRMAT(» INITIAL VALUES*/F10.5,E15.5/» FINAL  VALUES  AND GRADIENT*/
     /F10.5,?E15.5)
      Go TO 25
999   STOP
      ENO
                                             136

-------
BAST Code Sample Problem
     The input required for the BAST code is defined on the
following page.  Reference to the main subprogram and to
the text is helpful for obtaining further detail.  A sample
input case follows.  The case consists of eight fixed stations,
the last four of which are boundary points, i.e., their
coordinates define the convex region containing the eight
stations (data card groups 1-3).  A total of 10 triangles
cover the region defined by the eight stations  (card
groups 4 and 5).

     The region for which correlation coefficient data is
input, and to which additional stations are to be optimally
added, is defined by a circle centered at  (0,0), with a
radius of 30 km (card 6).  No input points on the circle
are indicated by card 6, so card 7 is not required.  In
this event, the code will automatically define two points
on the perimeter of the circle.

     Input card 8 specifies the number of stations to be
added, 17,  plus program controls for the minimization routine
(FMFP) and printing option.  Cards 9 and 10 contain the
standard set of coordinates for this problem, a 9 X 9 grid
with 10 km spacing in both the X and Y directions.  Card
group 11 lists the proper values of the correlation coefficient
matrix, in monotonic decreasing order.  The desired signif-
icance is obtained with the first 15 values.  Card group 12
contains the proper function coefficients, at each point in
the 9X9 grid,  corresponding to the 15 proper values.
Last, card group 13 is a 9 X 9 array of weight parameters
for the correlation coefficients.  The normal situation of
equal weighting,  all weights being 1.0, is input in this
case.
                          137

-------
     A sample output follows the input.   The fixed stations,
the stations added on the circle, and then the remaining
stations to be added are listed.  For each added station
the three prospective coordinate pairs of maximum estimate
of error (E of E)  are listed.  The optimization procedures,
if successful, varies these coordinates in obtaining the
location for which the greatest reduction in estimate of
error can be achieved.  Optionally, extensive mathematical
detail can also be output.
                              133

-------
CA3D GROUP
  NUMBER
                                                            FORMAT
                 OF
                                       MAXIMA OF 3o
    9
   10
          XS, YS



          N8DY
          I°DY

          NTR

          ITR CCW FKOM EAST* OF POINTS LYING ON    9E8.0
         CIRCLE,  TOTAL Qf NClR POINTS,  SKIP T^IS CARD  IF
              = 0.
         NUMBE» OF STATION^ TO HE fiDOED            IS,FlO . 0,E10. 1
         PF«CEMT PEOUCTIOM BtTwEEM MAXIMUM E-RRORS  OF
         ESTIMATE,  uo POINTS ADOEI AFTER ^RECENT  -?EDUCTIO\J
         GOES ?ELOd HMIN
         EXPECTED ABSOLUTE E^^oRt SU^ROUTINJP rMF=
         M/sXIMJM NJMaER Or ITERATIONS IN SUBROUTINE FMFP
         PPINT OPTION FLAG.  o» OUTPUT STATION COORDINATES
         ONLY,  i, DLTAILED JIAGMOSTICS PRINTFO.
                       FOR CORRELATION C°£FFICIEVT  INPUT   9F8.0
         Y_CORPlDNATtS FOR CORRELATION COEFFICIENT  INPUT   9F8.0
         DATA.  COOE ASSUMES 9 VALJES 0^" XO AND YD, FQR A
         9X9 GRID rflTH lo KM SoACI\|fi BETWEEN
   II



   12





   13
                                                           8F10.0
ALAM     PROPER VALUtS FOR CORRELATION COEFFICIENT DATA,
         1* VALUES.

PH(I,J,K) COEFFICIENTS OF 1= =>R03E3 FUNCTIONS, I = 1, 15,
         AT GRID POINTS SPEC^IED ^Y (XD»YD), ORDERED FOR  SElb.9/
         XD VALUES. KOR EACH YD VALUE < I.E..              UEI^.B,
         (( PH(J,J,K), K = i,9 ), J = 1,9 )  >.             5E16.S)
                FUNCTION TaPLE FOR POINTS (XO,YD),
                                                           9F8.0
                                      139

-------
                              t   1NJDJT
     C A '7 n =5 1. T  J «
o.        :..
-.5       u.b
-4.75     -2.5
-.5       -<+.
t.5       -8.
4.5       -2.5
3.5       9.
-22.      7.5
                                                                         tt » # »
»o
p
5
I
3
1
2
1
2
2
•»»««
0.
U4HH1
17
•!>»«•«•
-40.
-40.
»»•»«•
CAPO SET
5
4
6
4
6
6
2
7
8
CA^D 6
0.
CARD H
.01
-3C.
-30.
CA3D cFT

3
3
1
i
1
2
7
3
8
3
»»4«tf»»«»»«»»»«»«»»««ftft»««ft««»
30. 0
«««*»»»»«*«»*»«»»»»»»»»*»*»««»
i ^ •
• * !
-20. -10. 0. 10. 20. 30. 40.
-20. -10. 0. 10. 20. 30. 4o.
*i»«*»»»*o***»«»»»»<»»«*»#***»»**
28.909829930.4315128
  .73^2799  .5133393
.3574174
                                         2.6616002 2.196S837 1.220i544  1.1634856
                                 331^036   .2512566  .2123830  .1081351
PROPER FJNC.  1
  7.8273254PE-02
  4.91204939E-03
  9.71762U28E-02
 -1.13289246E-02
  9.54481463E-02
9. 72^90823^-^2
1 . H649P 5^4r_o2
1 . ! lSOi4^o^-'J 1
1 . 3oB771 76---03
1 . 31 071 3
2.°1
1 . 0*
3
1
2
                                             t -02
                                           72--0 1
                                             - -0 1
-5.52873886E-02 -5.2485044l--.Gc;  -
                            4.523>^0567E-02
                            1 . 0^6lH454p-0 1
                            b « 792^98 53r-02
                            1 . 152C>31 37r-01
                            6. 20^60851 c--02
                            1 . 41
                            2.
                                                                  &.57938454r-0
-------
   2.3Q534203t-02   1.
 -1.12859460E-01  -1.
 -1.53931315E-01
 -1.19492211E-01
 -1.63982476E-01
 -1.11995777E-01
 -1.66174576E-01
 -9.32753081E-02
 -1.57249352E-01
 -B.98516291E-02
PROPER FUNC.  2
 1.&8599575F-01
 2.58123666=:-|J3
 i.o937«»<»l7t-01
                                   P.P7039Q67---02
             01   1.58256^97^-01
             02
 1.57131973E-01   1.595157l7r-0l
             01
             01   i.42279!05r-01
             01
             01   1.2225b099;.--0l
-3.«»5392130P-02
 3.51570809£-C
-------
                                                   9. 59*) -^450^-02
   9.63918369E-OJ   1.: 91 7*690 ~—jtL  -5.1
  D.4336^953t-02
 -1.09d276b7E-o:
 -2.54435856E-03
 -2.31335639E-Q1
PROPER FJNC.  5
  2.9470556GE-02
 -4.96574543E-02
  5.71023314E-02
  5.2938?061E-02
  2.462669Q9E-02
  1.04562477E-01  9
 -2.09007517E-01  5
  6.88108760E-02  3
 -9.8l58579o£-0^  2
 -3.79426S54E-02 -4
  2.53804551E-01  3
 -5.16735579E-02 -4
  1.07424413E-OI  n
 -6.49296552E-02 -6
 -5.39149076E-03 -1
 -1.24685118E-01 -i
  2.25283490E-02
PROPER FUNC.  6
  2.32072364t-01
 -1.41854046E-OI
  1.34712190E-01
 -1.99453596E-01
                             -- '1
                   6.7179?61«£-03  -»
                                             --01
                                                   1.0461S8&4F-01
  8.67563695E-Q2  >.
 -3.5569862CE-02
PROPER FUNC.  7
  1.02962321E-01
 -1.15002120E-01
  f.55958905E-02
                  7.28336007"-U«!
1.74^91039
«.H3
2.00
                                               01
                                               02
                          :lo--n  -;
                                             - -01
                                  -9,99^89954--(
                                   l.!6°2o330-'-01
                                   3.4iS03l7?5--02
                                   4.79'7^3fl8r-02
                                   5.7B0998Q9E-02
                                   1.11^74066^-02
                                   «. 0735^-707^-02
                1 .b200°794£-01
                1.31951755^-01
                                 1.36663303^-01

                                 A.26642007--01
                                 1.72210495^-01
                               -i.09465773^-01

                               -1.3618B963--01

                               -d.B3940150---02

                               -J.172934Q5--02

                                3.42138412---J3

                                J.07191213--02

                                1.27163154^-01
                                2.30392728r-ol
                               -4.93870213=--02
                       1130F-02 -t
               -1.03363937F-01
               -7.«?994430E.-02 -1.01727667--01
                3.324H062E-02
  7.23906777E-02 -4
  1.05074735E-01  1.53376502--')2
              02 -5.1507"4955-02
              02  2.06959993--02
                1.33120038^-01
                9.03230076F-02
                2.Q5791661F-02
               -b.39lPR916E-02
 . 08357975-1-01
 .49J54654- -02
               6.2S410204F.-02
              -2.61305459F-02
              -1.4704Q130F-02
               1.39613657E-03
               1.21864921^-01
              -7.98054349K-02
               3.05759671F-02
.9«o76201r-02 -6.67310735F-02
              -1.2n5?6417F-01
               4.00501930F-0?
               2.24656696F-01
2.74285861E-02
3
                                                                   (».543b4bl8r_o<::

                                                                   4.84856476F-02

                                                                   t>.7544S»bo7--02

                                                                   ».04872955r-02

                                                                   0.85158455^-02
                                                                   1.46079962 — 01
                                                                 -B. 81331683^-02

                                                                 -4.95471852^-02

                                                                   t. 58478857^-02
                                 142

-------
  •2.82799542E-OJ  l.u
  •3.73925194E-0> -2.1
  3.12^83205E-01 -5.<*
                                                  -1 .641'-* i74B r-01 ~J
  1.29770559E-01 -<
  1.97011792E-0;  5
  1.08734877E-01 -I
 -1.91314695E-02  5
  1.8339613EE-02
PROPER FUNC. 11
  1.33987406E-oi
  1.26697376E-01
  2.37635991E-01
  1.4857974RE-01
  5.084Q9717E-02
  3.18742356E-02
  2.47013442E-02 -5
  1.4265127GE-01
 -4.72893111E-02
  2.27701383E-01
 -1.28835119E-01
  2.12842445E-Q1
  2.47074943E-02
 -2.28151<*84E-02
  3.45244999E-02
  3.7066Q780E-02
PROPER FUNC. It
  3.24804198E-02
  5.1061Q381E-02
  1.2425Q744E-02
   .10554399E-OI
   .03066976E-Q1
    16872594E-02
   .53692248E-02
  S.1672Q501E-02
  2.20869430E-01
  8.45785730E-03
  2.35151781E-01
 -1.0277P290E-03
 -2.25835953E-01
 -7.42316933E-02
  4.23024553E-02
  5.39579118E-02
  2.58440917E-02
PROPEP FUNC. 13
 -7.75100209E-02
 -1
  1
  3
  8
 -1.81153422E-01
 -7.33847355E-02
 -6.6077182<*E-02
                   3.
                   2.n997537lt-0l
                  1.13905516-1-01 -]
                 -1 ,58924964F-i>3 -'
1.58939523F-ol
2.15871R39--01
3.uQ329735f.-Jc:
2. ?
                  1.02347831-:-ul
                  2.37b396l2E-Oc!
                  1. 11951373?. «jl
                  l.R0641«44E:-01
                  i.l705hfe49r:-01
                  3.76b84478£-C2
                                            -01   l.36424767--0ci
                                -1.33664454F-01
                                -1.6435^495^-01  -3
                 4.(t7-»72112"-02  -8.761«9945F-02   i
               -1.35o39o8«E-Ol  -!•4^6^442^^-01  -i
                                -1.33352487E-02
                          -»» i 94E-02  --
                 3.cH
-------
  -2.33113562E-01   7.4'
   3.63910562E-02  -z.t'  ... .    -   .-._.._.
  -2.63571796E-CI   2.'+579222c; -- • 1  « .61 775769-- 02 -"S
  -1.09626299t-0i  -2 . ^79^1 83b- -(, 2  S . 4Q 1 ^1 0 1 1-'-02 -<•
  -1.88591191E-11  -3.55l3Q74fU--jl -2 .<-4dl 33B2~-03   3 . 7*7qa7snr-«Q 37F-02  i . 950 1 70 73- -OtJ
  -2.44341837E-02   >.oQ334029-:-v2  !. 1 4*29427^-0 1   1.4421 GO 41r-01
   3.4l58«728E-02  -!.'>7dt>«063F-ol  1 .1 6CJG5036'-0 1   1. 5H,f T0824E-01
   1.6l45o89oE-02   5.15112172F-02  1 . o^**76fep^---0 1   b . 686^*)839F-02
  -1.4440R245E-01  -2.15220237F-U1  <*.24oSl544r-02   1.n^6^«fl43E-01
 PpOPEp  FjMC.   ^   -2.'+103i:>7:>6E-01 -2.15 J39b3«r-01 -7.5A27/+555F-02  1.50342204^-01
  -1.01529704E-0*  -:.'
   7.63922850E-02   5.610E041 o-T-03 -3.7=)t»781 46--02 -b.8^5^3554^-02  -7. 92641030'-02
 -1.76022413E-01  -1. O^OOl '*58F-v I -7.41219842F-02 -7. 00 7?0379F-02  -i
   3.7339Q343E-02   3. C305b805^- Jti -2 .0^«935^9;-02 -1. lb9S?558r-0 1
 -2.6547755«E-oi  -2.707l = o3=---0 1  2. ?<57l I 95«£-02   1.1 ^244430E-01   7.5b327438r-n2
 -6.57879989E-02  -3.05731488E-02 -1.P49Q9231F-0 1 -1.15494860^-01
   1.86053399E-Q1  -1. 10037346 —oc -1 .oo^74fca7---0 1   a.3^50^599^-02   '<
 -2.92134157E-02   5.4503596Q--C2 -7. 1 ?«:9QiJ7h--02 -9.8714^237^-03
   5.30&31914E-02  -1 .900382B6--0 1 -2.4 1 aSl 05^^-02 -•* . ^7904733.F-02  -:
   3.32083674E-Q2   5.0225]035F-02 -4.41b2o220--02 -1.079^^474^-01
 -1.81341067E-01  -2.T0244037E--a -1 .3«*19223?~-0l -8.5Q9ST1 71 E-02  -:
   1.09953670E-01   9.3237044?=-'^ -3.31'*67204F-02 -1 . 570 n20«--01
 -4.30207760E-02  -1 . l4897422'-'> 1 *\ 729F-02 -b. 895a?951 F.-02
 -2.61299804E-02   5.c>0876378E-uc'  9.8«3327l 1 "-02   9.52506936F-02   7.95o45234r_03
 -1.14254261E-01  -1.28714919^-,1 -1 . 74b733«5---0 1 -b. 7.3760548E-02
   1.60890183E-01   2.77820621 c- J1* -1 .2?8374b9--ol -1 . Pfc 1
  1.36130580E-01 -9. q3326779E-C2  -^. = 92e:5470--02 -1.03637936F-01 -1.29696080---01
  1.20347300E-OI  1.52708981F.-01   9.4042738H--02 -8.QO l=062bc-03
 -1.53732209E-01  4. 86043964F-02   1 .89^17113E-01  8. U9Q7791 E-02 -'....,	
PROPER FjNC. 10   5.19245337E-J2   5.25<»45873E-02 -».5334«695E-03 -b.10131986--02
 -7.86148557E-02 -5.685l678oE-02   ]
              02  2.ol7970l4t-02  -£
 -6.06995400^-03 -7.7o83£»937E-'>3  -5.7643Q355E-02 -8
              02  7.35752000F-OC  -2.02°l4430~-03 -3.7543Q098F-02 -'
              02  7.72921194F-02   1,0?'*77750E-01  4.6796141 Or-02
                                     144

-------
  -4.05590052E-02'
  -5.30618510^-03
  -6.07213021E-02
   1.2018R588E-OI
 --1.50699222E-0!
  -1.32679503E-0;
  -3.07bl5636E-02
   5.27634533E-02
   1.63238222E-02
  -2.08352659E-02
   3.1R360227E-02
   4.67046101E-02
  2.00<:132<»2E-02
  5.28M7312E-02
 -7.388124HOE-02
  5.81050621E-03
 -5.48997783E-02
 -1.41158513E-02
  3.336b8940E-02
 -1.64419620E-01
  3.30006154E-02
 -5.46608587E-02
 -3.14438469E-02
  7.03747368E-02
 -1.13225561E-01
  9.43947797E-03
  1.56216338E-02
 -2.55055303E-02
 -2.37234579E-02
PROPER FJNC. 15
 -I.75163969E-02
  2.14161623E-OI
 -6.28205997E-02
  1.63966818E-01
  3.35785569E-02
  1.72608575E-OI
 -5.705«»7759E-02
  i.l 310010 !-~-Cl
 •2.75b30009---0cr
                -1.53302877--02
                -b.ec.i«?4!f^c_o2 -i
                  9.PJ 075947;-02
                                 -2.4f04a420F-01  1.27b57b39c--ol
  1. 722120351-01  -?.74d77127--01 -<; . 3^5?a471 r_o^
                                             -02 -d.35750^66^-02
 2.12«23579E-yl
-2.56435664K--01
 -1.59906235E-OI
 -5.33173438E-02
  1.05015452E-02
  3.16619165E-02
 -2.14736746E-03
  1.1470<»035E-02
  7.58568479E-02
 1.2030P773r-Ol
 8.54459538--03
 1.70372901r-rjl
 2. 22630236^-01
 5.ei24807lr-02
 b.l3856509r-02
 4.1fe477383f--"»
 4. i'
                   1.17521337^-01
                  7.^4Hl0b70r-02  l.S02?1424F-01 -i.:
                 •1.257blb83r-01 -1.5^6«,a924F-01
                  Q.4R-+17994--03  1.7o2^^337F-01 - 1. 88647091 r-02
                  1.23<^b2483-:-01 -2.9fl9^4997F-01 ~d . 321 6i>a49^-n 1
                  1.7SU90830---01 -9.2430»8S8r-02 - I . 9030 7b 1 Or-0 1
 1.23551373F-01
 6.
            -01  -9.0775J901p-02 -1.7125674U-01
            -02  -7.20710232F-03
            -01  -2.726777a5r_02 -d.815b4734r-02
 5.26062167--02  -*.'
 1.42981904^-01
 i.4bOb299Sr-Jl
 7.45787R42-.Jt;
 2.53569499r-i)2
 7.75720922r-0<;
            "
 1.0905079^^-01  -<
                -1.35044622E-01
                                  1.7097b227c-02

                                 -i.!0639738r-01
                -3.998^4300^-02
 •». 15892431^-02
 4.57189754--U2
 1.28881170--02
2.78^95679^-02
P. 07730036^-02
1.55241502E-02
2.02^*64082^-02
1.463^3069^-01
9.86^02888^-02
                  1.03856222E-01
                 -H.84568572E-02
                  3.2239QQ35F-02
                  1.320333S2F-02
                 -3.11376437E-01   1
                  5.34787910^-02
                 -2.1345o843r-01   1.30392220r-01
                  l.lH5n
-------
1.0
1.0
1.0
CA=>Q
  1.
  i
  * .
  1.
  \
  *• •
  1.
  1.
  i
  * *
  1.
  1.
^ET : ?* *
     1.0
     1.0
     1.0
     1.0
     1.0
     1.0
     1.0
     1.0
     1.0
l.o
l.o
1.0
1.0
1.0
1.0
1.0
1.0
l.o
1.0
l.o
l.o
1.0
1.0
1.0
1.0
» » t
 l.o
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
                                                1.0
                                                1.0
..'J
 .0
  <\
. • *
 .0
1.0
1 .0
'.0
1.0
'.0
1.0
' .0
1.0
1.0
1.0
1.0
1.0
1.0
l.u
1.0
l.J
1.0
i.J
1.0
                                       146

-------
                 ^c IMITIfi;. STATKNi.
;.- I AL   C IG^Jlr.wTiS
1
1
H
3
e
*
3 . L 3 - -
9
10
C. (,00r
S - - x A L '
iNJ'' "Jtr
11
12
1"?
!•*
15
1*
17
IB
19
20
21
22
23
2 +
25
X Y
L • J U . 'J
- . •} H . 5
- . 5 - - . 0
-,.<= -3.1
-.." -2.5
3.3 9.0
-'2.1 1 . 5
'DILATES uF CIRCLE POINTi
2-1. H -7.0
-29. C 7.6
:OG°JJ
<
-7.0
-'5.9
7.6
13.1,
-18.1
-12. *
-9.1
7.1
-11.9
«.o
-15.1
-1.3
-6.6
-'9.7
a. 6
[f 5TES
Y
-?9. 0
-15.1
29. 0
10. 1
-23.9
-15. D
3.2
5.5
2.7
3.2
25.9
21.3
12. b
-t.l
-It. 9
-. OF F
3EFORE
.C58E+CO
.2J2E+03
,'*U2£-01
. t81E-01
.t53E-01
.Hl9£-01
. ullE-01
. t05E-01
. ^05£-01
.399E-01
.39PE-01
.395E-01
.39*.E-01
.337E-01
.1^05-01
AND SJPPL



PF^CENT ALFiRNATIyES REji
=?ELUCT.
999.90
.53
.79
.00
.05
.38
.02
. Ul
.OJ
.31
.00
.Gl
. 00
.02
• ok
X
7.6
7.6
15,1
-18.1
-12.3
-9.1
7. 1
-11,9
-15.1
-15,1
-29.7
-13. fa
-29.7
8.6
-1.5
Y
?9 . u
29.0
-25.9
-23.9
-1-5.0
3.2
5.5
2. 7
25.9
23.9
-t.l
It. 2
-t.l
-It. 9
It . 3
£ OF E
. 55rtc*
.232E+
. U3 2£-
. <*81t-
,i*53E-
,«19c-
. *11£-
, *»05£-
.t05E-
.399£-
.398t-
.395E-
.39«*t-
.387E-
. mot-




CTEJ

0 j
00
01
01
01
01
01
Oi
01
Jl
01
31
01
01
01
X
-9.3
15.1
-18.1
12.3
7.1
7.1
-15.1
7.5
8.0
-29.7
15.6
-29,7
8.6
15.6
-<». 1
Y
-1.0
-25.9
-23.9
-.<*
5.5
5.5
25.9
-.1
3.2
-4.1
-1.8
-t.l
-14.9
-1.3
29.7
£ OF E
. 558£*uO
.232E+00
,t82t-01
.t8 It- 01
.453E-01
.419E-01
.tllt-01
.405L-01
.U05E-01
.399E-01
.393E-01
.395E-01
.394E-01
.397E-01
.14 OE.-0 1
                                          147

-------
B.   Subroutine CIRCUM
     Subroutine CIRCUM is used to locate points on the
circular boundary if not more than one is already located
there.  If one point is already on the boundary the sub-
routine locates the second point on the boundary (line 35)
at the point of maximum residual error.  The search is
started at a point opposite the point already located on the
boundary.  If no point has been located on the boundary, the
"center of gravity" of the already located interior stations
is found.  The first point is then located tentatively at
the point in which the line through the "center of gravity"
of already located points and the center of the circle meets
the circumference of the circular boundary.  A minimum of
the residual variances is then sought from that starting
point.
                            148

-------
       LOCATES C.G. OF ALL POINTS AVAILABLE
                        I
    PICKS POINT ON BOUNDARY OPPOSITE THE C.G.
         LOCATES LOCAL MINIMUM ON THE
          BOUNDARY VIA FMFP (  FUNB )
             ADDS  TO LIST  VIA  ADOPT
          LOCATES POINT ON BOUNDARY
             OPPOSITE FIRST MINIMUM
              LOCATES LOCAL MINIMUM
         ON THE BOUNDARY VIA FMFP (FUNB)
            | ADDS TO LIST VIA ADOPT  |
                     RETURN
                  FIGURE V-2
ABBREVIATED FLOW CHART FOR SUBROUTINE CIRCUM.
                        149

-------
10
15
20
25
30
35
            c
            20
            •DECK CIRCUM
                  SU3*JUTIN£  CIRCUMC  XC,  YC,  3,  I )
            C     LOCATES POINTS  ON  THE  CIRCUMFERENCE IF ^OT MORE  THAN  ONE 15
            C     ALREADY THERE.   (  XC,  YC )  IS  THE CIRCLE CENTER,  <  =  RA
                  COMMON /  BLKi  /  iJSTN,  NdJY ,  NTC, XS(30>, YS(30>,  113(50,"?)
                  CoMMON /  8LK3  /  NCIR,  AZ(20)
                  CCMMGN /  8LK<*  /  EST,  EPS,  LIMIT. ItR, F
                  CGMHuN /  LINKS  / ExRl,  XO(fcO», OIFF(60)
                  EXTERNAL  FUNS
                  JATA PI,  T»I /  1.1-^15927,  6.231155"5 /
                  IF  ( I ,tO.  1  )  GO  TO  20
            C     ^IN£ CG OF  POINTS  AVAILABLE
                  ANJ = NSTN
                  LU^l = SUM2  r  0.
                  JO  10 I = 1, NSTN
                  5U11 = SUM1  *  XS(I»
            10    SUM2 = SUM2  *  YS(D
                  XCG = SUMl  / ANO
                  tCG = SUM2  / ANO
                  JIST = SQPT (  <  XCG  -  XC )**2 *  ( YCG - YC  )*»2  )
            C     SELECTS PCINT  OPPOSITE  CG  OR WEST
                  IF  ( OIST ,LE.  1.  )  12, 1*
            12    X = XC -  R
                  Y = YC
                  Gu  TO 16
            14    X = XC -  (  XCG  - XC  )  * R  /  OIST
                  Y = YC -  (  YCG  - YC  )  * R  /  OIST
            16    ANG = ATAN2( Y-YC,  X-XC )
            18    CALL FMFP(  FUNS, 1,  ANG, F,  G, EST, EPS, LIMIT,  IER,  M )
                  XO  = XC * R  »  COS(  ANG »
                  YO  = YC «• R  *  SIN(  ANG  »
                  CALL AOOPK  XO,  YO  )
                  IF  ( ANG  ,LT.  0.  )  ANG = ANG * TPI
                  NCI*? = 1
AZ(1) = ANG
PUTS ANOTHER POINT
ANG = AZC1I * PI
CALL FMFP( FUNB,  i,
IF ( ANG .GT. TPI  >
AZ(2) = ANG
NCIR = 2
XO = XC «• R • COS<
YO = YC » R » SIN(
CALL ADOPT < XO, YO
RETURN
tNO
                                      ON CIRCLE
                                       ANG, F, G, EST,
                                       ANG = ANG - TFI
                                                       EPS,  LIMIT,  IER,  H >
                                      ANG
                                      ANG
                                      )
                                      150

-------
C.   Subroutine FMFP
     Subroutine FMFP is a "standard" method of finding the
minimum of a function of several independent variables and
was extracted from the IBM System/360 Scientific Subroutine
Package (360-CM-03X) Version III, Programmer's Manual.  The
details of the minimization procedure are described in the
above and also in R. Fletcher and M.J.D. Powell, "A Rapidly
Convergent Descent Method of Minimization", Computer Journal,
Vol. 6, No. 2, 1963, pp. 163-168.  The subroutine as
listed in the first reference above has been modified at
lines 195-200.  In our application the standard FMFP
iterative procedure did not meet the convergence criterion.
A loop was added to check the magnitude of t-ah./x. and if
it was found to be less than e for all i, then a branchout
of the iteration was made by transferring to statement 36.
                            151

-------
                    B ROUTINE FMFH(Fut,CT,N,X,F,G,EST»t.PS,LIMIT,
10
30
35
f 0
C
c
C
c
c
c
c
c
c
c
c
c
c
c
C
c
c
c
c
c
C
C
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
                     FUNCT  -
   TO FINu A LOCAL MINIMUM OF A FUNCTION  OF  SEVERAL  VARIABLES
   3Y THE METHOD OF FLETCHER ANO POWELL


USAGE
   CALL FMFP(FUNCT,N,X,F,G,EST,EPS,LIMIT,IER,H)


            CF PARAMETERS
            USER-WRITTEN SUBROUTINE  CONCERNING THE FUNCTION TO
            3F MINIMIZED. IT MUST  tic uF  THE  FORM

            SUBROUTINE FUNCT(N,ARG,VAL,GRADJ
            ANO MULT SERVE THE FOLLOWING  PURPOSE
            FOR EACH N-3IMENSIONAL  ARGUMENT  VECTOR ARG,
            FUNCTION VALUE ANO GRADIENT  VECTOR MUST  3E COMPUTED
            AND, ON RETURN,  STORED  IN VAL  ANO GRAD RESPECTIVELY
            NUMBER OF VARIABLES
            VECTOR OF DIMENSION N  CONTAINING  THE INITIAL
            ARGUMENT WHERE THE ITERATION  STARTS. ON  RETURN,
            X HOLDS THt ARGUMENT CORRESPONDING TO THE
            COMPUTED MINIMUM FUNCTION VALUE
            SINGLE VARIABLE  CONTAINING THE MINIMUM FUNCTION
            VALUE ON RETURN,, I.E.  F=F{X).
            VECTOR OF DIMENSION N  CONTAINING  THE GRADIENT
            VECTOR CORRESPONDING TO  THE  MINIMUM ON RETURN,
            I.E. G=G(X».
            I£ AN ESTIMATE OF THE  MINIMUM FUNCTION VALUE.
            TESTVALUE REPRESENTING THE EXPECTED ABSOLUTE ERROR.
            A REASONABLE CHOICE IS  10**(-6>,  I.E.
            SOMEWHAT GREATER THAN  10»*(-0>,  WHERE D  IS THE
            NUMBER OF SIGNFICANT DIGITS  IN FLOATING  POINT
            REPRESENTATION.
            MAXIMUM NUMBER OF ITERATIONS.
            ERROR PARAMETER
            IE"? = 0 MEANS CONVERGENCE WAS  OBTAINED
            IER = 1 MEANS NO CONVERGENCE  IN  LlPlT ITERATIONS
            IER =-1 MEANS ERROffS IN GRADIENT  CALCULATION
            IER = ?. MEANS LINEAR SEARCH  TECHNIQUE INDICATES
            IT IS LIKELY THAT THERE EXISTS NO MINIMUM.
            WORKING STORAGE  OF DIMENSION  N*(N*7)/2.
                     f


                     G
   EST
   EPS
   LIMIT  -
                  IER
                     H
KtMARKS
    I)  THE SUPROUNTINE NAME REPLACING  THE  DUMMY ARGUMENT FUNCT
       MUST 9E DECLARED AS EXTERNAL  IN THE CALLING PROGRAM.
   II)  IER IS SET TO 2 IF t STEPPING IN ONE OF THE COMPUTED
       DIRECTIONS, THE FUNCTION  WILL NEVER INCREASE WITHIN
       A TOLERABLE RANGE OF ARGUMENT.
       IER = 2 MAY OCCUR ALSO  IF  THE INTERVAL WHERE F
       INCREASES Ib SMALL ANO  THE INITIAL  ARGUMENT WAS
       RELATIVELY FAR AWAY FROM THE  MINIMUM SUCH THAT THE
       MINIMUM WAS OVE»LEAP£0. THIS  IS DUE TO THE SEARCH
       TECHNIQUE WHICH DOUBLES THE STEPSIZE UNTIL A POINT
       IS FOUND WHERE THE FUNCTION INCREASES.
                                         152

-------
   SU3KQUTINE  FMFP ^

             c
             C     SUBROUTINES AND FUNCTION SU3PROGRAMS REQUIRED
             c        FUNCT
             c
 tO          C     METHOD
             C     THE METHOD IS GESCRI9EC IN THE FOLLOWING  ARTICLE
             C        R. FLETCHER AND M.J.O. POWELL, A RAPIC DESCENT  METHOD  FOR
             C        MINIMIZATION,
             C        COMPUTER JOURNAL VOL.6, ISS. 2, 1963,  PP. 161-168.
 65          C
             C
             C
                   COMMON / LINKS / ERR1, XOI60), OIFF(60»
             C
 70          C        DIMENSIONED DUMMY VARIABLES
                   DIMENSION H(1),X(1),G(1)
             C
             C        COMPUTE FUNCTION VALUE ANO GRADIENT VECTOR FOR  INITIAL  ARGUMENT
                   CALL FUNCT
100                K=K»N
                   H(K)=XIJ>
             C
             C        DETERMINE DIRECTION VECTOR H
105                T=0.
                   00 3  L=1,N
                   T=T-G(L)*H(K)
                   IF(L-J)6,7,7
                 6 K=<*N-L
110                GO TO 8
                                   153

-------
   LUJROUTINE  F-lFP

                 7 < = <«•!
                 3 CONTINUE
                 9 H( J» =T
             C
115          C        CHtCX WHETH£R FUNCTION WILL 0£Cx£ASt  STEPPlN  ALONG H.
1?0          C        CALCULATE DIRECTIONAL DERIVATIVE  ANO  TESTVALU^S FOR DIRECTION
             C        VECTOR H AND GRADIENT VECTOR  G.
                   CO 10 J=1»N
                   GNRM=GNRM*ABS
                10 OY = DY4-H(J)*G(J)
             C
             C        REPEAT SEARCH IN DIRECTION  OF  STEEPcST  DESCENT  IF DIRECTIONAL
             C        DERIVATIVE APPEARS TO 3E POSITIVE  OR  ZERO.
                   IF(QYH1,51,51
130          C
             C        REPEAT SEARCH lU DIRECTION  OF  STEEPEST  DESCENT  IF DIRECTION
             C        VECTOR H IS SMALL COMPARED  TO  GRADIENT  VECTOR G.
                11 IF51tpl,12
             C
135          C        SEARCH MINIMUM ALONG DIRECTION H
             C
             C        SEARCH ALONG H FOR POSITIVE  OIRECTICNAL DERIVATIVE
                12 FY=F
                   ALFA=2.*(£ST-F)/DY
                   AM13A=1.
             C
             C        USE ESTlMATt FOR STEPSIZE ONLY IF  IT  IS POSITIVE.  AND LESS THAN
             C        1. OTHERWISE TAKE 1. AS STEPSIZE
                   IFCALFAJ 15, 15, 13
                13 IF(ALFA-AMBDA) 14,15,15
                1* AMBOA=ALFA
                15 ALFA=0.
             C
             C        SAVE FUNCTION AND DERIVATIVE VALUES FOR OLD ARGUMENT
150             16 FX=FY
                   DX = 3Y
             C
             C        STEP ARGUMENT ALONG H
                   00 17 1=1, N
155             17 X(I»=X(I)*AMBOA»H(I|
             C
             c        COMPUTE FUNCTION VALUE  AND  GRADIENT FOR NEW ARGUMENT
                   CALL FUNCT(N,X,F,G)
                   FY=F
160          C
             C        COMPUTE DIRECTIONAL DERIVATIVE DY  FOR NEW ARGUMENT. TERMINATE
             C        SEARCH, IF OY IS POSITIVE,  IF  OY  IS ZERO THE MINIMUM IS FOUND
                   OY = 0.
                   DO 18 1=1, N
165             13 OY=OY»G(I)*H{I)
                                    154

-------
               FMFP

                   IF(DY)19,36,22
             C
             C        TERMINATE SEARCH ALSO IF THE  FUNCTION  VALUE  INDICATES THAT
             C        A MINIMUM HAS 3tEN PASSED
170             19 IF(FY-FX)20,22,22
             C
             C        *EPtAT SEARCH AND DOU3LE STtPSIZE  FOR  FURTHER SEARCHES
                20 AM80A=AM6DA*ALFA
   0               ALFA=AM8DA
175          C        £NO OF SEARCH LOOP
             C
             C        TERMINATE IF T Ht. CHANGE IN ARGUMENT  GETS  VERY LARGE
                   IF(HNRM»AM9DA-1.£10)16,16,21
             C
180          C        LINEAR SEARCH TECHNIQUE INDICATES  THAT  NO  MINIMUM  EXISTS
                21 I£* = 2
                   RETURN
             C
             C        INTERPOLATE CU3ICALLV IN THE  INTERVAL  DEFINED 3Y THE  SEARCH
165          c        ABOVE AND COMPUTE THE ARGUMENT  x FOR WHICH THE INTERPOLATION
             C        POLYNOMIAL IS MINIMIZED
                22 T=0.
                23 IFUM8DA)2<*,3fc,2<»
                2«« Z=3.*(FX-FY)^AM30A+OXfDY
190                ALFA=AMAX1(A3S,A8S(OX),A3S(0Y ) )
                   OALFA=Z/ALFA
                   OALFA=OALFA*GALFA-OX/ALFA*OV/ALFA
                   IF(OALFA)51,25,25
                25 W=ALFA»SQRT(DALFA)
195                ALFA=
                   IF(OALFA)30,33,33
                30 IF(F-FX)32,31,33
                31 IF(OX-DALFA)32,36,32
220             32 FX=F


                                   155

-------
   SUBROUTINE  FMFP,

                   OX=OALFA
                   T = Al_FA
                   AMSOAsALFA
                   GO TO 23
225             33 IF(Fr-F)35, 3^,35
                3<« IF 51,38,38
             C
             C        TtST  LENGTH  OF ARGUMENT DIFFERENCE  VECTOR  AND  DIRECTION VECTOR
             C        IF  AT  LEAST  N  ITERATIONS HAVE BEEN  EXECUTED. TERMINATE, IF
2*5          C        30TH  ARE  LtSS  THAN EPS
                36 IER=0
                   IF(KOUNT-N»<*2,39,39
                39 T=0.
                   Z=0.
250                DO +0  J=1,N
                   T=T*A8S(H(KI I
255             i»0 Z=Z«-H»H(K)
                Wl IF(T-EPS)56,56,<»3,50,50
             C
             C        PREPARE  UPDATING OF MATRIX H
                «»3 ALFA = 0.
                   DO k7  J=1,N
265                *=J«-N3
                   H = Q.
                   DC -»b  L = 1,N
                   KL=N*L
270                IF(L-J)<*<», 45,
                <»(» K=<»N-L
                   GO TO  «*6
                kf> CONTINUE
                   K=N*J
                              156

-------
   SUBROUTINE  FMFP

                   ALFA=ALFA+H*H
-------
D.   Subroutine FUNB
     When a prospective minimum point reaches the boundary,
it is constrained to remain on the boundary.   This requires
that the (x,y)  coordinates be exchanged for an angle
coordinate, 8 (azimuth of the point concerned), and the
derivative with respect to 0 replaces the derivatives with
respect to x and y.  This is done by FUNB which calls FUNCT
and modifies the derivatives from it so that they are directed
tangentially to the boundary.
                          158

-------
       CHANGES  POLAR COORDINATES
         TO CARTESIAN COORDINATES
GETS FUNCTION  AND GRADIENT  FROM  FUNCT
     CHANGES GRADIENT IN CARTESIAN
  COORDINATES TO GRADIENT  IN AZIMUTH
             FIGURE  V-3
   ABBREVIATED FLOW CHART FOR FUNB
                   159

-------
            »OEO. FUN?
                  SUBROUTINE  FUNBf  I,  ANG,  F,  G )
            C     TO CONVERT  FuNCTC  N,  X,  F,  G >  INTO A ONE DIMENSIONAL
            c     FUNCTION on  THE CIRCULAR  BOUNDARY
 5                COMMON / BLK1  / NSTNt  NBJY,  NTR, XS<30), YS(30),  ITR(50,3>
                 , , 
                  CALL FUNCT(  2, X,  F,  G?  )
                  G(ll = -G2
-------
E.   Subroutine FUNCT
     Subroutine FUNCT(N,X,F,G) is used by FMFP to get the
function value and its partial derivatives at the point X(I),
1=1, N, where N is the number of independent variables X(I)
that are the arguments of the function, F.  The function
value is scalar and is returned in F.  Its partial derivatives
with respect to X(I) are returned in the gradient vector
G(I) , 1=1, N.

     The function value and its partial derivatives are
evaluated using the subroutine CORFUN  (q.v.) from the proper
values/functions of the correlation coefficient at discrete
points and from the field of pollution concentration
variance W(I,J), I,J=1, 9, which is available in INT2D.
via COMMON/BLK2/.

     The computation process is carried out on the following
basis.  Let a..  be the elements of the correlation coefficient
matrix for station locations i, j; let g. be the correlation
coefficient of pollution concentration between stations i
and the point (x , y ), the starting point for the minimization
procedure (or some point during the search for the minimum);
     2
let a  be the variance of pollution concentration at (xo'yo)•
The least square error of estimate of pollution concentration
at  (x ,y ) from linear regression already located stations i,
at  (x.,y.)f is given by

     F =  e2 = a2(l- I I g^1^)                      (1)
                     i J
which is the function value required.  Its partial derivatives
are given by
                           161

-------
          = 9e2/9x = 2a(9a/9x)(l - S Sg.^1^.)
                                  i j
               - 2a2 £ £ gialj Ogj/gx)                  (2)
     G(2) =  9e2/9y = 2a(9a/9y)(l - I Sg.a1
                                    i J
               - 2a2 I Igialj(9g./9y)                  (3)
                     i J         J
where a1-' are the elements of the inverse of the matrix
     The correlation coefficient function obtained from
CORFUN is such that if c(x,y;£,n) represents the correlation
between pollution concentrations at  (x,y) and  (£,ri), then
the limit for £ -> x, n + y is not 1, but a value somewhat
less than 1  (a jump discontinuity at £=x, n=y) .  This means
that the correlation function is represented in the  form

     C(x,y;£,n) = C*(x,y;£,n) + A (x,y) 6 (x,y ; C , n)        (4)

where C*(x,y;C/n) is the continuous part of the correlation
coefficient function, A(x,y) is the amount of  the jump
discontinuity, 6(x/y;£,n) is the two-dimensional Dirac function
which is zero if x 7^ £, y ^ n and is 1 at x =  £/ y = n.  Thus,
the limit for ? -»• x,n ^ y of C*(x,y;5fH) exists and  is
               *                                     *
C*(x,y;x,y) = C .  In the limit sense C(x,y;£,rt) -»• C but
the actual value of C(x,y;x,y) is C  + A(x,y).  This means
that in  (1) the terms of g. have this jump discontinuity
                          1    *
so that g..^ = g (xi,yi;xQ,yo) -> Co(xi,yi) in the limit sense,
but should actually take the value C  (x.y.) +  A(x.y.) when
x  = x^, y  = y. .  Thus, F (x ,y ) in the limit sense does
not approach zero when  (x  ,y ) approaches the  location of
an already located pollution concentration observation
station.
                           162

-------
     To force the function to zero in such circumstances
the following modified form of the error of estimate  (1)
                2
was used.  Let F  be the value computed from

      F« 	 1   ^ —-* __  J- 1
      o ~     i 2. ^^a  g-i •
              i j
Then

     F = a2(FQ - A/F.^2

will be zero when the point (x ,y ) approaches an already
located station.
     In lines 17-38 the matrix {a..} is set up and inverted.
                                 1^        2
The terms g. are computed in lines 39-45, a  is obtained on
line 48, the error variance on line 55 and the value of F
on line 62.  If only the function value is required, 1C = 1,
and the rest of the subroutine is skipped.  When derivatives
are required, the derivative of the variance of pollution
concentration is computed in lines 69-76, the correlation
coefficient derivatives are then obtained (77-83) and the
remaining terms of the gradient vector are added (78-90).

     When diagnostics have been returned by any of the sub-
routines, their sense is printed  (lines 91-106).  Since the
coordinates of already located stations remain fixed, pro-
vision is made in IQ to hold the values of matrix {a -1} (IQ = 2)
during a given minimization, but to renew them  (IQ = 1) when
a new minimization is started.
                           163

-------
GETS CORRELATION COEFFICIENTS FOR STATION PAIRS
  TO FORM MATRIX OF  COEFFICIENTS VIA CORFI
                       1
            GETS  INVERSE OF MATRIX
                       I
      GETS CORRELATION COEFFICIENTS FOR
           NON-HOMOGENEOUS TERMS
                       I
  FINDS ERROR OF ESTIMATE (  VALUE OF FUNCT )|
     NO
   NEED
DERIVATIVES
                 FIGURE  v-4
                                        YES
                                        GETS CORRELATION
                                        COEFFICIENT
                                        DERIVATIVES VIA CORFUN
                                                 I
                                        COMPUTES FUNCTION
                                        DERIVATIVES
                                               RETURN
       ABBREVIATED  FLOW CHART OF  FUNCT.
                    164

-------
GO
00
XM
YM
00
XI
Yl
TO ( 30,
61*1,
= XSIII
= VSCI)
<» J * 1,
= XSU)
= YS(J)
32 » ,
NSTN
I
            •DECK FUNCT
                  SUBROUTINE FUNCT( N, X, F, G )
            C     TO COMPUTE THE FUNCTION TO BE MINIMIZED
            C     N = NO. OF VARIA3LES IN ARGUMENT! N =  2
 r^          C     XCI), I = If N, ARGUMENT
            C     F = FUNCTION VALUE  ( SCALAR )
            C     G(I)( I = It N, GRADIENT OF FUNCTION
            c     ic, CONTROL; i = GET F ONLY, 2 « GET F AND cm
            C     10 = CONTROL, 1 = 00 MHOLE THING, 2 =  USE OLD  APAM,  AA,  AINV.
10          C     SINCE BASIC STATION LOCATIONS AR£ NOT  CHANGED.
            C     NG PENALTY FUNCTION INVOLVED
            C
                  COMMON / 8LK1 / NSTN, N90Y, NTR, XS130), YSC30),  ITRf5C,3»
                 , , XC, YC, R, IBDYC20)
15                COMMON / Q / 1C, IQ
                  DIMENSION G(2), AlNV(30,30), GGC30), OGOX(30>,  DGOY(30),  AA(30,30)
                  DIMENSION X(1)
                  xo - xdi
                  VO = X(21
20          C     GETS CORRELATION COEFFICIENT ARRAYS
                                    IQ
            30


25


                  CALL CORFUN( XM, YM, XI, Yl, COR, DCOX, OCOY,  10  )
                  IF ( 10 ,NE. 0 i GO TO 102
30                AA(I,J) - COR
                  IF ( I . EQ. J ) AA(I,J) z 1.
            k     CONTINUE
            6     CONTINUE
            C     FILLS OUT CORCOEF MATRIX AND GETS INVERSE, ETC.
35                oo 9 i - i, NSTN
                  Ji = I * i
                  oo a j = ji, NSTN
            8     AA(I,J) = AA(J,I)
                  CALL MATINVC AA, AINV, NSTN, 40 )
<+0          C     GET NONHOMOGENtOUS TERMS
            32    00 36 I > 1, NSTN
                  XM = XSCII
                  VM * vsm
                  CALL CORFUN< XM, YM, XO, YO, COR, OCOX, OCOY,  10  )
45                IF ( ID .Nt. 0 ) GO TO 102
            36    GG(I) s COR
                  CALL INT2D( XO, YO, VAL, 10 )
                  IF ( 10 .N£. 0 ) GO TO 10<*
                  VSQ = VAL * VAL
50          C     COMPUTES ERROR OF ESTIMATE AND ASSIGNS NEGATIVE SIGN SINCE
            C     SUBROUTINE FMFP GOES FOR MIN AND WE WANT MAX
                  SUM = 0.
                  00 10 I * It NSTN
                  00 10 J = 1, NSTN
55          10    SUM = SUM + GG(I1 * AlNVd,JI  * GG(JI
                                  165

-------
~u
               FUNCT
             11
             500
             C
             C
 75
             22
 •90
ic;
IPS
110
             C
             200
             C
             102
             1002
          10*
          1QO<*
          106

          1006



          300
 t>  =  &(?)  *2. »
                                 0  )
                                  YM, XO, YO,
                                  GO TO 102
                                                 COR,  OCDX,  DCDY,  ID )
                                        *
                                        *
                                       AINV(I,J)
                                       »INV(I,J)
                                    ViQ * SUMl
                                    VSO * SUM2
                                    OGOX(J)
                                    OGOYCJI
 THESE GRADIENT  CO^IPNENTS HAVE SIGN RiiVERSEC  TO  CORRESPOND TO F.
 RETURN
 OIAGNOSTICS PRINT  OUTS
 PRINT 1002, ID
 FOR1AT  (* COORDINATE *,I3,» OUT OF RANGE  IN  CG^F'JN FROf SUBROUTINE
* FJNCT*)
 GO TO 300
 PRINT lOOi*, ID
 FORMAT  (* COORDINATE »,  13, * OUT OF RANGE IN  INT20 FROM SU3ROUTIN
*E FJNCT*>
 GO TO 300
 CONTINUE
 PRINT 1006, ERR,  XO, YQ
 FORMAT  (»
* »,F10.5,<
 £RR = 0.
 GO TO 11
 CALL EXIT
 END
            INTERPOLATION ERR
              Y  =  *,F10.3l
                              *,E10.3,« FROM SU3ROUTINE  FUNCT AT X =
                                    166

-------
F.  Subroutine CORFUN
     Subroutine CORFUN is used to determine the correlation
coefficient relating pollution concentration at two arbitrary
points neither of which need be a point of the 9x9 grid.
The technique used is that of a quadratic interpolation pro-
cedure based on the description of the correlation function
in terms of its proper functions and values.  If the proper
values are A., i=l, ..., k,  (A,>A_>...>A,) and the proper
            1                 J_  ^      K.
functions corresponding thereto are .(x,y), then the correlation
function may be expressed in the form
                   k
     K(x,y;£,n) =  £ Ai4>i(x,7)^(5,n)                   (D
                  i=l
where (x,y) and (?,n)  are the points between which pollution
concentration is being correlated.  The value of k is that
of the statistically significant proper values/functions.
The proper functions are known only at the points of a 9 x 9
square grid of points (10 km separation).  The values of
(£,n)  are found by means of a simple bivariate
quadratic interpolation procedure.

     Had the correlation coefficient function itself been
used, K (x,y; C/i~l) a four variable interpolation method would
be required and additional constraints imposed to preserve
the basic character of the correlation coefficient function
(such as having a value not exceeding +1 and a "horizontal"
tangent plane at any point x = £, y = n).  The use of the
proper functions greatly simplifies the interpolation pro-
cedure and guarantees that the characteristics of the
correlation function are preserved.
                            167

-------
     The interpolation algorithm is standard and is given by
M. Abramowitz and i.A. Stegun, Handbook of Mathematical Functions,
U.S. Government Printing Office, Washington, D.C., June 1964,
p. 882, formula 25.2.67.  This requires six points for a
rectangular grid in the arrangement of Figure V-5.  The
formula is
     f (X0+ph,y0+qk) =
          4- (l+pq-p2-q2) f Q/ 0+P (p-2q+l) f l ^ Q/2+q (q-2p+l) f Q
which has been rearranged into the form
     f(x0+Ph,y0+qk) =
In the above,  (h,k) are the lengths of the sides of the
rectangular grid, and f.  . are function values at the grid
                       1 / D
points with units indicated by the subscript and scale factors
h for index i, k for index j.  The values of p and q are
determined from p=(x-x,.)/h, q=(y-y«)/k.

     The arrangement of points in Figure V-5 provides for
values of p and q in the ranges 0
-------
                                      •-	'
                                     (0,0)
                                FIGURE V-5

         THE ARRANGEMENT OF DATA POINTS FOR  6-POINT QUADRATIC
         INTERPOLATION.  THE  AREA OF MOST ACCEPTABLE VALUES OF
         P AND Q  ( 0 '                          (41,+D                       (0.41)
    •    "---4        •        •    ,—-f        •        •        j—,     •
(0,0)         (+1,0)      (+2,0)     (0.41)  | (B) I        (4-1,42)   (-1.41)     |    |      (+1.4-1)
                                     I	I                            I	I
            (•H.-1)              (0,0)     (*1,0)                      (0,0)     (tl.O)

         A )                       ( B )                                 (  C )

                              FIGURE V-6
         EQUIVALENT ARRANGEMENTS OF DATA POINTS TO ACCOMMODATE
         OTHER VALUES OF P, Q LYING CLOSE TO THE POINT (  XO,YO  )
                                      169

-------
     The partial derivatives of the correlation coefficient
function are obtained by differentiating (1) ,
                      k
                    = 2  A .  . (x,y) [3 . (£, n)/H]
                     i=l
                      k
                    = 2 A .  . (x,y) [3
-------
10
15
20
30
35
50
55
*0£CK CORFUN
      SUBROUTINE CORFUN( XH, YM, XI, Yl, COR, OCOX,  OCOY,  10  )
c     Gtis CORRELATION FUNCTIONS ANO DERIVATIVES wo  xi,  YI  FROM  TABLES
C     OF PROPER FUNCTIONS / VALUES.
C     ID = 0 FOR COORDINATES IN RANGE,  10  .NE.  0, OUT  OF RANGE
      OIMENSION PHK151, PHHU5), OPH1DX(15), DPHlDYdS)
      COMMON / 8LK2 / X0(9), YD(9), PH(15,9,9), ALAM(15I,  H(9,9)
      COMMON / Q / 1C, IQ
C     PH
C     ALAM(L2) * L2TH PROPER VALUE
6
C
6
            10
            11

            12


            13
            16
            C
            ia
            20
            22
2k



26



28


30
                            15 /
                           , YH .EQ. Yl
                           $ OCOY = 0.
                                                   ) <»,
DATA L2 /
10 = 0
IF ( XH .EQ. XI .ANO
COR = 1. S DCOX = 0.
GO TO 100
ENTRY CORFUN1
XA = XM $ YA * YM f K * 1
LOCATE LOWER LEFT CORNER OF SQUARE CONTAINING  XA, YA
IF I XA .LT. XOI1) ) GO TO 11
                                             12
00 10 I * 2,
IF ( XA .Lt.
CONTINUE
10 * i * 2 *
GO TO 100
IF ( YA .LT.
DO 13 J = 2,
IF ( VA ,LE.
CONTINUE
10 = 2 * 2 *
GO TO 100
12 * I " 1 $
9
xom

i < -

YOU)
9
YOU)

( K -

JZ *

) GO

1 )

) GO

) GO

1 )

J - 1

TO



TO

TO




                                 16
INTERPOLATION ON PROPER FUNCTIONS
P0= .1 » I XA - XO(IZ) )
Q0= .1 * ( YA - YOCJZI I
00 3k L = 1, L2
IF ( PO.LE. .5 ) 18,
IF { QO.LE. .5 ) 22,
IF ( QO.LE. .5 I 26,
ICASE * 1 t P * PO t
A * PHU,IZ,JZ) 18 =
0 > PH(L«IZ,JZ+1) S E
GO TO 30
ICASE »3SP=POtQ=l. -QO
A a PH(L,IZ,JZ»1) S B = PH(L,IZ*1,JZ+1)
                           20
                           2<»
                           28
                           Q x QO
                            PH(L,IZ*1,JZ) S C *
                              PHU,IZ,JZ-1) t F
                                                            PH(L,IZ-1,JZ)
                                                            « PH(L,IZ*l,JZ+i)
                                                          $ C = PH(L,IZ-1,JZ*1)
                                  i E = PH(L,IZ,JZ*2) IF* PH(L,IZ*1,JZ)
i) * PH(L,IZ,JZ)
GO TO 30
ICASE * 2 t P = 1. - PO
A * PH(L,IZ+1,JZ) $ 9 *
0 = PH(L,IZ*1,JZ*1) t E
GO TO 30
ICASE = k S P - 1. - POl Q * 1. - QO
A > PH(L,IZ+1,JZ»1) SB* PH(L,IZ,JZ+1) $ C
0 = PHIL,IZ*1,JZ) S E = PH(L,IZ*1,JZ*2) S F
IF { K .NE. 1 ) PHKCLI = PHKL)
t Q * QO
PH(L,IZ,JZ) f C * PHCL,IZ+2,JZ)
* PH(L,IZ*i,JZ-U $ F a PH(L,IZ,JZ+1)
                                                                PHIL,IZ+2,JZ*1)
                                                                FH(L,IZ,JZ>
                                     171

-------
  SU1RJJTINE   CCRFUN
                                      (P*(=5-C)  *•  Q » < D - =  >  f  ?  * F * < •?
                  *  C - ?t * A  )  +  Q  * Q » lC*E-2.  *A)H-P*GI*  (A *F-3-3
                   IF « K .EQ.  1  )  GO TO 3<»
                   GO TO ( 3**,  32  ),  1C
             32     bf.NP = 1. 5  SGNQ = 1.
                   IF ( ICASE . £Q.  3  .OP. ICASE .EQ.  i,  )  SGNQ = -1.
                   IF ( ICASE .£0.  2  .0*. ICAS£ .£Q.  %  J  SGN*3 = -1.
                   JPHIQX(L) =  SGNP »«.5*(l-C>**=*<3*C-2.  *A| +
                  4-Q* ( A * F -  3 - 0 ) )
                             =  SGfjQ *(.5*(0-£)+Q»{D»e-2.  *A) *
             3*»     CONTINUE
                   GO TO ( 36,  38  ) ,  N
             36     < = 2
71                 Xfl = Xl $ rA  =  Yl
                   GO TO 3
             38     C03 = 0. f OCOX =  0.  S OCOY =  0.
                   00 %2 L = 1,  L2
                   COR = COR «•  PHM(LI  *  ALAHCLI * PHKD
7?                 GO TO C *»2,  40  I,  1C
             •+0     JCOX = OCDX  + PHN(L>  * ALAM(L> *  OPH10XIL)
                   JCOV = OCOY  * PHfl(U)  * ALAM(L) *
             t?     CONTINUE
             100    RETURN
e"                 EHO
                                     172

-------
G.  Subroutine INT2D
     This is a two-dimensional interpolation subroutine
using the standard formula
     f = f,
Four points f_ n, fn ,, f, n, f, , at the corners of a
             U / U   U f -L   X / U   X / -L
rectangle are used.  The values p and q are given by
p = (xa-x1)/(x2-x-L) , q = (ya-y1)/(y2-y-L) where  (*a»ya)  is  the
point at which the value of f is required and  (x,, y,),
(x2,y,), (x,,y2),  (x2,y2) are the coordinates at the corners
of the rectangle.  The surface fitted to the data is a
hyperbolic paraboloid.

     In the form INT2D(XA,YA,VAL,IDIOG) the coordinates
(x ,y )  are input as XA, YA, the result is output in
  cL  Si
VAL.  IDIOG=0 for successful interpolation, =1  for XA
outside the table range, =2 for YA outside the  table range.
In this case the table, W(I,J), I,J=1,9, appears in the
COMMON statement.  The quantity PH(15,9,9) of the COMMON
statement is not used in this subroutine.
                           173

-------
            *OEO INT?D
                  SUBROUTINE  INT 20
                  Co^ON /  BLK2  /   X(9),   Y(9), PH(15,9,9»,  ALAM{15),  W(9,9J
            G     INTERPOLATES  \/ALUE OF w  AT XA,YA,  FROP VALUES  GI^EN  AT X,Y
 5          C     N=9,  IOIOG  =  DIAGNOSTIC, 0=GK, 1=X OUT OF  RANGt,  2=  Y OUT  OF RANGE
                  IOIJG=0
                  30 10 1=1,9
                  IF(XA.LT.Xd) »  GO  TO  1?
               10 CONTINUE
10                ICIOG-1
                  GU TO 18
               1? UO It J=l,9
                  IF(YA.LT.Y-X(Il))
20                Q=(YA-Y(J1» »/(Y(J2)-Y(JD)
                  HI=W(II  Jl)
                  H2=H(I2,Jl)
                  W3=W(I1,J2>
                  M4=W(I2,J2)
25                VAL
               16 RETURN
                  END
                              174

-------
H.  Subroutine MATINV
     This is a standard subroutine for matrix inversion.
In MATINV(A,B,N,M)  the matrix to be inverted is contained
in A and N gives the number of rows/columns of A.  The inverse
of A is returned in B.  Both A and B are listed in full
form.
                             175

-------
            •OECK
                  j'J3^0UTINt  MATINiHA, i,N, M)
               1C
15
?0
50
eo
30
35
 65
 70
            C
            C
            C
50
 95

100

105
110
115
                  INITIALIZATION

                  GO IE  I=ltN
                  30 1C  J=1,N
                  a (I, J)=A (I, J>
   Jf-IVUT0
   00 50 J = 1,N
   IF(JFIVOT.GT. A8S(8(I,J) ) )  GO TO 50
   PIVOT = 8(1, Jl
                  JCOL=J
                  CONTINUE
                  CONTINUE
   JFIrfoT ( JCOLJ-IROW

           PIVOT  COLUMN WITH ROM MULTIPLIERS
   CO 70 1=1, N
   *=-* (I, JCOL)
   IFIIROW.NE. I)  GO  TO  70
   X=l.
   3  GO  TO  115
   00 110 1=1, N
   IF(I*OW.EQ.I)  GO  TO  105
   d(I,J) =  B(I, J)+3(I, JCOLt *C( J)
   GO TO 110
   d
-------
  Sim OUTING  1ATINV

                  L =  IPIVOTIM                                                       MATI  5*0
              120 CIO  =  3(1,L)                                                       MATI  550
                  00 130  K=1,N                                                       MATI  5oQ
              139 3(I,K)  =  C(K)                                                       MATI  570
cQ                OG 150  1=1,N                                                       MATI  530
                  00 1*4.0  K = 1,N                                                       MATI  590
                  L =  JPItfOKK)                                                       MATI  603
              1*0 C«<)  =  3(L,I)                                                       MATI  610
                  OC 150  K=1,N                                                       MATI  620
£5            150 S(K,I)  =  CfiO                                                       MATI  b30
                  •
-------
I.   Subroutine ADOPT(XO,YO)
     This subroutine adds the station location with co-
ordinates XO,YO to the station list and rearranges the list
so that it is in canonical form, i.e., the triangle assignments
include the new point and vertices of all triangles are in
counterclockwise order', if it is a new boundary point it is
listed as such and the number of boundary points incremented
and the counterclockwise ordering of the boundary points is
maintained.

     The logic is somewhat involved and many exceptional
cases occur.  The subroutine print-out contains many ex-
planatory comments on what is taking place.  These should
be sufficient to unravel the situation.  Extensive use is
made of subroutines AR2,  TRITST, PORDR, and TRIFIX.

     One of the basic ideas is to the effect that if XO,YO
is the point P and if II, 12 and neighboring boundary points
listed in counterclockwise order, then the "area" of the
triangle formed by the ordered points P, 11, 12 will be
"positive"  (i.e., P, II,  12 are in counterclockwise order
around a triangle) for all boundary point pairs II, 12
(neighboring) if P lies inside the boundary, and will be
"negative" for at least some pair II, 12 if P is outside
the boundary, and will be "zero" if P lies on 11, 12
(interior or exterior).  If the point is clearly interior
to the boundary of the points already located, it is
required then to find the triangle of already located
points within which it lies  (or on a side of which it lies).
One then subdivides this triangle.  If the new point lies
outside the boundary of already located points, then additional
triangles are to be formed.  The many exceptional cases
require careful handling since experience indicates that
they occur with distressing frequency.
                             178

-------
              TESTS POINT COORDINATES AGAINST  EACH BOUNDARY  (P)
              SEGMENT  VIA AR2 TO FIND IF INSIDE OR OUTSIDE THE
              BOUNDARY  (P)

INSIDE

OUTSIDE

                     I
 IDENTIFY WHICH TRIANGLE CONTAINS
 THE NEW POINT VIA  TRITST
 INSIDE
ON  SIDE
 ADD  NEW POINT AND  TWO
 NEW  TRIANGLES TO THE  LISTS
®	»i
 CHECK  POINT ARRANGEMENTS
 VIA PORDR
         I
 IMPROVE TRIANGLES
 VIA TRIFIX
      RETURN
              INFERIOR
/is
V
                           SIDE INFERIOR OR
                        ON BOUNDARY (P)
BOUNDARY
         FIND OTHER TRIANGLE
         WITH THIS SIDE COMMON
                I
         MAKE FOUR TRIANGLES
         OUT OF THESE TWO
i
               LOCATE ADJACENT
               BOUNDARY (P) POINTS
                        I
               REASSIGNS BOUNDARY (P)
               POINTS AND ADDS ONE
                                                                       i
                                      MAKES TWO TRIANGLES
                                      OUT OF ONE AND  ADDS
                                      NEW POINT
         FINDS LEFT MOST AND RIGHT MOST
         BOUNDARY (P) POINTS "SEEN" FROM
         NEW POINT, THERE ARE  K SUCH
                      I
         FORMS K-1 NEW TRIANGLES

         ADDS BOUNDARY  POINT
                                 FIGURE V-7

                   ABBREVIATED FLOW CHART FOR ADOPT
                                        179

-------
            *OcC* A03PT
                  SUBROUTINE ADOPTI xo, YO >
            C     ADDS POINT TO AN ARRAY IN STANDARD FORM SO THAT  THE  STANDARD  FORM
            C     IS PRESERVED AND CHECKS TRIANGLE ARRANGEMENTS
 5          C     CHECKS TRIANGLE ARRANGEMENT SO IT IS ,»GOQQ,«.
                  COMMON / 8LK1 / NSTN, N8QY, NTR, XSJ30), YS<30!,  ITRC5Q.3)
                 ,  , XC, YC, R, IBOY(2fl>
                  INTEGER P
                  DIMENSION ISIG(2fll, PI2), IHI20)
10          C     PUTS ( XO, YO ) IN XS, YS(NSTN«1)
                  Nl = NSTN * 1
                  XSCN1J * XO
                  YS(N1» x YO
            c     CHECKS WHETHER ( xo, YO » is INSIDE OR OUTSIOE THE BOUNDARY
15                00 10 I * It N80Y
                  II = I «• 1 J IF  ( I .EO. N8DY I II * 1
                  CALL AR2< leOYlII, I30Y(I1I, Nl» A, 10 )
                  IF ( ID .Lt. 0 )  GO TO 54
            10    CONTINUE
20          C     ALL VALUES OF ID ARE * SO IS INSIDE.  FINO WHICH TRIANGLE
                  00 12 I « 1, NTR
                  Jl = ITRCI.l) I J2 * ITR * IA 1 ITRII4,3» *  M
            C     J03 DONE.  TWO TRIANGLES, ONE POINT, NO 3DRY  PTS AOOED
                  GO TO 12C
            C      NOW TAKE CARE OF EXCEPTIONAL CASES
50          20    IF ( ID .LT. 0 ) GO TO 22
            C     HA/E A DEGENERATE TRIANGLE
                  PRINT 1870, I, Jl, J2, J3
            1070  FORMAT (» TRIANGLE *,I5,» VERTICES*,315,* IS  A DEGENERATE*)
                  GC TO 126
55          C     THE POINT ( XO,  Y8 1 IS ON A SIDE OF THE TRIANGLE
                                         180

-------
               ACC^T
C
22

2*
2*
32
36
<.o
•»2
C
C
<*<»





FINJ
10
GO
IA
IA
IA
IA
IA
IA
=
TO
-
=
=
=
£
S
POINT
TH
00
IF
00
Jl
IF
IF
IS
•+«
(
<*6
=
(
(
THE
-10
(
Jl
J3
J2
Jl
J3
J2
(
OTHER TKIANGLL

2<»,
« IB
J IB
S 18
* IB
S 18
« IB
XO,

28, 3
= J3
= Jl
- Jl
= J2
= J?
- J3
YO )

^,
S
B
S
S
t
S

36,
1C
1C
1C
1C
1C
1C
WITH T

HO,
= J2
= J2
= J3
= J3
= Jl
= Jl

<*2
R
s
s
B
S

LIES ON SIDE
HIS SlDc ANu MAOc t OF 2.

)
GO
GO
GO
GO
GO

IA

, ID
TC *»<*
TO t+b
TO <*<*
TO <»<«
TO *«<»

, IB. FIND ANOTHER TRIANGLE WITH
SIDE
I
I .
J
J «•
IA
13
= If
EQ.
= 1,
1 S
.£0.
.EQ.
NT*?
IT )
3
IF (
ITR<
ITR(

GO

J
I,
If

TO

.EQ
J) .
J) .

1*3

. 3 )
AND.
ANO.






Jl
19
IA
•
•



= 1
EQ. ITR(IfJl) ) GO TO 50
EQ. ITRd.Jl) 1 GO TO 50
 65
 70


             <*6    CONTINUE
             w8    CONTINUE
 7b          C     THE TRIANGLE  MUST  HA = Nl
                   ITRdFA.U  =  IA  t  ITR(ITA,2I  = I£ S ITR  IM  = NBOY
                   IP = I * 1  S  IF  (  I .EQ. NBOY )  IP = 1
                   IF ( { ISIG(IM)  .£Q.  1  > .ANO. ( ISIGJIP) .EQ. 1 ) ) GQ  TO  90
                   GO TO 60
             58    CONTINUE
105                GO TO 62
             60    ISIGU1 = 1
             C     CHECKS OUT  LEFT  AND RIGHT  POINTS ,,SE£N,, FRCM NEW POINT
             62    K = 0
                   00 66 I = 1*  N30Y
110                II = I * 1  f  IF  (  I .EQ. NBOY I  II = 1
                                      181

-------
   SUBROUTINE  ADOPT

                   IF { I3IGCI) .£0.  ISIGtll)  )  66,  c*
             t><»    K = K * 1
                   P(*» = II
             66    CONTINUE
115                IF « ISIG(P(1)  )  .EQ. -1  )  68,  70
             63    IL = P(l»
                   IR = P<2>
                   GO TO 72
             70    IL = P(2)
120                IR = P(1I
             C      IL = FIRST POINT  TO LEFT  ,,SEEN,,  FROM NEW POINT
             C     IR = LAST POINT TO RIGHT  ,,SEEN,,  FROM NEW POINT
             C     K = NUMBER OF POINTS , .SEEN,,  FRCH  NEW POINT
             72    K = IR - IL + 1
125                IF ( K .LT. 0 > K.  = NBOV  «•  K
                   IF ( K .LE. 1 ) 7t, 76
             7<*    II - IBDY(IL) { 12 = ISOYCIR)
                   PRINT 1075, II, 12, XO,  YO
             1075  FORHAT {» LEFT  AND RIGHT  BOUNDARY  POINTS*/2I5/»ARE SEEN FROM*/
130               .2F10.5)
                   GO TO 126
             C     THtRt ARE K - 1 NEW TRIANGLES
             76    12 = K- 1 5 N2  =  NTR
                   DO 78 I = 1, 12
135                N2 = N2 + 1
                   Jl = IL * I - 1 $  IF (  Jl  .GT.  N3DY >  Jl = Jl - NBOY
                   J2 = Jl * 1 $ IF  ( J2 .GT.  N30Y )  J2 = J2 - N8CY
                   ITRJN2,!) = IBDY(J2)
                   ITR(N2,2) = IEDY = Nl
                   12 = NBOY * 3 - K
                   DO 3% I = 2, 12
                   II - IR » I - 2 I  IF {  II  .GT.  N9CY »  II = II - NBOY
150          8<»    IBOY
-------
   SUBROUTINE  ADOPT

             90    K2 =  I8QY(IP)
                   II =  IP  -  1  $  IF  1  II .EQ.  0 )  II = 1
                   N! =  IBOYdl)
                   DC 92 I  -  1, NTR
170                00 92 J  s  l, 3
                   Jl =  J »• 1  S IF  <  J ,£Q.  3  ) Jt = 1
                   IF (  Kl  .£Q. ITRd.J) .ANu.  K2  .tQ, ITR(I.Jl)  ) GO TO  9<+
                   IF <  K2  .EQ. ITRd.J) .AND.  Kl  ,EQ. ITR(I,J1>  ) GO TO  9*
             92    CONTINUE
175                PRINT 1085. Kl, K2
             1035  FORMATC FAILS TO  LOCATE  TRIANGLE HITH SEGMENT'/?I5/»  ON
                  »30UNDA«r»)
                   GO TO 126
             94    J2 =  J » 2  $ IF  <  J2 .GT.  -T  )  J2 = J2 - 3
180                IT =  I
                   IA -  ITR(ITtJ) «  Id = ITR(IT.Jl)  $ 1C = ITR(IT,J2>
             96    IT^(IT,1I  =  1C f  ITRdT,?)  = IA $ IT«
-------
J.  Subroutine TRIFIX
     This subroutine readjusts the triangle assignments
to maintain a network of non-overlapping triangles with
observation points at their vertices to prevent the occurrence
of triangles of unnecessarily small area.  It finds triangle
pairs with a common side.  When such a pair is located, they
are combined into a quadrilateral.  This quadrilateral may
be divided into two ways.  The triangle subdivision is
prefered that has the shorter of the two diagonals of the
quadrilateral as the common side  (Figure V-8).

     The quadrilateral formed by two triangles with a common
side may be re-entrant.  In this case the second diagonal
lies "outside" the quadrilateral and the subdivision is
accepted as is (line 36), see Figure V-9.  The case is
identified by the fact that when the outside diagonal is
used, one of the triangle areas resulting is larger than
the sum of the areas of the original triangle pairs.

     When the re-entrant quadrilateral test fails but still
the area of the quadrilateral obtained by the second sub-
division exceeds that obtained by the first subdivision by
more than 0.01% a diagnostic is printed.  This test is
required to account for the effect of roundoff errors in the
computer.
                             184

-------
                    FIGURE V-8

THE TWO POSSIBLE SUBDIVISIONS OF QUADRILATERAL ABCD INTO
TWO TRIANGLES.  ON  THE LEFT, THE SUBDIVISION  ABC,  ACD RESULTS
IN TRIANGLES WITH A  LARGER  COMMON  SIDE AC WHILE THE
SUBDIVISION ABD.BCD  ON THE RIGHT RESULTS IN A SMALLER
COMMON SIDE BD AND IS THEREFORE PREFERRED.
                    FIGURE V-9

THE  REENTRANT QUADRILATERAL CASE.  THE ORIGINAL TRIANGLES
ARE  ABC AND ACD.  THE SECOND DIAGONAL  BD LIES " OUTSIDE "
THE  QUADRILATERAL  AND IS REJECTED.
                         185

-------
            *OECK TRIFIX
                  SU3-40UTINE TRIFIX
            C
            C
            C
            C
10
15
20
25
30
35
            12
            1002
            1001
 CHiCKS  THE  TRIANGLE SUBDIVISION ANO TfiYS TO IMPROVE  THE
 TRIANGLE ARRANGEMENTS.  FINOS TRIANGLE PAIRS  WITH  COMMON  SIDE  ANO
 TESTS THE JIAGONALS OF THE QUADRILATERAL TO SEE  IF MORE NEARLY
 EQUAL AREA  SUBDIVISION IS POSSIBLE
 COMMON  / BLK1 / NSTN, NBOY, NTR, XS<30), YS130),  ITR(50,3)
,  ,  XC,  YC,  R, I80Y(20>
 DIMENSION XU3),  X2(3), Yl(1), Y2{3)
 12  =  NTR -  1
 00  20 I * 1, 12









2
H
6
a




10
11




C
C
<1 = I » 1
00 20 K = Kit NTR
DC 16 J = It 3
Jl = JtJ2 = J*-l$IF(J .EQ.
JA = ITRCI,J1) S J3 = ITR(I,J2>
00 16 L * It 3
Ll=LfL2=L+i$IFCL .£Q.
LA - ITRCKtLl) I Ld = ITR(K,L2)
IF ( ( JA .EQ. LA ) .AND. ( JB .EQ
ISM - 1 $ GO 10 A
IF ( ( JA .EQ. LB » .ANO. < JB .EC
ISH = 2
J3 = J2 * 1 $ IF ( J3 .GT. 3 1 J3
L3 = L2 * 1 $ IF I L3 .GT. 3 > L3
JA = ITR(ItJl) « JB = ITR(I,J2I t
LC = ITR«k,L3)
GO TO { lit 10 ), ISM
LA = ITR(K,L2) $ LB = ITRIK.L1
CALL AR2( JA, JB, JC, Al, 101
CALL AR2( LA, LB, LC, A2, 102
CALL AR2C JB, JC, LC, A3, 103
CALL AR2( JC, JA, LC, A<* , 104
AT = Al + A2
TESTS IF QUADRILATERAL REENTRANT.
OTHERWISE SIMPLY DIVIDE



3 ) J2 = 1


3 > L2 = 1

. LB > 1 2, H

. LA 1 ) 6, 16

= 1
= 1
JC = ITR(I,J3»








IF IS CANNOT

                                                 .GE.  AT  I  1  GO TC 20
                  BT
                  IF
                  DO
    = A3 *
    ( A8S(
    12 M =
AT I  .LT,
0001 * AT )  GO TO
                                        $ YKMI  =  YSUTRCItMH
                                        $ Y2(M)  =  YSCITRCKtMM
                                                            M = 1, 3 )
           Ak
           BT -
           It 3
       = XS(ITR
-------
SUBROUTINE  TRIFIX
          16    CONTINUE
          20    CONTINUE
                RETURN
                END
                        187

-------
K.  Subroutine PORDR
     This subroutine checks the arrangement of the vertex
points for counterclockwise ordering and rearranges the
ordering if a clockwise ordering is found.
                            188

-------
            *OEO PGROR
                  SUBROUTINE PQR3R
            C     CHECKS THE ARRANGEMENT  OF  POINTS ARCUNO THE TRIANGLES TO
            C     ASSURE CC ORDERING
 5                COMMON / 3LK1  / NSTN, M30Y,  NTfc, XS(30), YS(3QI, ITR(5Q,3)
                  UO •* I - 1, NTR
                  Jl = ITRCI.ll  t J2  =  ITR(I,2»  S J3 = ITR(I,3)
                  CALL AR2( Jl,  J?t J3, A, IGIOG  )
                  IF ( IOIOG .EQ. 1 ) GO  TO  k
10                ITRd.ll = J2  S ITR(I,?| = Jl
            k     CONTINUE
                  RETURN
                  END
                                189

-------
L.  Subroutine AR2
     This subroutine computes twice the area of the triangle
with vertices at the points with indices I,J,K and outputs
this as A with a diagnostic IDIOG which takes the value of
+1 if I,J,K in counterclockwise order about the triangle,
-1 if in clockwise order, and 0 if A=0.
                              190

-------
            •DECK AR?
                  SUBROUTINE  AP2I  I,  J,  is,  A,  IDIOG )
            C     COMPUTES TWICE THE  AREA  OF  THE TRIANGLE WITH STATIONS  I,  J,  K  AS
            c     VERTICES.   RETURNS  ABSOLUTE  VALUE AT A ANO SIGN IN I^ICG  AS  <•!  o'
 5          C     -1.  IOIOG  =  +1  MEANS  I,  J,  < IN CC ORDER.
                  CCMNON / BLK1 /  NSTNf  N90Y,  NTR, XS(30), YS(30),  ITR(50,3»
                 , , XC, YC,  R, I8DY(20)
                  XI = XS(I)  8  X2  =  XS(J>  J X3 = XS(K»
                  Yl = YSII)  S  Y2  =  YSU)  J Y3 = YS(K)
10                A = ( X2 -  XI )  *  (  Y3 -  Yl  I - { X3 - XI I *  ( Y2 - Yl  )
                  IDIOG = SIGNt 1.,  A  >
                  IF < A .EQ. 0. )  IOIOG =  0.
                  A - A3S( A  )
                  RETURN
15                END
                                                  191

-------
M.  Subroutine TRITST
     This subroutine determines ;_he location of the point
input, P, with coordinates X, Y with respect to a triangle
with vertices at the points Jl, J2, J3 where these are the
index numbers for the coordinates of an already located
observation point.  The output consists of a diagnostic
(IDIOG), and twice the areas of the following triangles:  A for
J1,J2,J3, P for J1,P,J3, Q for J1,J2,P.  The diagnostic
indicates as follows:

IDIOG = 2      , J1,J2,J3 in counterclockwise order,  (X,Y) inside
        1      ,  '	'  clockwise
        0      ,  "   "  " in straight line
        -l,-2  , P lies on Jl,J3
        -3,-4  , "   "   " J1,J2
        -5,-6  , '	J2,J3
        -7,-8  , " is outside opposite J2
        -9,-10 , "  "    "       "     J3
        -11,-12, '	       "     Jl
                             192

-------
            *D£CK TRUST
                  SUBROUTINE  TRITSTl  X,  V,  Jl,  J2, Jl, IOIGG, A, P, Q  >
            C     TESTS WHERE F  $  Y2 =  YSCJ2) t  Y3 = VS(J3>
15                Al = X2  - XI «  A2 = X3 -  XI  « A3 = X - XI
                  81 = Y2  - Yl f  B2 s Y3 -  VI  $ 83 = Y - VI
                  A = Al * B2 -  A2 *  81
                  P = A3 * B2 -  B3 •  A2
                  Q = Al * 83 -  31 *  A3
20                IF ( A I 2,  12,  <»
            2     K s 1
                  A = -A $ P  = -P f Q =  -Q
            H     IF { P I 20, 14, 6
            6     IF ( Q ) 22, 16, 8
25          8     IF { A - P  - Q  I 2<*, 18,  10
            10    IDIOG =  2 - K
                  GO TO 26
            12    IOIOG =  0
                  GO TO 26
30          1%    IDIOG =  -2  * K
                  IF ( 0 .LT.  0.  1 GO TO 20
                  IF C Q .GT.  A  »  GO  TO  2<*
                  GO TO 26
            16    IDIOG =  -<*  * K
35                IF I P .GT.  A  )  GO  TO  22
                  GO TO 26
            18    IOIOG =  -6  «• K
                  GO TO 26
            20    IOIOG =  -8  * K
*»0                GO TO 26
            22    IOIOG =  -10 *  K
                  GO TO 26
            2
-------
                       REFERENCES
 1.    Abramowitz,  M.  and Stegun,  I.A.  (1964),  Handbook of
      Mathematical Functions,  N.B.S.  Applied Mathematics
      Series No. 55,  Superintendent  of Documents,  U.S.G.P.O.,
      Washington,  B.C.,  xiv +  1046 pp.

 2.    Albrecht,  J. and L.  Collatz (1958),  Zur numerischen
      Auswertung mehrdimensionaler Integrals;  Zeitschrift
      fur Angewandte  Mathematik und  Mechanik,  Vol.  38, pp.  1-15.

 3.    Anderson,  T.W.  (1958), An Introduction t£ Multivariate
      Statistical Analyses, John Wiley & Sons, Inc.,  New
      York,  xii  + 374 pp.

 4.    Bartlett,  M.S.  (1951),  "The effect of standardization
      on an  approximation in factor  analysis", Biometrika,
      Vol. 38,  pp. 337-334.

 5.    Beard, R.E.  (1947),  Some Notes on Approximate Product-
      Integration, Institute of Actuaries,  London,  Vol. 73,
      pp. 356-416.

 6.    Bellman,  Richard (1960), Introduction to_ Matrix Analysis,
      McGraw-Hill Book Co., Inc., New York, xx + 328  pp.

 7.    Boland, W. Robert and C.S.  Duris (1969), Product
      Type Quadrature Formulas, Mathematical Report 69-9,
      Department of Mathematics,  Drexel Institute of  Technology,
      Philadelphia, PA 19104,  48 pp.

 8.    Boland, W. Robert and C.S.  Duris (1971), Product
      Type Quadrature Formulas, B.I.T., Vol. 11, pp.  139-158.

 9.    Boland, W. Robert (1972), The  Convergence of Product-type
      Quadrature Formulas, SIAM J. Numerical Analysis, Vol. 9,
      pp. 6-13.

10.    Brooks, C.E.P., Durst,  C.S., Carruthers, N.,  and Dewar,  D.
      (1950a),  "Upper Winds over the World", Geophysical
      Memoirs,  No. 85, Meteorological Office, Air Ministry,
      MO499e, London, 57 pp.

11.    Brooks, C.E.P., Durst,  C.S., and Carruthers,  N.  (1946),
      "Upper Winds over the World, Part 1,  The frequency dis-
      tribution of winds at a  point  in the free air", Quarterly
      Journal of the Royal Meteorological Society,  London,
      Vol. 72,  pp. 55-73.
                          194

-------
12.  Buell, C. Eugene (1972) ,  "Integral Equation Representation
     for 'Factor Analysis'",  Journal of A tmo spheric Science,
     Vol. 28, No. 8, pp. 1502-1505".      ~     '"    ~     "~

13.  Ceschino, F. and M. Letin (1972), Generalisation de la
     methode de Havie au calcul des integrales multiples,
     C.R. Acad. Sc. Paris, Series A, Vol. 275, pp. 505-508

14.  Craddock, J.M. and Flintoff, S.  (1970), "Eigenvector
     Representation of Northern Hemisphere Fields", Quarterly
     Journal of the Royal Met e or olog i c a 1 Society, Vol. 96,
     pp. 124-129.

15.  Davis, Philip J. and Philip Rabinowitz (1967), Numerical
     Integration, Blaisdell Pub.  Co., Waltham, Mass, 225 pp.

16.  Dixon, Valerie A.  (1973), Numerical Quadrature, A
     Survey of Available Algorithms, National Physics
     Laboratory, Division of Numerical Analysis and Computing,
     Oxford, NPL Report NAC36, 28 pp.

17.  Eimutis, E.G., and Konicek,  M.G. (1972), "Derivations
     of continuous functions for the lateral and vertical
     atmospheric dispersion coefficients", Atmospheric
     Environment, Vol. 6, pp.  859-863.

18.  Ewing, G.M. (1941), On Approximate Cubature, Amer.
     Math.  Monthly, Vol. 48,  pp.  134-136.

19.  Farmer, S.A.  (1971), "An Investigation of the Results
     of Principal Component Analysis of Data Derived from
     Random Numbers", The Statistician, Vol. 20, pp. 63-74.

20.  Gantmacher, F.R. (1960a) , The. Theory of Matrices, I_,
     Chelsea Publishing Co.,  New York, x + 374 pp.

21.  Gantmacher, F.R. (1960b) , The Theory o_f Matrices, II,
     Chelsea Publishing Co.,  New York, ix + 276 pp.

22.  Hammer, P.C. and A.W. Wymore  (1957), Numerical Evaluation
     of Multiple Integrals, I., Mathematics of Computation,
     Vol. 11, pp. 59-67.

23.  Hammer, P.C. and A.H. Stroud  (1958), Numerical Evaluation
     of Multiple Integrals, II, Mathematics of Computation,
     Vol. 12, pp. 272-280.

24.  Horst, Paul (1965), Factor Analysis of Data Matrices,
     Holt,  Rinehart and Winston,  Inc., New York, xix + 730 pp.
      ir
25.  Joreskog, K.G.  (1962), "On the statistical treatment
     of residuals in factor analysis", Psychometrika, Vol. 27,
     pp. 335-354.

                          195

-------
26.   Kenney,  J.F.  and Keeping,  E.S.  (1951),  Mathematics of
     Statistics,  Part Two,  D. Van Nostrand  Co.,  Inc.,  New
     York,  xii +  429 pp.

27.   Lawley,  D.N.  (1956),  "Tests of  significance for the
     latent roots of covariance and  correlation  matrices,
     Biometrika,  Vol. 43,  pp.  128-136.

28.   Lawley,  D.N.  and Maxwell,  A.E.  (1963),  Factor Analysis
     as a. Statistical Method, Butterworths,  London, vii + 113 pp.

29.   Lawley,  D.N.  and Maxwell,  A.E.  (1971),  Factor Analysis
     as a_ Statistical Method, Butterworths,  London, viii + 153 pp.

30.   Levy,  Paul (1965),  Processes Stochastique et Movement
     Brownien, Gauthier-Villars et Co.,  Paris, vi + 438 pp.

31.   MacDuffee, C.C. (1949), Vectors and Matrices, The
     Mathematical Association of America, Providence,  R.I.,
     xi + 203 pp.

32.   Mehta, M.L.  (1967),  Random Matrices and the Statistical
     Theory of Energy Levels, Academic  Press, New York,
     x + 259~pp~i

33.   Mises, R. von(1936),  Formules de cubature,  Revere
     Mathematique del'Union Interbalkanique, Vol. 1, pp. 2-27.

34.   Mises, R. von  (1954), Numerische Berechnung mehrdimensionaler
     Integrals, Zeitschrift fur Angewemdte  Math. Und Mech.,
     Vol. 34, pp.  201-210.

35.   Panchev, S.  (1971),  Random Functions and Turbulence,
     Pergamon Press, Oxford, xiii +  444 pp.

36.   Perils,  S. (1952),  Theory  of Matrices,  Addison-Wesley
     Press, Inc.,  Cambridge, Mass.,  xiv + 237 pp.

37.   Ruff,  Ronald E  (1973a), Memorandum of  January 10 with
     attachments.

38.   Ruff,  Ronald E  (1973b), Memorandum of  23 January with
     attachments.

39.   Slepian, David  (1962), "The one sided  barrier problem
     for Gaussian Noise",  The Bell System Technical Journal,
     Vol. 41, March 1962,  pp.  463-501.
                           196

-------
40.  Stroud, A.H.  (1971), Approximate Calculation of Multiple
     Integrals, Prentice-Hall, Inc., Englewood Cliffs, N.J.,
     418 pp.

41.  Stroud, A.H.  (1960a), Quadrature Methods for Functions
     of More than  one Variable, Annals of_ the New York
     Academy of Sciences, Vol. 86, pp. 776-791.

42.  Stroud, A.H.  (1960b), Numerical Integration Formulas
     of Degree Two,  Mathematics of Computation, Vol. 14,
     pp. 21-26.

43.  Spearman, Carl  (1904), "General intelligence objectively
     determined and  measured", Amer. J. Psychol., Vol. 15,
     pp. 201-293.

44.  Synge, J.L. (1953), Simple bounding formulas for integrals,
     Canadian Journal of_ Mathematics, Vol. 5, pp. 46-52.

45.  Thacher, H.C.  (1957), Optimum Quadrature Formulas in
     8 Dimensions, Mathematical Tables and Other Aids to_
     Computation,  Vol. 11, pp. 189-194.

46.  Turnbull, H.W.  (1960), The Theory of Determinants,
     Matrices and  Invariants,  Dover Publications, Inc., New
     York,  xviii + 374 pp.  (From the second edition of
     the original,  1945).

47.  Turner, D. Bruce (1970),  Workbook of Atmospheric
     Dispersion Estimates, Air Resources Field Research
     Office, Environmental Science Services Administration,
     U.S. Department of Health, Education and Welfare,
     National Air  Pollution Central Administration, Cin-
     cinnati, Ohio,  vii + 84 pp.

48.  Tyler, G.W. (1953), Numerical Integration of Functions
     of Several Variables, Canadian Journal of Mathematics,
     Vol. 5, pp. 393-412.

49.  Wiener, N. (1949),  Extrapolation, Interpolation, and
     Smoothing of  Stationary Time Series, The Technology
     Press and John  Wiley & Sons, New York, ix + 163 pp.
                            197

-------
                      APPENDIX A
 Conditions on an Empirical Formula fo r_a Correlation
                      Coefficient
     Before adopting the use of the proper function/proper
value expansion of the correlation coefficients as described
in Chapter III, it was felt that an empirical formula could
be developed that would represent the correlation coefficients
involved.  This effort was unsuccessful.  At least a part
of the reasons for this lie in the following analysis of the
structure of correlation coefficients or covariances .

     An empirical formula to represent a correlation co-
efficient field may not be chosen in a perfectly arbitrary
way even though the expression leads to values that are con-
fined to the range (+!,-!)  and has a maximum of +1 at zero
separation between the points concerned.  There are several side
conditions that the functions must satisfy.  They are stated
here in terms of only one independent variable to keep the
notation reasonably simple, but may be extended to two
or three independent variables with no difficulty.  We
consider the correlation of a field property, p, at two
points, x, and x_ , or at x and x+£>  Thus the correlation
coefficient concerned will be written as r(x2,x,), r(x+C,x),
or as r(£;x) all of which will be considered as equivalent.

     The usual conditions that must be satisfied are:

     1.  The correlation coefficient takes on the value +1
when Xi=x2 or at £=0; i.e., r(x, ,x,)=+l, r(x+0,x)=+l,
r(0;x)=+l.  On the other hand it is not necessary that the
correlation coefficient be continuous at this point.  It
may have a jump discontinuity here.  In other words, it may
be that the limit, for £>0, may be less than +1,
lim r(£;x)=a,
                          A-l

-------
     2.  The value of the correlation coefficient must be
within the range  (+!,-!).

     3.  It must  satisfy the symmetry condition r(x2,x,)=
r(x, ,x2).  This may also be expressed as r(x+£,x)=r(x,x+£)
and  also as r(£;x)=r(-£;x+£).  This condition  is occasionally
overlooked.

     4.  It must  be a positive definite  (or at least a non-
negative definite) function.  In the case of an homogeneous
field, this implies that its Fourier transform be positive
 (or  non-negative).

     Elementary statistics texts do not usually discuss  these
points in detail.  The reader is referred to Cramer and  Lead-
better  (1967), Panchev  (1971), and others.  Slepian  (1962)
is particularly interesting though it deals principally  with
another subject.  Levy  (1965) is especially good.

     In constructing an empirical formula for  a correlation
coefficient, one  is tempted to use an expression in the  form
r (£;x) =r (£ ;a (x) , 3 (x) ,	) where a(x), 3(x) are parameters
which in turn are functions of the location of the point x.
     To understand what goes on, consider a simple series
expansion of the correlation coefficient function
                          A-2

-------
where P-j^ has coordinate x and P   is at  x+£.   The coefficient
A2(x) is negative.  If we reverse the roles  of  P,  and P2/
then PI will have coordinate  (x+£)-£ and  P2  will be  x+£
so that one has r(£,x)=r(-£,x+£;)  for the  equality r(P2,P,)=
r(P-L,P2).  Then one must have equality  of the two series

             9        ^                   ?         3
     1+A  (v\ F +A  (v\ F +      = 14-A (•x+F)F — A  (v+ri? +       .
     J-Tra^ \X-f £3   O * •"•' ^   ...  — X~r\~ \ A < C, / c,   O V •" S / S ~ * * *
         ^*       O                 ^          -J

If the coefficients also  have valid power series expansion
about x  (or 5=0), we  expand  coefficients on the right and
collect  terms in  ascending powers of  £  to obtain
Equating coefficients of  like  powers  of £ on each side, then
                       i
          A3 = ~A3 + A
                      i     ii
                         A2/2!
                         - A3/2!
                      I     ii        ii I        T\r
                       + A/2!  -  A /3!  + A
     It is readily seen  that  these lead to a system of
equations for the coefficients  of  the odd order terms in
terms of the derivatives of the even ordered terms, and that
                            A-3

-------
they give no information on the even order terms.  If we
write

     A     = a A'     + a A'"    +     + * A(2n-V
      2n-l    1 2n-2    2 2n-4   '"    n 2

The coefficients a.  are given by the system of equations


      2al                                     = 1/1!

      a1/2!+2a2                               = 1/3!

      a1/4!+a2/2!+2a3                         = 1/5!

      a1/6!+a2/4!+a3/2!+2a4                    = 1/7!
a1/(2n-2) !+a2(2n-4) !+ ... +an_1/2I+a
                                     2n
 which are easily solved recursively,  the first few solutions
 are a-L=l/2!/  a2=-l/4!,  a3=3/6!,  a4=-17/8!,  a5=155/10!, etc.
 The general expression  for the solution is  not immediately
 obvious.
                           A-4

-------
     A stochastic field of property is said to be homogeneous
 (in the extended sense) if its statistical parameters are
independent of the location of the point P,  (here, x) with
which the points P2 (here, x+C) are correlated.  This means
that the correlation coefficient r(£;x) is a function of £
alone, r(£), and (of course) the standard deviations are
also constant.

     The fact that the correlation coefficient for a non-
homogeneous process contains odd order terms in its series
expansion does not imply that a function form with odd order
terms necessarily represents a non-homogeneous process.
For example, nearly all simple empirical expressions for a
correlation coefficient are applicable only to an homogeneous
process.  The very popular expression r=exp(-|£j/L), L =
scale parameter, can only apply to an homogeneous process.
Thus, if we assume the contrary and let r(x2,x,)=exp(-|x2~x,|/L(x,)),
then r(x,,x2)=exp(-|x,-x2|/L(x2)), and since these correlation
coefficients are equal, it follows at once that L(x,)-L(x~)
which means that L is a constant and hence that the process
is homogeneous.

     The same kind of analysis may be made for the two point
covariance functions.   It is not necessary that the covariance
function have a maximum at the point P1=P2*  In tiie case
of a one-dimensional function one may use

     cov(P2,P1) = cov(£;x)  = cov(x)+C1(x)5+C2(x)52 + 	

and
                          A-5

-------
    cov(P1,P2) = cov(-C;x+C) = CQ (x+£) -GI (x+€) £+C2 (x+C)



                      9      HIT      JIT  4

                                      V
Now rearranging in terms of powers  of  £ ,
cov(P1,P2)  = C0+(CQ-C1)?+(CQ/2!-C2)C+(CQ/3!-C/2!+C2-C3)



               TV     " '      "     '      4
           +  (CjV/4! -(:.,_  /3i+C2/2!-C3+C4)?4+ ...




Equating  coefficients in the first and last of these expressions

since cov(P.,,P, )  = cov(P, ,P-) one obtains the system of

relations
      f  = C
      L0    o
      Cl  = -C1+C0
         = -C3+C2-C1/2!+C() /3!
               I   it     n I     -rw

      C4 = C4~C3+C2/2I~C1 /3!+co
                         A-6

-------
which lead to expressions for the odd order terms as
                      I      T\T     W
                       /3!-C:Jv/4!+Cg/5!
This system of equations is the same as those obtained for
the terms in the expansion of the correlation coefficient
with the exception that one starts here with C, while before
one started with A~ and we have rather general values for
GO and C, while in the previous example A =0 and for any
correlation coefficient AQ=1.

     The point of these exercises is to emphasize the fact
that care must be exercised in selecting an empirical formula
to represent a correlation coefficient function.  Otherwise
one may have an expression that cannot represent a correlation
coefficient function.  This is particarly the case when the
field concerned is not homogeneous.
                            A-7

-------
                      APPENDIX B
        NOTE ON THE RANK OF A COVARIANCE MATRIX

     Let A be the covariance matrix and let X be a data
matrix in which the element x. .  is the departure from the
mean of the pollutant concentration at station i on day j .
Let there be n stations (i=l, --- , n)  and d reporting days
(j=l, --- , d) .  Then the covariance matrix may be written
in the form

     A = XX' /d,     X1 = transpose of X
so that the element a. .  is the mean sum of products
and is the covariance of pollutant concentration at stations
i and j.  The matrix A is nxn while X is nxd and X1 is dxn.

     It is a well known theorem that the rank of a matrix
cannot exceed the smaller of the number of rows and the
number of columns.  Thus, the rank of X (and of X1) is the
smaller of d and n.  Also, the rank of a matrix product
is no greater than that of either factor (Perils (1952) ,
Ex. 7, p. 58).  (The division by the scalar, d, to obtain A
does not affect the rank.)

     As a result, the rank of the covariance matrix A, which
is always nxn, cannot exceed the smaller of n and d.  Thus,
if there are 40 locations, but only one day of observations,
the rank of A would be 1.  If there are 27 days of observations,
the rank of A would not exceed 27.  If there are 59 days
of observations, the rank of A would not exceed 40.  If
there were 1241 days of observations, the rank of A would
not exceed 40.
                           B-l

-------
                                   TECHNICAL REPORT DATA
                            (Please read Instructions on the reverse before completing)
 1 REPORT NO.

   EPA-650/4-75-005
                                                          3. RECIPIENT'S ACCESSIOONO.
 4. TITLE AND SUBTITLE
   OBJECTIVE PROCEDURES FOR OPTIMUM LOCATION  OF  AIR
   POLLUTION OBSERVATION STATIONS
                                                          5. REPORT DATE
                                                            June 1975
             6. PERFORMING ORGANIZATION CODE
 7. AUTHOR(S)
                                                          8. PERFORMING ORGANIZATION REPORT NO.
   C.  Eugene Buell
9. PERFORMING ORGANIZATION NAME AND ADDRESS

   Kaman  Sciences Corporation
   P.  0.  Box 7463
   Colorado Springs, Colorado 80933
                                                           10. PROGRAM ELEMENT NO.
               1AA009
             11. CONTRACT/GRANT NO.

               EPA-68-02-0699
 12. SPONSORING AGENCY NAME AND ADDRESS
                                                           13. TYPE OF REPORT AND PERIOD COVERED
   Environmental  Sciences Research Laboratory
   Office  of Research and Development
   U.S.  Environmental Protection Agency
   Research  Triangle Park, North Carolina  27711
               Final
             14. SPONSORING AGENCY CODE

               EPA-ORD
15. SUPPLEMENTARY NOTES
16. ABSTRACT
        This document is concerned with developing  linear regression techniques  for
   interpolation of air pollutant concentrations  over an area and, using these
   techniques in a computer program for determining the optimum location of air
   pollution observing stations.  The general  interpolation problem is surveyed  and
   the  advantages of using linear regression  formulas as interpolation formulas  are
   discussed.  The case of observations containing  errors of observation or effects
   of limited range of influence is emphasized.   Since the use of linear regression
   methods  depends on knowledge of the two-point  correlation function for  pollutant
   concentration measures, the construction of correlation coefficients from
   synthetic data is taken up.  Attention  is  given  to the estimation of residual
   variances or the effects of limited range  of influence, using Factor Analysis.
   In extending these methods to a continuous  formulation in integral equation form,
   the  lack of accuracy in the integral equation  solution is more important than the
   statistical significance of the data unless the  residual variances are  removed.
   If this  is done, then the tests for accuracy and statistical significance  are
   reconciled.  If the user carefully handles  the residual variances in constructing
   program  input, difficulties encountered  in  code  development are avoidable.
                               KEY WORDS AND DOCUMENT ANALYSIS
                  DESCRIPTORS
b.IDENTIFIERS/OPEN ENDED TERMS  C.  COSATI Field/Group
  Air  pollution
  Linear  regression
  Interpolation
  Factor  analysis
  Optimization
                                 13B
                                 12A
18. DISTRIBUTION STATEMENT
         RELEASE TO PUBLIC
                                             19. SECURITY CLASS (This Report)
                                                    UNCLASSIFIED
                           21. NO. OF PAGES

                               215
                                             20. SECURITY CLASS (Thispage)
                                                    UNCLASSIFIED
                                                                        22. PRICE
EPA Form 2220-1 (9-73)

-------
                                                      INSTRUCTIONS

1.   REPORT NUMBER
     Insert the EPA report number as it appears on the cover of the publication.

2.   LEAVE BLANK

3.   RECIPIENTS ACCESSION NUMBER
     Reserved for use by each report recipient.

4.   TITLE AND SUBTITLE
     Title should indicate clearly and briefly the subject coverage of the report, and be displayed prominently.  Set subtitle, if used, in smaller
     type or otherwise subordinate it to main  title.  When a report is prepared in more than one volume, repeat the primary title, add volume
     number and include subtitle for the specific title.

5.   REPORT DATE
     Each report shall carry a date indicating at least month and year. Indicate the basis on which it was selected (e.g., date of issue, date of
     approval, date of preparation, etc.).

6.   PERFORMING ORGANIZATION CODE
     Leave blank.

7.   AUTHOR(S)
     Give name(s) in conventional order (John R. Doe, J. Robert Doe, etc.).  List author's affiliation if it differs from the performing organi-
     zation.

8.   PERFORMING ORGANIZATION REPORT NUMBER
     Insert if performing organization wishes to assign this number.

9.   PERFORMING ORGANIZATION NAME AND ADDRESS
     Give name, street, city, state, and ZIP code.  List no more than two levels of an organizational hirearchy.

10.  PROGRAM ELEMENT NUMBER
     Use the program element number under which the report was prepared. Subordinate numbers may be included in parentheses.

11.  CONTRACT/GRANT NUMBER
     Insert contract or grant  number under which report was prepared.

12.  SPONSORING AGENCY NAME AND ADDRESS
     Include ZIP code.

13.  TYPE OF REPORT AND PERIOD COVERED
     Indicate interim final, etc.,  and if applicable, dates covered.

14.  SPONSORING AGENCY CODE
     Leave blank.

15.  SUPPLEMENTARY NOTES
     Enter information not included elsewhere but useful, such as: Prepared in cooperation with, Translation of, Presented at conference of,
     To be published in, Supersedes, Supplements, etc.

16.  ABSTRACT
     Include a brief (200 words or less) factual summary of the most significant information contained in the report. If the report contains a
     significant bibliography or literature survey, mention it here.

17.  KEY WORDS AND DOCUMENT  ANALYSIS
     (a) DESCRIPTORS - Select from the Thesaurus of Engineering and Scientific Terms the proper authorized terms that identify the major
     concept of  the research and are sufficiently specific and precise to be used as index entries for cataloging.

     (b) IDENTIFIERS  AND OPEN-ENDED TERMS - Use identifiers for project names, code names, equipment designators, etc.  Use open-
     ended terms written in descriptor form for those subjects for which no descriptor exists.

     (c) COSATI FIELD GROUP - Field and group assignments are to be taken from the 1965 COSATI Subject Category List.  Since the ma-
     jority of documents are multidisciplinary in nature, the Primary Field/Group assignment(s) will be specific discipline, area of human
     endeavor, or type of physical object. The application(s) will be cross-referenced with secondary Field/Group assignments that will follow
     the primary posting(s).

18.  DISTRIBUTION STATEMENT
     Denote releasability to the  public or limitation for reasons other than security for example "Release Unlimited."  Cite any  availability to
     the public,  with address and price.

19. &20. SECURITY CLASSIFICATION
     DO NOT submit classified reports to the  National Technical Information service.

21.  NUMBER OF PAGES
     Insert the total number of pages, including this one and unnumbered pages, but exclude distribution list, if any.

22.  PRICE
     Insert the price set by the National Technical Information Service or the Government Printing Office, if known.
    EPA Form 2220-1 (9-73) (Reverse)

-------