Growth Effects of Major Land Use Projects: Volume I - Specification and Causal Analysis of Model


EPA-450/3-76-012-a
May 1976

           GROWTH EFFECTS
             OF  MAJOR  LAND
               USE PROJECTS:
VOLUME I -  SPECIFICATION
      AND CAUSAL  ANALYSIS
                    OF  MODEL
  U.S. ENVIRONMENTAL PROTECTION AGENCY
      Office of Air and Waste Management
   Office of Air Quality Planning and Standards
   Research Triangle Park, North Carolina 27711

-------
                                EPA-450/3-76-012-a
       GROWTH  EFFECTS


        OF  MAJOR  LAND


         USE PROJECTS:


VOLUME I - SPECIFICATION


   AND CAUSAL ANALYSIS


            OF  MODEL



                   by

  Frank Benesh, Peter Guldberg, and Ralph D'Agostino

        Walden Research Division of Abcor
              201 Vassar Street
         Cambridge, Massachusetts 02139

            Contract No. 68-02-2076


       EPA Project Officer: Thomas McCurdy


               Prepared for

     U.S. ENVIRONMENTAL PROTECTION AGENCY
        Office of Air and Waste Management
     Office of Air Quality Planning and Standards
     Research Triangle Park, North Carolina 27711

                May 1976

-------
This report is issued by the Environmental Protection Agency to report
technical data of interest to a limited number of readers.  Copies are
available free of charge to Federal employees, current contractors and
grantees, and nonprofit organizations - as supplies permit - from the
Air Pollution Technical Information Center, Environmental Protection
Agency,  Research Triangle Park,  North Carolina 27711; or, for a fee,
from the National Technical Information Service, 5285 Port Royal Road,
Springfield, Virginia 22161.
This report was furnished to the Environmental Protection Agency by
Walden Research Division of Abcor, Cambridge, Massachusetts 02139.
in fulfillment of Contract No.  68-02-2076.  The contents of this report
are reproduced herein as received from Walden Research Division of
Abcor.  The opinions, findings, and conclusions expressed are those
of the author and not necessarily those of the Environmental Protection
Agency.  Mention of company  or product names is not to be considered
as an endorsement by the Environmental Protection Agency.
              Publication No. EPA-450/3-76-012-a
                                    11

-------
                           ACKNOWLEDGEMENTS

     Special appreciation goes to Mr. Thomas McCurdy, the U.S.  Environmental
Protection Agency project officer for this study, whose extensive assistance
and advice was indespensible in the execution of this study.

     In addition, we wish to express our appreciation to the cooperation of
the more than one hundred individuals who were contacted and provided informa-
tion during the data collection phase of this project.

                               STAFFING

     Mr. Frank Benesh was the project manager of this study at Wai den Research.
Dr. Ralph D'Agostino, contributed to the overall study design and analysis.
Mr. Mahesh Shah assisted in data collection and Mr.  Peter Guldberg conducted
much of the causal analysis.   The data collection was coordinated by
Mrs. Allison B. Goodsell and the manuscript was prepared under the direction
of Ms. Gail Kelleher.

     Metcalf and Eddy of Boston were subcontractors to Waiden Research, assist-
ing in model specification, sample selection, and data collection.   Mr. Richard
Ball, Mrs. Elizabeth Levin, Mr. Stephen Koop, Mrs, Nancy Lundgren, and Mr. Gerald
Takano contributed to Metcalf and Eddy's effort.
                                       fii

-------
                             TABLE OF CONTENTS


Chapter                            Title                           Page

   I     INTRODUCTION, APPROACH AND SUMMARY OF RESULTS 	 1-1

         A.   Introduction	1-1
         B.   Organization of the Report	.   . 1-2
         C.   Approach	1-3
              1.  Theory of Induced Development  	 1-3
              2.  Selection of General Approach  	 1-3
              3.  Statement of Fundamental  Model 	 1-5

  II     SUMMARY OF STATISTICAL TECHNIQUES  (PATH ANALYSIS) .... 2-1
         A.   Aim of Path Analysis	2-1
         B.   Path Diagrams	2-1
         C.   Approach (Structural Model).  	 2-3
         D.   Connection with Regression	2-4
         E.   Assumptions and Deviations 	 2-4
              1.  Linearity	2-4
              2.  Recursive	2-5
              3.  Causal Priorities	2-7
              4.  Uncorrelated Residuals of Endogenous Variables   . 2-8
              5.  Stability of the Estimates	2-9
              6.  Multicol linearity not Present	2-10
              7.  Scale	2-11
              8.  Suppressor Variable Problem 	 2-11
         F.   Other Assumptions 	 2-12
         G.   Theory Trimming	2-14

 III     INITIAL LAND USE MODEL SPECIFICATION

         A.   Objective	3-1
         B.   Definition of Major Project 	  3-1
         C.   Area of Influence	3-2
              1.  Should the area of influence vary with the size
                  or density of the major project?	3-2
              2.  What is the appropriate size of the study area
                  (area of influence) for each major project size?  3-5
         D.   Identification	    3-16
         E.   Methodology	    3-19
         F.   Model. Description 	    3-21
              1.  Endogenous Variables  	    3-21
              2.  Exogenous Variables   	    3-28
              3.  Simultaneous Block  	    3-31
              4.  Recursive Block   	    3-31
         G.   Theoretical and Empirical  Basts for the Model .   .    3-31
              1.  Equation 1   	    3-31
              2.  Equation 2	    3-44
                                     iv

-------
                        TABLE OF CONTENTS (CONTINUED)


Chapter                          Tttle                               Page

                3.  Equation 3	3-46
                4.  Equation 4	3-49
                5.  Equation 5	3-51
                6.  Equation 6	3-52
                7.  Equation 7	3-53
                8.  Equation 8	3-55
                9.  Equation 9	 3-55
               10.  Equation 10	3-56
               11.  Equation 11	3-57
               12.  Equation 12	3-58

  IV     SAMPLE SELECTION		4-1

         A.    Purpose and Initial  Criteria	 4-1
         B.    Methodology	4-1
         C.    Revised Criteria  	 4-2
         D.    Selection Process 	 4-3

   V     DATA COLLECTION .	5-1

         A.    White Westinghouse and Western Electric, Columbus,
               Ohio	5-1
         B.    Montclaire - Starmont, Charlotte, North Carolina  .  . 5-25
               1.   Photograph Scale  .	5-25
               2.   Data Aggregation	5-25

  VI     CAUSAL ANALYSIS	6-1
         A.    General  Approach  	 6-1
         B.    Data Transformations  	 6-4
         C.    Problems in Path Analysis	6-16
               1.   Multicollinearity	6-16
               2.   Suppressor Variables  	 6-19
               3.   Multiple Variable Definitions 	 6-22
               4.   Instrumental Variables	6-24
         D.    Path Diagrams	6-25
         E.    Theory Trimming	6-35
               1.   Approach	6-35
               2.   Discussion of Individual  Equations  	 6-37
         F.    Comparison of Model  Forms	 6-60
         G.    Stability of Path Coefficients	6-63
               1.   Extreme Value Index   	 6-65
               2.   Coefficient Variation Index 	 6-74
         H.    Conclusions	6-77

 VII     REFERENCES	7-1

VIII     APPENDICES

         A.    Bibliography of Land Use Models and Land Use
               Development	A-l
         B.    Plots of Regresssion Coefficients for the Twenty
               Jackknifing Runs	B-l

-------
                            LIST OF TABLES

Table                             Title                              Page
 3-1      INITIAL LIST OF ENDOGENOUS VARIABLES	   3-22
 3-2      MODEL VARIABLES AND DEFINITIONS  	   3-25
 3-3      FIVE EQUATION SIMULTENEOUS BLOCK IN THE RESIDENTIAL MODEL  3-32
 3-4      FIVE EQUATION SIMULTANEOUS BLOCK IN THE. INDUSTRIAL-OFFICE,
          MODEL	   3-32
 3-5      SEVEN EQUATION RECURSIVE BLOCK IN THE RESIDENTIAL MODEL    3-35
 3-6      SEVEN EQUATION RECURSIVE BLOCK IN THE INDUSTRIAL-OFFICE
          MODEL	   3-35
 3-7      ORIGINAL SPECIFICATION OF THE RESIDENTIAL MODEL  ....   3-38
 3-8      ORIGINAL SPECIFICATION OF THE INDUSTRIAL-OFFICE MODEL.  .   3-39
 5-1      LIST OF DATA COLLECTION ITEMS	   5-2
 5-2      WESTERN ELECTRIC/WHITE WESTINGHOUSE 	    5-24
 5-3      COMPARISON OF BUILDING AREA DATA (103 SQ.FT.)  	    5-26
 5-4      CUMULATIVE PERCENTAGE OF CASE STUDY DATA COLLECTION
          OBTAINED FROM PHOTOGRAPHS OF CERTAIN SCALE OR LARGER.   .   5-27
 6-1      PATH ANALYSIS MODEL DATA TRANSFORMATIONS 	   6-5
 6-2      RESIDENTIAL MODEL SIMPLE CORRELATIONS SIGNIFICANT
          AT p < .10		   6-17
 6-3      INDUSTRIAL-OFFICE MODEL SIMPLE CORRELATIONS SIGNIFICANT
          AT p < .10	    6-18
 6-4      ANALYSIS OF SIMPLE CORRELATIONS BETWEEN VARIABLES IN THE
          RESIDENTIAL PATH MODEL  .  	    6-20
 6-5      ANALYSIS OF SIMPLE CORRELATIONS BETWEEN VARIABLES IN THE
          INDUSTRIAL-OFFICE MODEL   	    6-21
 6-6      INSTRUMENTAL VARIABLES USED IN THE FIRST PATH ANALYSIS.    6-26
 6-7      REDEFINITION  OF PATH ANALYSIS MODEL DATA TRANSFORMATIONS  6-38
 6-8      GROSS PRIVATE DOMESTIC INVESTMENT IN THE UNITED STATES,
          1955-1973, IN BILLIONS OF 1958 DOLLARS	    6-50
 6-9      COMPARISON OF PATH REGRESSION COEFFICIENTS FROM THE FINAL
          PATH ANALYSIS AND.THE JACKKNIFrNG STABILITY ANALYSIS FOR THE
          SIMULTANEOUS BLOCK OF EQUATIONS  	 ......    6-66
 6-10     EXTREME VALUE INDEX OF STABILITY FOR ALL PATH REGRESSION
          COEFFICIENTS IN THE SIMULTANEOUS BLOCKS OF EQUATIONS  .    6-71
 6-11     STABILITY INDEX FOR ALL PATH REGRESSION COEFFICIENTS IN
          THE SIMULTANEOUS BLOCKS OF EQUATIONS BASED ON THE EFFECT
          OF VARIATION IN THE COEFFICIENTS ON THE DEPENDENT VARIABLE 6-76

-------
                             LIST OF FIGURES

Fi gure                           Title                            Page
 2-1       ILLUSTRATIVE PATH DIAGRAMS . .	2-2
 2-2       LEAD - LAG RELATIONSHIPS	2-6
 2-3       ILLUSTRATION OF SUPPRESSOR VARIABLE  	   2-13
 3-1       ILLUSTRATION OF EFFECT OF DIFFERING MAJOR PROJECT SIZES 3-3
 3-2       RELATIONSHIPS OF LAND USE TO DISTANCE FROM THE AIRPORT   3-7
 3-3       ZONES USED FOR LAND USE/DISTANCE ANALYSIS 	    3-8
 3-4       FIVE EQUATION SIMULTANEOUS BLOCK IN THE RESIDENTIAL
          MODEL	    3-33
 3-5       FIVE EQUATION SIMULTANEOUS BLOCK IN THE INDUSTRIAL-
          OFFICE MODEL	    3-34
 3-6       SEVEN EQUATION RECURSIVE BLOCK IN THE RESIDENTIAL MODEL 3-36
 3-7       SEVEN EQUATION RECURSIVE BLOCK IN THE INDUSTRIAL-OFFICE
          MODEL	    3-37
 3-8       ORIGINAL SPECIFICATION OF THE RESIDENTIAL MODEL .  . .    3-40
 3-9       ORIGINAL SPECIFICATION OF THE INDUSTRIAL-OFFICE MODEL    3-41
 5-1       WESTERN ELECTRIC CASE STUDY AREA AND MORPC TRANSPORTATION
          ZONES	    5-22
 5-2       WHITE-WESTINGHOUSE CASE STUDY AREA AND MORPC TRANSPORTATION
          ZONES	    5-23
 6-1       ORIGINAL RESIDENTIAL PATH DIAGRAM 	    6-27
 6-2       ORIGINAL INDUSTRIAL-OFFICE PATH DIAGRAM 	    6-28
 6-3       INITIAL RESIDENTIAL PATH DIAGRAM	     6-29
 6-4       INITIAL INDUSTRIAL-OFFICE PATH DIAGRAM 	     6-30
 6-5       INITIAL PATH ANALYSIS FOR THE RESIDENTIAL MODEL  . .     6-31
 6-6       INITIAL PATH ANALYSIS FOR THE INDUSTRIAL-OFFICE MODEL    6-32
 6-7       FINAL PATH ANALYSIS FOR THE RESIDENTIAL MODEL .  .  .     6-33
 6-8       FINAL PATH ANALYSIS FOR THE INDUSTRIAL-OFFICE MODEL     6-34
 6-9       PATH ANALYSIS FOR THE RESIDENTIAL MODEL IN MULTIPLICA-
          TIVE FORM	     6-61
 6-10     PATH ANALYSIS FOR THE INDUSTRIAL-OFFICE MODEL IN MULTI-
          PLICATIVE FORM	     6-€2
 6-11     HYPOTHETICAL LINEAR REGRESSION  	     6-73
                                  vn

-------
                      LIST OF FIGURES (CONTINUED)


Figure                            Title                             Page

 B-l       PLOT OF  REGRESSION  COEFFICIENTS FOR THE JACKKNIFING OF
          THE RES  EQUATION  IN THE  RESIDENTIAL MODEL	B-2

 B-2       PLOT OF  REGRESSION  COEFFICIENTS FOR THE JACKKNIFING OF
          THE COMM EQUATION IN THE RESIDENTIAL MODEL	B-3

 B-3       PLOT OF  REGRESSION  COEFFICIENTS FOR THE JACKKNIFING OF THE
          OFFICE EQUATION  IN  THE RESIDENTIAL MODEL	 .  B-4

 B-4       PLOT OF  REGRESSION  COEFFICIENTS FOR THE JACKKNIFING OF THE
          MANF EQUATION  IN  THE RESIDENTIAL MODEL	B-5

 B-5       PLOT OF  REGRESSION  COEFFICIENTS FOR THE JACKKNIFING OF THE
          HWLMNX EQUATION  IN  THE RESIDENTIAL MODEL 	  B-6

 B-6       PLOT OF  REGRESSION  COEFFICIENTS FOR THE JACKKNIFING OF THE
          RES EQUATION IN THE INDUSTRIAL/OFFICE MODEL  	  B-7

 B-7       PLOT OF  REGRESSION  COEFFICIENTS FOR THE JACKKNIFING OF THE
          COMM EQUATION  IN  THE INDUSTRIAL/OFFICE MODEL 	  B-8

 B-8       PLOT OF  REGRESSION  COEFFICIENTS FOR THE JACKKNIFING OF THE
          OFFICE EQUATION  IN  THE INDUSTRIAL/OFFICE MODEL 	  B-9

 B-9       PLOT OF  REGRESSION  COEFFICIENTS FOR THE JACKKNIFING OF THE
          MANF EQUATION  IN  THE INDUSTRIAL/OFFICE MODEL 	  B-10
 B-10      PLOT OF  REGRESSION  COEFFICIENTS FOR THE JACKKNIFING OF THE
          HWLMNX EQUATION  IN  THE INDUSTRIAL/OFFICE MODEL .....  B-ll

-------
                           LIST OF EXHIBITS

Exhibit                         Title                             Page
 3-1         INDUCED RESIDENTIAL EXAMPLE 	 3-11
 3-2        INDUCED COMMERCIAL DEVELOPMENT  	 3-13
 4-1         EPA LETTER TO REGIONAL PLANNING AGENCIES  	 4-5
 5-1A       ON SITE VISIT DATA COLLECTION FORMS	5-8
 5-1B       CENSUS OF POLLUTION AND HOUSING DATA COLLECTION FORMS 5-15
                                    rx

-------
I.   INTRODUCTION. APPROACH AND SUMMARY OF RESULTS

     A.  INTRODUCTION

         This report documents the results of the first three phases of a
study of the growth effects of major land use projects (GEMLUP)  undertaken
by Wai den Research.  The principal objectives of the GEMLUP study are to
formulate a methodology to predict air pollutant emissions from:

         ••  two types of major land use developments:  large concentrations
            of employment such as office or industrial parks, and large
            residential developments
         «  secondary land development that is induced by the two types of
            major land use development projects
         «  motor vehicular traffic associated with both the major project
            and secondary development.

         GEMLUP relates to a nuniber of EPA programs, including air quality
maintenance plan (AQMP) development [1], environmental impact statement (EIS)
review [2], the indefinitely suspended portions of indirect source review
[3], and the prevention of significant air quality deterioration, or non-
degradation [4].  Explicit or implicit in these programs in an evaluation
of air quality impacts of land use plans or project developments.  GEMLUP
is designed to formulate and test a method of evaluating land use impacts
at the project scale, and, in the process, develop a set of land  use based
emission factors potentially useful at the regional scale.

         The study was divided into six phases

            Phase 1 - specification of a preliminary model and generation
                      of a list of data requirements
            Phase 2 - data collection
            Phase 3 - causal analysis of the land use model  using path
                      analysis
            Phase 4 - development of predictive equations for the land
                      use model  and development of a traffic model
            Phase 5 - development of indices of fuel consumption
                                 1-1

-------
           Phase 6 - translation of fuel consumption indices into land use
                     based emission factors.

         The ability to accurately predict the secondary development of a
major land use development alone is insufficient; it is also necessary to
understand the complex interrelationships inherent in a predictive model.
Therefore, after the formulation of a theory of secondary development, two
distinct phases of analysis followed:  Phase 3) the testing of the theory
using path analysis, and Phase 4] the calibration of the theoretical rela-
tionships as predictive equations.  This report concerns itself with the
former analysis.  The latter analysis (Phase 4} will be included in volume
III of the final report.

         In addition, a separate technical report (Volume II of the final
report) will be prepared documenting the land use based emission factors
(Phase 5 and 6).  Volume III of the final report will document the appli-
cation of the land use model, traffic model, and land use based emission
factors in predicting air pollutant emissions from major development projects
and their secondary effects.

         Two additional  appendices to this report (Volume I)  has been pre-
pared separately [42].  Appendix C documents the land use data and simple
(zero order) correlations between that data.  Appendix D presents the
statistical output of the final path analytic model.

     B.  ORGANIZATION OF THE REPORT

         The remainder of this introductory chapter discusses the general
approach of the study and briefly summarizes the findings.   Following this
chapter, the statistical  techniques employed in this study are reviewed in
Chapter II.   Chapter III contains the specification of the model, i.e., the
translation of the general  theory outlined above into explicit relation-
ships.  The next two chapters, IV and V, summarize respectively, the sample
selection and data collection processes.  Chapter VI contains the results
of the path analysis, i.e.,  the testing of the causal relationships explicit
in the model  and discusses  the results of the causal analysis.
                              1-2

-------
     C.  APPROACH

         1.  Theory of Induced Development

             Taking the industrial/office  major land use project type as
the more general case of the two types investigated, we adopted the fol-
lowing theory of induced development.*

             Constructing a large source of employment like an industrial/
office complex generates jobs which result in the nearby construction of
dwelling units; these induce retail development to locate near them and
generate demand for community, cultural, and religious facilities (schools,
recreation areas, libraries, churches, theaters, fire and police stations,
etc.).  All of this requires the construction of streets and highways that
then improve accessibility to the area.  Better access fosters continued
urban development, particularly highway-oriented commercial and office land
uses.  Additional sources of employment come into the area as secondary
(and tertiary) industry or services locate near the original major project,
spurring on another round of residential development, and so forth.

         2,.  Selection of General Approach

             The selection of an approach for testing this theory was tem-
pered by programmatic considerations.  The limitations of time and resources
precluded highly detailed deterministic approaches patterned after Lowry
[5] or others.**  They were also inappropriate because of their concern with
highly aggregated regional development and/or long planning horizons.  Be-
cause of this and the problems associated with deterministically modeling
a social  system, it was decided to use a stochastic approach.
*  The theory is not entirely new.  Most of the urban development models
   referenced on the same page are based upon the same general  relationships
   posited in our theory.  They are usually less explicit than  ours, however.
** Forrester [6], Hill [7], Seidman [8], and Center for Real  Estate and
   Urban Economics [9].  In addition to these general  or comprehensive models
   there are many single sector models that have been developed since the
   early 1960's.  See reference [10] and [11] for a review of urban develop-
   ment models.
                                 1-3

-------
             Resource constraints also made Unfeasible a dynamic modeling
approach because of the effort involved in obtaining longitudinal data
to incorporate time into the system.  Similarly, a difference approach,
modeling the change the development patterns over a time period was in-
feasible.  Consequently, a static approach using the path analytic tech-
nique based on regression analysis seemed appropriate to test the theory
of induced development.

             The static approach to testing the inherently change-oriented
theory is justified by three factors:  (1) our theoretical  assumption that
induced development follows a single basic causal structure for all cases
or observations, (2) the use of a cross-sectional method of obtaining data
for variables observed in a static state and the assumption that input
variables were initialized at time zero and held constant long enough so
that all the causal consequences in the system were realized, and (3) the
use of certain time-lagged exogenous variables in the system.  The conceptual
usefulness of these factors in testing causal  theory is well  described in
Heise [12] and Blalock [13].

             There was not enough existing information on project-level in-
duced development to be able to define the form of the relationships in the
system.  The form, therefore, was assumed to be linear.*  This is not a
bad a priori approximation in most social science applications, and it
allows the use of well developed statistical techniques.  There is accumu-
lating evidence that many social systems can be approximated by a linear
function as long as operating conditions remain fairly stable [12].  Even
complex nonlinear relationships can often be approximated by a constant
relation iin discrete subregions of the relationship [14].  Also, if a rela-
tionship is thought to be nonlinear on theoretical  grounds  (e.g., multi-
plicative, exponential), the data can be transformed prior to entering it
in the linear analysis [15].
* As part of the testing of the model, a multiplicative relationship was
  tested with a log transformation.  See Chapter VI.
                               1-4

-------
             The selection of an appropriate time period was resolved  in
a somewhat arbitrary manner.  Because of the disparate EPA programs  GEMLUP
focused on and because there was little guidance in the literature,  the
period of analysis was defined to be ten years.

             In summary, the approach consisted  of a cross-sectional model
that predicts the total land use in the vicinity of a major project  ten
years after development of a major land use project.

         3.,  Statement of Fundamental Model
             The basic theory of induced development may be restated  as  fol-
lows:  the amount induced land use is some function of the  size  of the Major
Development Project and certain other variables,  viz.,

             induced land use = f (major project, other variables)

As indicated previously, the approach used in this study limits  one to the
use of endogenous variables that measure the total land use at the end of the
ten year time period.  Conceptually, one can disaggregate the total land
use in the area of influence into three components, land use existing prior
to the development of the major project, new land use induced by the  major
project and certain other variables, and new land use not induced by  the
major project (that is, attributable to some other phenomena such as  general
regional growth).  This may be expressed as:

             total land use = prior land use +
                              project induced land use change +
                              non project induced land use  change

             In predicting the total land use in  an area of influence, one
can identify two types of exogenous variables:

             Type I - those used for predicting the induced land use  com-
                      ponent, such as,

             •  the size of the major project
                                1-5

-------
             •  the induced component of the endogenous  variables  of other
               land uses

             •  other independent variables  influencing  the effect of the
               major project (i.e.,  housing vacancy t+0)


             Type II - those used for predicting the prior or non-induced
                       land use component,


             •  the prior and non-induced component  of the  endogenous land
               use variables

             •  those used for predicting the prior  land use component (i.e.,
               I960 housing density)

             •  those used for predicting the non-induced component (i.e.,
               regional population growth)


             Accordingly, the fundamental model  that we have assumed is:


             land use t+1Q = prior t+Q + induced ^ ^  t+1Q

                             + non-induced  t+Q ^ t+1Q

                           = f (Type I variables, Type  II  variables).


             We note that both the distinctions  between the three land use

components and  the two variable types are unavailable to this study.   The
three land use  components are not measurable; also, several of our indepen-

dent variables  are possibly of both types.
                                1-6

-------
II.  SUMMARY OF STATISTICAL TECHNIQUES (PATH ANALYSIS)

     In this chapter we review briefly the concepts, assumptions, and prob-
lems associated with the application of path analysis.   With regard to the
characteristic problems involved in path analysis, we shall  focus on those
aspects which are most relevant to the project under consideration.

     A.  AIM OF PATH ANALYSIS

         Path analysis was developed by Sewall  Wright [16,17,18,19] as a
technique for studying and dealing with observed interrelated variables for
which it can be assumed that there are several  "ultimate" or exogenous vari-
ables that completely determine or cause them £20}'.   It is a method for study-
ing the direct and indirect effects of variables taken as causes of other
variables considered as effects [21].   It is not capable of deducing causal
relations from available quantitative information (viz., correlation coef-
ficients), but rather it is intended to combine this quantitative informa-
tion with qualitative interpretation [16],  It is a technique useful in test-
ing theories rather than in generating them, and it can be used to study the
logical consequences of various hypotheses involving causal  relations.  In
order to implement the technique, the researcher must make explicit a theo-
retical framework or model.

     B.  PATH DIAGRAMS

         Figure 2-1 presents various path diagrams which display graphi-
cally typical patterns of causal relations among sets of variables.  Figure
2-la shows the relation between seven variables, X,, Xp, X^, X^, U, V, and
W.  Where X-j, X2, Xg, X, are variables of the systems which  are measurable.
The variables U, V, and W are unmeasured and outside the system, but do
affect variables in the system.  These latter variables are  often referred
to as disturbances or residuals.  Usually they are not  represented in a
path diagram and Figure 2-la and Figure 2-1b can be considered the same
path diagram.  The residuals are "understood" to exist.  The arrows in all
the figures indicate the causal flow.   Variable X-,  in Figure 2-la and 2-1 b
                                  2-1

-------
                                                   (a)
                                                   (b)
                                                   (c)
                                                   (d)
FIGURE 2-1    ILLUSTRATIVE PATH DIAGRAMS
                     2-2

-------
is called an exogenous variable for its variability and is assumed to be
determined by causes outside the system.  Variables X2, ^3* an(^ ^4 °f
Figures 2-1 a and 2-lb are called endogenous.  Their variability is explained
by exogenous or endogenous variables in the system.  Variables U, V and W
of Figure 2-1 a can be considered exogenous.

         Sometimes there is more than one exogenous variable in the system
(see Figure 2-1 c and variables X-, and X2).  In this case, the correlation
between the two is represented by a curve line.  The relation between the
exogenous variables is usually not analyzed in path analysis.

         The path diagram shown in Figure 2-1 d represents an interaction
between two endogenous variables Xp and X~.  Analysis of this type of system
presents some serious technical problems which we will discuss later on.*
This interaction is also termed a reciprocal causation or a feedback loop.

     C.  APPROACH (STRUCTURAL MODEL)

         In order to analyze the system of variables under investigation
the path diagram must be formalized mathematically .  A structural model
must be developed which defines a causal structure .specifying the network
of causal paths that exist between variables and identifying the parameters
of causation.  For Figure 2-la, the mathematical model would be:
         X2 = b21Xl + U
         X4 = b41Xl + b42X2 + b43X3 + W
                                                                (2-!)
These equations are called structural equations.  If the variables are stand-
ardized (i.e., have mean zero and standard deviation unity), then the b's
are called path coefficients.  If they are not standardized, the b's a.re
called path regression coefficients or structural coefficients (there would
then also be an intercept in each equation of (2-1)).  The b's are the
parameters quantifying the causal effects.
* See discussion of identification in Section III.D.
                                 2-3

-------
          Given a structural  model,  one has  an  explicit  and  quanitative
statement of theory and so it can be used to explain why variables vary
together as they do.  It is possible with such a model  to calculate how a
change in any one variable in the system will  affect the values of the
other variables.  Also, it is possible to analyze how changing the struc-
ture of the system would affect the character of the system [22].

     D.  CONNECTION WITH REGRESSION

         It should be emphasized that the three equations of system (2-1)
each separately represents a multiple regression equation.   The estimates
of the b's are obtained using standard least squares (i.e.,  the standard
regression technique).  Up to this point, there is no difference between
path analysis and standard multiple regression,  the b's are beta  weights
if the Vs are standardized, otherwise they are partial  regression coeffi-
cients.

     E.  ASSUMPTIONS AND DEVIATIONS

         The transition from the path diagram of Figure  2-1 a to the system
of structural equations (2-1) and the appropriateness of standard  least
squares (i.e., multiple regression)  for estimating the b's  of (2-1) involve
a number of very important assumptions.  We now consider these.

         1.    Linearity

              The change in one variable is assumed to always occur as a
linear function of changes in other variables.   If it is believed  or known
that the relation is nonlinear, for example, say multiplicative:

                    (b^x, + b32x2 + u)                           (2_2)
              A3

then the data must be transformed before the analysis.   If  there is non-
linear "causality" the usual  path analysis will not detect  it.
                             2-4

-------
         2.,  Recursive

             In order for the above discussion to be correct, the system
must not contain reciprocal causation or feedback loops (e.g., as in Figure
2-ld with the variables ^ and X3).  The causal flow must be unidirectional.
At a given point in time, a variable cannot be both a cause and an effect
of another variable.  Such a system is called recursive.

             It is possible to solve a system of structural equations if
there are feedback loops by use of instrumental variables.  Instrumental
variables are exogenous variables introduced into the system for this specific
purpose.  They are used routinely in economic analysis [23,24].  Basically
what they do is to put enough constraints on the structural system so that
by use of estimation techniques (either two-stage least squares or limited
information least-generalized residual-variance solution) [23] the path
coefficients can be estimated.  A point often not emphasized in the "tradi-
tional" sources of path analysis is that little is known about the resulting
estimates of the path coefficients.  The most one can say is that they are
consistent estimates of the true b's.  This means that they converge in
probability to the true b's or, in other words, as the sample size increases
they get "closer" to the true b's.  Basically, nothing is known about the
estimates for small samples (such as 20 in our present problem).  All the
"nice" properties of the multiple regression estimates (i.e. minimum variance,
unbiased estimates) do not hold for the new ones.

             In the present project we had to make use of the above tech-
niques (i.e., use of instrumental variables and two stage least squares)  to
solve for the b's.  In order to judge the usefulness and appropriateness
of the estimated b's we used four approaches simultaneously.  The first
three approaches involved the use of the approximate t tests (to judge if
the b's are statistically significantly different from zero), the beta
weights (to judge if in the standardized scale the magnitude of the effect
                                2-5

-------
                                                      (a)
                                                      (b)
FIGURE 2-2     LEAD - LAG RELATIONSHIPS
                   2-6

-------
1s "large") and the signs of the b's (to verify that the signs  of the
b's are what theory would predict).  The fourth approach involved the  use
of "Jackknifing" to verify that the b's are stable  (i.e., will not change
radically with the inclusion or exclusion of just one observation.*) The
actual implementation of the four approaches is discussed in detail in
Chapter VI on analysis.

             Another way of solving the system if feedback loops exist is
to use lead-lag relations.  This might be possible in longitudinal  data [19].
Consider the path diagrams of Figure 2-2.  In Figure 2-2a, we have an  inter-
action between variables X2 and X.,.  However, say X2 and Xg represent  the
final states after a time period (say, 10 years, for example).   Further, say
it is possible to break the 10 years into two time periods (time periods
1 and 2) (see Figure 2-2b).  In time period 1, we may have variable X2 (called
Xpi) as a cause of Xg (called Xo-i)  but not X^ as a cause of X^-i •   Then into
time period 2 we have X3 as a cause of X2 (i.e., X31 X22).  The system
shown in Figure 2-2b is recursive and given sufficient data it  is solvable.
The dynamics of the system over the full time period are needed.   Unfor-
tunately, for the present study sufficient data was not available to imple-
ment this lead-lag device.

         3.  Causal Priorities

             The causal priorities  among the variables must be  "correctly"
established outside of the analysis.  A major drawback of path  analysis
is that it cannot establish causal  relationships using data alone.   One
needs as inputs both data and theory.  The ordering of the variables (that
is, the direction of the arrows in  the path diagrams) must be established
by the investigator.  "There is no  error-check mechanism in path analysis
to reject an incorrect ordering" [22].  It is common that the data can fit
well two very different orderings.   The selection of the appropriate one
depends on correct theory.  When one is dealing with variables  ordered in
time it is often possible to know the causal  priorities.   Path  analysis
* The concept of "Jackknifing" is discussed further in Part  5  of this
  Chapter.
                                  2-7

-------
seems well suited for this type of problem.  For the present study, the
ordering in time was usually not available so a substantial effort involv-
ing substantive considerations was needed to aid in establishing the causal
priorities.  Also, by use of the instrumental variables and two stage least
squares for estimating the b's in feedback loops, the causal priorities of
some questionable situations were established (e.g., for the cases where it
is not clear that a causal priority exists if the estimated b for a direc-
tion (say X-j to Xp) is zero then the analysis indicates there is no causal
priority of X, to Xp).

         4.  Uncorrelated Residuals of Endogenous Variables

             In the actual analysis the observed correlations among the
observed variables are used to estimate or identify the path coefficients.
If the residual (i.e., the disturbances) are uncorrelatedj then the usual
least squares technique (i.e., multiple regression) can be used for the
estimation.  If, however, the residuals are correlated, then the path
coefficient cannot be identified by the ordinary least squares technique for
there are then too many unknowns.  An implication of this assumption of un-
correlated residuals is that all inputs must be entered explicitly into the
analysis.  (It is possible to ignore a variable that affects two or more
inputs as long as it does not affect the residuals of any endogenous vari-
ables.)  This restriction of uncorrelated residuals is very important and
meaningless and confusing results may come out of the analysis if it is
not met.

             Because of its importance to the project under consideration,
we shall develop a bit further here the consequences of not including all
inputs explicitly in the model.   Say we have three variables X,, X2, and
X3 which are causal  factors of a dependent variable Y.  The appropriate
equation is:

             Y = bylX, + by2X2 f by3X3 + e                   (2-3)
                                  2-8

-------
Here, e is the residual.  Now say we do not include X., in our system and
we "believe" the correct equation is:

             Y = bylXl + by2X2 + e'                          (2-4)

e1 in (2-4) is a new residual.  It is actually e+b -X-.  It can be shown
[23] that the estimate we will obtain for fa -j will actually be an estimate
not of b , alone, but rather of

             byl+by3                               <2-5)

Where b ~ is the real path coefficient given in (2-3) and B.,,  ~ 1S tne
regression coefficient of X3 and X-,  after allowing for the effects of X2.
The important point here is that if we leave out an input it will reappear
in our path coefficients.  We will  have no way of knowing how much of our
estimate is the real b , and how much is byoCB^-i 2)-  ^ is possible that
b , is positive yet the value of (2-5) is negative (a complete change in
sign).  This point is not made very clear in much of the literature in path
analysis.

             Another implication of this fourth assumption is that all resi-
duals should be uncorrelated with all other exogenous variables that precede
them in the system.  If one suspects that the residuals are correlated,
then the technique of two stage least squares can be used to remove this
problem.  However, in doing so, it introduces the problem of having esti-
mates whose statistical  properties are not well known (see part 2 above).
In the present project, for those equations not requiring two stage least
squares estimation techniques, we assumed that the models contain a suffi-
cient number of relevant variables to minimize this problem of correlated
residuals.

         5.  Stability of the Estimates

             In order for path analysis to be of use the resulting esti-
mates should be stable.   They will  be of little use if they change radically,
                                   2-9

-------
depending on the sample observations.  In most applications of path analysis
this is achieved by using large samples and reliable measuring instruments.
Tukey [26] and Turner and Stevens [27] with regard to this point suggested
using path regression coefficients rather than the usual path coefficients
(i.e., beta weights).  Path regression coefficients tend to be more stable
as one applies the same structural model to different populations.

             For the problem of this project we have only 20 observations
to do all the required estimation.  This is a very small sample.  There will
be a real problem of obtaining stable estimates.  One way of judging the
stability of the estimates is to use the statistical technique of 'Mack-
knife".  Instead of running one regression using 20 observations, 20 regres-
sions each containing nineteen observations are run.  The path coefficients
can then be examined for stability.  This technique will not allow the path
analysis to capitalize on chance elements in the data.  This Jackknifing
was done for a selected number of equations in the present project.

         6.  Multicollinearity not Present

             If the correlations between independent variables in a multiple
regression are extremely large in absolute value, we have what is called
the problem of multicol linearity.  If present it is usually impossible to
separate the influences of the variables involved.  The effect on the path
analysis will  be to produce unstable path coefficients (they will have large
standard errors).  Beta weights greater than unity are even possible.  Ob-
viously, the purposes of path analysis can be defeated if col linear vari-
ables are present.   Notice these col linear variables can be both exogenous
and endogenous as long as they appear somewhere as independent variables.

             To handle the problem of multicollinearity we examined the zero
order correlations for large correlations and paid careful attention to the
output of individual regressions looking for  unusually large (in absolute
value) beta weights and standard errors of the regression coefficients.  See
Chapter VI on analysis for further details.
                                  2-10

-------
         7.  Scale

             At one time it was considered important that the variables be
measured on an interval scale.  Now, however, this is not so important if
properly handled [28].  The data for the present proposal does appear to
be appropriate for path analysis.

         8.  Suppressor Variable Problem

             It is possible for a path analysis system to contain too many
input variables.  For example, say X-, refers to the size of a new highway
and X. is the commercial land use.  Further, say Xp and X3 are two other
variables.  The structural model is given by the equations:

             x2 = b2lXl + b2uu

             X3 = b31Xl + b3vV                                 (2'6)
             x4 = b4]Xl + b42x2 + b43x3 + b4ww

and say the correlation matrix is:
xl X2
1 1.000 .800
> 1.000
!

X3
.600
.480
i.ooo

""" X4
.000
.288
.384
1.000
             X3
 Given this it can be shown that the solution for X4 is:

             X4 = -X] + .8X2 + .6X3 + .736W                     (2-7)

 The size of the highway appears to have a negative impact on commercial
 land use (path coefficient is -1).
                                   2-11

-------
             The actual situation is that a model was developed'to produce
the correlation matrix and it was:

             X4 = .48U + .48V + .736W                          (2-8)

U, V and W are the residuals of X2, ^3 an^ %&•  ** and X, are actually
uncorrelated.  The "strange" result shown in (2-7) arises because of the follow-
ing:  If we use X2 and X3 to predict *4 we will have a useful result because
X2 and X, contain X-| which is irrelevant.  In (2-7}the term X^ is used to
"suppress" the irrelevant components in X2 and Xo.  That is, X2 and X., do
contain a portion of X-j in them and this portion is subtracted again in
the equation by adding the negative term with X^.  Figure 2-3 illustrates
the problem.  The system of equation(2-6)is assuming Figure 2-3a while the
true situation is Figure 2-3b.

             The importance of this illustration is that one cannot just
include anything into a structural system and believe all will work out
well.

             To handle this problem we examined the zero order correlations
between all endogenous variables (i.e., those that are dependent variables
in some regressions) with all other variables.  Those cases where zero or
near zero correlations appeared were noted.  Later if the partial relations
(when included in multiple regression or two stage least squares) proved
to be significant, careful  substantive consideration was applied to see if
a suppressor variable problem may exist.  Also the intercorrelations of the
regression coefficients or beta weights (i.e., the b's) were examined .to
detect if the suppressor variable problem existed.  See Chapter VI on analy-
sis for further details.

     F.   OTHER ASSUMPTIONS

         There are other assumptions that deal solely with the fact that path
analysis is a regression technique (e.g., independence of observations).  We
will not enumerate them for they are standard 125].
                                  2-12

-------
                                                   (a)
                                                   (b)
FIGURE 2-3   ILLUSTRATION OF SUPPRESSOR VARIABLE
                       2-13

-------
     G.  THEORY TRIMMING

         To end this review of path analysis we will now discuss the ques-
tion of theory trimming.  One approach to path analysis is to develop the
most elaborate system and then after estimating the involved path coeffi-
cients attempt to refine or trim the system by dropping those paths that
have coefficients "close to zero".  Those paths that are "close to zero"
imply no or only a weak causal linkage between the involved variables.
The possibility for refining or trimming a theory and thus making a theory
more parsimonious is of considerable significance.

         There are a number of ways of deciding which path coefficients can
be dropped.  First, there is the criterion of statistical  significance  (e.g.,
drop all path ceofficients which do not differ significantly in the statis-
tical sense (0.05 level of significance) from zero.  With large samples, this
could imply retaining very small  coefficients and with small  samples this
would imply dropping very large coefficients.  Second, there is the possibi-
lity of dropping all coefficients whose values are less than some preassigned
fixed value (e.g., say 0.05 or 0.03) [29].  There is also the dependence
analysis procedure of Boudon [30].  In this technique, one usually assumes
that some path coefficients of the structural equations are zero.   To test
this procedure, one sees how well the non-zero paths reproduce the original
correlation matrix.  In general,  for any trimming procedure it is  important
to see how deletion of some path coefficients affects the ability to repro-
duce the original observed correlations (an important property of path  analy-
sis itself is the ability of reproducing the original observed correlations
by means of the path coefficients).

         The approach we took involved retaining a variable if (1) its  t
value exceeded unity in absolute value (this guarantees that the adjusted
R is larger with its inclusion than with its exclusion for ordinary multiple
regression), (2) its beta weight exceeded .1 in absolute value (this was
                                   2-14

-------
judged to reflect a substanttvely meaningful  relation)  and (3)  if it was
deemed a priori to be of substantive importance and its sign (i.e.,  the
sign of the coefficient b) was in the right direction.   See Chapter  VI  on
analysis for further detail.
                                   2-15

-------
III. INITIAL LAND USE MODEL SPECIFICATION

     A.  OBJECTIVE

         The objective of this phase of the project was to specify an  initial
land use model  explaining induced or associated land use ten years after
the construction and operation of a major project.   The basic theory under-
lying the development of the model was that major projects have certain
associated or induced l;and uses and these land uses can be predicted based
on the characteristics of the major project and the area in which it locates.
However, because of the approach to testing this model, it is necessary to
include all  land uses in the vicinity of the major  project, whether they
were induced or not induced, or even existing prior to the construction of
the major project.

         Two types of major projects were to be considered in the formula-
tion of the model, residential projects and office/industrial projects.
Because of the differing land uses associated or induced by these types of
projects, a separate model was constructed for each.   Thus, two models were
developed, one explaining induced or associated land uses ten years after
the construction and operation of a major residential  project; the other
explaining induced or associated land uses ten years after the construction
and operation of a major industrial/office project.

     B.  DEFINITION OF MAJOR PROJECT

         For purposes of the model specification, a major residential  pro-
ject was defined as housing facilities, planned unit developments or new
towns containing a minimum population of 4,500; a major industrial/office
project was defined as an office or industrial park or a research and
development complex with a minimum employment of 2,250.  Both types of
projects were initially assumed to reach nearly 80  percent occupancy within
two years of operation.  However, during case study selection, the defini-
tion of major project was somewhat modified to permit phased projects
(See page 4-3).
                                 3-1

-------
         In addition, for purposes of calibrating the model, the/case studies
to be analyzed were required to be projects built between 1954 and 1964.   The
induced or associated land uses were those as of the year 1970; i.e.  the
year by which it was assumed that the land use impacts of the project had
stabilized.

     C.  AREA OF INFLUENCE

         The size and definition of the geographical area for which the causal
model and predictive equations are specified and calibrated is critical to
their success and applicability.  If the area selected is too large,  the model
will accurately predict the end state  land use, but the influence of the
major project will not be seen.  This is due to the noninduced land use over-
whelming the induced land use.  On the other hand, if the area selected is
too small, the model will not capture all  of the induced uses; this also .
leads to an underestimate of the major project's influence.  Consequently,
two issues concerning the area of influence required resolution:

         1.  Should the area of influence vary with the size or density of
             the major project?

             The statistical analysis to be employed requires that the
major projects show substantial variation in size.  This assumes  that a larger
project will  have a greater amount of induced and secondary effects.

             As a major project increases in size or decreases in density
it will occupy a relatively larger amount of land.   The more land area in
the major project, the less area there will be in the remaining circle.
Figure 3-1(a) illustrates this concept.  With a larger amount of land area
in the major project, a circle of radius r' would have to be drawn so that
the area inside the circle, but not in the major project, remains constant.
Few of the major projects considered as case studies were much more than
one square mile.  This is a relatively small amount of land in a  circle of
radius two miles (3.22 km) and area of 12.56 square miles (32.53  sq.  km)
or a circle of radius 3 miles (4.83 km) and area of 28.27 square  miles
(73.22 sq.km).  Therefore, tt is felt that this concern can be ignored.
                                   3-2

-------
                                                          (a)
                                                          Cb)
    Q
                                                          (c)
FIGURE 3-1  ILLUSTRATION OF EFFECT OF DIFFERING MAJOR PROJECT SIZES
                           3-3

-------
             Even with major projects of Identical  area, a higher density
project is assumed to have a greater inducing effect (due to more employees
or residents).  Figures 3-1(b) and 3-l(c) illustrate this concept.  For a
project of size 4, a circle of radius r is drawn, and the amount of induced
land use Q is plotted against the distance d from the major project.  At
distance r from the major project, the induced land use falls to zero.   The
size of the area under the curve is the total induced land use for a project
of size 4.

             To the right in Figure 3-1(b) and 3-1(c), a project of size 8
is shown.  It is assumed that the project has a larger induced land use, so
the area under the curve must be greater.  Depending on where this addi-
tional induced land use occurs, a circle of radius  r or r1 would be appro-
priate.  If the curve represented by distance r' is the true case and a
circle of radius r is used, some of the induced land use will not be captured,
and the statistical analysis will underestimate the impact of the larger
major project.

             In reality, the curve for a larger project is somewhere between
the extremes shown in Figure 3-1(c).   Although it is believed that a larger
major project will  have some influence at a greater distance than a smaller
project, we have assumed that the additional  induced land use between r and
r' is negligible.   This assumption was made in the  interest of simplifying
the model.  If a varying study area (area of influence) were allowed, it
would create two difficult problems beyond the scope of this study:  1) the
development of a systematic definition of the area  of influence as a function
of major project size and 2)  the definition of the  induced land uses as a
rate, i.e., square feet of induced land use per acre of study area.

             In conclusion, the size  of the area of influence will  be con-
stant, regardless  of major project size.
                                    3-4

-------
         2.  What Is the appropriate size of the study area (area of
             i rif 1uence) for eacTi laajor :-pl?g jiefrt: type?

             The induced growth from a njajor development project would be
distributed throughout an urban area; however, in selecting the size of the
area of influence, we have assumed that the absolute amount of induced de-
velopment is some declining function of the distance from the major project.
The absolute amount of induced development will, of course, be influenced
by other variables (the determination of that influence is the goal of
this project.

             For example, the construction of a major residential development
would create a demand for retail trade facilities to serve the residents.
The residents would generally minimize travel time and would shop at shopping
centers closest to them.  Sales due to the residents of the major develop-
ment would, therefore, decline with the distance from the development.  The
sales could occur at new shopping centers or at existing ones enlarged to
handle the increased volume.

             Induced residential development from a major industrial or office
development would generally follow the same pattern.  The employees would seek
to minimize their travel time from home to work while trading off other
factors such as the cost of land, quality of the residential neighborhood,
etc.

             At a small distance to the major project, the absolute amount
of land available for development would be limited.  Accordingly, the first
assumption is modified such that the absolute amount of induced development
is at first an increasing function with distance at small distances from the
major development, viz.,
                                   3-5

-------
   Absolute Amount of
   Induced Development
                             Distance

             The general shape of this curve is confirmed by the sole item
of literature relevant to this question known to us.   The Air Pollution
Impact Methodology for Airports - Phase I, [31], in part, analyzed changes
in land use before and after the development of Chicago's O'Hare Airport
in the early 60's.  Figure 3-2, reproduced from that study, shows the rela-
tive proportion of land devoted to a particular use by distance from the
airport and time from development.  The horizontal  axis,  labeled zone,
refers to 5 concentric squares surrounding the airport (see Figure 3-3),
each one is approximately an additional mile (1.609 kilometers) from
the airport.  The vertical  axis, labeled "percent zone land use", refers to
the percent of the land in each zone that is devoted  to a particular use.

             Manufacturing and warehousing land use would seem to be the
most significant induced use from the airport development.   The percentage
of land that is industrial  in 1970 and 1960 is tabulated  on the following
page.
                                  3-6

-------
           40
         o 30
         z
         a
         z 20
         O
           10
                                                   I97Q
co
I
                   RESIDENTIAL
                        2 .       ,  3  .
                         distance(zoNE)
                                   14
                                   12
                                 o
                                 z
                                 4
                                 bl
                                 Z
                                 O
                                 Ul
                                 o
                                 (E
MANUFACTURING

     AND

 WAREHOUSING
                                     1234             5
                                                 distance(zoN$

                                FIGUR£ 3-2  RELATIONSHIPS OF LAND USE  TO  DISTANCE FROM THE AIRPORT

-------
                   A.   r
                    .*••*

FIGURE 3-3  ZONES USED FOR L/WD USE/DISTANCE ANALYSIS



NOTE:  The Dotted Line Indicates the Boundaries of the

       Airport.
                          3-8

-------
Zone
1
2
3
4
5
1970
Industrial
Percentage
3%
13%
10%
3%
5%
1960
Industrial
Percentage
1%
Wo
5%
2%
4%
60-70
Percentage
Growth
200
223
100
50
25
             Both the graph (Figure 3-2}  and the data above seem to support
the general shape of the curve if one assumes the change in industrial  land
use around 0'Hare Airport is solely attributable to the airport..

             One criterion for selecting  the size of the study area would
then be to maximize the amount of the induced development that would be
included.  In the case of an airport, we  conclude that three to four miles
would be sufficient, however, the land use model being ,developed under  the
current contract will predict the total  land use rather than the change in
land use patterns induced by a major development.  The model is cross-sec-
tional; thus, the successful identification of the impacts of a major project
will depend on observing a correlation between the frequency of a particular
land use and the size of a major project.  For example, if the real state of
the world is that airports induce industrial development, then a study  design
using cross-sectional analysis and a small sample size will have the best
changes of identifying the real state of  the world if the study area is
selected such that the ratio of induced (1960-1970) industrial land uses to
existing (1960) land uses is maximized.

             This ratio is unknown since  identifying induced land uses  is
the goal of the contract effort.  It is also probable that the ratio will
vary with the type of land use being predicted, the size of the major pro-
ject, and the category of the major project.  With these caveats, however,
an indication of the appropriate study area size can be obtained.
                                  3-9

-------
             a.  Induced Residential Example

                 The national average for home to work trip lengths for in-
dustrial plants [32] provide a rough estimate of the location of the induced
residential development.  Using Middlesex County, Massachusetts, as an example,
the ratio of induced residential development to existing housing is computed
in Exhibit 3-1.

                 At any radius that is greater than two miles, the number of
induced households as a percentage of existing households falls below 10
percent.  Given the small sample size in this study, it is doubtful the analy-
sis could be successful at a greater radius.  On the other hand, a radius of
two miles or less would compromise the usefulness of the product as less than
half (800 households/2000 households =40%) of the total  induced households
would be in the study area.

             b.  Induced Commercial Example

                 Using a methodology similar to the above, the location of
induced commercial  development can be obtained.  As shown on Exhibit 3-2, the
multiplier dy/dx, measuring the increase in regional income, Y, per increase
in export, X, is roughly estimated to be 1.45.  This approach treats the
development of a major residential  project, with its 2000 household consump-
tion of goods and services,  as an export as far as the region is concerned.
The increase in regional goods and services due to 2000 household major pro-
jects is then estimated.  Finally,  the ratio of induced to prior commercial
development is estimated.  As indicated, it peaks at 2 miles; however, note
that a circle of this radius includes only 30 percent of the total  induced
commercial development.

                 In summary, the three examples indicate that a radius of
between 2 and 3 miles would  be appropriate.  We have elected to use a study
area of 10,000 acres, i.e., a circle with a radius of 2.23 miles.
                                   3-10

-------
                              EXHIBIT 3-1
                      INDUCED RESIDENTIAL  EXAMPLE
Objective:  Estimate of the induced residential  development as a percentage
            of prior residential.

            Assume a plane of constant household density with the subsequent
            introduction of a 2000 employee industrial  facility.  Middlesex
            County, Massachusetts, will be used as an example.

I.    DATA

            Middlesex County:

                      Households          = 461,000 [33]
                      Total Acres         = 540,564 [33]
                      Household Density   = 0.8528 households per acre

            National home to work trip length frequency of auto drivers and
            walks to work;[32]:

                 Distances            Cumulative Percentage of Auto Drivers
            Kilometers    Miles       	and Walk to Work Commuters	
               .805        ..5                       5.0
              1.609        1.0                      17.5
              3.218        2.0                      40.0
              4.827        3.0                      61.5
              6.436        4.0                      68.5
                                 3-11

-------
CO
I
ro
                                               EXHIBIT 3-1  (Continued)

                                              INDUCED RESIDENTIAL  EXAMPLE
Radi us
Miles
1
.5
1.0
2.0
3.0
4.0
Km


Area (j
Acres
2

1.
3.
4.
6.
8
6
2
8
4
503
2011
8042
18096
32169
iiK2
Sq
3
2.
8.
3.
7.
1.
* 640)
. Meter

Oxl O6
IxlO6
3x1 07
3x1 07
3x1 08
Cumulative
Prior Households
(#3 * 0.85)
4
429
1709
6836
15381
27344
Cumulative Proportion Cumulative Induced
of Induced Households as Percent Induced
Households of Prior
5
0.
0.
0.
0.
0.
05
175
40
615
685
(#4 * 2000) (#5/#3)
100
350
800
1230
1370
23.
20.
11.
8.
5.
3
5
7
0
0

-------
                              EXHIBIT 3-2
                    INDUCED COMMERCIAL DEVELOPMENT

Objective:  Estimate the induced retail trade and services as a percentage
            of total retail trade and services.

            Middlesex County, Massachusetts, is used as an example.

I.   Data
            Middlesex County:

                 Retail Trade, Thousand Dollars = 3,703,001 [33]
                 Services, Thousand Dollars     = 1.090.370 [33]
                                        TOTAL   = 4,793,371
                 8,867 Thousand Dollars of Commercial Sales per Acre
                10,398 Thousand Dollars of Commercial Sales per Household
                      Median Disposable Income - 13076 Dollars [34]
                      Disposable Income = 6,782,652 Thousand Dollars

II.  Income Equation

            Conceptually, the major project can be thought of as an export.

            Y = C]  + C]  - M  - Ms + I + G + X
                  g     s

     where
            Y = regional income
          C,  = consumption of local goods
            9
          C,  = consumption of local service
            's
           M  = imports of goods
           M  = imports of services
            I = investment
            G = government
            X = exports
                                    3-13

-------
                             EXHIBIT 3-2 (CONTINUED)
                          INDUCED COMMERCIAL DEVELOPMENT

Consumption Local Goods:      GI  = 3703001/6782652 = .54595     .54595
                                9
Consumption Local Services:   C]  = 1090370/6782652= .16076     .16076
                                S                      TOTAL     .70671
Assume 60% of goods imported
Assume 40% of services imported
           Y = .54595Y + .16076Y - .6(.54595)Y - .4(.16076)Y + I + G + X
           Y = (I + G + XJ/.685
       dY/dx =1.45

III.  Induced Land Use Calculation

            AX = AHouseholds * $13,076
            AY = AX * 1.45
            ACommercial = AY * 0.70671

Assume that prior retail trade and services evenly distributed, 8.867 thousand
dollars per acre.  Assume a 2000 household major project.

Shopping center trip length distribution 132].

            distance
      Miles        KM
        1           1.6       5%
        2           3.2      30%
        3           4.8      50%
        4           6.4      60%
        5           8.0      75%
                                   3-14

-------
oo
en
                                                EXHIBIT 3-2  (CONTINUED)
                                            INDUCED COMMERCIAL DEVELOPMENT

Radlus Area „
(miles)
1
2
3
4
5
(Km) (acres) (vf-)
1.6 2011 S.lxlO6
3.2 8042 3.3xl07
4.8 18096 7.3xl07
6.4 32169 1.3xl08
8.0 50265 2.0xl08
Prior
Commercial Cumulative
Activity
(Dollars)
17831
71308
160457
285243
445703
Trip
Frequency
.05
.30
.50
.60
.70
Households
Shopping in Area
100
600
1000
1200
1500
Induced

Commercial Induced Coriunef ci al
AX
1307.6
7845
13076
15691
19614
.AY
1896
11376
18096
22752
28440
(Dollars)
1340
8039
13399
16079
20099
Prior Commercial
7.5%
11.3%
8.3%
5.6%
4.5%

-------
     D.  IDENTIFICATION

         In this section we briefly review the limitations imposed on the
model specification due to the requirement of identification.

         When there are feedback loops or reciprocal causal paths in a path
analytic model, then the residuals in the model are correlated with the endo-
genous variables and ordinary least squares yields biased and inconsistent
estimates of the structural coefficients.  For example, say X, and X2 are
residential and commercial variables measured in appropriate units and say
there is believed to be a feedback loop between these.  Then possible struc-
tural equations are:

         Yl = al + b21Y2X2 + el                                (3.1}
         Y2 = a2 + b12Y1X1 + e2

It can be shown [35] that the residual e^ is correlated with Y2 and the
residual e2 is correlated with Y-,.   Ordinary least squares applied to either
of these equations will result in biased and inconsistent estimates of the
structural  parameters a^, b2-j, a2,  and b,2.  By biased and inconsistent
estimates we mean that the average value of the estimates will not equal
the values of the structural parameters no matter how large the sample size
may be.

         The approach usually followed at this point to obtain appropriate
estimates of the structural parameters is to introduce instrumental vari-
ables and apply two stage least squares.  This approach will yield consis-
tent estimates if the relevant equations are properly identified.  The term
"instrumental  variable" merely means a variable which acts as an instrument
in making the system identifiable.   Below, we review how instrumental vari-
ables are introduced, how two stage least squares is applied and how the
identifiability condition arises.

         Given the structural system of equation 3-1, one must introduce  at
least one new variable which is a wcause|f of Y-j. but not of Y2<  Let us call
                                  3-16

-------
this variable X-j. Similarly a X^ should be introduced for Y'2.. ^3 and X4
are instrumental variables. There is no reason why there need be only one
instrumental variable for each endogenous variable. There must be at least
one. To allow for more generality let us say we have a second instrumental
variable for Y, and let us call it Xg. The important point is that we need
some instrumental variables which are causally related to one of the endo-
genous variables, but not to the other in the system. It is possible to have
some exogenous variables (which are still often called instrumental vari-
ables) which are causes of more than one of the endogenous variables. For
our example, let us say Xg is such a variable.

The revised structural system is now,
Y2 = a2 + blzYl + b42X4 + b62X6 + e'2

The e1, and e'2 of (3-2) are not the same residuals as e-, and e2 of (3-1).

It can still be shown that Y2 is correlated with e\and Y, is cor-
related with e1 2 (i.e., we have correlated residuals), so ordinary least
squares still cannot be applied directly to obtain consistent estimates of
the structural parameter b's. However, if we manipulate the equations of
(3-2) a new system of equations which have Y-, and Y2 only on the left viz.,

V a'l +bl31X3 + b'41X4:+b^5 + bl61X6 + etll
= a> + b
-------
Next, the estimates of Y^ and Y2 obtained from (3<-3) can be sub-
stituted into the right hand side of the equations in (3-2). Using YI and
Y2 to represent the estimates, the system (3-2) with this adjustment becomes,
Yl = al + b21 + b31X3 + 551X5
e'2
Y2 is not correlated with e'-| and Y-, is not correlated with e'2 and so ordi-
nary least squares may now be applied to obtain consistent estimates of the
structural parameter, the b's. This is the second stage in the two stage
least squares.

In order to obtain consistent estimates of the b's it is neces-
A
sary that Y2 is not a linear function of X3, X5 and Xg alone. This is the
identification problem (we want to "identify" or estimate the b's) and by
bringing X^ in the system we have this necessary condition satisfied.
Recall X4 is the instrumental variable that is causally related to Y2 but
not YI . A similar necessary condition is needed to identify the b's of the
second equation of (3-4).

The necessary condition for identification can be formalized as
follows: Say we want to estimate the b's of the first equation of the
system (3-2). First, one counts the number of instrumental variables in
the full system. Here, there are 4 - X3, X5, Xg and X^. Next, one counts
the number of instrumental variables excluded from the equation under con-
sideration. Let us call this number mn. Here nu is one for variable X,.
Then we count the number of endogenous variables on the right hand side of
the equation. Let us call this number q. q for this problem is one for
Y2. The necessary condition for identifying the parameters of the equation
under consideration is then, that,

m0 >. q -1 (3-5)
3-18
-------
[35]. For our example 1 >J - 1 = 0 so we have identification. Condition
(3-5) is necessary but not sufficient (i.e., if an equation is identified
it must satisfy (3-5), however, there are equations that satisfy (3-5) but
are not identified.) The sufficient condition involves rank conditions of
various matrices [35].

Due to the existence of feedback loops in the present project,
the problem of identification required consideration. In the model speci-
fication phase of this project, care was taken to specify exogenous variables
that could serve as instrumental variables. Additionally, the necessary
condition for identification (3-5) was continually checked for compliance.

Durinq the analysis phase of this project (Chapter VI) an unantici-
pated problem related to identification developed. As indicated above, in the
first stage of the two stage least squares, ordinary least squares is applied
to the right hand side endogenous variables. All the instrumental variables
in the system are used as right hand side variables. As the model, as speci-
fied later in this chapter, had five simultaneous equations, there were
numerous instrumental variables. In some instances, they approached twenty,
which is the sample size. Thus, the degrees of freedom approached zero.
This problem and its resolution is discussed in the chapter on Causal
Analysis.(Chapter VI).

E. METHODOLOGY

The specification of the model was based on (1) a literature search
to identify methodologies and case studies which had been used to determine
land uses associated with major projects and (2) the prior experience of
personnel with land use planning and forecasting, land use models, impact
analyses, and large development projects. In general, the literature
search was not productive. However, this does illustrate the potential use-
fulness of the GEMLUP project.
3-19
-------
While many land use models have been developed, these have dealt
primarily with total regional growth and activity. They have thus been
more complex and at a larger scale than the model we were developing. An
example of such a "regional" model is the EMPIRIC activity allocation model,
such as that used in the Metropolitan District Commission Wastewater Study
[36]. This model allocates projected employment and population to areas
within the region based on factors such as vacant land, accessibility, availa-
bility of utilities, population characteristics, etc. While useful in show-
ing what influences developed, the model does not address the specific prob-
lem we are examining, i.e., induced or associated land uses after a major
project has been built.

In addition to "full" models, such as EMPIRIC, there are also many
"partial" models. These deal with one section of the economy such as the
retail or residential sector. For the most part these models tend to be
market and/or location oriented. They are designed to answer either the
question of what is the optimum location for that particular type of activity
or what is the market for such an activity in a certain geographical area.
An example of such a "partial" model is the "Retail Market Potential Nbdel"
by Lakshman and Hansen [37]. This model predicts the size and number of
retail establishments based on consumer shopping expenditures and accessible
to that area. As in the case of the "full" models, these "partial" models
are helpful in providing insight on what influences certain land use acti-
vities (e.g., relationship between retail establishments and consumer expendi-
tures) they do not, however, establish quantitative relationships between a
major project and surrounding land uses.

The literature search also investigated methodologies which have
been used in environmental impact studies to predict secondary impacts of
major projects. The most useful literature found on this topic was a
previous study by the U.S. Environmental Protection Agency titled "Secondary
Impacts of Transportation and Wastewater Investments: Review and Bibliography"
[38]. This report contained an extensive bibliography and discussion of
methodological and studies performed concerning impacts of transportation and
3-20
-------
wastewater investments. Because our area of interest was residential and
office or industrial projects rather than transportation or wastewater in-
vestments, jnany of the impacts and relationships described in this report
were not applicable to our project. The report was valuable in helping to
direct our thinking. Appendix A contains a bibliography of the sources used
in the literature search.

As indicated, the literature search did not produce what we were
most interested in finding, viz., quantitative descriptions of land use
induced by an association with a major project. Therefore, in developing
our causal hypothesis we had to rely primarily on our own past experience
and professional judgement, i.e., what we had observed with respect to
land use development, particularly when major projects were built.

F. MODEL DESCRIPTION

Endogenous Variables

Due to the requirements of the emission factors, the units of
the endogenous variables for both the residential model and the industrial/
office model are building floor area (except for residential and outdoor
recreation land uses and highway lane miles) in each of 12 land use cate-
gories. These land case categories are residential, retail, office, manufac-
turing, whole sale and warehousing, hotel, hospital, cultural, churches,
public education, outdoor active recreation, and highway lane miles. These
particular categories evolved from a process which balanced the following
considerations:

• What land use output was needed for estimating emissions.
• What land use output could most effectively be predicted
using a causal model; and
• What land use output would be available during data col-
lection to calibrate the model.

The initial list of the desired model output is shown on Table 3-1. The
list was extensive, as the intent was to arrive at the most precise estimate
of land use and of emissions as possible. The final list of model output
3-21
-------
TABLE 3-1

INITIAL LIST OF ENDOGENOUS VARIABLES
Land Use
Units of Measurement
Residential

SF attached
SF detached
2 Family
MF (low rise -
4 stories or less
excluding 2 family)
MF (high rise - 4
stories or more)
Mobile Homes

Transient Lodgings
(Hotels, motels,
dorms)

Manufacturing

Transportation services
(except airport and
warehousing)

Wholesale Trade

Retail Trade (including
certain personal
services)
Units*
Units
Units
Units

Units
Units
Units
Rooms*

Total sq.ft.

Sq.ft. in buildings 9
<25000 ft2 (<2,323 nT) ,
25000-50000 ft2.(2323-4645 nT)
50000-1OOOQO ft2 (4645 - 9290 m2)
>100000 ft2 (>9290 m2)
* Will be converted to sq.ft. based on average size per unit or room.
3-22
-------
TABLE 3-1 (CONTINUED)

INITIAL LIST OF ENDOGENOUS VARIABLES
Land Use
Units of Measurement
Office (all office includ-
ing government, utilities
and communication offices)
Warehousing

Government (non-office)

Cultural

Churches

Hospitals
Entertainment
Indoors (except
sports assembly)

Education, Public
Education (other)
Sports Assembly

Ai rport
Sq.ft. in buildings 9
<25000 ftd (<2.323 nf)
24000-50000 ft2 (2323-4645 n.
50000-100000 ft2 (4645-9290 m
>100000 ft2 (>9290 m2)

Total sq.ft.

Total sq.ft.
Sq.ft. in buildings
<24000 ft2 (<2.323 m2)
24000-50000 ft' (2323-4645 nf)
50000-1000QO ft2 (4645-9290 m2)
>100000 ft2 (>9290 m2)

Total sq.ft.
Sq.ft. in buildings.
<24000 ft2 (<2.323 m2)
24000-50000 ft2 (2323-4645 n
50000-1OOOQO ft2 (4645-9290 m)
>100000 ft2 (>9290 m2)

Sq.ft. in buildings
<24000 ft* (<2.323 mz)
24000-50000 ft2 (2323-4645 m')
50000-100000 ft2 (4645-9290 m2)
>100000 ft2 (>9290 m2)

Total sq.ft.

Total sq.ft.
3-23
-------
TABLE 3-1 (CONTINUED)
INITIAL LIST OF ENDOGENOUS VARIABLES
Land Use Units of Measurement
Highway right-of-way Acres
Agriculture Acres
Mining Acres
Open Space Acres
3-24
-------
TABLE 3-2

MODEL VARIABLES AND DEFINITIONS
1. RES = number of housing units in area of influence in 1970 (excluding
major project).

2. COMM := commercial land use in area of influence in 1970 in 1,000
square feet

Commercial land use includes the following land use codes (LUC)
as used by the Public Service Administration Service in its
1962 Land Use Classification Manual.

LUC 52-59 Retail trade
61 Personal services
63 Automobile service
64 Miscellaneous repair service
65 Indoor amusement service

3. OFFICE = office land use in area of influence (excluding major project)
in 1970 in 1,000 square feet

LUC 60 Finance, Insurance, Real Estate
62 Business services
67 Medical, Health, Legal services
68 Other professional services

4. MANF = manufacturing land use in area of influence (excluding major
project) in 1970 in 1,000 square feet

LUC 2 Nondurable goods manufacturing
3 Durable goods manufacturing

5. WHOLE = wholesale/warehouse land use in area of influence in 1970 in
1,000 square feet

LUC 50 Wholesale
46 Warehousing

6. HOTEL. = Hotel and motel land use in area of influence in 1970 in 1,000
square feet
LUC 07 Hotels, Motels, Tourist homes

7. HOSPTL = hospital, etc., land use in area of influence in 1970 in
1,000 square feet

LUC 77 Hospitals, sanatoria, convalescent homes and rest
homes

8. CULTUR = cultural land use in area of influence in 1970 in 1,000
square feet
LUC 76 Museums, libraries, art galleries, except churches
(764), arboreta (762), cemeteries (767).
3-25
-------
TABLE 3-2 (Continued)
MODEL VARIABLES AND DEFINITIONS
9. CHURCH = religious land use in area of influence in 1970 in 1,000
square feet
LUC 764 Churches
765 Other religious services
10. EDUC = public educational land use in area of influence in 1970 in
1,000 square feet.
LUC 74 Public Schools
11. REG = active outdoor recreational land use in area of influence
in 1970 in acres
'12. (a) HWLMNK = highway land miles in area of influence in 1970,
excluding limited access highways.
12. (b) HWLM = highway lane miles in area of influence in 1970
13. (a) MPR70 = residential land use in major project in 1970 in
dwelling units
13. (b) MPR68 = residential land use in major project in 1968 in dwelling
units
13. (c) MPRt2 = residential land use in major project in base year plus
two (t+2) in dwelling units.
14. DUACRE = dwelling units per acre in area of influence in 1960
15. VACACR = percent vacant developable acreage in area of influence
in year (t+0)
16. VACHSG = percent vacant housing in area of influence in 1960
17. HWYINT = highway interchanges in area of influence in year (t+5)
18. MINCC = median income of families and individuals in area of
influence relative to U.S. median income in 1960
19. INCMP = variable indicating the median income level of major pro-
ject compared to surrounding community in year (t+2)

HWLM without expressways may be used instead, since commercial develop-
ment may be mostly strip highway development along major arterials
Data for the year (t+2) or 1970 may be used instead
3-26
-------
TABLE 3-2 (Continued)
MODEL VARIABLES AND DEFINITIONS
20. OFFVAC = percent office buildings vacant in metropolitan area in
year (t+0)
21. OFFACR = office employment per acre in area of influence in year
(t+0)
22. DISCEiD = distance from center of major project to CBD in year
(t+0)
23. ENERGY = cost factor for electricity ($/1500 KWH) for commercial
users in the metropolitan area in year (t+0) divided by
the average U.S. commercial rate in 1960
24. RRMI = railroad mileage in area of influence in year (t+0)
25. WWEA = warehouse and wholesale employment per acre in area of
influence in year (t+0)
26. EMPACR = total employment per acre in area of influence in year
(t+0)
27. NONHSE = nonhousehold population per acre in area of influence
in 1960
28. MPKIDS = sch.oolage children per dwelling unit:in major project
in year (t+2)
29. ENRACR = public school enrollment per acre in area of influence
in 1960
30. MANACR = manufacturing employment per acre in area of influence
in year (t+0)
31. DELPOP = growth factor for total regional population between
1960 and 1970 (county data)
32. DELEMP = growth factor for total regional employment between
1960 and 1970 (county data)
33. MINCR = median income of the region in year (t+0) relative to
the median U.S. income in 1960
34. (a) MPE70 = number of employees in major project in 1970
34. (b) MPE68 = number of employees in major project in 1968
34. (c) MPEt2 = number of employees in major project in base year
(t+2)
35. AUTO = automobile drivers per acre in county in 1960
3-27
-------
(the first 12 variables on Table (3-2) was a much simplified version of the

earlier list.) This simplified version resulted from these three decisions:

• It was decided to aggregate categories of land use (i.e.,
residential units, total office square feet, etc.) rather
than use the more specific subcategories within these broad
categories (i.e., number of two family units or office
square feet in buildings greater than 100,000 square feet,
9.3 x 103 square meters). We made the decision as we felt
our causal hypothesis would be more accurate and thus more
successfully tested at this larger scale. Further disaggre-
gation would be performed in the model calibration phase,
after the aggregated model had been tested.

• Certain of the initial land use categories were combined
based on the availability of data and our interest in sim-
plifying the causal structure. For example, warehousing and
wholesale trade were combined. In each of these cases, it
was felt that the disaggregated categories offered no further
accuracy in predicting emissions than the aggregate category.
• Certain of the land use categories were excluded from the
model as either the categories were independent of the major
project (i.e., correctional institutions) or the categories
produced only insignificant quantities of emissions. The
categories eliminated were: government (non-office) such as
correctional institutions and military installations; air-
ports; sports assemblies; educational uses except public
education; passive open space; agricultural and mining.
2. Exogenous Variables

The model consists of 23 independent variables. These variables

represent (the numbers refer to the order on Table 3-2):

• Population housing and employment characteristics (variables
13, 14, 16, 18, 19, 20, 21, 25, 26, 27, 28, 29, 30, 34)

• Accessibility measures (variables 17, 22, 24)

• Developability measure (variable 15)

- Regional influences (variables 23, 31, 32, 33, 35)

The independent variables for each equation were selected

because of their perceived causal relationship with the dependent variables.

Prior to selecting these variables a complete list of all possible factors

influencing the dependent variable was prepared. From this list the most

significant variables were identified.

3-28
-------
The specific format for each independent variable was developed
based on 1} availability of data; 2} consistency of data among case studies;
and 3} the appropriate time period for the data. Several of the major deci-
sions made in this regard were:

a. Availability of Data

For many of the independent variables (variables 14, 15, 16,
17, 18, 21, 24, 25, 26, 27, 29, and 30) the desired level of aggregation for
the data item was by area of influence (i.e., within the 10,000 acres).
Because the area of influence of the project did not coincide with boundaries
of recognizable geographical divisions, such as municipal boundaries or cen-
sus tract boundaries, obtaining information for certain independent variables
for area of influence was often very difficult. In these cases the informa-
tion was collected for the census tracts comprising the area of influence.
Thus, the level of aggregation for variables 14, 16, 18, 27, and 29 was by
census tracts; for variables 15, 17, 21, 24, 25, 26, 30 by area of influence.
The distinction between the two groups was how the information was most easily
available.

The level of aggregation for certain variables was regional.
In these cases the variable was expressed at that level either because the
regional level was the smallest unit where accurate information was readily
available (variables 20 and 23) or because the variable was expressing
regional influences on the land uses within the area of influence (variables
31, 32, 33 and 35).

b. Consistency of Data

Many of the independent variables (variables 14, 15, 16, 20,
21, 23, 25, 26, 27, 28, 29, 30, 31, 32, and 33) were expressed as density
factors or as percentages rather than in absolute numbers. The intent was
to assure that the data was consistent among case studies.
3-29
-------
(1) Time Periods. The time period for the majority of
independent variables is either the base year (t+Q) when the project started
or 1960. The reason some variables are for the period t+0 and others for
I960 is due to data constraints.

The basic objective was to have all these variables
expressed for the base year (i.e., baseline conditions prior to the start
of the project). Where it was very difficult to obtain data for the base
year, then census data for the year 1960 was used. The specific data items
for 1960 are variables 14, 16, 18, 27, 29, 33. Those for the base year are
variables 20, 21, 22, 23, 24, 25, 26, 30, and 35.

The independent variables describing the major charac-
teristics of the project are, for the time period t+2 (i.e., variables 19
and 28). The independent variables measuring the size of the major project
(13a, 13b, 13c, and 34a, 34b, 34c) consist of observations of major project
size in three time periods, two years after initial occupancy (t+2), 1968,
and 1970. While it was originally planned to collect only the final size
of the major project (i.e., 1970), the possible differences in the phasing
of occupancy of the major project could lead to differing amounts of in-
duced land use. Consequently, it was decided to specify three independent
variables measuring the size of the major project.

Variables 31 and 32 were designed to measure the
influence of regional growth on the local land uses. The period between
1960 and 1970 was selected as census data was readily available to determine
growth and the period coincided fairly well with the initial year of opera-
tion of the projects in our case studies and 1970 - the year in which land
use impacts were assumed to have stabilized.

The only variable not available for the period t+0
1960, t+2, 1968, or 1970 was variable 17, numbers of highway interchanges.
The time period for this variable was t+5. The reason was that the building
of new highway interchanges is commonly known five years in advance.
3-30
-------
3. Simultaneous Block

The specification of the model resulted in twelve structural
equations predicting the twelve endogenous variables described above. In
both models, five of these equations are simultaneous. These five equations
are shown on Tables 3-3 and 3-4. The path diagrams representing these
equations are shown in Figures 3-4 and 3-5.

4. Recursive Block

The remaining seven equations in each model are recursive.
These equations are shown on Table 3-5 and Table 3-6; their path diagrams
are shown in Figures 3-6 and 3-7. Finally the complete model is shown on
Tables 3-7 and 3-8 and in Figures 3-8 and 3-9.

G. THEORETICAL AND EMPIRICAL BASIS FOR THE MODEL

This section describes the theoretical or empirical base for the
residential project model and for the industrial/office project model.
Because the equations for these models are in many cases the same and in
other cases very similar, the two models are discussed jointly. The three
major project variables are treated as one variable (MPR or MPE) in the fol-
lowing discussion.

1. Equation 1

Res. Proj. RES = OFFICE + MANF + HWLM +
DUACRE + VACACR +
VACHSG + DELPOP + MPR
Ind/Off. RES = OFFICE + MANF + HWLM +
Proj. DUACRE + VACACR - VACHSG +
DELPOP + MPE

In these equations residential development (1970) is a function
of office development (1970); manufacturing development (1970); highway
lane miles (1970); dwelling units per acre (1960); percent vacant developable
3-31
-------
TABLE 3-3

FIVE EQUATION SIMULTANEOUS BLOCK IN THE RESIDENTIAL MODEL

RES = OFFICE + MANF + HWLM + MPR + DUACRE + VACACR + VACHSG + DELPOP

COMM = RES + OFFICE + MANF + HWLM + MPR70 + HWYINT + MINCC + INCMP +
DELPOP + MINCR

OFFICE= RES + COMM + MANF + HWLM + MPR70 + VACACR + HWYINT + OFFVAC +
OFFACR + DISCED + DELEMP

MANF = HWLM + MPR70 + HWYINT + ENERGY + RRMI + MANACR 4 DELEMP

HWLM = RES + COMM + OFFICE + MANF + MPR70 + HWYINT + DISCED + EMPACR
TABLE 3-4

FIVE EQUATION SIMULTANEOUS BLOCK IN THE INDUSTRIAL-OFFICE MODEL

RES = OFFICE + MANF + HWLM + MPE + DUACRE + VACACR + VACHSG + DELPOP

COMM = RES + OFFICE + MANF + HWLM + MPE70 + HWYINT + MINCC + DELEMP + INCMP

OFFICE= RES + RETAIL + MANF + HWLM + MPE70 + VACACR + HWYINT + OFFVAC +
OFFACR + DISCBD + EMPACR

MANF = HWLM + MPE + HWYINT + ENERGY + RRMI + MANACR + DELEMP

HWLM = RES + RETAIL + OFFICE + MANF + MPE70 + HWYINT + DISCBD + EMPACR
3-32
-------
CO

CO
CO
RESIDENTIAL

MAJOR PROJECT

(1970.1968. T+2)
RESIDENTIAL

Of.VtLOPMf.Hi
NOTE: SOME OF THE VARIABLES APPEAR TWICE (E.G., RESIDENTIAL)
FOR EASE OF PRESENTATION
Figure 3 -4 Five Equation Simultaneous Block In The Residential Model
-------
NO. OF EMPLOY
IN MAJOR PROJECT
(1970.1968, T+Z)
MANUFACTURING
DEVELOPMENT
NOTE: SONE OF THE VARIABLES APPEAR TWICE CE.G., RESIDENTIAL)
FOR EASE OF PRESENTATION

Figure 3-5 Five Equation Simultaneous Block In The Industrial-Off iceModel
-------
TABLE 3-5

SEVEN EQUATION RECURSIVE BLOCK IN THE RESIDENTIAL MODEL

WHOLE = COMM + MANF + HWLM + HWYINT + RRMI + WWEA + DELEMP

HOTEL = RES + OFFICE + MANF + MPR + DISCBD + EMPACR

HOSPTL = RES + MPR + DISCBD + NONHSE

CULTUR = RES + MPR + DISCBD

CHURCH = RES + MPR + MPKIDS

EDUC = RES + MPR + MPKIDS + ENRACR

REC = RES + MPR + MINCC + INCMP

TABLE 3-6

SEVEN EQUATION RECURSIVE BLOCK IN THE INDUSTRIAL/OFFICE MODEL

WHOLE = RETAIL + MANF + HWLM + HWYINT + RRMI + WWEA + DELEMP

HOTEL = RES + OFFICE + MANF + MPE + DISCBD + EMPACR

HOSPTL = RES + MPE + DISCBD + NONHSE

CULTUR = RES + DISCBD

CHURCH = RES

EDUC = ENRACR

REC = RES + MINCC
3-35
-------
CO
I
RESIDENTIAL

MAJOR PROJECT

(1970.1968. T+2)
NOTE: SOME OF THE VARIABLES APPEAR TWICE CE.G., RESIDENTIAL)
FOR EASE OF PRESENTATION
Figure 3-6 Seven Equation Recursive Block In The Residential Model
-------
^

EDUCATIONAL
FACILITIES

NO. OF EMPLOY

IN MAJOR PROJECT

CI970.1968, T* a)
COMMERCIAL
[DEVELOPMENT
NOTE: SOME OF THE VARIABLES APPEAR TWICE (E.G., RESIDENTIAL)
FOR EASE OF PRESENTATION
Figure 3-7 Seven Equation Recursive Block In The Industrial-Office Model
-------
Table 3-7 Original Specification Of The Residential Model
B
B
fe

>—« >0
KU. 3 <9
zo>- zw

J3S
4L !.
ou.2 ssia
£
S.
1:3
-
» EhPACR
•f.a
53 «»
«Df>
1.

2.
3.
4.
5.
C.
7.
8.
9.
10.
11.
12.
RES -

COW -
OFFICE -
NUF •
WHOLE •
HOTEL •
HOSPTL •
CULTUR • .
CHURCH -
EBUC •
REC •
KWJ1 •
* OFFICE * HWF * HWLH « NPR » DUACRE * VACACR * VACHSG
* DELPOP
RES * OFFICE + W:iF + HWLH * NPR t HWIKT * MlltCC » IBC „,_
+ DELtf + HINCR
RES *COW tHANF *HWLN » WR « VACACR * HBVINT » OFFVAC • OFFACR * DISCED , DaOT
* ""^ * BPR * H*YIBT * EKRGY * RRHI » ««„ . [j,^,,
* COW » HANF * HWLI1 * HWYIIIT » RRNI t WVEA t gnfm, ' '"
RES * OFFICE * WUF « BPR t DISCBD ' t WKR
^ * N™ * DISCBD ' t BONHSE
RES t flPR » DISCBD
RES * NPR .
RES »HPR . '.-"•• r~''
P£s * "PR * mrcc » INC
RES tCOW * OFFICE * HMF t NPR « HKYIHT . nlcmn . ™.,.
"•-+ AUTO;.
-------
Table 3-8 Original Specification Of The Industrial-Office Model
a s
'la a
1:3
t^ §£
5
- _ a:
a^ot-i
^5o -.01—0
Z 3 Z •« < 3L
5s! 3G ii*

5-S 5I £-
h- U. LU O U. U

S°t! li fe°t
1^1 la JS
Is sS siJ sJ
es.£
O UJ
I«K
»«->
__j

I3OT -
s
^S
ZZ>-
1. RES

' 2. CONM

3. OFFICE

1. MANF

5. WHOLE •

C. HOTEL

7. KCSPTL

8. CULTUR

9. CHURCH

10. ED

11. REC

12. WLN ^
+ OFFICE + MANF * HVLM + WE * DUACRE + VACACR * VACHSG
RES + OFFICE + HAilF * HHLM * MCE

RES + COMI1 + WNF + HVLM * WE

+ HHLM + WE

* com + m; AUTO
-------
RESIDENTIAL
MAJOR PROJECT
0970.1968. T
NOTE: SOME OF THE VARIABLES APPEAR TWICE CE.G., RESIDENTIAL) FOR EASE OF PRESENTATION
Figure 3-8 Original Specification Of The Residential Model
-------
RECREATIONAL

FACILITItS
5

EDUCATIONAL
FACILITieS

CULTURAL
FACILITIES

NO. OF EMPLOY
IN MAJOR PROJECT
(1970, 1968,
NOTE: SOME OF THE VARIABLES APPEAR TWICE (E.G., RESIDENTIAL)
FOR EASE OF PRESENTATION

Figure 3:9 Original Specification Of The Industrial-Office Model
-------
land (base year); housing vacancy rate (1960); and the number of dwel-
ling units in the major project if the project is a residential project;
or the number of employees in the major project if the project is an
industrial/office project. The reasons each of these variables were
included are as follows:

a. OFFICE and MANF

Office and manufacturing development (1970) may both pro-
vide significant employment opportunities. They thus create a demand and
pressure for residential housing located near the employment base. This
relationship between residential housing and office and manufacturing
development was shown in the use of the EMPIRIC Model by Metcalf & Eddy
for the MDC Wastewater Study, [36]. The model identified a significant
correlation between the change in manufacturing and white collar employment
and households. Therefore, we can expect that if all other variables in the
equation are kept constant that residential housing will increase as the
office employment or manufacturing employment increases.*

b. HWLM

Highway lane miles were included as a causal influence on
residential development as such development (particularly higher density
development) tends to locate in areas well served by transportation net-
works. Use of the EMPIRIC Model identified a correlation between highway
miles and residential development.

c. DUACRE

It is expected that if all other variables in the equation
are held constant, the total amount of residential development in 1970 will
be influenced by the residential density in the area in 1960 The greater
the density in I960, the greater the residential development in 1970.
* The assumption that all variables in the equation are kept constant except
the specific variable described applies throughout this section.
3-42
-------
d. VACACR

A prime factor encouraging new residential development is
availability of vacant developable land. .Therefore, in our model we can
expect a causal relationship between vacant developable land (I960) and
total residential development (1970).

e. VACHSG

Vacancy rates have traditionally been used in planning
studies as indicators of the demand per housing. Normal rates necessary
to provide an adequate measure of freedom of choice and housing opportunity
are 2 percent for owner-occupied units and 6 percent for renter-occupied
units. Low vacancy rates indicate a high demand for additional housing,
while high vacancy rates indicate a low demand. The theory in our model is
that the housing vacancy rate in 1960 is a negative causal factor on the
level of residential development in 1970.

f. DELPOP

Regional population growth (1960-1970) was included in the
model to control for the influence of regional growth on local residential
development. The theory is that where regional population growth is large
there will be increased demand for residential development in the local
area of influence.

g. MPR

In the residential project model, the number of units of the
project was included as a causal influence on residential development. The
theory is that the major project will stimulate other residential housing
in the area by increasing the attractiveness of the area to developers.*
* It is possible that at a certain size the residential project may capture
the demand for new housing in the area and thus have a negative impact on
other new residential development.
3-43
-------
h. MPE

In the industrial/office project model the number of
employees of the project was included as a causal influence on residential
development. The theory is that the employees will contribute to the
demand for housing in the area and thus positively affect residential
development.

2. Equation 2

Res. Proj. COMM = RES + OFFICE +
MANF + HWLM + HWYINT +
MINCC + INCMP +
DELEMP + MINCR + MPR
Ind/Off. COMM = RES + OFFICE +
Proj. MANF + HWLM + HWYINT +
MINCC +
DELEMP + MINCR + MPE

In these equations commercial development (1970) is a function
of residential uses (1970), office uses (1970), manufacturing uses (1970),
highway lane miles (1970), number of highway interchanges (base year + five),
median income (1960), regional employment growth (1960-1970), median regional
income (base year), and area in the case of the residential project, the
number of units and median income (base year + two) of the major project
and, in the case of the industrial/office project, the number of employees
of the major project. The reasons each of these variables were included are
as follows:

a. RES, OFFICE, MANF

Residential, office, and manufacturing development were
included as each of those stimulates a demand for commercial services. Thus
the greater each of these uses are, the more the commercial development.
Use of the EMPIRIC Model identified a significant correlation between the
growth of residential development and blue collar employment.*
* In the EMPIRIC Model, commercial development included both retail trade
and office uses.
3-44
-------
b. HWLM

In numerous land use studies, it is commonly observed that
commercial activity tends to locate in areas with good transportation access.
Therefore, highway lane miles, as an indication of transportation access,
.was chosen as a causal factor influencing commercial development.

c. HWYINT

Number of highway interchanges is a measure of an area's
accessibility. Land use studies have identified limited access highway
interchanges as a strong influence in the location of commercial development.
Therefore, we can expect a causal relationship between the number of highway
interchanges and the amount of commercial development.

d. MINCC

Expenditures on commercial services are a function of in-
come. Thus the higher the income within the area of influence the greater
the amount of commercial services that can be supported.

e. DELEMP

Commercial development both serves and is a part of the
region's employment base. Thus we can expect that growth in regional employ-
ment (1960-1970) will have a positive impact on commercial development
within the area of influence.

f. MINOR

As indicated on (d) above, expenditures on commercial
services are a function of income. Thus the higher the regional income, the
greater the expenditures on commercial development within the local
area of influence provided all other variables in the equation are held
constant.
3-45
-------
g. INCMP + MPR

The number of units of the residential project and the
income level of residents in the project will both add to the demand for
commercial services in the area of influence. Thus both variables were
included in the residential model as causal influences of commercial develop-
ment.

h. MPE

The number of employees of the industrial/office project
will contribute to the demand for commercial services within the area of
influence. We can expect, therefore, that the greater the number of
employees, the greater the amount of commercial services.

3. Equation 3

Res. Proj. OFFICE = RES + COMM + MANF +
HWLM + VACACR +
HWYINT - OFFVAC + OFFACR -
DISCED + DELEMP + MPR
Ind/Off. OFFICE = RES + COMM +
Proj. MANF + HWLM + VACACR + HWYINT +
OFFVAC - OFFACR - DISCED +
DELEMP + MPE

In these equations office development is a function of resi-
dential development (1970), commercial development (1970), manufacturing
development (1970), highway lane miles (1970), percent vacant land (base
year), number of highway interchanges (base year + five), regional office
vacancy rate (base year), office employees/acre (base year), distance
from the central business district (base year), regional employment growth
(1960-1970); and, in the case of the residential project, the number of
units of the major project and, in the case of the industrial/office project,
the number of employees (1970) of the major project.

The reasons each of these variables were included are as
follows:
3-46
-------
a. RES

Residential development stimulates office development by
creating a market for office services. As indicated, use of the EMPIRIC
Model identified a significant correlation between the growth of commercial
(office) development and residential growth.

b. COMM, MANF

A survey by the Fairfax County, Virginia Industrial Author-
ity in 1970 £43] identified proximity to commercial and industrial develop-
ment as strong factors in office building location. The commercial develop-
ment provides services to the office building employees, while the manufac-
turing development creates demand for the services of people working in the
office building. Thus both types of development were included in our
model as causal influences of office development.

c. HWLM

Highway lane miles (1970) are included as office develop-
ment tends to locate in areas which are well served by transportation systems.

d. VACACR

New office development locates in areas where vacant develop-
able land is available. This causal relationship was identified in land use
studies using the EMPIRIC Model. Thus percent vacant developable land was
included as a causal influence or office development. Where land is avail-
able, the office development will be greater than in areas where land is
more limited.

e. HWYINT

Office development tends to locate in areas which are
easily accessible to population and employment centers. Therefore, we can
expect a causal relationship between the number of highway interchanges in
an area and the amount of office development that occurs.
3-47
-------
f. OFFVAC

Regional office vacancy rate (base year) was included in
the equation; the theory is that office development (1970) is inversely
related to the regional office vacancy rate (base year). The lower the
vacancy rate, the greater the demand for new office development and the
greater the amount of office development in the area of influence in 1970.

g. OFFACR

It is expected that holding all other variables in the
equation constant, the total amount of office development in 1970 will be
influenced by the density of office employees in the area of 1960. The
greater the density in 1960, the greater the office development in 1970.

h. DISCBD

As indicated, office development tends to locate in areas
accessible to population and employment centers. Thus we can expect that
as the distance to the central business district increases, the amount of
office development decreases. Use of the EMPIRIC Model identified access
to total households within 10 miles to be a significant factor in the loca-
tion of office development.

i. DELEMP

Office development is part of the region's employment base.
It can be expected, therefore, that growth in the region's employment (I960-
1970) will have a positive impact on office development within the area of
influence.

j. MPR

In the residential project model the number of units of
the residential model was included as a causal influence of office develop-
ment. The theory is that the major project will create additional demand
for office services.
3-48
-------
k. MPE

In the industrial/office model the number of employees of
the major project was included as a causal Influence of office development.
The theory is that the major project will attract associated office develop-
ment.

4. Equation 4

Res. Proj. MANF = HWLM +
HWYINT - ENERGY + RRMI +
MANACR + DELEMP + MPR
Ind/Off. MANF = HWLM + HWYINT -
Proj. ENERGY + RRMI + MANACR +
DELEMP + MPE

In these equations manufacturing development (1970) is a
function of highway lane miles (1970); number of highway interchanges (base
year +5); the cost of energy (base year); railroad mileage in the area of
influence (base year); regional employment growth (1960-1970); and in the
case of the residential project, the number of units of the major project
and, in the case of the office/industrial project, the number of employees
of the major project. The reasons each of these variables were included
are as follows:

a. HWLM, HWYINT, RRMI

Prime location requirements for manufacturing industries
include proximity to good highways and availability of railroad access.
Therefore, highway lane miles, number of highway interchanges and rail-
road mileage were all included as causal influences of manufacturing
development on the theory that as each of these variables increases, the
desirability of the area for manufacturing development increases and,
therefore, the amount of manufacturing development increases.
3-49
-------
b. ENERGY

The MDC Wastewater Study. 1973 [36] identified the cost of
energy as an influence on the location of manufacturing industries. The
theory in the model is that the lower the cost of energy the more attractive
the area is for manufacturers.

c. MANACR

It is expected that holding all other variables in the
equation constant the total amount of manufacturing development in 1970 will
be influenced by the density of manufacturing employees in 1960. The greater
the density in 1960, the greater the manufacturing uses in 1970.

d. DELEMP

Manufacturing employment is part of the region's employment
base. We can expect, therefore, that an increase in the region's employment
will have a positive impact on manufacturing uses in the area of influence.*

e. MPR

For the residential project, the number of units of the
project should have a positive influence on manufacturing development in the
area as the project provides a new labor supply, thus encouraging firms to
locate in the area.

f. MPE

For the industrial/office project, the number of employees
of the project should have a positive impact on manufacturing uses in the
area. Associated firms tend to be attracted to areas where a major manu-
facturing use is located.
* This impact may not be felt if the region's employment as a whole
increased, but the manufacturing component declines.
3-50
-------
5. Equation 5

WHOLE = COMM + MANF + HWLM + HWYINT +
RRMI + WWEA + DELEMP

In this equation wholesale/warehouse development (1970) is a
function of commercial uses (1970); manufacturing uses (1970); highway lane
miles (1970); number of highway interchanges (base year +5); railroad mile-
age in the area of influence (base year); and the density of wholesale/ware-
housing employees (base year) and regional employment growth. The reasons
each of these variables were included are as follows:

a. COMM + MANF

Land use studies have shown that commercial and manufactur-
ing uses tend to attract associated warehousing and wholesale uses. Therefore,
commercial and manufacturing uses were included in our model as causal in-
fluences of wholesale/warehousing uses.

b. HWLM, HWYINT, RRMI

Locational factors which attract wholesale/warehouse develop-
ment are accessibility to highways and availability of railroads. Therefore,
highway lane miles, number of highway interchanges, and railroad mileage
were all included in the model as causal influences of wholesale/warehouse
development on the theory that as each of these variables increases the desira-
bility of the area for wholesale/warehousing development increases and, there-
fore, the amount of wholesale/warehouse development increases.

c. WWEA

It is expected that holding all other variables in the
equation constant, the total amount of wholesale/warehouse development in 1970
will be influenced by the density of wholesale/warehouse employees in 1960.
The greater the density in 1960, the.greater the wholesale/warehouse employees
in 1970.
3-51
-------
d. DELEMP

Wholesale/warehouse development part of the region's
employment base is also influenced by the commercial and manufacturing
sector of this base. We can expect, therefore, that as the region's
employment base increases wholesale/warehouses development in the area of
influence is positively affected.

It should be noted that in these equations the major project
is not included as a causal influence for warehousing/wholesale development.
In the case of the residential project it was felt that there was not a re-
lationship between the project and warehouse/wholesale development. For the
industrial/office project there may be some relationship, but we felt that
it was less significant than the other relationships identified.

6. Equation 6

Res. Proj. HOTEL = RES + OFFICE + MANF - DISCED + EMPACR + MPR
Ind/Off. HOTEL = RES + OFFICE + MANF - DISCBD + EMPACR + MPE
Proj.

In these equations hotel development (1970) is a function of
residential development (1970); office development (1970); manufacturing
development (1970); distance to metropolitan CBD; employment density in
the area of influence (base year); and in the case of the residential
project, the size of the project and, in the case of the industrial project,
the number of employees of the project. The reasons each of these variables
were included are as follows:

a. RES, OFFICE, MANF

Residential development (1970), office development (1970)
and manufacturing development (1970) were included as each of these generates
overnight visitors and thereby increases the demand for hotel space. We
can expect, therefore, that an increase in any of these uses will result in
an increase in hotel development.
3-52
-------
b. DISCED

Hotel development tends to be most concentrated near popu-
lation and employment centers. Therefore, as the distance to the CBD in-
creases, we can expect a decrease in hotel development.

c. EMPACR

As indicated in both (1) and (2) above, hotel development
is related to employment opportunities; therefore, there should be a rela-
tionship between the employment density in the area of influence in the
base year and total hotel development in 1970.

d. MPR

In the residential project model the number of units of
the residential project will result in an increased demand for hotel develop-
ment and thereby an increase in the amount of hotel development.

e. MPE

In the industrial/office project the major project will
result in an increased demand for hotel development from visitors who have
business with the project. We can expect, therefore, that as the size of
the project increases the amount of hotel development also increases.

The variables in the above equation relate hotel develop-
ment to population and employment centers. Another factor influencing hotel
development is proximity to major highways. Therefore, a potential variable
in this equation is numbers of highway interchanges. In calibrating the
model this, variable will be considered in lieu of distance from the CBD.

7. Equation 7

Res. Proj. HOSPTL = RES - DISCED + NONHSE + MPR
Ind/Off. HOSPTL = RES - DISCED + NONHSE + MPE
Proj.
3-53
-------
In these equations hospital development is a function of the
residential development (1970); distance to the CBD (base year); the non-
household density (base year); in the case of the residential project, the
number of units of the major project and, in the case of the industrial/
office project, the number of employees of the major project. The reasons
each of these variables were included are as follows:

a. RES

Residential development was included as it generates a
demand for hospital services. Thus, as residential development increases
the amount of hospital development should also increase.

b. DISCED

Hospital development services population and employment
centers; therefore, as the distance to the CBD increases we can expect
hospital development to decrease.

c. NONHSE
i
Nonhousehold population includes people in nursing homes
and mental institutions. We can expect, therefore, a positive relationship
between the nonhousehold density in 1960 and hospital development in 1970.

d. MPR

In the residential project model the number of units of
the major project will increase the demand in the area for hospital devel-
opment and thus, may result in an increase in such uses.

e. MPE

In the industrial office project the number of employees of
the major project will increase the demand in the area for hospital develop-
ment and thus, may result in an increase in such uses.
3-54
-------
8. Equation 8

CULTUR = RES - DISCED + MPR
CULTUR = RES - DISCED

In these equations cultural facilities are a function of the
residential development (1970); distance to the CBD (base year); in the
case of the residential project, the number of units of the major project.
The reasons each of these variables were included are as follows:
v
a. RES

Residential development generates a demand for cultural
facilities. Thus, as residential development in an area increases, we can
expect cultural facilities to also increase.

b. DISCED

Land use studies have shown that cultural facilities tend
to locate near population and employment centers. We can expect, therefore,
that as the distance to the CBD increases, the amount of cultural facilities
decreases.

c. MPR

As indicated in (1) above, residential development generates
a demand for cultural facilities. We can expect, therefore, that the major
residential project will likewise positively affect cultural facilities.

9. Equation 9

Res. Proj. CHURCH = RES + MPR + MPKIDS
Ind/Off. CHURCH = RES
Proj.

In these equations, church facilities (1970) are a function of
residential development (1970); in the case of the major residential project,
3-55
-------
the number of units of the project and the density of school age children.
The reasons each of these variables were included are as follows:

a. RES

Residential development generates demand for church faci-
lities. Thus, as residential development increases, church facilities will
also increase.

b. MPR

As indicated in (1) above, residential development generates
a demand for church facilities. We can expect, therefore, that the major
residential project will likewise positively affect church facilities.

c. MPKIDS

The patronage of many churches is oriented towards families.
We can expect, therefore, that an increase in the number of school age chil-
dren of the major project may positively affect church facilities.

10. Equation 10

Res. Proj. EDUC = RES + MPR + MPKIDS + ENRACR
Ind/Off. EDUC = RES + ENRACR

In these equations public educational facilities (1970) are a
function of residential development (1970); the density of public school
enrollment (base year); and, in the case of the residential project, the
number of units of the project and the density of school age children. The
reasons each of these variables were included are as follows:

a. RES

Residential development generates a demand for educational
facilities. Thus, it was included in our model as a prime factor influ-
encing educational facilities.
3-56
-------
b. ENRACR

Public school enrollment (base year) will have an affect on
the amount of educational facilities (1970). Thus, this factor was included
in our model.

c. MPR68 and MPKIDS

The residential major project will generate a demand for
educational facilities. This demand will depend on the size of the pro-
ject and the number of school age children per unit.

11. Equation 11

Res. Proj. REC = RES + MPR + MINCC + INCMP
Ind/Off. REC = RES + MINCC
Proj.

In these equations,active outdoor recreation acres (1970) are
a function of the residential development (1970); the income level in the area
of influence (base year); and in the case of the residential project, the
number of units of the major project (1970) and the income level of the major
project (1970).

A potential substitute variable for MINCC is ENRACR and for
INCMP is MPKIDS. The alternate theory is that the greater the number of
children, the more active recreation acres the area of influence may have.

The reasons each of these variables were included in the model
are as follows:

a. RES

Residential development generates a demand for active out-
door/active recreation acres. We can expect, therefore, that as the resi-
dential development in the area increases, the amount of active outdoor recre-
ation area also increases.
3-57
-------
b. MINCC

Recreation and land use studies by Metcalf & Eddy have shown
that the income level in a community affects the amount (as well as the type)
of active outdoor recreation acres. Therefore, we included income level in
our model as a causal influence on recreation acres.

c. MPR + MINCC

The residential major project will increase the demand for
active recreation acres in the area. This increase in demand will depend
on the size of the project and on the income level of the project.

12. Equation 12

Res. Proj. HWLM = RES + COMM + OFFICE + MANF + HWYINT - DISCBD +
EMPACR + AUTO + MPR
Ind. Proj. HWLM = RES + COMM + OFFICE + MANF + HWYINT - DISCBD +
EMPACR + AUTO + MPE

In these equations highway facilities (highway lane miles) (1970)
are a function of residential development (1970); commercial development
(1970); office development (1970); manufacturing development (1970); the num-
ber of highway interchanges (base year +5); distance to the CBD; the density
of employees/acre (base year); automobile drivers/acre in the country (base
year); and in the case of the residential project, the number of units of
the project (1970) and, in the case of the industrial/office project, the
number of employees of the project (1970). The reasons each of these vari-
ables were included in the model were as follows:

a. RES, COMM, MANF, OFFICE •

Each of these types of development may generate substantial
vehicular traffic. We can expect, therefore, that additional highway faci-
lities may be constructed in response to thetr needs.
3-58
-------
b. HWYINT

Number of highway interchanges were included in the model
as we felt that construction of a highway interchange would induce additional
highway lane miles.

c. DISCED

Highway facilities (highway lane miles) link outlying areas
to population and employment centers. We can expect, therefore, an inverse
relationship between highway facilities and distance to the CBD.

d. EMPACR

As indicated above, highway facilities serve employment
centers. We can expect, therefore, that as the density of employees per
acre increases, highway facilities also increase.

e. AUTO

This variable was included as an instrumental variable to
break the loop between highway facilities and residential, commercial, manu-
facturing, and office development. The theory in our model is that the
density of auto drivers in the county (1960) affects facilities in the area
of influence in 1970.

f. MPE AND MPR

Each of these variables was included as the major project,
and, in both cases, contributes to the amount of vehicular traffic generated
in the area. This vehicular traffic needs highway facilities and thus, is
a causal influence affecting such facilities.
3-59
-------
IV. SAMPLE SELECTION

A. PURPOSE AND INITIAL CRITERIA

The purpose of the sample selection process was to identify for

each type of major project a sample of case studies which could be used

in the testing and calibration of the model. The criteria for the case

studies were established jointly by the EPA contract officer and Walden

Research. These criteria were:

• There would be twenty observations in each of the two samples
(residential projects and industrial/office projects).

• No more than two case studies (observations) of each project
type would be located in the same SMSA (Standard Metropolitan
Statistical Area) and no more than three be located in the
same Federal Region.

• The major project would be (a) a residential project, planned
unit development, or new town with a minimum population of 4,500
or 1,800 parking spaces, or (b) an office or industrial park or
research and development complex with a minimum enployment of
2,250 or 1,800 parking spaces.

• The project be constructed between 1954-1964 and attain at least
80 percent of the minimum size noted above within two years of
initial operation.

The rationale behind the foregoing criteria was as follows:

• The location requirements were to assure representation through-
out the United States.

• The size requirements were to assure that the project was "major",

• The timing requirements (i.e., construction dates and occupancy
period) were to assure that project impacts had stabilized by
1970, the date in which land use in the area of influence was to
be measured.

B. METHODOLOGY

To identify case studies which met the criteria, the following

methodology was employed:
4-1
-------
• Professional organizations which either may know of projects
meeting the criteria or may be aware of sources for such projects
were contacted. These organizations included the Urban Land
Institute, American Society Planning Officials, and the National
Association of Homebuilders. The:organfzatfons were; helpful but
did not have lists of potential case studies.
• Through the EPA project officer a questionnaire was sent to
approximately 360 regional planning agencies requesting their
assistance in identifying projects which met the criteria. This
questionnaire is attached as Exhibit 4-1. The responses to the
questionnaire identified approximately 75 potential projects.
Subsequent investigation indicated that many of these projects
did not meet the criteria and, thus, were eliminated from consi-
deration. Nevertheless, the questionnaire proved to be the best
source of potential case studies.

. A telephone interview effort was undertaken. The objectives of this

effort were:

• Contacting agencies who did not respond to the questionnaire for
the purpose of identifying additional projects which met our
criteria;
• Supplementing the information received to date concerning poten-
tial projects.

C. REVISED CRITERIA

After several weeks of attempting to identify qualified case studies,
it became clear that the initial criteria established were too restrictive
and that the list of qualified projects would be significantly extended, by
relaxing these criteria. The particular criteria which many of the identi-
fied projects failed to satisfy were the criteria of 80 percent occupancy of
the minimum size within two years and a construction period between 1954-
1964. These criteria proved extremely difficult for residential projects
where the majority of projects identified were developed and occupied over
an extended period of time. For these projects, construction and occupancy
were often both prior to and subsequent to 1965 and depended on market demand.

The changes to the initial criteria evolved from the problems
encountered in finding acceptable case studies. Then changes were as follows:
4-2
-------
• Phased development projects were permitted provided they met
minimum size criteria. For residential projects the minimum
size was reduced to 1000 units within 5 years.
• Ongoing projects were considered eligible provided substantial
development occurred between 1960 and 1970.

This revised criteria permitted much greater flexibility in iden-
ifying suitable case studies.

D. SELECTION PROCESS

Once a list of qualified case studies was prepared, the actual sel-
ection of the final sample took place. This selection process involved con-
sideration of factors such as availability of information, particularly
aerial photographs, and geographic location of the project. In addition,
two case studies had to be replaced when problems became evident during the
data collection task. The final list of case studies is shown for the indus-
trial/office and residential sample on Table 4-1.
4-3
-------
TABLE 4-1
CASE STUDIES
Industrial
Residential
Farmington Park, Farmington, CT
Western Electric, North Andover, MA
Avco, Wilmington, MA
IBM, Kingston, NY
Ft. Washington, Philadelphia, PA
Keystone Park, Scranton, PA
Crestwood Park, Wright Twp., PA
GE, Salem, VA
Cummings Park, Huntsville, AL
IBM, Lexington, K.Y
Collins Radio Park, Cedar Rapids, IA
Ford, Woodhaven, MI
Western Electric, Columbus, OH
White-Westinghouse, Columbus, OH
Little Rock Industrial Park, Little Rock, AR
Chrysler, Fenton, MO
Western Electric, Omaha, NE
Motorola, Phoenix, AZ
Tektronix, Washington,County, OR
Honeywell, Phoenix, AZ
Joppatown, Hartford County MD
Montgomery Village, MD
Kings Park, Fairfax County, VA
Vienna Woods, Fairfax County VA
Deltona, FL
Miami Lakes, Miami, FL
Town'n Country, Tampa, FL
Montclair-Starmount, Charlotte, NC
Weathersfield, Schaumberg, IL
Oak Park, Blaine, MN
Cottage Grove, MN
Clear Lake City, Harris County, TX
Meyer!and, Houston, TX
Westwood Heights, Omaha, NE
Northglenn, CO
Sun City, Maricopa, AZ
Foster City, CA
Huntington Harbour, Orange County, CA
Sun City, Perris, CA
Rancho Bernado, CA
-------
4-1

UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
Office of Air Quality Planning and Standards
Research Triangle Park, North Carolina 27711

Executive Director
Regional Planning Agency
Someplace, U.S.A.

Dear Sir:

The Land Use Planning Off tee is presently developing a methodology to
predict induced land uses and resulting air quality impacts from major 0)
housing facilities, planned unit developments, or new towns; and (2) office
or industrial parks or research and development complexes. Twenty case studies
are needed for each type of project to develop the model, which will be rela-
tively easy to use and can assist practicing planners in environmental impact
analyses, community facilities impact review, and air quality maintenance
planning.

We would appreciate your assistance in helping us identify major projects
in your region which might serve as case studies. Our basic criteria are:

1. The major project must be (a) a residential project planned unit
development or new town with a minimum population of 4,500 or 1,800
parking spaces, or (b] an office or industrial park or research and
development complex with a minimum employment of 2,250 or 1,800
parking spaces.

2. The project must be constructed between 1954-1964 and must attain at
least 80 percent of the minimum size noted above within 2 years of
initial operation.

If there are any projects within your region which meet or almost meet
these criteria, would you please identify each project on the attached sheet
and return it in the enclosed envelope by August 14,

Our consultants for this project are Wai den Research and Metcalf & Eddy
of Boston, Massachusetts. Representatives from these firms may be contacting
you later when the final selection of cases is made.

Thank you very much for your assistance.

Very truly yours,
Thomas McCurdy
Community Planner
Land Use Planning Office
4-5
-------
EXHIBIT 4-1 (CONTINUED)

Name of Respondent: Organization:
Address: Telephone #:

Name of Project(s):
Type of Project (check one):
Residential _, P.U.D.
Office Complex Industrial Park . Research and Development
Location of Project (municipality):

Approximate Size of Project,
Resid., PUD, New Town «• #DUKs: Acres:
Office, Ind. Park, R&D Complex: Bldg Area ($): Acres:
Initial Year of Operation:
Approximate % Occupied within 2 Years:

Project Characteristics: Residential, PUD, New Town (Checkoff):

Single Family Attached Single Fam, Detached Multifamily
Sales Housing Rental Housing
High Income Middle Income Low Income
Predominately: Elderly Family Singles

Project Characteristics: Office, Ind. Park, and R&D (check off):
Finance Offices Insurance Offices Gov't Offices __
Other Offices (specify):
R&D Developments (specify type):
Industrial Park Development (specify type):
Available Data Sources in Area for Planning Information.
3C Planning Agency:
Detailed Land Use Inventory of Project Area:
Please attach a map to any scale of the area encompassing the project(s) mentioned
above. Feel free to make any pertinent comments concerning this project here.
'If you want to talk about this project, call Tom McCurdy at 919-688-8146 X291.
Thanks for your assistance.
4-6
-------
V. DATA COLLECTION

Following the specification of the model, a list of data items required
for the model was prepared. This was supplemented with additional items
potentially useful in the model calibration phase of this project. A list
of the data items is shown on Table 5-1.

The data collection process consisted of two simultaneous phases. The
first was an on site visit which primarily consisted of interviews with the
local and regional planning agencies, the developer (if available), and the
collection of locally available data. This data consisted primarily of
building floor area of various categories of land uses in the area of influence,
obtained by aerial photograph interpretation. The second phase consisted of the
the collection of 1960 and 1970 Census of Population and Housing data for the
area of influence. Field data collection forms for both phases are shown
on Exhibit 5-1.

The data collection effort was preceeded by a test-training case. This
case, the Wilmington, Massachusetts, AVCO Corporation Facility, was used to test
the feasibility of collecting the requisite data, the clarity of the field
data collection forms, and to train the personnel involved in the data col-
lection effort.

The element of the data collection process that was most subject to
potential measurement error was the aerial photograph interpretation. The
classification scheme for the various land uses did not facilitate interpre-
tation and some of the data collection personnel had little prior experience
in aerial photograph interpretation. Three of the case studies had alternate
sources of the same information which permitted an independent check on the
process.

A. WHITE WESTINGHOUSE AND WESTERN ELECTRIC, COLUMBUS, OHIO

The Mid-Ohio Regional Planning Commission (MORPC) had a transpor-
tation planning data base which includes an inventory of building floor area
5-1
-------
TA3LE 5-1
LIST OF DATA COLLECTION ITEMS
Data Item
Definition
Aggregation
Time
en
i-o
MAJOR PROJECT:
Total Housing Units
Housing Units
Single family attached
Single family detached
Multifamily low rise
Multifamily high rise
Economic level of residents
Above average
Average
Below average
School children per unit
Number of acres
Type of industry
Total Employment

Percent Professional
Single family dwelling units that are attached such as row houses or townhouses
Single family dwelling units which do not have common walls
Apartment units of 4 stories or less
Apartment units of more than 4 stories

Income of residents is approximately 15% or more above median for municipality
Income of residents is no more than 15% above or below median for municipality
Income of residents is approximately 15% or more below median for municipality
Average school age children per unit
Total developed acres and undeveloped acres under developer control
% Manu, % Office, % R & D
Total employees working at the project site

Managerial and professional
Major Project
Major Project
Major Project
Major Project
Major Project
Major Project
Major Project

Major Project
1970
T+0, T+2, T+8,
1960, 1970
T+2
T+2
1970
1970
T+0,T+2,1960,
1970
1970
ENDOGENOUS VARIABLES:
Office
<50 thousand square feet
(<4.65xl03 sq. meters)
50-100 thousand square feet
(4.65xl03 to 9.29xl03 sq.
meters)
Square feet of the following; land use codes as used by the Public Administration
Service in its 1962 Land Use Classification Manual

LUC 60 finance, Insurance, Real Estate
62 Business Services
67 Medical, Health, Legal Services
68 Other Professional Services
Area of Influence*
1970
-------
TABLE 5-1 (Continued)

LIST OF DATA COLLECTION ITEMS
Items
Definition
Aggregation
Time
>100 thousand square feet
(>9.29xl03) sq. meters

Commercial
<25 thousand square feet
(<2.32xl03 sq. meters)
25-50 thousand square feet
(2.32xl03 to 4.65xl03 sq.
meters)
50-100 thousand square feet
(4.65xl03 to 9.29xl03 sq.
meters)
>100 thousand square feet
(>9.29xl03 sq. meters)
Manufacturing

Wholesale/Warehousing

Hotels/Motels
<25 thousand square feet
(<2.32xl03 sq. meters)
25-50 thousand square feet
(2.32xl03 to 4.65xl03 sq.
meters)
50-100 thousand square feet
(4.65xl03 to 9.29xl03 sq.
meters)
>100 thousand square feet
(>9.29xl03 sq. meters)
LUC
LUC
LUC
52-59 Retail Trade
61 Personal Services
63 Automobile Service
64 Miscellaneous Repair Service
65 Indoor Amusement Services
2 Nondurable Goods Manufacturing
3 Durable Goods Manufacturing
50 Wholesale
46 Warehousing
LUC 07 Hotels, Motels, Tourist Homes
Area of Influence
1970
Area of Influence

Area of Influence

Area of Influence
1970

1970

1970
-------
TABLE 5-1 (Continued)

LIST OF DATA COLLECTION ITEMS
Data Item
Definition
Aggregation
Time
in
i
Churches
Cultural

Hospitals
<50 thousand square feet
(<4.65xl03 sq. meters)
50-100 thousand square feet
(4.65xl03 to 9.29xl03 sq.
meters)
>100 thousand square feet
(>9.29xl03 sq. meters)

Public Education
<50 thousand square feet
(<4.65xl03 sq. meters)
50-100 thousand square feet
(4.65xl03 to 9.29xl03 sq.
meters)
>100 thousand square feet
(>9.29xl03 sq. meters)

Active Outdoor Recreation
LUC 764 Churches
765 Other Religious Services

LUC 76 Museums, Libraries, Art Galleries, except Churches (764), Arboreta
(762), Cemeteries (767)

LUC 77 Hospitals, Sanatoria, Convalescent Homes and Rest Homes
LUC 74 Public Schools
75 Private Schools
LUC 80 Outdoor Public Recreation.
81 Outdoor Water Based Recreation
Area of Influence

Area of Influence

Area of Influence
1970

1970

1970
Area of Influence
1970
Area of Influence
1970
-------
TABLE 5-1 (Continued)

LIST OF DATA COLLECTION ITEMS
Data Item
Definition
Aggregation
Time
EXOGENOUS VARIABLES
Vacant Land

DU/Acre

Percent vacant housing units

Highway Interchanges

01 Median Income of Families
-------
TABLE 5-1 (Continued)
LIST OF DATA COLLECTION ITEMS
Data Item
Definition
Aggregation
Time
Ul
I
Warehouse and Wholesale
Employment per acre
Total Employment per acre

Manufacturing Employment
Non-Household Population per
acre
Enrollment' per acre

Regional Population

Regional Employment

Regional Median Income
Zoning Classification
% Residential
% Single Family .
% Multifamily Acr€s
% Office
% Commercial
% Industrial
% Other
Employment with SIC 422, 50 and 51 as reported by the State Division of
Employment Security divided by total acres of the municipality or district.
Total employment as reported by the State Division of Employment Security
divided by the total acres of the municipality or district.
SIC 20-39.
Total population less population in households as presented in Table P-l of the
1960 Census Tract data or Tables 15, 20, 24, 26 or 27 of the 1960 Census Urban
places, towns and Minor Civil Divisions, depending on the size of the
municipality divided by total acres of the respective delineation.
Total public school enrollment divided by total acres of the respective
municipality or school district.
Total population for the counties as presented in the 1960 and 1970 U.S. Census
General Population Characteristics.
Total employment for the counties as reported by the State Division of
Employment Security (adjusted to represent a similar number of covered
professions).
As reported by Census.
As indicated in local zoning ordinance
Municipality

Municipality

Municipality
Municipality or
Census Tract

Municipality

Counties in Region

Counties in Region
Area of Influence
T+0

T+0

T+0
1960

T+0

1960, 1970

1960
T+5
-------
TABLE 5-1 (Continued)
LIST OF DATA COLLECTION ITEMS
Data Item
Definition
Aggregation
Time
Professional Labor Force
School-aged children per acre

Population in Area of
Influence
Total year round units
One Unit Structures
Highway Lane Miles
Percent served by public
sewer
Commuter rail stops
Mass Transit stops
Distance to airport
Presence of
Agricultural uses
University
Private school
Airports
5 mile square water body
(12.95 sq. kilometers)
Percent of total employed population, male and female, with occupations as
professional, technical and kindred workers plus managers, officials, and
proprietors as presented in Table P-3 of the 1960 Census Tract Data or Tables
74 or 78 in the 1960 General Social and Economic Characteristics.
Total school-aged children for municipality of census tract divided by acres
of the respective district.
As defined in Table HC 3 Block Statistics 1970 U.S. Census.

As defined in Table HC 3 Block Statistics 1970 U.S. Census.
As defined in Table HC 3 Block Statistics 1970 U.S. Census.
Lanes times miles of highways carrying over 10,000 average daily traffic.
Percentage of land which is connected to the public sewer or has public
sewerage available.
Number of commuter railroad stops.
Number of fixed rail mass transit stops.
Highway distance to major airport (with FAA regulated tower).
Existence of item within 3.23 miles (5.2 kilometers) of center of major project.
Municipality or 1960
Census Tract
Municipality or T+0
Census District
Area of Influence 1970
Area of Influence 1970
Area of Influence 1970
Area of Influence 1970
Municipality T+5

Area of Influence T+0
Area of Influence T+0
Metropolitan Area T+0
Metropolitan Area T+0
Note: T = Base Year
-------
EXHIBIT 5-1
PART A
ON SITE VISIT
DATA COLLECTION FORMS
5-8
-------
.EXHIBIT 5-1
Major Project Name
Location (City & State)_
Location (UTM coordinates)
i i.n i i i i i.n.
Horizontal KM Vertical KM
USGS quadrangle sheets for area of influence (7.5 minute series).
1.
2.
3.
4.
Definition of area of influence
] Circle
Other (describe)
Are there Non-Tracted Areas in the Area of Influence?
No
Yes (Indicate on Map)
5-9
-------
EXHIBIT 5-1 (CONTINUED)
I. MAJOR PROJECT INFORMATION
Name of Project
City of Major Project_
Exact Location
Contact Name
Contact Telephone Number_
Developed Acres
Undeveloped Acres Under Control of Developer, contiguous,1970
Year of Initial Occupancy (Base Year) j
RESIDENTIAL PROJECT:

1.
2.
3.
Number of D.U.
Initial Initial
Year Year +2 1960 1968 1970
4.
5.
6.
7.
8.
SFD
SFA Low Rise Multifamily High Rise Multifamily
9.
13.

10.
14.
n.
15.
Average sq.ft.,1970
1970 # of D.U.
ECONOMIC LEVEL OF RESIDENTS, T+2 - Median Income is
Above Surrounding Community (15/6)
About the Same
Below that of Surrounding Community £I5#J
Average Children Per Unit, T+2
12.
16.
9.
10.
INDUSTRIAL/OFFICE/RESEARCH PROJECT:
PCT Manuf.
PCT R & D
PCT Office. Other
22. X
23.
Initial Year +2 1968
26. H &.
lll24. *
. 1970
||2B.
Type of Industry: Percentage

Employment

II. AREA OF INFLUENCE - 10.000 Acres. 15.625 square miles.
(1 1. If no Coastline, Barriers, etc.
Circle of 2.23 miles radius Circle of 2.23 miles radius
I1 2. Otherwise, use circle without appropriate segment
15.625 = Area of circle - Area of
Segment
15.625 = irR2 - 1 R2 (6 - s1n9)
• .'. necessary R for 10,000 acres is
- sine)
15.625
N.B. (9 in radians)
. Or use other appropriate configuration dictated by situation. (Describe).
5-10
-------
EXHIBIT 5-1 (CONTINUED)
Project name_
III. NON-RESIDENTIAL BUILDING FLOOR AREA IN AREA OF INFLUENCE
SOURCE
1970 Floor Area in Thousands of Square Feet
<25K 25-50K 50-1 OOK >100K
Commercial 29.

Hotels & Motels,
Clubs 33.
30.

34.
<50K
nffiro _----_.__

37.

Hneni'tal __..______

40.

Manufacturing
Wholesale &
Warehousing
Churches
Cultural
LAND USE IN AREA OF INFLUENCE
**Active Outdoor Recreation,
43.

31.

32.

35.

36.
50-1 OOK >100K
38.

39.

41.

42.

44.

46.

47.

48.

49.

45.

terminal year |5Q
1970 Air Photos Checked with 1970
Land Use Map and 1970 U.S.G.S.
Map.
SIC: 52-59,72,75,76,78,79
LUC: 52-59,61,63,64,65
SIC: 70
Hotel .Motel Redbook - 1970
SIC: 60-67,73,80,81,89,86
LUC: 60,68,62,67,78
SIC: 80 (Interview with& AHA
LUC: 77 local planner) Guide
SIC: 82
LUC: 74,75
SIC: 20-39
LUC: 2,3
SIC: 50,51,422
LUC: 50,46
SIC: 866
LUC: 764
SIC: 84 (Interview with local
LUC: 76 (except 764) planner).
, acres.
**Includes, playgrounds, playfields, tot lots, golf courses, riding stables, developed beaches,
tennis complexes and other active recreation such as ski slopes.
Does not include, passive open space portions of large reservations, arboretums, other areas
devoted to conservation, such as, wildlife refuge, national forests.

The inventory of active outdoor recreation is an optional item. Do only if you have the time
or is easily available from secondary information.

INDICATE PRESENCE OF OTHER LAND USES IN 1970
Within area of
Influence

Within 1 Mi. of
Boundary of Area
of Influence
Universities Private Schools Airports Coastline Inland Water
5
Agriculture
62.
Yes No
Yes No
Yes No Yes No
DD
Yes No
Yes No
Describe any other significant land uses or factors potentially affecting development (continue
on reverse side). (I.e., military bases, correctional institutions, etc.).
5-11
-------
EXHIBIT 5-1 (CONTINUED)
Project name_
IV. GEOGRAPHIC DATA:
1970 Highway Lane Miles with Road A.D.T. > 10,000 (from worksheet #2)
Area of Influence:
74.
f + C

t o
t o --
t 0
-1C.

76
77.
78.
79.
80.
Highway Interchanges
Commuter Rail Stops
Mass Transit Stops
Railroad Line Miles
Major Project:
Distance to Region CBD t 0 bi9bway..miles
Distance to F.A.A. Airport t 0 blg^OH?.*
MUNICIPAL DATA: For Each Principal Municipality (if not base year, then you must adjust
for presence of Major Project). Obtain vacant land, zoning, and sewer
information for area of influence if possible.

Name of Municipality or Area
Total Area in Acres
Vac. Developable Acres, t+0
Vac. Undevelopable Acres,t+0
Percent sewered t+5
Percent of Area Zoned 1n follow
Single family residential
Multifamily residential
Commercial
Office
Industrial
Other
Total Employment t+0
Manufacturing Emp. (SIC 20-39)
Office Emp.(60-67.73.80.81.86.8J93.
Wholesale & Warehouse Emp(50,5V

81.
82.
83.
e/i 2>
ng categories
85. %
86. %
87. %
88. %
89. %
90. %
91.
92.
93.
94.

%
t+5
%
%
%
%
%
%

%
%
%
%
%
%

. %

%
%
%
%
%
%

VI. REGIONAL DATA:
County
Total Employment: 1960
1970
Area
SMSA/Metropol1tan Area
Base Year: Cost of Energy, $/KWH t+0
For Each County Economically Significant to Area of Inf.
Census

95.
96.
97.
t+0
percent.
uying Income

98.
99. *
100.

5-12
-------
EXHIBIT 5-1 (CONTINUED)
Project
name
WORK SHEET HWLM with ADT >1 0,000
Li-nk Name Approximate Link Link Lane
ADT Length Lanes Miles

i
i
;

5-13
-------

r
v •
!
'.i

" "COMW
<25K
j
i
1

i
j

(
4

1
•
»

•1

CTAL "
25-50K

i i
; 1
t .
1
1 i-
i I
.

i '
i .
•
H

-'-

50-1 OOK

1
i ,
1 i

! •
i
i

>100K
:
i

PROJECT NAME
V. .1

MANUFAC-
TURING

;

i
• i
•ii

1 :

MARE
WHOL
i j •
"1

i-
--H"

.e
HI
E

EXHIBIT
3US
SAL
i

ING
INI

I
•- j

! i
H
- 1

5-1 (CONTINUED)
OFFICE
A
1
|

! 1 -

i
i i
.j...
i •

1
j ',

i
j... •
1
-j-M
! :
I
.
------ J

I
|
i - !
i i
i
i
i:
i

i i
,
\0
HOTELS &
HOTELS CLUI
<25K

25-50K

i i
i
50-1 OOK

•••
> IOOK

vt
IOSPITALS
<50K

HOSPITAL
50-1 OOK

HOSPITAL?
>|pw

PUBLIC
SCHOOLS
<50K

SC.UOLS
50-1 COk
'

1 -!
SCHOOLS
>100K,

CHURCHES

CULTURAL

1
'*
MISCEL-
LANEOUS
1
• • I'

I
i

-------
EXHIBIT 5-1
PART B
CENSUS OF POPULATION AND HOUSING
DATA COLLECTION FORMS
5-15
-------
EXHIBIT 5-1 (CONTINUED)
Project name
1970 CENSUS HC(3) BLOCK STATISTICS, TABLE AVAILABLE FOR ALL URBANIZED AREAS.
IF NOT COVERED, HOUSING COUNTS TO BE TAKEN FROM AERIAL OR OTHER-MEANS
FOR ALL BLOCKS WHOLELY OR PARTIALLY IN AREA OF INFLUENCE
CENSUS
TRACT
BLOCK
NUMBER
% OF BLOCK
IN AREA
TOTAL:
TOTAL
POPULATION
101.
TOTAL YEAR
ROUND UNITS
102.
ONE UNIT
STRUCTURE
103.
5-16
-------
EXHIBIT 5-1 (CONTINUED)
Project name
1970 CENSUS PHC(l) CENSUS TRACTS, TABLES H-l & H-2
AVAILABLE FOR ALL SMSA S AND MORE. IF NOT COVERED, INFORMATION
IS COLLECTED BY PRINCIPAL MUNICIPALITIES FOR ALL CENSUS TRACTS MORE THAN 25*
IN AREA OF INFLUENCE AND OTHER CENSUS TRACT JUDGED TO BE RELEVANT
Total
Census
Tract No.
% of Census Tract
in Area
Median Rooms
104.
Units
in
Structure
Year Round Units
1
2
3-4
5-49
50+

Occupied Units.

105.

106.
107.
108.
109.
110.

111.
If area of Influence wholely or partially in untracted area without
block statistics, enter estimate of 1970 housing units for untracted/
unblocked area. I 112. (Units, year round.
5-17
-------
EXHIBIT 5-1 (CONTINUED)
Project name_
FOR ALL PRINCIPAL MUNICIPALITIES [All Municipalities Within One-Half Mile of Project]
1970 CENSUS DETAILED HOUSING CHARACTERISTICS
GENERAL HOUSING CHARACTERISTICS
ITEM
Name
% of City in Circle
Year Round Units
Elevator: Total
Units
4 Floors +
1-3 Floors
Units in Structure:
1 Detached
1 Attached
2
3-4
5 +
Mobile Home
BY
O
O in
O 3
O cZ
D45
D45
D45
D43
D43
D43
D43
043
D43
JAB^LE NO.
i
O 0
O 0
o o
o o
i— in
D55
055
D55
D53
053
D53
053
D53
D53
o o
0 O
o in
O CM
D58
D58
D58
D58
D58
PAv-ITf
O O
o o
o m
i— CM
627
G27
G27
DATA

%
TOTAL

114.
In Towns Less Than 10,00, Estimate No. of High Rise

115.
116.
In Towns 10-2 1/2 Thousand, Estimate Detached/Attachec
Singles. In Towns Less Than 2,500, One Units are
Detached, Remainder 3-4 Units.

117.
118.
liq
120.
121.
122.
FILL OUT FOLLOWING SECTION FOR UNTRACTED - UNBLOCKED PRINCIPAL MUNICIPALITIES
Occupied Units
All
Median Rooms/Unit
All Year Round
G8
G9
G18
G19
G23
G23
G27
G27
Towns Less Than 10,000, Add Renter & Owner Occupied

|l23.
Towns Less Than 2,500 Take Weighted Average of Renter
& Owner

124.
FILL OUT FOLLOWING FOR COUNTIES IN URBANIZED AREA OF PROJECT

County Name
Total Population
Total Housing Units
G29
G29

125.

126.
5-18
-------
EXHIBIT 5-1 (CONTINUED)
Project name_
1960 CENSUS OF POPULATION & HOUSING
Use the combination of Census Tracts and minor civil divisions that best reflect
the project area. If the same, use Census Tracts as the data is presented in a
more usable format.
Note combination to be used:
Total Acreage of Census Tracts 113.
FOR CENSUS TRACTS
ITEM TABLE ,
Trart Nn.
Total Population
Pop. in Household
Median Income
Family ft Indiv.
Pop. under 14
Total Employment
Professional
Managers
All Housing Units
Owner Occupation
Renter Occupation
Available Vacant
Median Rooms/Unit
Median Value, Ownei

P-l
P-l
P-l
P-2
P-3
P-3
P-3
H-l
H-l
H-l
H-l
H-l
H-2

ATA TOTAL

127
128.
12Q
130.
131.
132.
133.
134.
135.
136.
137.
138.
139.
5-19
-------
EXHIBIT 5-1 (.CONTINUED)

Project name_
FOR URBAN PLACES, TOWNS, & MINOR CIVIL DIVISIONS

It is Important that all Items in each block are from a common base. Minor
Civil Division and Urban Place are not the same.
BLOCK 1 - POPULATION
ITEM TABLE i
Total Population
Number of Households
Population in Households
Males under 5
Males 5-14
Females under 5
Females 5-14
Median Income Family & Indiv.
Employment
Professional Employment
Managerial Employment
25
25
25
26
26
26
26
76 or 81

74, 78, or 81

NAME OF TOWN TOTAL

140.
141.
142.
143.
144.
145.
146.
147.
148.
149.
150.
BLOCK 2 - HOUSING
ITEM
Total Housing Units
Population in Housing Units
12 18 22 25 27
15 20 24 26 27

Occupied Housing Units
1 Total
Owner
Renter
12 18 24 26 —
27
27

151.
152.

153.
154.
156.
Mean-Median No. of Rooms
Total

Renter
Median Value-Owner Occupation
13 19 23 25 ~
-_ - _- -_ 97

27
17 21 24 26 27

156.
1 K7
ID/.
158.
159.
Fill out for all Counties Economically Significant to the Area of Influence
Total Population
Total Housing Units
Total Workers
Auto Drivers
Median Income
P-13
H-28
G-82
G-82

160.
161.
162.
163.
164.
5-20
-------
by zone. As such, it provides an estimate of the accuracy of the methodology
that was employed to estimate floor area from aerial photographs.

The MORPC transportation zones for both case study areas are shown

in Figure 5-1 and 5-2. The floor area of all the zones contained in the two

case study areas was summed and compared to the figures obtained by Wai den

from the aerial photographs. The floor area in zones along the boundary of

the circle was apportioned to the total by area. Table 5-2 shows the compari-

son for, respectively, the Western Electric and White-Westinghouse case studies.

The total floor area estimated from the aerial photographs is remarkably close

to the MORPC inventory; the difference is within two percent for Western

Electric and 1.5 percent for the White-Westinghouse case. Due to the different

classification systems used by MORPC and Walden, a comparison of the subtotals .

is more difficult.

From our knowledge of the study areas and discussion with personnel,

we made the following observations:

• The office floor area was systematically underestimated. This is
most likely due in part to misclassifying small offices as com-
mercial floor space (e.g. bank branches in shopping centers,
real estate offices, etc.). One other possibility is that auxil-
iary office space is being misctlassified in industrial facilities
as industrial floor area; however, we believe that the "missing
office area is classified as commercial floor area. We antici-
pate that will cause no difficulty with the final product viz.,
predicted emissions.

• The amount of manufucturing floor area is overestimated by about
ten percent. This could be due to misclassifying some warehousing
as manufacturing; however, MORPC personnel consider that the
distinction between non-manufacturing industrial floor area and
manufacturing industrial floor area may not have been consistently
accurate in their inventory.

• As the MORPC groups all retail and wholesale floor area and we
grouped warehousing with wholesale floor area, it was not possible
f to accurately evaluate the comparison for these subtotals. Dis-
cussion with MORPC personnel familiar with the case study areas
indicate that our figures are reasonable.
5-21
-------
2422
19 If 2421
Figure 5-1 Western Electric Case Study Area and MORPC Transportation Zones
5-22
-------
Figure 5-~2. White-Westinghouse Case Study Area and MORPC Transportation Zones
5*23
-------
TABLE 5-2
WESTERN ELECTRIC
10 square feet
MORPC WALDEN
Transportation by Aerial
Zones Photographs
Office 76
Retail and Wholesale Trade and Services 597
Retail Trade and Services
Warehousing and Wholesale
Warehousing, Construction 252
Manufacturing 2,372.6
TOTAL 3,279.6
48

568
5

2,601
3,222
TABLE 5-2
WHITE-WESTINGHOUSE
10 square feet
MORPC WALDEN
Transportation by Aerial
Zones Photographs
Office 362
Retail and Wholesale Trade and Services 5,360
Retail Trade and Services
Warehouse and Wholesale Trade
Warehousing, Construction 2,617
Manufacturing 4,158
TOTAL 12,497
238

1,874
6,010

4,572
12,694
5-24
-------
B. MONTCLAIRE - STARMONT, CHARLOTTE, NORTH CAROLINA

A more limited check on the aerial photograph interpretation process
was available on this case study. A land use data bank (Management Planning
Analysis Module, MPAM) was available on a Census Tract disaggregation. A
comparison of the aerial photogrpah interpretation and the MPAM data is
shown on Table 5-3. The total floor area estimated from the aerial photo-
graphs is 78 percent of that reported in the MPAM data bank. There are also
apparently some severe misclassification errors.

In summary, we are unable to adequately explain the discrepancy in
the Charlotte, North Carolina aerial photograph interpretation. The three
aerial photograph interpretations referenced.in this section, were each done
by different individuals. Of the three, the individual doing the interpre-
tation in Charlotte was the most experienced. We have identified two possible
reasons for the conflicting results between the three sites:

1. Photograph Scale

The Columbus, Ohio aerial photographs were 1" = 400' scale;
the Charlotte, North Carolina photographs were 1" = 800' scale.

2. Data Aggregation

The MORPC data was on a smaller aerial unit than the Charlotte
data. The MORPC data had about twenty zones in each circle (see Figures 5-1
and 5-2); the Charlotte data had eight. While the Charlotte MPAM data may be
accurate, the aggregation and apportionment necessary to compare it with the
totals from the aerial photograph interpretation may have caused the error.

The second reason would have no impact on the accuracy of this
project. If the error was caused in part by the smaller scale of the
Charlotte photographs, certain case studies may be more accurate than others.
The distribution of the aerial photograph scales of all case studies is
shown on Table 5-4.
5-25
-------
TABLE 5-3

COMPARISON OF BUILDING AREA DATA (TO3 Sq. Ft.)

Item
Offices
Commercial
Industrial
Remainder
Total
Estimate from
1971/1972 Aerials
1,489.4
2,606.5
2,835.1
925. 7a
6,516.7
MPAM 1973
Data
2,532
2,946
1,654
l,262b
8,394
a Includes: Hotels, Motels, Clubs; Hospitals; Public Schools;
Wholesale and Warehousing; Churches; and Cultural,

b Includes: Institutional and other.
5-26
-------
TABLE 5-4
CUMULATIVE PERCENTAGE OF CASE STUDY DATA COLLECTION OBTAINED
FROM PHOTOGRAPHS OF CERTAIN SCALE OR LARGER

1" = 100 5%
1" = 200 12.5%
1" = 300 17.5%
1" = 400 35%
1" = 500 45%
1" = 600 57.5%
1" = 700 62.5%
1" = 800 75%
1" = 900 77.5%
1" = 1000 90%
1" = 1200 97.5%
1" = 2000 100%
5-27
-------
VI. CAUSAL ANALYSIS

A. GENERAL APPROACH

The approach to path analysis in the current study involved the use
of two basic statistical techniques: two stage least squares and ordinary
least.squares multiple regression. The first technique was used to
solve for path coefficients in the system of five equations connected by
feedback loops (viz, the dependent variables RES, COMM, OFFICE, MANF, and
HWLM). For a given dependent variable, the first stage of the two stage
process involved estimating the values of the other four endogenous variables
through linear combinations of so-called instrumental variables which are
chosen .for their causal relationships with the endogenous variables. Take,
for example, the Residential model equation where RES is the dependent
variable. OFFICE, MANF, and HWLM are endogenous, and MPR70, DUACRE, and
VACACR are exogenous:
RES = b] OFFICE + b2 MANF + b3 HWLM + b4 MPR70 +
b5 DUACRE + bg VACACR + b? + e
If ordinary least squares was applied directly to this equation, the estimates
of bp b2> .. . . bj would be inconsistent because three of the model variables
(OFFICE, MANF, and HWLM} are endogenous variables linked to RES through feed-
back loops and hence correlated with the error term e [35]. However, we can
eliminate the dependence of the endogenous variables on e by substituting
modified regression estimates of OFFICE, MANF, and HWLM that are linear
combinations of the exogenous (i.e., independent of e) instrumental variables.
First stage regressions are therefore performed with OFFICE, MANF, and HWLM
as the dependent variables and the instrumental variables as the independent
variables to obtain values for the modified regressors, OFFICE, MANF and
HWLM. The second stage then entails ordinary least squares of the following
equation:

RES = b1 OFFICE + b2 MANF + b3~TiWTM + b4 MPR70 +
bg DUACRE + bg VACACR + b? + e
6-1
-------
Because of the interconnections among the five equations, the set of instru-
mental variables used in any one, two stage regression had to consist of
all of the instrumental variables in the simultaneous block. A complete
discussion of statistical techniques is given in Section II, and the prob-
lem of identification that can occur in two stage least squares analysis is
discussed in Section III.D.

An examination of the model equations (see Section VI.E.2) showed
there to be 20 instrumental variables. For the Residential project, for
example, these were:

MPRT2 HWYINT ENERGY
MPR68 MINCC RRMI
MPR70 INCMP EMPACR
DUACRE OFFVAC MANACR
VACACR OFFACR DELEMP
VACHSG DISCBD DELPOP
MINCR AUTO

The number of instrumental variables in the original model was a definite
problem, since meaningful results cannot be obtained in the first stage
regression if there are as many independent (instrumental) variables as
there are data samples in the current study (20). Therefore, a modified
stepwise regression analysis was performed to choose those instrumental
variables which were most appropriate, in order to allow sufficient degrees
of freedom for the error term in the first stage regression and ensure stable
coefficients for the first stage estimates. This analysis and the trimmed
equations are discussed in Section VI.C.4.

To solve for path coefficients in the other model equations that
were not interconnected, ordinary least squares regression techniques were
used.

The multiple regression techniques used in path analysis, when
applied directly to the data to be analyzed, will yield unstandardized path
regression coefficients (b). However, if applied to data that has been
6-2
-------
standardized (I.e., the data has a mean of 0 and a standard deviation of 1),
the analysis will yield standardized path coefficients (3) or beta weights.
There are advantages to both the standardized and the unstandardized approaches.
Unstandardized path regression coefficients indicate the amount of change in
a dependent variable given a one unit change in a particular independent
variable, all other effects held constant. This approach allows one to deduce
quantitatively the resultant effects of a given change or action on the system.
On the other hand, standardized path coefficients provide a sensible way of
comparing the relative importance of various independent variables Independent of
different units and scales. In the current study, both b and 3 values were
computed for each regression performed. The path coefficients (3) were used
extensively in judging the significance of model paths for theory trimming.
Path coefficients do not, however, provide a valid basis for comparisons of
systems operating on different data populations since the variance of the
individual variables is included in the path coefficients and variance can
change drastically from sample to sample ['26]. Therefore, path regression
coefficients (b) were utilized in the analysis of model stability. Both b
and 3 values are presented in this report for the final model equations.

Two different computer software packages were employed for the
statistical analysis in this study. The first of these was the Time Series
Processor (TSP) developed at MIT and Harvard University in the late 1960's
[39]. TSP is a programming language oriented towards the statistical analysis
of time series, with specific applications to econometric research. TSP was
used for all two stage least squares analysis. Although it also provides
ordinary least squares techniques, these analyses were not performed on TSP
due to several software installation problems encountered during the project.
Therefore, the Statistical Package for the Social Sciences (SPSS) was used
instead [40]|. SPSS is a comprehensive and widely used system of programs
for the statistical analysis of social science data. Although SPSS automati-
cally provides both b and 3 values for a multiple regression, TSP does not.
Thus, 3 values for the two stage least squares were calculated using the fol-
lowing identity:
6-3
-------
Bt - b.s./sy
where 3^ = path coefficient for dependent variable x^
b.j = path regression coefficient for dependent variable x^
s.. = standard deviation of independent variable x.
s = standard deviation of dependent variable y

B. DATA TRANSFORMATIONS

The land use and demographic data collected in the field program
(and discussed in Section V) were loaded into a field data file on our com-
puter system for processing. Computations were performed on these data to
create the model variables chosen for path analysis. These data transforma-
tions are summarized in a list of variable definitions in Table 6-1, along
with the computer card format used for input. The data form numbers refer to
the numbers on the data collection forms, pages 5-9 to 5-20. Certain inter-
mediate calculations on the raw field data are described in footnotes on
Table 6-1.

The transformations were performed on data from all 40 field cases
and the results stored in two separate data files, one for the Residential
model, and one for the Industrial/Office model. Complete listings of all
data files are given in Appendix C (published in the separate Appendix to
this volume,. APTIC Document #80998 [42]), These path analysis data, for both
the Residential and Industrial/Office models, were examined for errors by
checking variable means and extremes, verifying computer coding sheets and
field calculations, and examining the signs of simple correlations between
the variables. Errors were then corrected in the field data file and the
path variables recomputed.

Analyses were subsequently performed on model variables to test for
multicollinearity, possible suppressor variable problems, and the suitability
of instrumental variables (used in solving feedback loops). The model was
trimmed as a result of the investigations, and the first path analysis per-
formed.
6-4
-------
TABLE 6-1

PATH ANALYSIS MODEL DATA TRANSFORMATIONS
Model Variable
Data
Form
Number
Card
Number
Column
Numbers
C71
RES = residential land use in area of influence
in 1970 (excluding major project) in dwel-
ling units

RES = du70t - mpr70

where: du70t = dwelling units in area of influence
in 1970

mpr70 = residential land use in major
project in 1970 in dwelling units

COMM = commercial land use in area of influence in
1970 in 1,000 square feet

COMM = (comml + commZ + comm3 + comm4)/10
8
5

1
49-53

36-40
where:

comml =
commZ =
comm3 =
comm4 =
100
of
100
of
100
of
100
of
square feet
influence in
square feet
influence in
square feet
influence in
square feet
influence in
commercial in
1970 (<25K)
commercial in
1970 (25-50K)
commercial in
1970 (50-100K)
commercial in
1970 (>100K)
area
area
area
area
29
30
31
32
2
2
2
2
24-28
29-33
34-38
39-41
* The sum of dwelling units from the block data (page 5-16) plus dwelling units in unblocked areas
(page 5-17)
-------
TABLE 6-1 (CONTINUED)

PATH ANALYSIS MODEL DATA TRANSFORMATIONS
Model Variable
Data
Form
Number
Card
Number
Column
Numbers
01
OFFICE = office land use in area of influence
(excluding major project) in 1970 in
1,000 square feet
OFFICE = (offl + off2 + off3)/10

where: offl = 100 square feet office in area
of influence (excluding major
project) in 1970 (<50K)

off2 = 100 square feet office in area
of influence (excluding major
project) in 1970 (50-1OOK)

off3 = 100 square feet office in area
of influence (excluding major
project) in 1970 (>100K)

MANF = manufacturing land use in area of influence
(excluding major project) in 1970 in 1,000
square feet

MANF = manf/10
where: manf = 100 square feet manufacturing in
area of influence (excluding major
project) in 1970

WHOLE = wholesale/warehouse land use in area of
influence in 1970 in 1,000 square feet

WHOLE = whole/10
where: whole = 100 square feet wholesale/ware-
house in area of influence in 1970
37

46
2

3
64-68

69-73

74-78

34-39
47
34-39
-------
TABLE 6-1 (CONTINUED)

PATH ANALYSIS MODEL DATA TRANSFORMATIONS
Model Variable
6. HOTEL = hotel and motel land use in area of influence
in 1970 in 1,000 square feet
HOTEL = (ihotell + hotel 2 + hotel 3 + hotel 4) /1 0
where: hotel 1 = 100 square feet hotel in area of
influence in 1970 (<25K)
hotel 2 = 100 square feet hotel in area of
influence in 1970 (25-50K)
hotels = 100 square feet hotel in area of
influence in 1970 (50-100K)
hotel 4 = 100 square feet hotel in area of
influence in 1970 (>100K)
7. HOSPTL = hospital, etc. land use in area of influence
in 1970 in 1,000 square feet
HOSPTL = (hospl +hosp2 + hosp3)/10
where: hospl = 100 square feet hospitals in area
of influence in 1970 (25-50K)
hosp2 = 100 square feet hospitals in area
of influence in 1970 (50-1 OOK)
hospS = 100 square feet hospitals in area
of influence in 1970 (>100K)
Data
Form
Number

33
34
35
36

40
41
42
Card
Number

2
2
2
2

3
3
3
Column
Numbers

44-48
49-53
54-58
59-61

4-8
9-13
14-18
-------
TABLE 6-1 (CONTINUED)

PATH ANALYSIS MODEL DATA TRANSFORMATIONS
Model Variable
Data
Form
Number
Card
Number
Column
Numbers
00
8. CULTUR = cultural land use in area of influence
in 1970 in 1,000 square feet

CULTUR = cultur/10

where: cultur = 100 square feet cultural in area
of influence in 1970

9. CHURCH = religious land use in area of influence
in 1970 in 1,000 square feet

CHURCH = church/10

where: church = 100 square feet religious in area
of influence in 1970

10. ED = educational land use in area of influence in
1970 in 1,000 square feet

ED = (edl + ed2 +ed3)/10

where: edl = 100 square feet education in area of
influence in 1970 (25-50K)

ed2 = 100 square feet educational in area of
influence in 1970 (25-50K)

ed3 = 100 square feet educational in area of
49
48
43

44
3

3
51-55
46-50 .
19-23

24-28

11.
12.

REC =
HWLM =
influence in 1970 (>100K)
active outdoor recreational land use in area of
influence in 1970 in acres
highway lane miles in area of influence in 1970
45
50
74
3
3
4
29-33
57-60
4-8
-------
I
10
TABLE 6-1 (CONTINUED)

PATH ANALYSIS MODEL DATA TRANSFORMATIONS
Model
12a.
13.
13a.
13b.
14.

Variable
HWLMNX
MPRT2
MPR68
MPR70
DUACRE
DUACRE
where:

= highway lane miles in area of influence
in 1970 without expressways
= residential land use in major project in
year t+2 in dwelling units
= residential land use in major project in
1968 in dwelling units
= residential land use in major project in
dwelling units
= dwelling units per acre in census tracts
in 1960
= (du60c - mpr60)/ac60c
du60c = dwelling units in census tracts
ac60c = census tract acreage in 1960
mpr60 = dwelling units in major project
in 1960
Data
Form
Number
*
5
7
8

134
113
6
Card
Number
4
1
1
1

7
8
1
\s\J 1 UIMM
Numbers
9-13
21-25
31-35
36-40

54-58
4-9
26-30
* This was computed by adjusting the highway lane miles shown on the data collection form (page 5-12)

by the amount of expressway highway lane miles (page 5-13)
-------
TABLE 6-1 (CONTINUED]

PATH ANALYSIS MODEL DATA TRANSFORMATIONS
Model Variable
Data
Form
Number
Card
Number
Col umn
Numbers
I
o
15. VACACR* = percent vacant developable acreage in area
of influence in year (t+0)

VACACR = vacdev/(10,000-vacund)

VACUND = vacant undevelopable acreage in area of
influence in year (t+0) 82

where: vacdev = vacant developable acreage in area
of influence in year (t+0) 83

16. VACHSG = percent vacant housing in census tracts in
1960

VACHSG = vac60c/du60c

where: .vac60c = vacant available housing units in
census tracts in 1960 137

17. HWYINT = highway interchanges in area of influence
in year (t+0) 75

18. MINCC = median income factor for families and indi-
viduals in census tracts relative to
average U.S. income in 1960

MINCC = mincc/$5,650

where: mince = median income for families and individuals 129
4

4
7

4
33-37

38-41
69-73

14-15
29-33
•* Redefined later, see Table 6-5.
-------
TABLE 6-1 (CONTINUED)

PATH ANALYSIS MODEL DATA TRANSFORMATIONS
Model Variable Data Card Column
Form Number Numbers
Number

19. INCMP = variable indicating the median income
level of major project compared to
surrounding community in year (t+2)
INCMP = incmpa - incmpb
where: incmpa = +1 if major project median
income >_ 15% above that of
surrounding community 17 1 61
incmpb = +1 if major project median
income >^ 15% below that of
surrounding community 19 1 63

20. OFFVAC = percent office buildings vacant in
metropolitan area in year (t+0) 99 5 35-38

21. OFFACR = office employment per acre in area
of influence in year (t+0)
OFFACR = offemp/10,000
where: offemp = office employment in area
of influence in year (t+0) 95* 4 69-73

22. DISCED = distance from center of major project
to CBD in year (t+0) in miles 79 4 25-28
*
Employment data for area of influence (if not available directly) was apportioned from muincipal
data by area (page 5-12)
-------
TABLE 6-1 (CONTINUED)

PATH ANALYSIS MODEL DATA TRANSFORMATIONS
Model Variable
Data
Form
Number
Card
Number
Column
Numbers
ro
23. ENERGY = cost factor for electricity ($/1500 KWH)
for users in the metropolitan area in year
(t+0) relative to the average U.S. commer-
cial rate in 1960.

ENERGY = energy/$51.59

where: energy = dollars per 1500 KWH for commer-
cial users in metropolitan area
in year (t+0)

24. RRMI = railroad mileage in area of influence in
year (t+0)

25. WWEA = warehouse and wholesale employment per acre
in area of influence in year (t+0)

WWEA = wwemp/10,000

where: wwemp = warehouse and wholesale employment in
area of influence in year (t+0)
98
78
94*
25-29
20-24
74-78
* Employment data for area of influence (if not available directly) was apportioned from municipal data
by area (page 5-12)
-------
TABLE 6-1 (CONTINUED
PATH ANALYSIS MODEL DATA TRANSFORMATIONS
en
co
Model Variable Data
Form
Number
26. EMPACR = total employment per acre in area of
influence in year (t+0)
EMPACR = totemp/1 0,000
where: totemp = total employment in area of
influence in year (t+0) 91*
27. NONHSE = nonhousehold population per acre in
census tracts in 1960
NONHSE = (p60c - hp60c)/ac60c
where: p60c .= total population in census tracts ,97
in 1960 '"
hp60c = household population in census
tracts in 1960 128
28. MPKIDS = school-age children per dwelling unit
in major project in year (t+2) 2Q
29. ENRACR = public school enrollment per acre in
census tracts in 1960
ENRACR = p!460c/ac60c
where: p!460c = population under 14 years of age
in census tracts in 1960 130
* Employment data for area of influence (if not available directly) was
(page 5-12)
Card Column
Number Numbers

4 59-63

7 17-22
7 23-38
1 64-66

7 34-38
apportioned from municipal data
-------
I
__J
-P»
TABLE 6-1 (CONTINUED)

PATH ANALYSIS MODEL DATA TRANSFORMATIONS
Model
30.

31.

32.

Variable
MANACR =
MANACR =
where:
DELPOP =
DELPOP =
where:

DELEMP1=
DELEMP =
where:

manufacturing employment per acre in
area of influence in year (t+0)
ma nemp/1 0,000
manemp = manufacturing employment in
area of influence in year
(t+0)
growth factor for total regional popu-
lation between 1960 and 1970 (county
data)
(p70cty - p60cty)/p60cty
p70cty = total population in county in
1970
p60cty = total population in county in
1970
growth factor for total regional
employment between 1960 and 1970
(county data)
(e70cty - e60cty)/e60cty
e70cty = total employment in county in
1970
e60cty = total employment in county in
1960
Data Card Column
Form Number Numbers
Number

92* 4 64-68

125 7 4-10
160 8 10-16

96 5 11-17
95 5 4-10
* Employment data for area of influence (if not available directly) was apportioned from municipal data
(page 5-12)

Redefined later, see Table 6-5.
-------
TABLE 6-1 (CONTINUED)

PATH ANALYSIS MODEL DATA TRANSFORMATIONS
Model Variable Data Card Column
Form Number Numbers
Number

33. MINCR = median income factor for the region in
year (t+0) relative to the average U.S.
income in 1960

MINCR = mincr/$5,660

where: miner = median income for the region in
the year (t+0) (county data) 164 5 39-43

34. MPET2 = number of employees in major projects in
year (t+2) 26 2 9-13

1. 34a. MPE68 = number of employees in major projects in
01 year 1968 27 2 14-18
34b. MPE70 = number of employees in major projects in
year 1970 28 2 19-23

35. AUTO = automobile drivers per acre in county in
1960

AUTO = au60cy/ac60cy

where: au60cy = automobile drivers in county in
1960 163 8 30-36

ac60cy = county acreage in 1960 97 5 18-24
-------
. C. PROBLEMS IN PATH ANALYSIS

A discussion of our approach to some common limitations of the general
multiple regression techniques used in path analysis is given in the follow-
ing sections:

1. Multicollinearity

Multicollinearity refers to the situation in which some or all of
the independent variables in a multiple regression equation are very highly
correlated. The effect of multicol linearity on path analysis is to cause
instability in the path regression coefficients, and in the extreme case, make it
impossible to invert the correlation matrix of the independent variables, pre-
venting solution of the equation. Multicollinearity can exhibit itself as a
strong simple correlation between two independent variables, or as the situation
where one of the variables is a nearly perfect linear function of a set of the
other variables. Multicollinearity typically arises with a simple correlation
greater than .90 or .95. In the current study, the test for multicollinearity
began with an examination of the simple correlations between the independent
variables in all of the path analysis equations. Simple correlations between ,
variables in the residential and industrial models that are significant at the
ten percent level are shown on Tables 6-2 and 6-3, respectively. A complete
listing of the simple correlation matrices is shown in Appendix C (published in
the separate appendix to this volume) [42].
The results show that in general the independent variables of each
equation are highly uncorrelated with most values falling below r=0.35. This will
help improve the stability of the final path regression coefficients. In the
Residential model data, the possibility of multicollinearity was found in two
cases. In the OFFICE and WHOLE equations, the correlation between the indepen-
dent variables, HWLM and COMM, is 0.83; and between HWLMNX and COMM it is 0.89,
both significant at the 0.1 percent level (t-test). In the OFFICE equation,
COMM is eventually trimmed, and in the whole equation both of these variables are
eventually trimmed, eliminating possible multicollinearity problems.
In the Industrial/Office model data, multicollinearity was found
in only one case. In the WHOLE equation, the correlation between the indepen-
6-16
-------
Table 6-2 Residential Model Simple Correlations Significant At p*.10
no
uJ
QJ

u
.
r-ro
« S
G-
a.
£T

o»
MPR70 l.On .51 ,75 -.40 -,32 .40 -.32 -.12
MPR63 .31 1.00 ,87 .10 -.39
.M.PP,T2 ,75 ,C7 1,00 -.35
RES 1.00 .75 .66 .62 .67 .76 -.35
COW .75 1.00 .75 .72 .59 .83 .89 -.59 .52 .18 .37 .63
HOSPTL 1.00 .52 .43 .42 .50 .H .12 .31 .19
CULTUR 1.00 .31 .31
CHURCH .52 .31 1.00 .75 .69 .73 -.13
EDUC .75 1.00 .31 .19 .71 .71 -.57
RFC 1.00 .33
OFFICE 1.00 .32 .15 .36 .10 -.50 .63 .60 .62 .76
HANF .32 1.00 .49 .35 -.36 .30 .33 .30
KHOLE .15 .19 1.00 .51 -.17
HOTEL .35 1.00
HWLH 1.00 .92 -.61 .39
HWLHNX ,92 1.00 -.60 .11
HHYINT 1.00 -.41
DISCED 3.00 -.16 -.15 -.12 -.13
EHPACR 1.00 .99 .98 .92
HANACR .99 1.00 .98 .88
HWEA .98 .93 1.00 .87
OFFACR .92 .83 .87 1.00
MONHSE
DUACRE
ENRACR
AUTO

-.32 .64 -.45
-.42 .54 -.40
-=11 ,49 -,32
.31 ,52 .60 .34 .43
.30 .44

.36 .34
.62
-.45 -.57
-.38

.31 .13
.12
.41 .13
.15 .11
.80 .63 .59
-.30 -.13

-.10
1.00 .71 .63 .36
.71 1.00 .97
.63 .97 1.00
.36 1.00 .37
OFFVAC 1.00 -.36
VACHSG 1.00 -.33
VACACR -.36 -.33 1,00
MINCR
HINCC

-.36 -.61 -.53
-.34 -,'!6 -.42

.51
.15

.51
.76 .45

.44

.57
.31
.62
.50
.31 .39
-.42

.32
.11

.33
.58 .37 .16
-.15 -.58 -.13
-.19 -.13
.37 .62 .17
1.00 .37 .35
.37 1.00 .52
MPKIDS 1.00
DELPOP
DELEHP

-.31 .36
-.35
- = 34 -,it5 ,33 =32
.32 -.51
-.35

-.33

.33
-.35
-.32

-.39

-.39
.31 .31
-.33

-.31
-.15

.91 -.39
-.32

.31 -.12
.12 .61 -.39

.11
1.00 -.32
1.00 -.18
INCHP 1.00 -.39
ENERGY 1.00
RRHI
HPR70
HPR68
MPRT2
.31 *RES
com
HOSPTL
CULTUR
-.32 CHURCH
EDUC
REC
OFFICE
HANF
WHOLE
HOTEL
HWLH
HWUttX
HWYIiff
DISCED
EHPACR
HMttCR
MMEA
OFFACR
.48NONHS.E
DUACRE
ENRACR
AUTO
OFFVAC
VACHSG
VACACR
HI NCR
HINCC
MPKIDS
DELPOP
DELEHP
.47 INCHP
-.11 ENERGY
1.00 RRHI '
-------
Table 6-3 Industrial-Office Model Simple Correlations Significant Atp*.10
;lPE7u I.JO .83 .33
MPE6S .83 1,00
HPET2 .i3 1.00
OFFICE
HANF
com
WHOLE
HOTEL
« QC -T" x t— C3
• uj -J i is !3 r~ z 03
111. i.lS'liilsiiiS

.35 .13
.52 .13
1.00 .56 ,50 .18 .35 .52 .51 .13
.561.00 .73 .68 .31 .56 .47 .43 .50
.50 .73 1.00 ,57 .79 .65 .53 .37 .55 .10 .52 .13 -.50
.57 1.00 .74 .31 .61 .68 -.32
.18 .68 .79 1.00 .38 .50 .51 .68 .31 .38 .68 .31 -.38
RES 1.00 .32 .31 .18 .72 .64 -.35
HOSPTL .32 1-00 .67 -.35
CULTUR 1.00 .35
CHURCH .31 .67 .35 1.00
EDUC .48 1.00 .43 .35
ocr LOO 35
KtL .J3
HWLM 1.00 .77
HWLMNX .77 1.00 -.15
HWYINT -.45 1.00
DISCED 1.00
EHPACR
HANACR
WWEA
OFFACR
NONHSE

g
£
SI
UJ

.38
.45
.62
.31
.52
.16

.17
.55

-.38
1.00
.88
.70
.71
.50

_
oe
1 i

.36 .32

.33
.58
.13 .83
.53 .69
.56
.56 .72
.50 ,31

.35 .37
.33

.53
.50 .56

-.11 -.36
.83 .70
1.00 .70
.70 1.00
.16 .33
.31 .58

a; uJ
t_> to
u_ 3
fe §

.11
.50
.66
.41 .77
.35
.51
.19

.31
.11

.55
.16

-.38
.71 .50
.16 .34
.38 .58
1.00
1.00
DUACRE
ENRACR
AUTO

S
§

.61
.61

.60
.41

.33
.42

-.57
.57
.67
.69

.63
1.00
.97
-.33

Li u ^
S cr> > :c «r ~ a:
~ =£ u_
-------
dent variables WWEA and COMM is Q.83, significant at the 0.1 percent level. The
variable COMM is eventually trimmed from this equation, eliminating possible
multicollinearity problems.
If significant simple correlations had been found between many inde-
pendent variables, the next step in testing for multicol linearity would have been
to examine the partial correlations between variables. However, since the vari-
ables are in general highly uncorrelated, it is not probable that a problem
exists in the partial correlations, and so these were not examined.
2. Suppressor Variables
The suppressor variable problem arises in path analysis when multiple
regressions are performed between independent and dependent variables which are
truly uncorrelated. The presence of an irrelevant variable can cause the sign
and magnitude of the path coefficients to deviate from the values one might
expect. The approach used for this problem was to examine the magnitude and sign
of the simple correlations between dependent and independent variables in the
model equations.
In order to test out the hypothesized path structures, the indepen-
dent variables in each equation were grouped by the significance of their correla-
tion with the dependent variable. The results of this categorization are shown
in Table 6-4 and 6-5 for the Residential and Industrial projects, respectively.
The significance level of the t-statistic gives the probability that the associ-
ated -.variables are truly uncorrelated, i.e., population r=0. Variables with a
significance level above 20 percent generally had correlations below r=0.2. Thus,
it can be argued that such variables are essentially uncorrelated with the depen-
dent variable and their inclusion in the model may cause suppressor variable
problems. Use of this criterion would trim over half the paths from the current
model. The sign of the correlation coefficient associated with each independent
variable is indicated in Tables 6-4 and 6-5 by an underline. An examination of the
the underlined variables reveals some significant simple correlations that are
unexpectedly negative.
Based on these results, it appears a suppressor variable problem
could exist with the path model as originally specified. However, it
6-19
-------
TABLE 6-4

ANALYSIS OF SIMPLE CORRELATIONS BETWEEN VARIABLES IN THE RESIDENTIAL PATH MODEL'
Dependent
Variable
Significance Level oft-Statistic of Correlation with Independent Variables
<. 10% > 10% and £20% >20%
RES

COMM

OFFICE
MANF
WHOLE
HWLM, HWLMNX, MPRT2.
MPR70, DUACRE

RES, OFFICE, HWLM,
HWLMNX, MINCC
COMM, MANF, HWLM,
HWLMNX, MPR70.
OFFVAC.
DJSCB.D

MANACR
MPR68. VACACR. DELPOP OFFICE, MANF, VACHSG
MANF
MPRT2. MPR68, MPR70,
WIflT',~ITiCRP ."DTLTMP,
MINCR
MPRT2. MPR68. VACACR RES, HWYINT, DELEMP
HWLM
HWLM
DELEMP. HWLMNX, MPRT2,
MPR68. MPR70, HWYINT,
ENERGY,~RRW

COMM. MANF. HWLMNX,
Ml, WWEA.
HOTEL
HOSPTL
CULTUR
EDUC
REC
HWLM
HWLMNX
MANF
RES
RES, MPR70, MPKIDS
MRRT2, MPR68,
MPR70, IBCMP
RES, OFFICE,
DISCBD, COMM
RES, COMM,
OFFICE, DISCBD

MPR68
MPR68, MPR70, DISCBD
MPR68
ENRACR
RES. MINCC
MPR70, MANF,
HWYINT, EMPACR
EMPACR
DELEMP
RES, OFFICE, MPRT2,
MPR70, DISCBDTWACR
MPRT2, NONHSE
MPRT2, MPR70, RES,
DISCED
MPRT2, MPR68
MPRT2, MPR68, AUTO
MANF, MPRT2, MPR68,
An linrlovl ino cirmifioc a nona-Mwo ^nv>*»ol a-Unn l
6-20
-------
TABLE_6-5

ANALYSIS OF SIMPLE CORRELATIONS BETWEEN VARIABLES IN THE INDUSTRIAL/OFFICE MODEL*
Dependent
Variable
Significance of t-Stat1st1c of Correlation with Independent Variables
< 10% :,To% and-,.<2Q% > 20%
RES

COMM

OFFICE
HWLM, HWLMNX, OFFICE,
MPE68, DUACRE, VACACR.
DELPOP, MANF

OFFICE, HWLM, HWLMNX,
RES, MANF

COMM, RES, MANF
MINCC
DISCED. HWLM, HWLMNX,
VACACR
MPET2. MPE70,
VATRST3
MINCR, MPET2, MPE68,
MPE70, WlNT, DELEMP

MPET2, MPE68, MPE70.
HMYINT.WVAC. OFFACR.
MANF
WHOLE
HOTEL
HOSPTL
CULTUR
CHURCH
EDUC
REC
HWLM
HWLMNX
HWLM, HWLMNX, DELEMP
HWLM, HWLMNX, WWEA,
COMM
OFFICE, DISCED, RES,
MANF, EMPACR
RES, DISCED, NONHSE
RES
RES
MINCC
RES, COMM, MANF,
EMPACR
RES, MANF, COMM,
HWYINT, EMPACR

MPET2, ENERGY, RRMI,
MANACR
MANF, DELEMP
MPE68, MPE70
MPE68, MPE70
—
ENRACR
--..
MPE68, OFFICE, AUTO
AUTO, OFFICE
DELEMP
MPE68. MPE70. HWYINT
HWYINT, RRMI
MPET2

MPET2
RES, DISCED
—
—
RES
MPET2, MPE70, HWYINT,
DISCED
MPET2, MPE68, MPE70,
DISCED

An underline signifies a negative correlation
6-21
-------
is not desirable to trim the model using the stmple correlations as the only
guide. Intercorrelations between independent variables (although not large)
can create a "masking" effect whereby an independent variable appears to
have a low simple correlation with the dependent variable, but is highly
correlated when the effects of other variables are taken into account. There-
fore, all trimming that was done to remove possible suppressor variable
problems (viz., where path coefficients were insignificant or of the wrong
sign), was based on the results of the ordinary and two stage least square
multiple regressions. A discussion of the trimming of the model that was per-
formed is given in sections VI.C.4, "Instrumental Variables" and VI.E, "Theory
Trimming".

3. Multiple Variable Definitions

There were some independent variables in the system of equations
for which we initially had two or three different possible definitions. HWLM
and HWLMNX are two representations of highway lane miles. MPRT2, MPR68, and
MPR70 are all Major Project (Residential) dwelling units, but in different
years. Likewise, MPET2, MPE68, and MPE70 are all Major Project (Industrial)
employment. Due to the limited sample size and possible multicollinearity,
it is undesirable that all three variables be included in the regression.

In the Residential and Industrial/Office models, HWLM and
HWLMNX are about equivalent in the significance of their correlations (see
Tables 6-4 and 6-5). The choice of one variable or the other in this case
must also take account of the fact that highway lane miles also serves as a
dependent variable in one equation. Logically, the endogenous variables
linked to highway lane miles by feedback loops (viz., RES, COMM, OFFICE and
MANF) should show a stronger causal relationship with non-expressway highway
lane miles (HWLMNX) because of local strip highway development along major
arterials. The dependent variable chosen must also be significantly cor-
related with the instrumental variables in the feedback loop system in order
for the two-stage least squares method to be successful in estimating path
coefficients. Stepwise multiple regressions were performed between the high-
6-22
-------
way lane mile variables and the total set of 20 instrumental variables for
the feedback system. In each case the overall goodness of fit of the regres-
sion equation was evaluated by testing the significance of the F ratio. This
test indicated whether the data sample being analyzed has been drawn from a
population in which the multiple correlation is equal to zero, and that any
observed multiple correlation is due solely to sampling fluctuations or
noise in the data. The results show F ratios significant at the five percent
level were occurring only with the HWLMNX variable. None of the HWLM step-
wise regressions are significant. Based on all of the above results, the
decision was made to use the HWLMNX variable exclusively throughout the
model equations.

In the Residential model, the simple correlations (Table 6-4)
show MPR70 to be generally the most significant of the three possible Major
Project Residential variables. MPRT2 and MPR68 also appear to be signifi-
cant in several of the equations. However, their inclusion in the model could
i
cause multicol linearity problems since the simple correlations between them
and MPR70 are 0.75 and 0.81, respectively. Thus, MPR70 was used exclusively
throughout the Residential model equations.

In the Industrial/Office model, all three Major Project Employ-
ment variables appear to be significant. An examination reveals the simple
correlations between these variables to be:

rMPET2, MPE68 = °'21
rMPET2, MPE70 = °*33
rMPE68, MPE70 = °*83

Only the combinations of MPET2 and MPE68, or MPET2 and MPE70 can be used
without causing multicollinearity problems. In order to choose between MPE68
and MPE70, stepwise regressions of all the equations not connected by feed-
back loops were performed, forcing in MPE68 and MPE70 as the first variable.
A comparison of the overall F ratios for these regressions showed MPE70
6-23
-------
produced more significant results. Therefore, MPE70 and MPET2 were used
throughout the Industrial model.

4. Instrumental Variables

The efficacy of instrumental variables depends first upon
causality. They must be a cause of the endogenous variable they are to
estimate but not be caused by it. Secondly, instrumental variables should
be a stronger causal influence on the endogenous variables whose equations
they appear in, than on other endogenous variables in the simultaneous
block. Addressing the second point is very difficult since any apparently.
significant correlation obtained between an instrumental variable and any
"other" endogenous variable can be due solely to the feedback loop relation-
ships. Thus, testing this aspect of an instrumental variable involves exam-
ination of the initial two-stage least squares analysis.

As discussed previously, a subset of the 20 instrumental vari-
ables originally specified in the path analysis model had to be chosen to
allow sufficient degrees of freedom in the first stage regressions. Step-
wise regressions of each of the five dependent variables in the feedback
loop system (viz., RES, COMM, OFFICE, MANF, and HWLMNX) with the total set
of 20 instrumental variables were performed. In each case, six of these
instrumental variables were first forced into the regression and then the
others were allowed to enter in the order of the significance of their added
partial contributions. The six forced variables were MPR70, (or MPE70),
DUACRE, MINCC, OFFACR, DISCBD, and MANACR. It was decided that at least one
major project variable had to appear in each regression. The other five
instrumental variables were obtained by choosing the one instrumental variable
from each equation that we believed to be the most important, based upon
theoretical induced land use relationships and simple correlations between
instrumental and endogenous variables. In a two-stage least squares regres-
sion of five interconnected equations, a minimum of five instrumental vari^i
ables is necessary to avoid the problem of underidentification (see Section
III.C).
6-24
-------
Although all 20 instrumental variables eventually entered all
regressions, a cutoff was made after four new variables had entered (beyond
n
the original six forced). An examination of the R change for the instru-i
mental variables (i.e., the additional variance explained in the regression)
indicated that after the fourth new variable entered, insignificant amounts
of the variance were explained by entering more variables. Next, for both
the Residential and Industrial models, a tabulation was made of how many
times an instrumental variable appeared in the first four chosen. Finally,
an upper limit of 13 instrumental variables (allowing 20-1-13=6 degrees of
freedom for the first stage error term) was set for use in the path model.
It was felt that six degrees of freedom were a minimum for meaningful results
in the first path model run, whereas further trimming beyond this limit would
be premature at this point in the analysis. Beyond the six forced variables,
seven more were chosen for both the Residential and Industrial/Office models
(see Table 6-6), based upon the number of times they appeared in the first
four variables chosen in the stepwise regressions. Figures 6-1 and 6-2 show
the path models as originally specified, and Figures 6-3 and 6-4 display the
models after the above trimming was performed. The dotted lines indicate the
variables that were trimmed.

D. PATH DIAGRAMS

The results of the stepwise and two-stage least squares regressions
performed for the initial path analysis are presented in Figures 6-5 and 6-6
for the Residential and Industrial/Office Models, respectively, The results
of the final path analysis are presented similarly in Figures 6-7 and 6-8.
2
The path coefficients (3) are shown on each path and the R for each regres-
sion is displayed next to the associated dependent variable. For each equa-
tion, the effect on the dependent variable of all residual causes can be
quantified by the path coefficient for residual causes, defined as:
= (1-R2)1/2
6-25
-------
TABLE 6-6
INSTRUMENTAL VARIABLES USED IN THE FIRST PATH ANALYSIS
Residential Model
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
DUACRE
MINCC
OFFACR
DISCBD
MAN AC R
MPR70
VACACR
RRMI
OFFVAC
AUTO
HWYINT
DELEMP
EMPACR
Industrial /Office Model
DUACRE
MINCC
OFFACR '<
DISCBD
MANACR
MPE70
VACACR
RRMI
OFFVAC
AUTO
HWYINT
DELEMP
MPET2
6-26
-------
rsj
RESIDENTIAL
MAJOR PROJECT
(I97O.I968.T+2)
(MANACR)
Figure 6-1 Original Residential Path Diagram
-------
era
IS3
NO. OF EMPLOY
IW MAJOS PBOJKT
Figure 6-2 Original Industrial-Office Path Diagram
-------
RESIDENTIAL
MAJOR PROJECT
1970
Figure 6-3 Initial Residential Path Diagram
-------

EDUCATIONAL
FACILITIES

CULTURAL
FACILITIES

NO. OF EMPLOY
IN MAJOR PROJECT
(1970. T+E)
Figure 6-4 Initial Industrial-Off ice Path Diagram
-------
RESIDENTIAL
MAJOR PROJECT
19 70
Figure 6-5 Initial Path Analysis for the Residential Model
-------
NO. OF EMPLOY

IN MAJOR PROJECT

(ie70. T+Z)
?* VV
W s—-x
(MANACRJ
Figure 6-6 Initial Path Analysis for the Industrial- Office Model
-------
c*>
CO
RESIDENTIAL
MAJOR PROJECT
1970
Figure 6-7 Final Path Analysis for the Residential Model
-------
..HU

EDUCATIONAL
.25
c-,18

NO. OF EMPLOY
IN MA JOB PROJECT
(1970.
Figure 6-8 Final Path Analysis for the Industrial-Off ice Model
-------
For example,, in the final path analysis for the Residential model (Figure 6-7),
? '
the R for the RES equation is 0.773. Thus, the path coefficient for all
residential causes of RES is 8 = /I-.773 = 0.48.
&
A detailed discussion of the trimming of each model equation is
given in the next section, along with a presentation of the path regression
coefficients for the final model. The complete statistical output of the
final path analysis is given in Appendix C, (published in the separate appen-
dix to this volume, APTIC document 80998) £42].

E. THEORY TRIMMING

1. Approach

The results of the first path analysis revealed many path
coefficients that were too low or of the wrong sign, as predicted from theory.
In addition, the overall t-statistics of several of the model equations indi-
cated no statistical significance at the five percent level. Thus, criteria
were developed, which when applied to the numerical output of the first path
analysis, trimmed many model paths. A second path analysis was performed,
and the process repeated several times until a final path model was decided
upon. This recursive procedure was necessitated by the fact that trimming
one variable often causes significant changes in the path coefficients of
the remaining variables.

In deciding what paths to drop, we did not choose one fixed
significance level (e.g., five percent) against which to compare the model t
and F statistics. In the two stage least squares analysis, the values cal-
culated for a t-statistic can only asymptotically be interpreted as being
distributed as t, due to the first stage estimation of all feedback loop
variables. Thus, it was necessary to use a more general rule in interpret-
ing these statistics. In ordinary least squares analysis, an accepted guide-
line is that if t or F* is greater than 1.0, the variable being tested raises
* F(l,N-k-l) = t2(N-k-l,)
6-35
-------
2
the adjusted R of the equation, i.e., it explains a significant additional
amount of the variance of the dependent variable. Conversely, a t or F less
than 1.0 indicates that the variable in question lowers the adjusted IT of
the equation.
2
Th§ adjusted R is defined as:
where i

k = number of independent variables
N = number of data samples.
2
The conventional R statistic can yield deceptive results when the signifi-
cance of independent variables is in question. For example* simply adding
a variable to any regression equation, whether it is at all correlated with
2
the dependent variable or notj will raise the R and indicate additional
y
variance has been explained. The R _ gives a more conservative, unbiased
a
estimate of the; amount of variance explained in the dependent variable
through the regression equation.

The path coefficient (3) indicates the expected standard devi-
ation change in the dependent variable given a one standard deviation change
in a particular independent variable, all other effects held constant. Com-
paring the B's for different variables allows one to evaluate the relative
importance of the various independent variables because of the standardiza-
tion. In general, g < 0.1 indicates a variable is not contributing impor-
'tantly to the regression equation.

In trimming the path analysis model then, the following rules
were applied. A path was trimmed if:

• |t| or F < 1.0, and 3 < 0.1, and loss of the variable would
not cause the loss of a significant instrumental variable
in the first stage estimations;
6-36
-------
• the sign of path coefficient (3) was wrong and counter to
the original path model hypothesis.

In addition, some paths whose statistics indicated they should
be trimmed were (sept if the sign of the path coefficients (3) was correct
and the variables were very important causally in the path model. Finally,
the discovery of 3 > 1.0 in a few model equations indicated correlations
between the independent variables to the extent that some paths were redun-
dant. In these instances, simple correlations between the various path
coefficients were computed and the most highly correlated variable was
trimmed. Correlation coefficients were calculated from the variance-covari-
ance matrix of the estimated coefficients by dividing the covariance of the
coefficients of the two variables in question by the product of their indi-
vidual standard deviations. A detailed discussion of the trimming process
performed on each model equation is given in the next section.

2. Discussion of Individual Equations

One result of the first path analysis was that path coefficients
for the variables VACACR and DELEMP consistently appeared in the model equa-
tions with either an unexpected sign or too low a t or F statistic. There-
fore, the original specifications of these variables were examined. It was
discovered that VACACR did not exclude vacant developable acreage in the
major proje;ct, as intended. Also, examination of data for the DELEMP variable
indicated that absolute, rather than percentage, employment growth would be
a more appropriate predictor of land use development. Therefore, these
variables were redefined, as shown on Table 6-7, for the second and subse-
quent path analyses.

The instrumental variables used for first stage estimates in
all analyses of the Industrial/Office model equations are those listed on
Table 6-6. For the Residential model, however, the elimination of all feed-
back loops in the MANF and COMM equations, and theory trimming in the HWLMNX
equation, eliminated five instrumental variables (MANACR, RRMI, AUTO, EMPACR,
and MINCC) from the list of those used in the first analysis (Table 6-6).
6-37
-------
TABLE 6-7

: PATH ANAl
"TRANSFORMATIONS
REDEFINITIONLOF PATH ANALYSIS MODEL DATA
VariabU
15. VACACR = vacant developable acreage in area of
influence in 1970, excluding major
project

VACACR = 10,000 - vacund - mpdev - mpund

where: vacund = vacant undevelopable acreage in 4 38-41
area of influence in year (t*0)

mpdev = major project size (acreage) 1 4-8
mpund = undeveloped acreage in area of 1 9-12
influence in 1970 under control
of the major project developer

32. DELEMP = total regional employment growth per
acre between 1960 and 1970 (county data)

DELEMP = (e70cty - e60cty)/ac60cy

where: e70cty = total employment in county 5 11-17;
in 1970
e60cty = total employment in county 5 4-10
in 1960
ac€0cy = county acreage in 1960 5 18-24

See Table 6-1 for original definitions of all variables.
6-38
-------
Note that since a path from MANF to OFFICE remains in the final model, MANF
had to be added in as an instrumental variable for the two-stage least
squares in the Residential model equations. This reduced set of instrumental
variables was used for first stage estimates in the second and subsequent
analyses of the Residential model equations.

a. Residential Model

(1) RES Equation. The initial path analysis involved a two
stage least squares with the following variables:

Endogenous Exogenous
OFFICE MPR70
: MANF DUACRE
HWLMNX VACACR

Based on the second and third analyses, VACACR and MANF were trimmed (low 3
and t-statisties), respectively. Although the path coefficient for OFFICE
subsequently dropped below 0.1 in absolute value, this variable was kept for
its causal importance in the model. The final equation became:

RES = -1.38 OFFICE + 168 HWLMNX - 0.808 MPR70 + 3930 DUACRE + 2730
RES = -0.09 OFFICE' + 0.72 HWLMNX - 0.27 MPR70 + 0.30 DUACEE*
R2 = 0.773
R2a = 0.730
a
The coefficients for OFFICE and MPR70 are unexpectedly
negative. It is possible that these negative signs are due to the finite
size of the area of influence. In the case of the first variable it can be
argued that office development has priority over and thus uses up land avail-
able for residential development. Alternatively, the area-wide changes in
zoning preceding office development may preclude nearby residential develop-
ment. In the case of MPR7Q,. it can be argued that a large residential 'major
project has supplied a good portion of the residential development demand
* Italicized variables are standardized and associated coefficients are path
coefficients (3).
6-39
-------
in the area of influence, using up the finite land available. It was hoped
that the variable VACACR would control for these problems by representing
developable land not in the major project, but unfortunately, it performed
poorly in the equation and was dropped.

(2) COMM Equation. The initial path analysis involved a
two-stage least squares with the following variables:

Endogenous Exogenous
RES
OFFICE
MANF
HWLMNX
MPR70
HWYINT
MINCC
DELEMP
Based on the first analysis, MANF was trimmed (low 3 and t-statistics) and
HWYINT was trimmed (wrong sign of 3). Subsequently, the loss of the path
from COMM to HWLMNX (see section VI.E.2.a.5) removed the only feedback loop
in this equation. Thus, the second analysis involved ordinary least squares
techniques. Based on the second analysis, MINCC was trimmed (low B and t-
statisties) and the final equation became:

COMM = 0.0814 RES + 0.649 OFFICE + 18.4 HWLMNX + 0.0976 MPR70
-692 DELEMP - 139
COMM =0.49 RES + 0.25 OFFICE +0.47 HWLMNX +0.20 MPR70
-0.16 DELEMP
R2 = .860
R2a = .823
a
The coefficient for DELEMP is unexpectedly negative.
However, it can be argued it is indicating that areas that experience a large
growth in employment from 1960-1970 were less developed areas (those with
less total commercial land use in 1970), while older urban areas (those with
the greatest total commercial land use in 1970) experienced the lowest growth
in employment from 1960-1970.

(3) OFFICE Equation. The initial path analysis involved a
two-stage least squares with the following variables:
6-40
-------
Endogenous Exogenous
RES
COMM
MANF
HWLMNX
MPR7Q
VACACR
HWYINT
OFFVAC
OFFACR
DISCED
DELEMP
Based on the first analysis, RES and MPR7Q were trimmed (low 3 and t-statis-
tics). Correlations between path coefficients were computed because of the
existence of 3 values > 1.0 for the variables COMM, HWLMNX, and OFFACR. COMM
was found to have a correlation with HWLMNX of -0.92, and with OFFACR of
0.78. Based on these results, COMM was trimmed to eliminate a seemingly
redundant p;ath. Based on the second analysis, HWLMNX was trimmed "(low 3 and
t-statistics) and COMM was added back into the model. Based on the third
analysis, HWYINT was trimmed (low 3 and t-statistics). This analysis also
showed COMM to be insignificant (low 3 and t-statistics), however, it was
not trimmed since to do so would have removed the OFFICE equation entirely
from the five-equation system of feedback loops. Also, it was not clear
whether COMM and HWLMNX were doing so poorly in the model because the feed-
back loops with RES were lost (in the first trimming) or whether the redefi-
nition of the instrumental variables VACACR and DELEMP affected the first-
stage estimates. Therefore, for the fourth analysis, HWLMNX was added back
in so that all original endogenous variables, except RES, were present. It
was not possible to rerun the original OFFICE equation at this point with all
four endogenous variables since the loss of four instrumental variables in
the Residential model system (due to trimming of other equations after the
first analysis) would have caused the equation to be under-identified and
hence unsolvable by two-stage least squares techniques.

The results of the fourth analysis were disappointing
with COMM and HWLMNX having path coefficients of 3 = .09 and -.09, respec-
tively, and low t-statistics. In addition, the 3 value for MANF was nearly
1.0; indicating a correlation between paths. Correlations, between the path
coefficients for the three endogenous variables were found to be:
6-41
-------
rMANF, COMM = °'86
rMANF, HWLMNX
rCOMM, HWLMNX
= °*98
These results indicate a certain degree of redundancy is present between all
three endogenous variables, and again, trimming of COMM would eliminate the
most redundancy. Thus, in succeeding analyses all combinations of the three
endogenous variables, RES, MANF and HWLMNX, were tried. The results indicate
that the presence of all three of these endogenous variables allows for the
highest R . . Thus, the final equation became:

OFFICE = -0.0319 RES +5.74 HWLMNX + 0.0572 MANF - 0.0127 VACACR
-5.26 OFFVAC + 690 OFFACR - 15.4 DISCED + 765 DELEMP + 421
OFFICE = -0.47 RES + 0.36 HWLMNX + 0.21 MANF - 0.10 VACACR
-0.07 OFFVAC +0.57 OFFACR - 0.28 DISCED +0.46 DELEMP

R2 = ,812
R2 = .702
a

Unexpected negative signs appear on the path coefficients for RES and VACACR.
In the case of RES, the same argument given for the OFFICE to RES path (see
Section II.E.2.a. 0)} can be put forth. The negative sign for VACACR is
unexplained. This variable was not trimmed, however, because of its impor-
tance as an instrumental variable. Although the 3 value for the variable,
OFFVAC, is below 0.1 in absolute value, the sign of the path coefficient is
correct. This variable was also not trimmed because of its importance as
an instrumental variable.

(4) MANF Equation. The initial path analysis involved a
two -stage least squares with the following variables:
6-42
-------
Endogenous Exogenous
HWLMNX MPR70
HWYINT
RRMI
MANACR .
DELEMP

Based on the first analysis, HWLMNX was trimmed (low 3 and t-statistics),
thus, severing the only feedback loop with the simultaneous block of equa-
tions in the Residential model. The second and subsequent path analyses,
therefore, involved ordinary least squares analysis. Based on the second
analysis, RRMI was trimmed (low g and t-statistics}. The third analysis
indicated a path coefficient for DELEMP that was slightly below 0.1 in abso-
p
lute value and with an unexpected sign (£=-0.09). The R for this equation
was very low (0.085) and so it was decided to re-analyze the original equa-
tion (without HWLMNX) using stepwise multiple regression techniques. In
addition to the five exogenous variables shown, the variable, ENERGY, was
added into the equation. This variable was trimmed prior to the first path
analysis to allow for sufficient degrees of freedom in the first stage
estimates of the two-stage least squares (see Section VI.C.4). The use of
ordinary least square techniques removed this constraint. Based on the
fourth analysis, DELEMP was trimmed (low 0 and F statistics) and the final
equation became:

MANF = 1470 MANACR + 316 HWYINT + 4100 ENERGY - 0.176 MPR70 + 79.8 RRMI
-3490
MANF = Q.^, MANACR + 0.33 HWYINT + 0.53 ENERGY - 0.25 MPR70 + 0.24 BBMI

R2 = 0.407
R2 =0.248
a
(5) HWLMNX Equation. The initial path analysis involved a
two-stage least squares with the following variables:
6-43
-------
Endogenous Exogenous
RES MPR7Q
COMM HWYINT
OFFICE DISCBD
MANF EMPACR
AUTO

Based on the first analysis, MPR70 and AUTO were trimmed (low $ and t-statis-
tics) and EMPACR was trimmed (wrong sign for coefficient and not a significant
instrumental variable). Based on the second analysis, OFFICE and MANF were
trimmed (low (3 and t-statistics). Based on the third analysis, COMM was
trimmed (low B and t-statistics}, and the final equation became:

HWLMNX = 0.00247 RES - 3.31 HWYINT - 1.73 DISCBD +47.9
HWLMNX - 0.58 RES - 0.20 HWXINT - 0.49 DISCBD

R2 = 0.742
R2 = 0.712
a
The coefficient for HWYINT is unexpectedly negative.
Since HWYINT is an indicator of expressway or highway lane miles, while
HWLMNX is defined as non-expressway highway lane miles, it can be argued that
areas with an extensive limited access highway network (greatest HWYINT) have
less non-expressway highway lane miles, and vice versa.

(6) WHOLE Equation. The initial path analysis involved an
ordinary least squares with the following variables:

Endogenous Exogenous
COMM DELEMP
MANF HWYINT
> HWLMNX RRMI
WWEA

The results of the first analysis revealed both COMM and HWLMNX with 3 values
> 1.0. As a result, HWLMNX was trimmed to eliminate a redundant path. Step-
wise multiple regression techniques were used on the second analysis, and
based on the results, the variables COMM, DELEMP, RRMI, and WWEA were all
6-44
-------
trimmed (lov\f 3 and F statistics). The final equation became:

WHOLE =60.4 HWYINT + 0.0608 MANF +26.5
MOLE = 0.43 HWIINT + 0.41 MANF

R2 = 0.421

R2 = 0.389
a

(7) HOTEL Equation. The initial path analysis involved an

ordinary least squares with the following variables:

Endogenous Exogenous

RES MPR70
. OFFICE DISCED .
MANF EMPACR

Based on the first analysis, EMPACR was trimmed (wrong sign of 3). Stepwise

multiple regression techniques were used in the second analysis, and based

on the results, the variables, OFFICE, MPR70, and DISCED, were all trimmed

(low 3 and t-statistics). The final equation became:

HOTEL = 0.00151 RES + 0.0140 MANF + 49.3
HOTEL = 0.15 RES + 0.33 MANF
I

R2 = 0.144

R2 = 0.096
a

(8) HOSPTL Equation. The initial path analysis involved an

ordinary least squares with the following variables:

Endogenous Exogenous

RES MPR70
DISCED
NONHSE

& Based on the first analysis, DISCED (low 3 and F statistics) and NONHSE (wrong

sign of 3) were trimmed. Based on the second analysis, the final equation
became:
6-45
-------
HOSPTL = 0.0106 RES -f 0.0246 MPR7Q - 61.6
ROSPTL = 0.59 RES + 0.47 MPR70
R2 = 0.348
R2a = 0.312
a
(9) CULTUR Equation. The initial path analysis involved
an ordinary least squares with the following variables:

Endogenous Exogenous
RES MPR7Q
DISCED
2
The results of the first analysis revealed an extremely low R „ (actually
a
below zero) indicating the equation is a very poor predictor of cultural
facilities. Stepwise multiple regression techniques failed to produce more
significant results. Although the 3 for the variable RES was below 0.1,
this variable was not trimmed because it was felt to be an important causal
path. The final equation is:

CULTUR = 0.00014 RES + 0.00154 MPR70 - 0.447 DISCBD + 19.5
CULTUR = 0.04 RES + 0.15 MPR70 - 0.15 DISCBD

R2 = 0.050
R2 = -0.061
a
(10) CHURCH Equation. The initial path analysis involved
an ordinary least squares with the following variables:

Endogenous Exogenous
RES MPR70
MPKIDS

Based on the first analysis, MPKIDS was trimmed (low B and t-statistics);
and the final equation became:
6-46
-------
CHURCH = 0.0134 RES + 0.00716 MPR7Q + 13.3
CHURCH = 0.72 RES +0.13 MPR7Q

R2 = 0.456
R2a = 0.425
Q
(11) EDUC Equation. The initial path analysis involved an
ordinary least squares with the following variables:

Endogenous Exogenous
RES MPR70
MPKIDS
ENRACR

Based on the first analysis, ENRACR was trimmed (wrong sign of g), and the
final equation became:

EDUC = 0.0392 RES + 209 MPKIDS +0.0203 MPR70 - 137
EDUC = 0.57 RES +0.37 MPKIDS + 0.10 MPR70

R2 = 0.481
R2 = 0.420
a
(12) REC Equation. The initial path analysis involved an
ordinary least squares of the following variables:

Endogenous Exogenous
RES IN CMP
MPR70
MINCC

Based on the first analysis, RES and MPR70 were trimmed (low 3 and F statis-
tics), and the final equation became:
6-47
-------
REC = 329 INCMP - 563 MINCC + 826
REC = 0.45 INCMP - 0.39 MINCC

R2 = 0.362
R2a = 0.299
a
An unexpected negative sign appears on the coefficient for MINCC. It can be
argued, however, that in areas with high income, families have more expen-
sive recreational activities that take them out of their immediate area
(e.g., summer house, overseas travel, ski vacations in the mountains, etc.),
while in areas of moderate Income, there ts a greater density of nearby
recreational facilities. None of the case studies in the sample included
low income areas.

b. Industrial/Office Model

(1) RES Equation. The initial path analysis involved a two
stage least squares with the following variables:

Endogenous Exogenous
OFFICE MPET2
MANF MPE70
HWLMNX DUACRE
VACACR

Based on the first analysis, MANF was trimmed (wrong sign of 0). Based on
the second analysis, HWLMNX was trimmed (low 3 and t-statistics), and the
final equation became:

RES = 3.96 OFFICE - 0.859 MPET2 +0.392 MPE70 + 4560 DUACRE
+1.05 VACACR - 4860
RES =0.13 OFFICE . - 0.31 MPET2 +0.19 MPE70 + 0.57 WACEE
+ 0,32 VACACR

R2 = 0.600
R2 -
, 0.493
a
6-48
-------
The one unexpected sign tn the path coefficients is for the variable MPET2.
First it is important to note that two major project variables (MPET2 and
MPE7Q) appear in the equation. It can be argued that MPE70 is already con-
trolling for the final size of the major project in the equation, and that
the coefficient for MPET2 indicates a secondary effect on RES. It is
interesting to note that the year t+2 coincides in most cases with the 1960-
1961 recession in the U.S., see Table 6-8. Unlike most years, during the
period 1960-1961 a large initial growth rate in a major project would not
have induced as much of the other types of land use development due to the
economic stagnation accompanying a recession. This hypothesis indicates
one area where a dynamic, rather than a static, model may have aided in the
analysis.

(2) COMM Equation. The initial path analysis involved a
two stage least squares with the following variables:

Endogenous Exogenous
RES MPET2
OFFICE MPT70
MANF HWYINT
HWLMNX MINCC
DELEMP

Based on the first analysis, HWLMNX (low B and t-statistics) was trimmed.
Although MINCC also had low B and t-statistics, it was not trimmed because
of its efficacy as an instrumental variable. Based on the second analysis,
OFFICE and MPET2 were trimmed (low B and t-statistics}. The results of the
third analysis showed the |B| value for HWYINT to be slightly below 0.1, but
this variable was not trimmed because it was felt it was important causally
in the equation. Thus, the final equation became:

COMM = 0.0785 RES + 0.413 MANF + 0.0367 MPE70 +39.1 HWYINT
- 2270 DELEMP - 290 MINCC - 45.2
COMM = 0.50 RES + 0.74 MANF + 0.11 MPE70 + 0.09 HWYIW
-0.25 DELEMP - 0.09 MINCC

R2 = 0.794
R2 = 0.726
a
6-49
-------
TABLE 6-8

GROSS PRIVATE DOMESTIC INVESTMENT IN THE UNITED STATES,
1955-1973, IN BILLIONS OF 1958 DOLLARS*
Year
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
Residential
Structures
25.1
22.2
20.2
20.8
24.7
21.9
21.6
23.8
24.8
24.2
23.8
21.3
20.4
23.2
23.7
22.2
29.0
34.6
34.0
Non-Residential
Structures
16.2,
18.5
18.2
16.6
16.2
17.4
17.4
17.9
17.9
19.1
22.3
24.0
22.6
23.4
24.3
23.7
22.5
23.0
24.8
Total**
69.0
69.5
67.6
62.4
68.8
68.9 '
67.0
73.4
76.7
81.9
90.1
95.4
93.5
98.8
103.8
99.5
105.0
118.3
126.6
* Source: Department of Commerce, Bureau of Economic Analysis

**Includes change in Business Inventories and Fixed Investment in Residential
Structures, Non-Residential Structures and Producers;Durable Equipment
6-50
-------
Besides the previously discussed MPET2, the only variable with an unexpected
sign on its coefficient was DELEMP. In this instance, it can be argued that
the greatest growth in employment between 1960 and 1970 occurred in largely
undeveloped areas which, even in 1970, probably have less total commercial
land use than older urban areas (a bias towards low initial levels).

(3) OFFICE Equation. The initial path analysis involved
•es with the following vai
Endogenous Exogenous
RES
COMM
MANF
HWLMNX
MPET2
MPE70
VACACR
HWYINT
OFFVAC
OFFACR
DISCED
DELEMP
The results of the first analysis revealed |3| values > 1.0 for three vari-
ables: RES, COMM, and MANF. An examination of the correlations between
both regression coefficients indicated COMM was the most highly correlated
variable. It was, therefore, trimmed to eliminate a redundant path. Based
on the second and third analyses, the variables OFFVAC and HWLMNX were ';
trimmed (wrong sign of &), respectively. Based on the fourth analysis,
VACACR and HWYINT were trimmed (low g and t-statistics), and the final equa-
tion became:

OFFICE = 0.0110 RES + 0.0178 MANF - 0.0199 MPE70 4 0.0195 MPET2
- 285 OFFACR - 8.11 DISCBD + 902 DELEMP + 174
OFFICE =0.34 RES +0.15 MANF - 0.30 MPE70 + 0.21 MPET2
-0.50 OFFACR - 0.31 DISCBD +0.47 DELEMP

R2 = 0.462
R2a = 0.214
a
Examination of the statistical output for the final equation reveals unex-
pected signs on the coefficients for the variables OFFACR, MPET2, and MPE70.
6-51
-------
OFFACR was not trimmed even though ft was felt its coefficient should not
be negative because of its importance as an instrumental variable. The
signs for MPET2 and MPE70 appear to be reversed from the situation discussed
previously in the RES equation. Again we believe that MPE70 is controlling
for the final size of the major project in the equation. The negative sign
on MPE70 can be explained by the argument that where the Industrial/Office
major project is large, the presence of this development discouraged other
office development in the same area, using up most of the available land
(zoned for office) and capital resources. As before, the variable MPET2
can be thought of as controlling secondary effects. We were unable, however,
to formulate a reason why the coefficient was not negative, as in the RES
equation.

(4) MANF Equation. The initial path analysis involved two
stage least squares of the following variables:

Endogenous Exogenous
HWLMNX MPET2
MPE70
HWYINT
RRMI
MANACR
DELEMP

Based on the first analysis, HWYINT (low $ and t-statistics) was trimmed.
The results of the second analysis indicated low 3 and t-statistics for
HWLMNX, but this variable was kept because of its importance causally in the
model. Thus, the final equation became:

MANF =4.55 HWLMNX + 0.124 MPET2 - 0.0974 MPE70 +55.6 RRMI
+8690 DELEMP + 1020 MANACR - 22.5
MANF = 0.06 HWLMNX + 0.16 MPET2 - 0.16 MFE70 +0.19 PBMI
+0.51 DELEMP + 0.16 MANACR

R2 = 0.426
R2a = 0.221
a
6-52
-------
Note the unexpected signs on the coefficients for MPET2 and MPE70 indicate
the same situation is occurring as was discussed previously for the OFFICE
equation.

(5) HWLMNX Equation. The initial path analysis involved
two stage least squares of the following variables:

Endogenous Exogenous
RES MPET2
COMM MPE70
OFFICE HWYINT
MANF DISCBD
AUTO

Based on the first, second, and third analyses, MPET2, MANF, and MPE70 were
trimmed (low 3 and t-statistics), respectively. The variable DISCBD was
not trimmed as a result of low 3 and t-statistics in the fourth analysis
because of its importance causally in the model. Thus, the final equation
became:

HWLMNX = 0.00178 RES - 0.0209 OFFICE + 0.00980 COMM - 4.59 HWYINT
-0.246 DISCBD +32.6 AUTO +14.3
HWLMNX =0.45 RES - 0.17 OFFICE + 0.39 COMM - 0.42 HWYINT
-0.08 DISCBD +0.37 AUTO

R2 = 0.682
R2 = 0.568
a
The coefficient for HWYINT is unexpectedly negative. Since HWYINT is an
indicator of the presence or absence of expressway or interstate highways,
while HWLMNX is defined as non-expressway highway lane miles, it can be
argued that areas with an extensive interstate highway network (greatest
HWYINT) have less non-expressway highway lane miles, and vice-versa. The
coefficient for OFFICE is also unexpectedly negative. However, a reasonable
argument for this variable could not be developed.

(6) WHOLE Equation. The initial path analysis involved
ordinary least squares of the following variables:
6-53
-------
Endogenous Exogenous

' COMM HWYINT
MANF RRMI
HWLMNX WMEA
DELEMP

Based on the first analysis, MANF was trimmed (wrong sign of &}. Based

on the second COMM and DELEMP were trimmed (low 8 and F-statistics) and

the final equation became:

WHOLE = 10,500 WWEA + 35.7 HWLMNX + 29.0 RRMI + 64.6 HWYINT - 1,120
mOLE 0.38 WWEA + 0.51 HWLMNX .+ 0.10 RRMI + 0.08 HWYINT

'R2 = 0.615

R2 = 0.543
a

The variable HWYINT was not trimmed, despite its low & value, because of its

causal importance in the model.

(7) HOTEL Equation. The initial path analysis involved

ordinary least squares of the following variables:

Endogenous Exogenous

RES MPE70
OFFICE DISCED
MANF EMPACR
MPET2

Based on the first analysis, RES (low 3 and F-statistics} and EMPACR (wrong

sign of B) were trimmed, and the final equation became:

HOTEL = 0.0524 MANF +0.105 OFFICE + 0.0182 MPE70 - 0.0156 MPET2
-4.04 DISCED - 26.0
HOTEL = 0.65 MANF + 0.15 OFFICE + 0.37 MPE70 - 0.24 MPET2
-0.21 DISCED
r

R2 = 0.699

R2 = 0.619
d
6-54
-------
The opposite signs of the coefficients for MPE70 and MPET2 indicate the situ-
ation discussed previously for the RES equation is occurring.

(8) HOSPTL Equation. The initial path analysis involved
ordinary least squares of the following variables:

Endogenous Exogenous
RES MPE70
DISCED
NONHSE
MPET2

Based on the first analysis, MPE70 (wrong sign of 3) and MPET2 (low 3 and F-
statistics) were trimmed. Based on the second analysis, RES was trimmed (low
8 and F-statistics) and the final equation became:

HOSPTL = 508 NONHSE - 5.66 DISCED + 108
HOSPTL =0.41 NONHSE - 0.19 DISCED

R2 = 0.269
R2 = 0.228
a
(9) CULTUR Equation. The initial path analysis involved
ordinary least squares of the following variables:

Endogenous Exogenous
RES DISCBD

The results of the first analysis indicate both of these variables should
be trimmed. However, to do so would totally eliminate the equation. Thus,
the final form is:

CULTUR = 0.0023 RES - 0.249 DISCBD + 13.6
CULTUR = 0.06 RES - Q.09 DISCBD

R2 = 0.015
R2 = -0.040
a
6-55
-------
The extremely low R (the adjusted R is actually below zero) indicates that
'similar to the CULTUR equation in the Residential model, this equation is a
very poor predictor of cultural facilities.

(10) CHURCH Equation. The initial path analysis involved
ordinary least squares of the following variables.

Endogenous
RES

The final equation is:

CHURCH = 0.00513 RES + 51.1
CRURCR = 0.34 RES

R2 = 0.119
R2a = 0.119
a
(11) EDUC Equation. The initial path analysis involved
ordinary least squares of the following variables:

Endogenous Exogenous
RES ENRACR

Based on the first analysis, ENRACR was trimmed (wrong sign of &), and the
final equation became:

EDUC = 0.0405 RES +304
EDUC = 0.48 RES

R2 = 0.227
R2 = 0.227
a
(12) REC Equation. The initial path analysis involved
ordinary least squares of the following variables:

Endogenous Exogenous
RES MINCC
6-56
-------
The final equation is:

REC = 0.0149 RES + 242 MINCC - 184
EEC ~ 0.37 RES + 0.28
R = 0.239
R2a = 0.176
a
3. Summary of Final Model Equations in Unstandardized Form
RES
COMM
MANF

HWLNNX
WHOLE
HOTEL
HOSPTL
CULTUR
CHURCH
EDUC
REC
a. English Units

(1) Residential Model
= - 1.38 OFFICE + 168 HWLMNX - 0.808 MPR70 + 3930 DUACRE + 2730
= 0.0814 RES + 0.649 OFFICE + 18.4 HWLMNX + 0.0976 MPR70 - 692
DELEMP - 139
OFFICE = - 0.0319 RES + 5.74 HWLMNX + 0.0572 MANF - 0.0127 VACACR
- 5.26 OFFVAC + 690 OFFACR - 15.4 DISCBD + 765 DELEMP + 421
RES

COMM
1470 MANACR +316 HWYINT + 4100 ENERGY - 0.176 MPR70 +79.8 RRMI
3940
.0.00247 RES - 3-31 HWYINT - 1.-.73 DISCBD + 47,9
60.4 HWYINT + 0.0608 MANF + 26.5
0.00151 RES + 0.0140 MANF + 49.3
0.0106 RES + 0.0246 MPR70 - 61.6
0.00014 RES + 0.00154 MPR70 - 0.447 DISCBD + 19.5
0.0134 RES + 0.00716 MPR70 + 13.3
0.0392 RES + 209 MPKIDS + 0.0203 MPR70 - 137
329 INCMP - 563 MINCC +826

(2) Industrial/Office Model

3.96 OFFICE - 0.859 MPET2 + 0.392 MPE70 + 4560 DUACRE + 1.05 VACACR
4860
0.0785 RES + 0.413 MANF + 0.0367 MPE70 +39.1 HWYINT - 2270 DELEMP
290 MINCC - 45.2
6-57
-------
OFFICE = 0.0110 RES + 0.0178 MANF - 0.0199 MPE70 + 0.0195 MPET2
- 285 OFFACR - 8.11 DISCBD + 902 DELEMP + 174
MANF = 4.55 HWLMNX + 0.124 MPET2 - 0.0974 MPE70 +55.6 RRMI + 8690
DELEMP + 1020 MANACR - 22.5
HWLMNX = 0.00178 RES - 0.0209 OFFICE + 0.00980 COMM - 4.59 HWYINT
- 0.246 DISCBD + 32.6 AUTO + 14.3
WHOLE = 10,500 WWEA + 34.7 HWLMNX + 29.0 RRMI + 64.6 HWYINT - 1,120
HOTEL = 0.0524 MANF + 0.105 OFFICE + 0.0182 MPE70 - 0.0156 MPET2
- 4.04 DISCBD - 26.0
HOSPTL = 508 NONHSE - 5.66 DISCBD + 108
CULTUR = 0.0023 RES - 0.249 DISCBD + 13.6
CHURCH = 0.00513 RES +51.1
EDUC = 0.0405 RES + 304
REC = 0.0149 RES + 242 MINCC - 184

b. Metric Units

As can be seen in Tables 6-1 and 6-5, the model variables
were defined and the path analyses carried out in English units. Thus, the
path regression coefficients given for each equation in the previous section
reflect this fact. The path diagrams in Section VI.D, however, display the
path coefficients ($) which are independent of the units chosen.

This section summarizes the final model equations for vari-
ables RES, COMM, OFFICE, MANF, WHOLE, HOTEL, HOSPTL, CULTUR, CHURCH, and EDUC.
2
The variables REC, VACACR, also use m, but in place of acres. Distances in
miles in HWLMNX, DISCBD, and RRMI are converted to km. And finally, the units
-1 -2
of acre are replaced by m in the vari<
NONHSE, ENRACR, MANACR, DELEMP, and AUTO.
of acre"1 are replaced by m"2 in the variables DUACRE, OFFACR, WWEA, EMPACR,
The conversion factors used were:
i ,000 ftr -»•
acres ->
mi 1 es ->•
acre" -»•
mfc
m2
km
*by 92.90
*by 4,047
*by 1.609
vby 4047
6-58
-------
(1} Residential Model

RES = - 1.38 OFFICE + 97QQ HWLMNX - 75.1 MPR70 + 1,480,000,000 DUACRE
+ 254,000
COMM = 0.0814 RES + 0.649 OFFICE + 1,060 HWLMNX + 9.07 MPR70
- 260,000,000 DELEMP - 12,900
OFFICE = - 0.0319 RES + 331 HWLMNX + 0.0572 MANF - 0.000292 VACACR
- 489 OFFVAC + 260,000,000 OFFACR - 889 DISCBD + 288,000,000 DELEMP
+ 39,100
MANF = 553,000,000 MANACR +29,400 HWYINT + 381,000 ENERGY - 16.4 MPR70
+ 4,610 RRMI - 366,000
HWLMNX = 0.0000428 RES - 5.33 HWYINT -1.73 DISCBD +771.
WHOLE = 5,610 HWYINT + 0.0608 MANF + 2,460
HOTEL .= 0.00151 RES + 0.0140 MANF + 4,580
HOSPTL = 0.0106 RES + 2.29 MPR70 - 5,720
CULTUR = 0.00014 RES + 0.143 MPR70 - 25.8 DISCBD + 1,810
CHURCH = 0.0134 RES + 0.665 MPR70 + 1,240
EDUC = 0.0392 RES + 19,400 MPKIDS +1.89 MPR70 - 12,700
REC = 1,330,000 INCMP - 2,280,000 MINCC + 3,340,000

(2) Industrial/Office Model

RES = 3.96 OFFICE - 79.8 MPET2 + 36.4 MPE70 +1,710,000,000 DUACRE
+ 0.0241 DUACRE - 451,000
COMM = 0.0785 RES + 0.413 MANF + 3.41 MPE70 + 3,630 HWYINT
- 853,000,000 DELEMP - 26,900 MINCC - 4,200
OFFICE = 0.0110 RES + 0.0178 MANF - 1.85 MPE70 + 1.81 MPET2
- 107,000,000 0FIACR - 468 DISCBD + 339,000,000 DELEMP + 16,200
MANF = 263 HWLMNX + 11.5 MPET2 - 9.05 MPE70 + 3,210 RRMI
+ 3,270,000,000 DELEMP +383,000,000 MANACR - 2,090
HWLMNX = 0.0000308 RES - 0.000362 OFFICE + 0.000170 COMM - 7.39 HWYINT
- 0.246 DISCBD + 212,000 AUTO +23.0
WHOLE = 3,948,000,000 WWEA + 2,000 HWLMNX + 1,670 RRMI + 6,000 HWYINT
- 104,000
HOTEL = 0.0524 MANF + 0.105 OFFICE +1.69 MPE70 - 1.45 MPET2 - 233 -DISCED
- 2,420
HOSPTL = 191,000,000 NONHSE - 327 DISCBD + 10,000
6-59
-------
CULTUR = 0.0023 RES - 14.4 DISCBD + 1,260
CHURCH = 0.00513 RES + 4,750
EDUC = O.Q4Q5 RES + 28,200
REC = 0.649 RES + 979,000 MINCC - 745,000

F. COMPARISON OF MODEL FORMS

The analyses presented so far in this report have assumed a. linear
form for the-model equations, e.g.

RES =v,'b] OFFICE + DZ HWLMNX + bg MPR70 + b4 DUACRE + c

This form assumes the independent effects of the model variables on RES are
additive. An alternative assumption is that the effects are multiplicative,
e.g.
i i i i
RES = c*OFFICE ] * HWLMNX 2 * MPR70 3 * DUACRE 4

In order to test out this model form, the final path analysis was rerun using
the natural logarithms of all data values in the Residential and Industrial/
Office data files. Path diagrams for the multiplicative form are shown in
2
Figures 6-9 and 6-10. Notice that the R for the OFFICE equation in the Resi-
dential model is negative (Figure 6-9). This probably indicates that the
sufficient condition for the two'Stage least squares is not satisfied (see
Section III.D) or that some other singularity has been introduced into the
problem when the equation is analyzed in multiplicative form.
2
The resultant R values for the multiplicative form were in general
much lower than those obtained using the additive form, however, in two cases,
the multiplicative form did perform significantly better. In the Residential
model, the R for the EDUC equation rose significantly from 0.481 (additive)
to 0.737 (multiplicative). The individual F-statistics and 0 weights for
each independent variable rose as well. Therefore, the multiplicative form
could be used for this particular equation. The final multiplicative equation
is:
6-60
-------
RESIDENTIAL
MAJOR PROJECT
IMANACR
Figure 6-9 Path Analysis for the Residential Model in Multiplicative Form
-------
12. 1—_,
RECR6ATION '43

^9_J
NO. OF EMPLOY
IN MAJOR PROJECT
(1970, T»Z)
Figure 6-10 Path Analysis for the Industrial-Office Model in Multiplicative Form
-------
EDUC = 0.234 * MPKIDS1'87 * RES0'157 * MPR700'651 (English Units)
EDUC = Q.00513 * MPKIDS1'87 * RES0'157 * MPR700'651 (Metric Units)

R2 = 0.737
R2 = 0.706
a
2
In the second case, a higher R was obtained for the HWLMNX equation
in the Industrial/Office model by using the multiplicative form. The R rose
from 0.682 (additive) to 0.845 (multiplicative). Although this change also
represents a significant improvement for a particular equation, in the current
project, the existence of feedback loops between HWLMNX and other endogenous
variables implies that all equations in the simultaneous block should be in
one form or another. If some equations in the simultaneous block were in
additive form and some in multiplicative form, then an enlarged set of indepen-
dent variables (endogenous and instrumental) would be required in the model
equation. If a given variable appears both in an additive and a multiplica-
tive equation, then it would have to be represented by two variables in the
simultaneous block (e.g., HWLMNX and log (HWLMNX)) because of the interconnec-
tions between equations. The additional independent variables would create a
degrees of freedom problem in many of the model equations, preventing correct
solution of the equations. The use of the multiplicative form exclusively in
2
the RES, COMM, OFFICE, and MANF equations would significantly reduce the R
values in each of these equations. Therefore, it is recommended that the
additive form be detained in this case.

6. STABILITY OF PATH COEFFICIENTS

The usefulness of any path analytic model is dependent on the stabi-
lity of the path estimates. In the current study, where only a small data
sample (20) was available for analysis, the question of stability was especi-
ally important. There are at least two approaches to evaluating the stability
of model estimates.
6-63
-------
One approach involves confidence intervals £24]. Given a model
equation:
confidence intervals can be specified in the form:

Y +
where t is the t-statistic for the regression of the model equation and V(Y)
is the variance of Y. V(Y) can be expressed as:
n n
V(U) = Z Z X.X. Covariance (b-b,)
i=l j=l ^ J T J
The confidence interval is thus, a function of actual data values, X-X-.
This approach was not used, as it does not, for our purposes, adequately con-
sider individual observations. All of the necessary data for computing con-
fidence intervals exists in Appendices C and D (published in the separate
appendix to this volume) §42].

The second approach, the statistical technique chosen to investigate
stability, is jackknifing. This procedure involved 20 separate analyses of
each model equation using only 19 data samples. For each analysis, a differ-
ent sample from the total set of 20 was excluded. This approach provided 20
separate estimates of each of the model coefficients, from which mean and
standard deviations, and one-way frequency distributions were prepared. Two
separate indices of coefficient stability were then developed for interpreting
the data. The jackknifing analysis was applied only to the five simultaneous
block equations in each model. Only these equations were analyzed because of
the following reasons:

> the five equations include the majority of the total land use,
and almost all the significant emission sources.
6-64
-------
• the jackknife, as employed here, is a relatively expensive pro-
cedure, (e.g., twenty regressions are run for each equation).

In the jackknifing analysis, only the path regression coefficients
(b) were used. Path coefficients (3) do not provide a valid basis for com-
parisons of systems operating on different data populations since the vari-
ance of the independent variables is included in the path coefficients and
variance can change dramatically from sample to sample [26].
The means and standard deviations computed from the set of 20 values
for each model coefficient are compared on Table 6-9 with the coefficient values
from the final path analysis. The results show the jackknifing results and
final path analysis path regression coefficients to be practically identical
with differences between the two sets, in general, on the order of two to three
percent. The standard deviation of the jackknifing coefficients is, on the
average, 49 percent of the coefficient mean value, indicating appreciable
variance in the coefficient values, but no extreme instability. One-way fre-
quency distributions of the coefficient values are graphed in Appendix B for
the ten model equations in the two simultaneous blocks. To more accurately
quantify the stability of the path regression coefficients and to allow compari-
sons, between equations, two separate indices of stability were developed.

1. Extreme Value Index

The first stability index that was used is a statistical test
for extreme values in a sample developed by Dixon [41]. The null hypothesis
assumes the value farthest from the mean does not differ significantly from
the range of values in the data set, i.e., is not an extreme value. Assump-:
tions are made that the data are independent observations and normally dis-
tributed. The coefficient values that were tested in the current application,
however, are a dependent set. The violation of this assumption does not in-
validate our use of the test statistic as a guide for judging stability: The
corresponding levels of significance probably need adjustment. The 20 values
6-65
-------
TABLE 6-9

COMPARISON OF PATH REGRESSION COEFFICIENTS FROM THE FINAL
PATH ANALYSIS AND THE JACKKNIFING STABILITY ANALYSIS FOR THE
SIMULTANEOUS BLOCK OF EQUATIONS
Model
Equation
Residential
RES

COMM

OFFICE

MANF

HWLMNX

Independent
Variable
Model
OFFICE
HWLMNX
MPR70
DUACRE
RES
OFFICE
HWLMNX
MPR70
DELEMP
RES
HWLMNX
MANF
VACACR
OFFVAC
OFF AC R
DISCBD
DELEMP
MANACR
HWYINT
ENERGY
MPR70
RRMI
RES
HWYINT
DISCBD
Path Regression Coefficients*
Final Path Jackknifing
Analysis Mean Standard Deviation

-1.38
168
-0.808
3930
0.0814
0.649
18.4
0.0976
-692
-0.0319
5.74
0.0572
-0.0127
-5.26
690
-15.4
765
1470
316
4100
-0.176
79.8
0.00247
-3.31
-1.73

-1.39
167
-0.819
3860
0.0814
0.652
18.3
0.097
-677
-0.030
5.86
0.063
-0.013
-4.91
724
-14.7
738
1750
331
4110
-0.176
81.0
0.003
-3.20
-1.72

0.896
16.4
0.105
798
0.015
0.200
4.42
0.017
154
0.011
6.71
0.051
0.013;
7.06
276
13.1
197
1260
103
891
0.068
23.8
0.001
1.13
0.165
* Shown to three significant digits, wherever possible.
6-66
-------
TABLE 6-9 (CONTINUED)

COMPARISON OF PATH REGRESSION COEFFICIENTS FROM THE FINAL .
PATH ANALYSIS AND THE JACKKNIFING STABILITY.ANALYSIS FOR THE
SIMULTANEOUS BLOCK OF EQUATIONS
Model
Equation
Independent
Variable
Path Regression Coefficients*
Final Path Jackknifing
Analysis Mean Standard Deviation
Industrial /Off ice Model
RES

COMM

OFFICE

MANF

OFFICE
MPET2
MPE70
DUACRE
VACACR
RES
MANF
MPE70
HWYINT
DELEMP
MINCC
RES
MANF
MPET2
MPE70
OFFACR
DISCBD
DELEMP
HWLMNX
MPET2
MPE70
RRMI
DELEMP
MANACR
3.96
-0.859
0.392
4560
1.05
0.0785
0.413
0.0367
39.1
-2270
-290
0.011
0.0178
0.0199
-O.OT95
-285
-8.11
902
4.55
0.124
-0.0974
55.6
8690
1020
4.43
0.860
0.397
4570
1.033
0.077
0.400
0.036
39.7
-2180
-297
0.011
0.024
0.020
-0.019
-282
-7.68
821
4.26
0.121
-0.096
54.5
9190
1160
2.35
0.249
0.090
714
0.247
0,006
0.057
0.011
19.5
697
178
0.002
0.021
0.008
0.005
61.0
3.35
371
18.9
0.037
0.038
29.8
4070
1050
* Shown to three significant digits, wherever possible.
6-67
-------
TABLE 6-9 (CONTINUED)

COMPARISON OF PATH REGRESSION COEFFICIENTS FROM THE FINAL ..
PATH ANALYSIS AND THE JACKKNIFING STABILITY ANALYSIS FOR THE
SIMULTANEOUS BLOCK OF EQUATIONS
Model
Equation
Independent
Variable
Path Regression Coefficients*
Final Path Jackknifing
Analysis Mean Standard Deviation
Industrial/Office Model (Continued)
HWLMNX

RES
OFFICE
COMM
HWYINT
DISCBD
AUTO
0.00178
-0;0209
6.00980
-4.59
-0,246
32.6
0.002
-0.019
0.009
-4.55
-0.259
32.7
0.001
0.010
0,004
0,531
0.254
5.25
* Shown to three significant digits, wherever possible.
6-68
-------
for each model coefficient were ordered from lowest to highest
anc* *^e f°l''owVn9 ratio was computed:
X3 - Xj ' .
ri = r — T- • k = 20
' \-2 Al

The critical r-, level tested for was 0.535, corresponding to the one percent
significance level. If r, exceeds 0.535, then, given an independent sample,
there is a 99 percent probability that the deviation of the lowest value X1
from the range of values in the data set is significant and that X^ is an
extreme value. Again as a guide, if r, exceeds 0.535 in our samples, we will
refer to the corresponding X-j as an extreme value significant at the .01
level. A separate test on the highest data value X£Q was also performed. A
second ratio to test the highest data point was computed as follows:

Xi, - Xi. o
r = k = ?0
r20 Xk - X3 ' k . U

The values of r-j and r2Q for each model coefficient are shown on Table 6-10
where the ratio was found to be significant at the one percent level, the
case number of the data sample excluded from the path analysis corresponding
to the extreme value is indicated.

When an extreme value occurs in the data set of path regression
coefficient values, it indicates that the exclusion of data from a particular
case number is causing a significant change in the value of the coefficient.
Normally one would test for a change in value when certain data were included,
however, it was impractical to compute coefficients for all possible data
combinations, and thus, detection of changes was accomplished by excluding
each sample, one at a time. If exclusion of one sample from a least squares
analysis causes a significant change in a coefficient'^ values, then that
data sample, when included, is exerting a greater effect on the analysis
results than other samples.
6-69
-------
TABLE 6-10

EXTREME VALUE INDEX OF STABILITY FOR
ALL PATH REGRESSION COEFFICIENTS IN THE SIMULTANEOUS
BLOCKS OF EQUATIONS
Model Independent
Equation Variable
Residential Model
RES

COMM

OFFICE

MANF

HWLMNX

Industrial /Off ice
RES

OFFICE
HWLMNX
MPR70
DUACRE
RES
OFFICE
HWLMNX
MPR70
DELEMP
RES
HWLMNX
MANF
VACACR
OFFVAC
OFFACR
DISCED
DELEMP
MANACR
HWYINT
ENERGY
MPR70
RRMI
RES
HWYINT
DISCED
Model
OFFICE
MPET2
MPE7C
DUACRE
VACACR
Extreme Value
Ratio;:
rl r20

0.233
0.124
0.400
0.693
0.333
0.803
0.761
0.461
0.528
0.342
0.282
0.100
0.454
0.454
0.744
0.144
0.651
0.625
0.616
0.635
0.739
0.632
0.000
0.726
0.676

0.263
0.844
.0.388
0.547
0.754

0.370
0.124
0.576
0.141
0.283
0.762
0.711
0.503
0.475
0.333
0.526
0.876
0.354
0.297
0.893
0.559
0.134
0.967
0. 734
0.377
0.247
0.417
0.000
0.890
0.523

0.373
0.785
0.405
0.750
0.633
Case No.* Excluded
if Extreme Value**
Xl X20

15
36

36
37
40
35
19
25

10
10

17
12

36
22

15
37

15
16

12
38
*See Table C-l in Appendix C for definitions of case numbers [42]
**At the 1 percent significance level.
6-70
-------
TABLE 6-10 (CONTINUED)
EXTREME VALUE INDEX OF STABILITY FOR
ALL PATH REGRESSION COEFFICIENTS IN THE SIMULTANEOUS
BLOCKS OF EQUATIONS
Model
Equation
COMM

OFFICE

MANF

HWLMNX

Independent
Variable
RES
MANF
MPE70
HWYINT
DELEMP
MINCC
RES
MANF
MPET2
MPE70
OFFACR
DISCED
DELEMP
HWLMNX
MPET2
MPE70
RRMI
DELEMP
MANACR
RES
OFFICE
COMM
HWYINT
DISCBD
AUTO
Extreme
Ratio
rl
0.297
0.329
0.403
0.319
0.311
0.296
0.197
0.368
0.278
0.530
0.434
0.410
0.535
0.743
0.443
0.180
0.437
0.566
0.182
1.00
0.408
0.418
0.315
0.500
0.239
Value
r20
0.301
0.352
0.292
0.139
0.137
0.215
0.496
0.294
0.284
0.189
0.335
0.609
0.515
0.391
0.213
0.204
0.519
0.897
0.813
1.00
0.251
0.323
0.312
0.573
0.346
Case No.* Excluded
if Extreme Value**
V Y
Al A20

17 24
24
17 38

*See Table C-l in Appendix C for definitions of case numbers.[42]
**At the 1 percent significance level.
6-71
-------
An example of this situation is illustrated by a hypothetical
simple regression in Figure 6-11. Here the data sample "A" is the only point
in a particular portion of the variable range, and so its position allows it
to exert a lot of leverage in the regression when it is included, causing a
significant charge in the line of best fit, which would not occur with any
of the other points. This does not mean that "A" is in error; it could just
be the only data sample available in a particular portion of the variable
range. To exclude it from the regression when it is a valid sample would, in
fact, cause an error in'the estimation of model coefficients. In the mul-
tiple regressions used in the path analysis, this situation is not so easily
visualized. The anomalous sample may not be alone in a different range of
values,, but just far from the line of best fit determined using all other
samples (dotted line in Figure 6-11).

In the path analysis, only 20 data samples were available for each
regression and so it is not surprising with such a small sample size that insta-
bilities occur in which particular samples exert significant changes in some of
the model coefficients. When a particular case number reappears in the same
model equation more than once, it is probably an indication that data for the
dependent variable (and not the independent variables listed) is causing the
extreme value. An example of this is the OFFICE equation in the Residential
model where case number 37 appears three times (see Table 6-8). Thus, multiple
occurrences of the same case number in a given equation is not as strong an
indication of instability as the occurrence of the same number of different
case numbers.

The existence of instability in a model coefficient may simply
be an indication of the small sample size available for analysis, or it may
indicate an exception to the general relationships developed in the final path
analysis. In the first case, additional data in the analysis would remove the
instability and probably give a mean value for the coefficients close to that
obtained in the jackknifing analysis. In the second case, however, additional
data would probably not correct the instability in the model. Since it was
6-72
-------
FIGURE 6*.T]. HYPOTHETICAL LINEAR REGRESSION
-------
not possible to ascertain which was the case for each coefficient, the
validity of this stability index was limited to assessing the overall sta-
bility of an equation, not its individual coefficients. In this light, only
the NANF equation in the Residential model can be classified as having appre-
ciable instability in its model estimates. This is the only equation in
Table 640 having greater than 50 percent extreme values (7 out of 10). It
is interesting to note that this equation also has the lowest fr (0.407) of
any equation in the two simultaneous blocks.

2. Coefficient Variation Index
A second stability index was developed to quantify the effect
of the total variation in each model coefficient on the predicted values for
the dependent variable. The variation in each path regression coefficient was
expressed as the difference between the maximum (b_ax) and minimum (bm^n)
values obtained in the jackknifing analysis. This difference was then
multiplied by the mean value of the associated independent variable (3C) to
obtain the total effect on the dependent variable. This quantity was then
normalized by the mean value of the associated dependent variable (¥] to
obtain the percentage change:

(bmax " bm1n)*i
The computed P. values are summarized in Table 6-11.

The results indicate substantial instability is present in all
model equations except the RES equation in the Residential model. This is
the only equation which does not have a p value exceeding 50 percent. The
average percentage change for all coefficients is 90.3 percent. The above
results deal solely with the stability of the Individual regression coeffi-
cients. If one were concerned with the combined effect on the dependent
variable, then it is possible that in an equation with many independent vari
ables these individual effects could add up to an average variation in the
6-74
-------
TABLE 6-11

STABILITY INDEX FOR ALL PATH REGRESSION COEFFICIENTS IN THE
SIMULTANEOUS BLOCKS OF EQUATIONS BASED ON THE EFFECT
OF VARIATION IN THE COEFFICIENTS ON THE DEPENDENT VARIABLE
Model
Equation
Residential Model
RES

COMM

OFFICE

MANF

HWLMNX

Industrial /Off ice Model
RES

Independent
Variable

OFFICE
HWLMNX
MPR70
DUACRE
RES
OFFICE
HWLMNX
MPR70
DELEMP
.;RES
HWLMNX
MANF
VACACR
OFFVAC
OFFACR
DISCED
DELEMP
MANACR
HWYINT
ENERGY
MPR70
RRMI
RES
HWYINT
DISCED

OFFICE
MPET2
MPE70
DUACRE
VACACR
Percentage Change in
Dependent Variable

22.5
27.4
36.9
24.9
27.9
33.9
50.4
24.4
15.9
71.9
222.0
74.8
134.0
71.4
75.8
343.0
556.0
111.0
66.5
49*3.0
119.0
55,3
20.9
30.3
72.8

19.9
49.1
30.9
36.2
143.0
6-75
-------
TABLE 6-11 (CONTINUED)

STABILITY INDEX FOR ALL PATH REGRESSION COEFFICIENTS IN THE
SIMULTANEOUS BLOCKS OF EQUATIONS BASED ON THE EFFECT
OF VARIATION IN THE COEFFICIENTS ON THE DEPENDENT VARIABLE
Model
Equation
COMM

OFFICE

MANF

HWLMNX

Independent
Variable
RES
MANF
MPE70
HWYINT
DELEMP
MINCC
RES
MANF
MPET2
MPE70
OFFACR
DISCED
DELEMP
HWLMNX
MPET2
MPE70
RRMI
DELEMP
MANACR
RES
OFFICE
COMM
HWYINT
DISCED
AUTO
Percentage Change in
Dependent Variable
18.5
37.9
29.1
167.0
260.0
68.7
42.3
76.6
51.5
74.2
29.8
96.9
112.0
204.0
32.4
58.7
73.1
146.0
93.6
79.1
23.0
56.5
15.0
41.9
19.5
6-76
-------
dependent variable of several hundred percent. It is also possible that
some of the effects could cancel each other reducing this variation.

Of the 21 different independent variables listed on Table 6-9,
17 of them (or 81%) are associated with a p value exceeding 50 percent. Thus,
the instability problem is wide spread and not confined to a few variables
in certain equations. The far extent of the problem suggests that the cause
is probably the small sample size of 20 that was available for analysis. With
a larger data set to work with, path regression coefficients should not change
as much when a particular data sample is excluded, thus reducing the spread
between b and b - and the value of p.

H. CONCLUSIONS

A theoretical model of the Growth Effects of Major Land Use Projects
has been developed. This model was keyed towards representing the total land
use in a circular area of influence of 10,000 acres ten years after the con-
struction of a Major Project in that area. The model was specified in two
separate forms to represent induced land use growth associated with large
Residential developments, and large Industrial or Office parks in the follow-
ing 12 land use categories:

Residential Hotels/Motels
Commercial Hospitals
Office Cultural
Manufacturing Churches
Highways Education
Wholesale/Warehouse Recreation

The assumption of a single basic causal structure for induced development,
and the use of cross-sectional data.from diverse-locations throughout the
United States, allowed a static approach to the testing of the theoretical
models, using path analysis.

Path analysis is a set of statistical techniques useful in testing
theories and studying the logical consequences of various hypothesis involv-
ing causal relations. It is not capable of deducing or generating causal
6-77
-------
relations, only testing them. The causal analysis of induced land use
development in the current study involved the use of two basic statistical
techniques: two-stage least squares and stepwise ordinary least squares
(multiple regression). The first technique was required to produce consis-
tent estimates of the path coefficients in a system of simultaneous equations
involving feedback loops or reciprocal causation in the models. The second
technique was used to solve the remaining recursive portions of the models.
The dependent variables in these regression analyses represented the total
land use in the previously noted 12 categories. Both linear and non-linear
forms were tested and the linear form was found to produce the best fit.
Specific statistical criteria were developed to identify model paths that
were insignificant or redundant, and these criteria were used to trim unneeded
and undesirable paths from the models. A second complete path analysis was
performed, and the trimming process repeated several times until the final
path models were decided upon. The trimming process eliminated almost half of
the paths in the models as originally specified.

The final models of land use development show that strong statisti-
cal relationships exist between the variables representing the 12 categories
of total land use and the other model variables representing induced and
non-induced land use growth processes. Only in the case of cultural land use
did the path analysis reject the hypothesized causal relationships. Exclud-
2
ing this category, the R statistic for the model equations in the simultane-
2
ous block ranged from 0.43 to 0.81 with an average value of 0.66 and the R
values for the model equations in the recursive block ranged from 0.12 to
0.86 with an average value of 0.41. These statistics can be interpreted as
the amount of variance in the dependent variables (total land use) of the
model equations that can be explained through the linear relationships in
the final causal model. These results indicate a good verification of the
hypothesized land use development model (and above the expectations of the
authors).
6-78
-------
There were several problems encountered in the path analysis,
involving multicollinearity, suppressor variables, choice of instrumental
variables, available degrees of freedom, and coefficient instability. The
first two problems were eliminated through the approach used for theory trim-
ming of the models. The last three problems were caused principally by a
common element: the small number of data samples (20) available for analysis.
In the model equations, as originally specified, there were sometimes as
many independent variables as data samples. Since at least several degrees
of freedom should be reserved for the error term in any multiple regression,
some model paths had to be trimmed prior to, and in order to perform, the
first path analysis. Thus, the limited data sample did preclude the testing
of causal relationships in some instances. Also, an analysis of the stability
of the model path coefficients revealed appreciable instability in the indi-
vidual model coefficients when it was applied to different subsets of the
original land use data set.

We note that this instability does not invalidate the strong causal
relationships confirmed by the path analysis. These causal relationships
will be translated into predictive equations in the subsequent task in this
contract, the model calibration phase. This effort is reported in a forth-
coming third volume of this report.
6-79
-------
VII. REFERENCES
40 Federal Register 40048
40 Federal Register 41941
40 Federal Register 25814
40 Federal Register 23746
40 Federal Register 18726
40 Federal Register 16343
40 Federal Register 9599
40 Federal Register 6279
(October 20, 1975),
(September 9, 1975),
(June 19, 1975),
(June 2, 1975),
(April 29, 1975),
(May 8, 1974),
(April 18, 1973), and
(March 8, 1973).
2. U.S. Environmental Protection Agency, Review of Federal Actions
Impacting the Environment. Washington, DC, EPA, 1975 (Manual TN2/3-1-75),

Office of Federal Activities, U.S. Environmental Protection Agency,
Guidelines for Review of Environmental Impact Statements; Volume I,
Highway Projects, Washington DC, EPA, 1973, (Volume II on Airports and
Volume III on Steam Channelization will be published shortly).

39 Federal Register 16186, (May 7, 1974).

Office of Air Quality Planning and Standards, U.S. Environmental
Protection Agency, Guidelines for Preparing Environmental Impact State-
ments. Research Triangle Park NC, OAQPS, May 1975.

3. 40 Federal Register 28064 (July 3, 1975),
39 Federal Register 45014 (December 30, 1974),
39 Federal RegTster 25292 (July 9, 1974),
39 Federal Register 7270 (February 24, 1974), and
38 Federal Register 15834 (June 18, 1973).

4. 40 Federal Register 25504 (June 12, 1975).
39 Federal Register 42510 (December .5, 1974"'),
39 Federal Register 31000 (August 27, 1974),
38 Federal Register 18986 (July 16, 1973), and
37 Federal Register 23836 (November 9, 1972).

5. Lowry, Ira, A Model of Metropolis. Santa Monica, California, The RAND
Corporation, 1964, (RM-4035-R6).

6. Forrester, Jay, Urban Dynamics, Cambridge, Massachusetts, The MIT Press,
1969.

7. Hill, Donald, "A Growth Allocation Model for the Boston Region"
Journal of the American Institute of Planners (1965)[3], This is the
EMPIRIC model.
7-1
-------
8. Seidman, D.R., The Construction of an Urban Growth Model, Philadelphia,
Delaware Valley Regional Planning Commission, 1969.This is the Penn-
Jersey model.

9. Center for Real Estate and Urban Economics, Jobs, People, and Land,
Bay Area Simulation Study, Berkeley, University of California, 1968,
(Special Report No. 6). This is the BASS model.

10. Ohls, James, C., and Peter Hutchinson, "Models in Urban Development"
pp. 165-200. in: Saul I. Gass and Roger L. Sisson (eds), A Guide to
Models in Governmental Planning and Operations, Potomac, Maryland,
Sauger Books, 1975.

11. Office of Air Quality Planning and Standards, Guidelines for Air Quality
Maintenance Planning and Analysis; Volume 4, Land Use and Transportation
Considerations, Research Triangle Park, NC» U.S. Environmental Protection
Agency, 1974 (EPA-450/4-74-004).

12. Heise, David, R., Causal Analysis, Chapel Hill, NC, Unpublished manu-
script, 1974.

13. Blalock, Hubert, M., Jr., Theory Construction, Englewood Cltffs, NJ,
Prentice-Hall, 1969.

14. Meisel, William, S., and David C. Collins, "Repro-Modeling: An Approach
to Efficient Model Utilization and Interpretation", IEEE Transactions
on Systems. Man and Cybernetics (1973) 3:349-358.

15. D'Agostino, Ralph, "Path Analysis", Boston, Unpublished paper, 1975.

16. Wright, S., "The Method of Path Coefficients", Annals of Mathematical
Statistics, 1934 (Vol. 5, pp. 161-215).

17. Wright, S., "The Interpretation of Multivariate Systems", Statistics
and Mathematics in Biology, Edited by 0. Kempthorne, T. Bancroft,
J. Cowen, and J. Lush, Iowa State College Press, Ames, Iowa, 1954
(Vol. 11-33).

18. Wright, S., "Path Coefficients and Path Regressions Alternative or Com-
plementary Concepts", Biometrics. 1£, 1960 (pp. 189-202).

19. Wright, S., "The Treatment of Reciprocal Interaction, with or without
Lag, in Path Analysis", Biometrics. 16_, 1960 (pp. 423-445)

20. Van de Geer, J.P., Introduction to Multivariate Analysis for the Social
Sciences, W.H. Freeman, San Francisco, California, 1971.

21. Kerlinger, F.N., and Pedhazer, E.J., Multiple Regression in Behavioral
Research, Holt, Rinehart and Winston, New York, N.Y., 1973.
7-2
-------
22. Heise, D.R., "Problems in Path Analysis and Causal Inference", Socio-
logical Methodology, edited by E.F. Borgatta, Josey-Bass, San Francisco,
California, 1969 (pp. 38-72).

23. Goldberger, A.S., Econometric Theory, Wiley, New York, NY, 1964.

24. Johnston, J., Econometric Methods, McGraw-Hill, New York, NY, 1963.

25. Snedecor, G.W., and Cochran, W.G., Statistical Methods (6th Edition),
Iowa State College Press, Ames, Iowa, 1967.

26. Tukey, J.W., "Causation, Regression and Path Analysis", Statistics and
Mathematics in Biology, Edited by 0. Kempthorne, T.A. Bancroft,
J.W. Gowen and J.L. Lush, Iowa State College Press, Ames, Iowa
(pp. 35-66).

27. Turner, M.E., and Stevens, C.D., "The Regression Analysis of Causal
Paths", Biometrics. 1_5, 1959 (pp. 236-258).

28. Heise, D.R., "Employing Nominal Variables, Induced Variables, and Block
Variables in Path Analyses", Sociological Methods and Research. 1972
(pp. 147-173).

29. Land, K.C., "Principles of Path Analysis", Sociological Methodology.
Edited by E.F. Borgatta, Jossey-Bass, San Francisco, California, 1969.

30. Boudon, R., "A Method of Linear Causal Analysis: Dependence Analysis",
American Sociological Review 30. 1965 (pp. 365-373).

31. An Air Pollution Impact Methodology for Airports - Phase I, Argonne
National Laboratory, Prepared for Office of Air and Water Programs,
U.S., EPA, Durham, NC, January, 1973.

32. Urban Travel Patterns for Airports. Shopping Centers, and Industrial
Plants, National Cooperative Highway Research Program Report, No. 24.

33. 1972 County and City Data Book, U.S. Bureau of the Census.

34. Survey of Buying Power, Sales Management Magazine, New York, NY (annual)

35. Wonnacott, Ronald, and Wonnacott, Thomas, Econometrics, John Wiley and
Sons, 1970.

36. Metcalf and Eddy, Massachusetts Metropolitan District Commission Waste-
water Study, 1973.

37. Lakshmanan and Hansen, "A Retail Market Potential Model", Journal of the
American Institute of Planners, Vol. 31, May 1965.
7-3
-------
38. Secondary Impacts of Transportation and Wastewater Improvements; Review
and Bibliography, Prepared by the Environmental Impact Center for the
Council on Environmental Quality and the U.S., EPA, 1975.

39. Infonet Times Series Processor, Information Network Division, Computer
Sciences Corp., Los Angeles, California, December 1974.

40. Nie, N.H., C.H. Hull, J.G. Jenkins, K, Steinbrenner, and D.H. Bent,
Statistical Package for the Social Sciences (second edition), McGraw-
Hill, New York, 1975.

41. Dixon, W.J., "Analysis of Extreme Values", Annals of Mathematical Statis-
tics, Vol. 21, pp. 488-506, 1950.

42. Benesh, Frank and Peter Guldberg, Growth Effects of Major Land Use
Projects, Volume I: Appendices C and D, Prepared by Wai den Research,
Division of Abcor, for Office of Air Quality Planning and Standards,
U.S. Environmental Protection Agency, Research Triangle Park, North
Carolina. Available from Air Pollution Technical Information Center
(MD#18), U.S. Environmental Protection Agency, Research Triangle Park,
North Carolina 27711 (APTIC Document #80998), May 1976.

43. Private Communication with Mr. Russell Hanson, Director of Fairfax County
Industrial Authority, 1972.
7-4
-------
APPENDIX A

BIBLIOGRAPHY OF LAND USE MODELS
AND LAND USE DEVELOPMENT
-------
Argonne National Laboratories, A General Methodology for the Planning and
Analysis of Air Pollution Control Strategies based on Land Use,
February, 1972.

Batty, Michael, "Recent Development in Land Use Modeling: A Review of
British Research", Urban Studies, June, 1972.

Berry, Brian J.L. "The Retail Component of the Urban Model". Journal of the
American Institute of Planners, Vol. XXXI, No. 2, May, 1965.

Central Regional Planning Commission, The Impact of New Housing on the
Suburbs, Vol. I - The Apartments of Westborough, March 1972.

Chapin, F. Stuart Jr. "A Model for Simulating Residential Development,"
Journal of the American Institute of Planners, Vol. XXXI, No. 2,
May, 1975.

Chapin, F. Stuart Jr. and Shirley F. Weiss, "Factors Influencing Land Develop-
ment", Institute for Research in Social Science, Univer. of North Carolina,
Chapel Hill, August, 1962.

Chorley, Richard J. and Peter Raggett, Socio-Economic Models in Geography,
1967.

Dickey, J.W. and P.M. Shuldiner, "Theory and Methodology Regional Planning
Case Study 2: Maximum Traffic Generation for Planned Shopping Center"
from paper presented at Annual Meeting of Highway Research Board,
January, 1966.

Goldner, William, "The Lowry Model Heritage," Journal of the American Insti-
tute of Planners, Vo. XXXVI: No. 2, March, 1971.

Hill, Donald M. "A Growth Allocation Model for the Boston Region",
Journal of the American Institute of Planners, Vol. 31:2, May, 1965.

Isard, Walter, "Ecologic-economic Analysis for Regional Development," 1972.

Highway Research Record, No. 126, "Land Use Forecasting Concepts, 7 Reports".
1966.

Lakshmanan, T.R.,and Walter G. Hansen, "A Retail Market Potential Model",
Journal of the American Institute of Planners, Vol. XXXI, May, 1965.

Lathrop, George T. and John R. Hamburg, "An Opportunity-Accessibility Model
Allocating Regional Growth". Journal of American Institute of Planners
Vol. XXXI No. 2, May, 1965.

Lee, Donald, Models and Techniques for Urban Planning, Cornell University,
New York, September, 1968.
A-l
-------
Leontif, Wassily, "Environmental Repercussions and the Economic Structure:
An Input-Output Approach, "The Review of Economics and Statistics. Vol.
611, No. 3, August, 1970.

Lowry, C.S. "A Short Course in Model Design". Journal of the American
Institute of Planners, Vol..XXXI, No. 2, May, 1965.

Metcalf & Eddy, Inc., MDC wastewater Study, 1973.

Pack, Janet Rothenberg, "The Use of Urban Models: Report on a Survey of
Planning Organizations". Journal of American Institute of Planners,
May, 1975.

Roberts, John L. and Croke, Edward J. "A Critical Review of the Effect of
Air Pollution Control Regulations on Land Use Planning." Journal of
Air Pollution Control Association. May, 1975, Vol. 25, No. 5.

Steinitz, Carl and Peter Rogers, A System Analysis Model of Urbanization
and Change, 1970. ••

Seidman, David R., "The Construction of an Urban Growth Model; A descrip-
tion of the Issues Confronted in the Development of the Activities
Allocation Model", Delaware Valley Regional Planning Commission, 1970.

U.S. Environmental Protection Agency, 45014-74-004, Guidelines for Air
Quality Maintenance Planning and Analysis, Volume 4: Land Use and Trans-
portation and Wastewater Investments: Review and Bibliography, January,
1975":

U.S. Environmental Protection Agency, 68-61-0788. A Guide to Models in
Governmental Planning and Operations, August, 1974.

U.S. Naval Civil Engineering Laboratory, The Impact of Large Installations
on Nearby Areas - August, 1965.
A-2
-------
APPENDIX B
PLOTS OF REGRESSION COEFFICIENTS FOR THE
TWENTY JACKKNIFING RUNS
-------
It is important to note that the scale for each graph is different,
so that comparisons should not be made visually between the spread of
points in different graphs. The scale of each graph is determined by the
minimum and maximum values in the data set which are always plotted along
the right- and left-hand edges of the graph, respectively. Data points
are plotted useng the "*" character, except where coincident values occur,
in which case a number indicating the multiplicity is plotted, e.g., "3"
for three coincident points. Because of the differences in scale, data
points whose values are grouped quite closely may appear to be spread out,
and vice-versa. Thus, in order to quantify more accurately the stability
of the path regression coefficients and to allow comparisons, the two
separate indices of stability discussed in the text (page6-65) were developed.
B-l
-------
-2.83 -2.Z6 -1.93 -J.67 -1.22
(ACROSS) OFFICE
-.87 -.52
-.17
.19
.54
I* 2
* * »» j*
****
(ACROSS) HWLMNK
142.13 147.83 153.53 159.23 164.93 170.63 176.33 182.03 187.73 193.43
4.
j*
* *
» **2 * 2
*
*
*I
C3
ro
-l.OZ -.96
-.91
-.85 -.60
(ACROSS) HPR70
-.74 -.69
-.63 -.58
-.52

1*
* »* *3*2 * * * 2
•I
(ACROSS) DUACRE
949.64 (331.61 17)3.38 2095.14 2476.91 2858.68 3240.45 3622.21 4003.98 4385.75
i*
«* * » * *23«
•32
FIGURE B-l PLOT OF REGRESSION COEFFICIENTS FOR THE JACKKNIFING OF THE RES EQUATION IN THE RESIDENTIAL MODEL
"Please see the text for an explanation of the scales of the graphs"
-------
(ACROSS)

.05 .05 .06 .07 .07 .08 .08 .09 .10 .10
. . ---- + ---- «. ---- «. ---- * ---- «. ---- * ---- + ---- * ---- + ---- * ---- «• ---- + ---- * ---- * ---- * ---- * ---- * ---- * ---- * ---- *•
>2 * * 2 25 * 2* * * **
.. . i

(ACROSS) OFFICE

.05 .17 .29 .40 .52 .63 .75 .87 .98 1.10
.«, ---- * ---- * ---- * ---- * ---- + ---- * ---- * ---- * ---- * ---- * ---- *— -* ---- * ---- * ---- * ---- * ---- * ---- * ---- * ---- *•
+ »- * « »2333* 2 * .

(ACROSS) HWLMNX

6.35 8.65 10.94 13.24 15.54 17.83 20.13 22.42 24.72 27.02
... — .. — * — ... — *. — *. — ...... - — *- — *- — *- — *- — *- — *- — *- — *- — *- — *- — *- — *- — ;•
+# « »* «* 35**»* * *

(ACROSS) HPR70
» . * * ........ 3* • • * **

' (ACROSS) DELEMP

-1045.97 -964.56 -883.14 -801.72 -720.30 -638.88 -557.46 -476.04 -394.62 -313.21
.+ ---- * ---- * ---- * ---- * ---- * ----- * ---- + ---- * ---- 1 ---- * ---- * ---- + ---- *--— + ---- * ---- * ---- * ---- * ---- * ---- *•
«,» » * » *« 42 **** * * * **
T *

FIGURE B-2 PLOT OP-REGRESSION COEFFICIENTS FOR THE JACKKNIFING OF THE COHFl EQUATIOI! IN THE RESIDENTIAL fCDEL

"Please see the text for an explanation of the scales of the graphs"
-------
(ACROSS) RES

-.05 -.05 -.04 -.04 -.03 -.03 -.02 -.02 -.01 -.01

«• ••• ••»^2***** * • * **

(ACROSS) HrtLMNA

-*.53 -1.56 1.40 4.36 7.32 10.28 13.24 ^ 16.20 ^19.17 22.13

+ « » » • 2 »2» 3* • ** * * •»

(ACROSS) MANF

.05 .07 .10 .12 .14 .17 .19 .22 .24 .26

*2*» **42»3* 2 •«•

(ACROSS) VACACR

-.04 -.04 -.03 -.03 -.02 -.01 -.01 .00 .01 .01

»•" "* * ~** * "I ~~"I • * 3 5* • * • * * *»

(ACROSS) OFFVAC

-22.10 -18.70 -15.29 -11.89 -8.49 -5.09 -1.69 1.71 5.11 8.51
. 4.__-_»-- + --• + ---- + ---- + ---- + ----»-__-«.-_-- + -___4.__-_4.____<. _» 4. ^___*___-^ •--_*----•»•----*----*----*•
+ # **»*3»*»22** * * **

(ACROSS) OFFACR

357.47 509.98 662.49 615.00 967.51 1120.02 1272.53 1425.04 Ib77.55 1730.06

»» * 2*244** • » **

(ACROSS) OISCBO

-28.30 -22.82 -17.34 -11.86 -6.37 -.89 4.59 10.07 15.55 21.03

+* * * • • •• »•• *** 2 * * * • * • **

(ACROSS) OELEHP

150.25 233.58 316.91 400.24 483.58 566.91 650.24 733.57 816.90 900.23

+* • * 22* * * * **• 2 *2 **
I

FIGURE B-3 PLOT OF REGRESSION COEFFICIENTS FOR THE JACKKNIFING OF THE OFFICE EQUATION IN THE RESIDENTIAL MODEL

"Please see the text for an explanation of the scales of the graphs"
-------
•»*
en
(ACROSS) MANACR

1380.28 1980.93 2581.58 3182.24 3782.89 4383.54 4984.20 5584.85 6185.50 6786.16

** 95 3 »I
I
(ACROSS) HWYINT

150.76 207.97 265.17 322.37 379.57 436.78 493.98 551.18 608.38 665.58

** * 3* 352 * * * **
I
(ACROSS) ENERGY

1208.03 1679.47 2150.91 2622.35 3093.78 3565.22 4036.66 4508.10 4979.54 5450.98

** « » » 332* * 2 * ** •*
I

(ACROSS) MPR70

-.41 -.38 -.34 -.31 -.27 -.24 -.20 -.17 -.14 -.10
+ «.—«._«._ + _. «. *—.-«. * » * * - + * * * * * * * * *«
* * 2* *»33 »» * * * * " . **<
(ACROSS) RRMI

17.44 29.58 41.73 53.87 66.02 78.17 90.31 102.46 114.61 126.75

** * * 3*4*2 »* * * * *«
I . I
1

FIGURE B-4 PLOT OF REGRESSION COEFFICIENTS FOR THE JACKKNIFING OF THE MANF EQUATION IN THE RESIDENTIAL MODEL

"Please see the text for an explanation of the scales of the graphs"
-------
•oo
(ACROSS) RES

.003. .00 .00 .00 .00 .00 .00 .00 .00 .003

*9 9*
I I
(ACROSS) HWYINT

-4.80 -4.17 -3.54 -2.91 -2.28 -1.64 -1.01 -.38 .25 .89

I* »»3 243»* • • •*
I I
(ACROSS) OISCBO

-2.17 -2.08 -2.00 -1.91 -1.82 -1.73 -1.65 -1.56 -1.47 -1.39 .
4+---.^.««-4....«*»»..*« —.» — •-» —.» — -»— — *- — *— — *- — * — -»- — *- — * — — * —— * — -*^— — »——*•
4^ * 2 *23 2*«* * *• * **

I l

FIGURE B-5 .PLOT OF REGRESSION COEFFICIENTS FOR THE JACKKNIFING OF THE HWLWX EQUATION IN THE RESIDENTIAL MODEL

"Please see the text for an explanation of the scales of the graphs"
-------
(ACROSS) OFFICE

.63 1.57 2.50 3.44 4.37 5.31 6.25 7.18 8.12 9.05
04, »_4. __._+-. 4 4_-_-4- —-4 _4 4. * 4----* * -4 * * -_4----4- 4 4.----* .
4.2 * **»**» 3»* * * * * * »4
I . - " I
(ACROSS) MPET2

-1.57 -1.43 -1.28 -1.14 -1.00 -.86 -.71 "^I..^... Il!!__ *.—IlfL- 4.

U * * * * * * * » 2» 2322*3 * _ "J

(ACROSS) MPE70

M .z6 .30 .3* :3» __:!?_....-:!! :!° :!!„..—:!!-...

:: : ""*".; ""•-«" 5 - - *« •;

I (ACROSS) OUACRE

3397.17 3790.43 4183.70 4576.96 4970.22 5363.49 5756.75 6150.02 6543.28 6936.54
.4. 4.__-_4. ». -* 4-——* » *--—+- * —* * * * * *—••* * * *•
4* • *» • »3 23 2* * * **
I \
(ACROSS) VACACR

.?8 .42 .56 .70 .84 .98 1.13 1.27 1.41 1.55
.4 * * * * ' + * * * *- * * * * * * * ll
'. « 2 3* »3 »**2 * * **
I •

FIGURE B-6 -PLOT OF REGRESSION COEFFICIENTS FOR THE JACKKNIFING OF THE RES EQUATION IN THE INDUSTRIAL/OFFICE MOD6L

"Please see the text for an explanation of the scales of the graphs"
-------
(ACROSS) RES

07 .07 .07 .07 .08 .08 .08 .08 .08 .09
«, ____ I ____ 4 ---- *_— * ---- «• ---- * ---- *- --- * ---- * ---- * --- -* ---- * ---- * ---- * ---- * ---- * ---- * ---- * ---- *- --- *•
I. • 3 • Z 2 * 3 2 » •

(ACKUSb) MANJ-

.29 .31 .34 .36 .38 .41 .43 •*<> «48 «51
.«. ---- » ---- * ---- * ---- * ---- *— — * ---- «.-- — * ---- * ---- * ---- * ---- * ---- * ---- «.-—«. ---- * ---- * ---- * ---- * ---- *•
«.» * * * » » » 3 •»••••»• * **
I l
(ACROSS) MPE70

.02 .02 .03 .03 .04 .04 .04 .05 .05 .06
.* ---- «. ---- «. ---- *— -* ---- * ---- * ---- «• ---- * ---- * ---- * ---- * ---- * ---- * ---- * ......... * ---- * ---- * ---- * ---- *•
** 2 »»»22 ******* * * **
I
(ACROSS) HWYINT

-3.60 3.86 11.32 18.78 26.24 33.70 41.16 48.61 56.07 63.53

:: .................. ; ......... ............. ......... — -*~ ----- •—-•
co *
1 (ACROSS) OELEMP

-3492.01 -3247.75 -3003.49 -2759.23 -2514.97 -2270.71 -2026.45 -1782.19 -1537.93 -1293.67
.«. ---- *— — * ---- * ---- * ---- * ---- * ---- * ---- * ---- * ---- * ---- * ---- * ---- * ---- * ---- * ---- *— — * ---- * ---- * ---- *•
!• • • 2 • a» •• * * 2 *• * * *•*
i l
(ACROSS) MINCC

-593.29 -529.16 -465.03 -400.91 -336.78 -272.66 -208.53 -144.40 -80.28 -16.15
,* ---- * ---- ^ ---- „ ---- 4 ---- + ---- * ---- * ---- 4 ---- * ---- + _.._*__— * ---- * ---- * ---- * ---- *- --- *----* ---- * ---- *.
*# • *. * « ..... • • » .•••••• •*
I L

FIGURE B-7 PLOT OF REGRESSION COEFFICIENTS FOR THE JACKKNIFING OF THE COMM EQUATION IN THE INDUSTRIAL/OFFICE MODEL

"Please see the text for an explanation of the scales of the graphs"
-------
(ACROSS) RES

.01 .01 .01 .01 .01 .01 .01 .01 .02 .02

la 3 3 6 3 * * **

i ACROSS) MANF

-.Qd -.01 -*00 '01 '°2 *03 »03 »°4 -05 .06

4| * w w gw » v w w w 4 * It .41 £ ^1 0^
I
(ACROSS) MPET2

.01 . .01 .01 .02 .02 .02 .03 .03 .03 .04

+ * » 2»**?»32* * * * **
1
(ACROSS) MPE70

-.OJ -.03 -.03 -.02 -.02 -.02 -.02 -.02 -.01 -.01

I« * • 5 3 . 2 2 • * •»•»
o> J
i (ACROSS) OFFACR

-419.50 -392.31 -365.11 -337.92 -310.73 -283.54 -256.35 -229.15 -201.96 -174.77 »

+ * * * ***»2 *2 » * »* « * • ••
T I
(ACROSS) OISCBO

-13.01 -11.35 -9.69 -8.03 -6.37 -4.71 -3.05 -1.39 .27 1.93
. +---- + ---- + ---- + ---- + ----•---- + --_.+-_-- + __--+---- + ..--- + ---- + ..---+..--- + --.--*----«.----+-..-- + ----*--..-* .
** • »» * * • »2 2 *• * ** ** **
T I
(ACROSS) OELEMP

-73.73- 114.45 302.64 490.82 679.01 867.19 1055.37 1243.56 1431.74 1619.93

I» » * 2*2 » 22 2 * 2 * **
I 1

FIGURE B-8 PLOT OF REGRESSION COEFFICIENTS FOR THE JACKKNIJING OF THE OFFICE EQUATION IN THE INDUSTRIAL/OFFICE MODEL

"Please see the text for an explanation of the scales of the graphs"
-------
(ACROSS) HWLMNX

-62.03 -52.17 -42.32 -32.47 -22.61 -12.76 -2.90 6.95 16.81 26.66

4» * » * 2*3 *2, ** ** ** **

(ACROSS) MPET2

.02 .04 .06 .08 .09 .11 .13 .14 .16 .18

4* « « »»»•*« 22 ** * * * * - - **

(ACROSS) MPE70

-.16 -.14 -.13 -.11 -.10 -.08 -.07 -.06 -.04 -.03

4» * » * » » • »»g> ****** * * **

(ACROSS) RRMI

-3.98 9.79 23.55 37.32 51.09 64.85 78.62 92.38 106.15 119.92
. 4—---4-- —-*—---4----4----4----4---_ 4— 4 - —— - 4- ---- 4 __-- 4 ---- 4 -- —- 4 - —-- 4 ----4 - — — —*— — --*- — — —*- — — — 4.—— — -* .
4* * » » »» *22> **** * * * * **'-•
oa - . .
£ (ACROSS) UtLEMP

5869.31 7975.29 10081.27 12187.25 14293.23 16399.21 18505.19 20611.16 22717.14 24823.12

4* - »* 22 *4»32 * »4

(ACROSS) MANACR

549.10 1051.37 1553.65 2055.93 2558.21 3060.49 3562.77 4065.05 4567.32 5069.60

4« * * «»2* A?* •»•'.» **
FIGURE B-9. PLOT OF REGRESSION .COEFFICIENTS FOR THE JACKKNIFING OF THE flANF EQUATIOII IN THE INDUSTRIAL/OFFICE i-ODEL

"Please see the text for an explanation of the scales of the graphs"
-------
(ACROSS) RES
.000 .00 .00 .001 .00 .00 .002 .00 .00 .003
+* 9 *+

(ACROSS) OFFICE
-.04 -.04 -.03 -.03 -.03 -.02 -.02 -.01 -.01 -.01
+ + + + + + + + 4 + + + + + + + + + + + +
+ * * * * **52 2 3 **+
(ACROSS) COMM
-.00 .00 .00 .01 .01 .01 .01 .01 -01 .02
+* * *• * * 2 4 5 ** * * *+

(ACROSS) HWYINT
-5.54 -5.32 -5.11 -4.90 -4.69 -4.47 -4.26 -4.05 -3.83 -3.62
+ + + + + + + + + + + +. + +_. + + + + + + +
+* * * **** * * 2 * ** ** * * * * +
(ACROSS) DISCED
-.80 -.67 -.54 -.41 -.29 -.16 -.03 .10 .23 .35
+* * * * * *** 2 2*** * ** * *+
(ACROSS) AUTO
23.52 25.71 27.90 30.09 32.28 34.47 36.66 38.85 41.04 43.23
+* * * * ** * **** * 2 ** ** * *+

FIGURE B-10 PLOT OF REGRESSION COEFFICIENTS FOR THE JACKKNIFING OF THE HWLMNX EQUATION IN THE INDUSTRIAL.
OFFICE MODEL
"Please see the text for an explanation of the scales of the graphs"
-------
TECHNICAL REPORT DATA
(Please read Instructions on the reverse before completing)
1. REPORT NO.
2.
3. RECIPIENT'S ACCESSION-NO.
4. TITLE ANDSUBTITLE
Growth Effects of Major Land Use Projects,
Vol. I: Specification and Causal Analysis of Model
5. REPORT DATE
May. 1976
6. PERFORMING ORGANIZATION CODE
7. AUTHOFUS)
Frank Benesh, Peter Guldberg, Ralph D'Agostino
8. PERFORMING ORGANIZATION REPORT NO.

C-781 - a
9. PERFORMING ORGANIZATION NAME AND ADDRESS
Wai den Research Division of Abcor
201 Vassar Street
Cambridge, Massachusetts 02139
10. PROGRAM ELEMENT NO.
11. CONTRACT/GRANT NO.
68-02-2076
12. SPONSORING AGENCY NAME AND ADDRESS
13. TYPE OF REPORT AND PERIOD COVERED
Environmental Protection Agency
Office of Air Quality Planning and Standards
Strategies and Air Standards Division (MD-12)
Research Triangle Park, North Carolina 27711
Final
14. SPONSORING AGENCY CODE
15. SUPPLEMENTARY NOTES
16. ABSTRACT
Growth Effects of Major Land Use Projects is a research program whose goal is to
formulate a methodology to predict air pollutant emissions resulting from the con-
struction and operation of two types of major land use projects, large residential
projects and large concentrations of employment (i.e., office parks and industrial
parks). Emissions are quantified from the major project, from land use induced
by the major project, from secondary activity occurring off-site (i.e., generation
of electricity by utilities), and from motor vehicle traffic associated with both
the major project and its induced land uses.

This report documents the development of a model, to predict the induced land use
from such major projects. The report discusses the theoretical basis of the model,
the specification of the equations that model the theory, and the causal analysis
of the equations. Sample selection and data collection is also discussed.

Subsequent reports (i.e., Volume II and Volume III) document the emission factors,
the land use model predictive equations and the traffic model.
7.
KEY WORDS AND DOCUMENT ANALYSIS
DESCRIPTORS
b.IDENTIFIERS/OPEN ENDED TERMS C. COSATI Field/Group
Land Use
Planning
Industrial Areas
Residential Areas
Path Analysis
Causal Analysis
Secondary Effects
Induced Land Use
3. DISTRIBUTION STATEMENT
Unlimited
19. SECURITY CLASS (This Report)
Unclassified
21. NO. OF PAGES
221
20. SECURITY CLASS (This page)
Unclassified
22. PRICE
EPA Form 2220-1 (9-73)
-------