EPA-600/5-77-008a
May 1977
Socioeconomic Environmental Studies Series
CLASSIFICATION OF AMERICAN CITIES FOR
CASE STUDY ANALYSIS: VOLUME I.
Summary Report
Office of Monitoring and Technical Support
Office of Research and Development
U.S. Environmental Protection Agency
Washington, D.C. 20460
-------
RESEARCH REPORTING SERIES
Research reports of the Office of Research and Development, U.S. Environmental
Protection Agency, have been grouped into nine series. These nine broad cate-
gories were established to facilitate further development and application of en-
vironmental technology. Elimination of traditional grouping was consciously
planned to foster technology transfer and a maximum interface in related fields.
The nine series are:
1. Environmental Health Effects Research
2. Environmental Protection Technology
3. Ecological Research
4. Environmental Monitoring
5 Socioeconomic Environmental Studies
6. Scientific and Technical Assessment Reports (STAR)
7 Interagency Energy-Environment Research and Development.
8. "Special" Reports
9. Miscellaneous Reports
This report has been assigned to the SOCIOECONOMIC ENVIRONMENTAL
STUDIES series. This series includes research on environmental management,
economic analysis, ecological impacts, comprehensive planning and fore-
casting, and analysis methodologies. Included are tools for determining varying
impacts of alternative policies; analyses of environmental planning techniques
at the regional, state, and local levels; and approaches to measuring environ-
mental quality perceptions, as well as analysis of ecological and economic im-
pacts of environmental protection measures. Such topics as urban form, industrial
mix, growth policies, control, and organizational structure are discussed in terms
of optimal environmental performance. These interdisciplinary studies and sys-
tems analyses are presented in forms varying from quantitative relational analyses
to management and policy-oriented reports.
This document is available to the public through the National Technical Informa-
tion Service, Springfield, Virginia 22161.
-------
EPA-600/5-77-008a
May 1977
CLASSIFICATION OF AMERICAN CITIES FOR CASE STUDY ANALYSIS
Volume I
Summary Report
by
Elizabeth Lake
Carol Blair
James Hudson
Richard Tabors
Urban Systems Research & Engineering Inc.
Cambridge, Massachusetts 02138
Contract No. 68-01-3299
Project Officer
Samuel Ratick
Office of Monitoring and Technical Support
Washington, DC 20460
OFFICE OF MONITORING AND TECHNICAL SUPPORT
OFFICE OF RESEARCH AND DEVELOPMENT
U.S. ENVIRONMENTAL PROTECTION AGENCY
WASHINGTON, DC 20460
-------
DISCLAIMER
This report has been reviewed by the Office of Research and Development,
U.S. Environmental Protection Agency, and approved for publication.
Approval does not signify that the contents necessarily reflect the views
and policies of the Environmental Protection Agency, nor does mention of
trade names or commercial products constitute endorsement or recommendation
for use.
11
-------
ABSTRACT
Attempts to analyze and evaluate the impacts of federal programs has led
to the extensive use of case studies of program impacts at selected sites.
This project has developed a methodology for the systematic selection of
representative case study sites and for generalizing the study results. The
methodology, involving two stage factor analysis and clustering, is applied
to a specific program/policy problem, the selection of metropolitan areas
for case studies in analyzing the impact of federal policies on general
environmental quality.
The methodology begins with a data base on standard metropolitan statistical
areas, SMSAs, including variables related to environmental quality, urban form,
and household, industrial, and government activity. It analyzes these
variables through a two-stage factor analysis technique which allows heuristic
consideration of the significant characteristics. Finally, it develops city
clusters which group areas with similar attributes. Modal (or representative)
cities are selected for each group and suggested as case study sites. These
groups may be used to generalize the study results and to analyze the trans-
ferrability of results between areas. The methodology is sufficiently flexible
to consider a wide range of research hypotheses.
iii
-------
CONTENTS
Abstract ill
Figures vi
Tables vi
Executive Summary vii
1.0 Introduction 1
1.1 Research Objectives 1
1.2 Research Methodology 3
1.3 Results 5
2.0 General Classification of Cities 8
2.. 1 Research Design and Data Base 8
2.2 Stage I Factor Analysis 10
2.3 Stage II Factor Analysis 15
2.4 SMSA Groupings and Representative Cities 17
3.0 Applicability to General Environmental Research Process 22
3.1 Environmental Data Base 22
3.2 Applicability to Environmental Research 23
4.0 Other Research Purposes 25
4.1 Transportation Demonstration Project 25
4.2 Potential Applications 26
APPENDIX A: Listing of Data Variables 29
-------
TABLES
Number Page
1 Major SMSA Groupings for General Environmental Purposes ix
1-1 Major SMSA Groupings for General Environmental Purposes .. ~.. 3
2-1 SCS2 - Stage I - Factor Analysis of 15 Variables 11
2-2 SCS4 - Stage I - Factors 12
2-3 SCS4 - Stage I - Interpretation of Factors 13
2-4 SCS5 - Stage I - Final Factors 13
2-5 SCS6 - Stage I - Final Factors 14
2-6 Stage II Factors 16
2-7 Groups of Similar SMSAs 19
4-1 Variables Selected for the .Transportation Classification .... 25
FIGURES
Mumber
1
1-1
Causation: Environmental Quality.
Boston SMSA
Page
viii
2
vi
-------
EXECUTIVE SUMMARY
The research described in this report has developed two major products,
one a direct output, and the other a methodology for further analysis. The
first output has been a classification of Standard Metropolitan Statistical
Areas (SMSAs) based on broad measures of environmental quality and other
attributes. This classification depends on a large scale data base which in-
cludes 262 SMSAs, and has data on activity in the industrial, demographic,
and government sectors, on the attributes of the urban form and the physical
environment, and on the pollutant residuals and ambient environmental quality
resulting from these activities and attributes.
The second product is a methodology for developing alternative classifica-
tions, oriented towards specific policy or research issues and the urban char-
acteristics related to them. The methodology can be directed at issues such
as the choice of sites for case studies, and demonstration projects or the
transfer of results from one case study area to other cities. The data base
and methodology are being maintained by EPA for further applications in case
study analyses.
The results of this study are focused on analytic needs involving gen-
eral environmental quality or other subject areas. Environmental quality is
determined by actors in the urban socioeconomic system and the physical en-
vironment together. A simplification of the interrelationships is illustrated
in Figure 1. Note that, at least in this crude model, polluting residuals
and ambient environmental quality are entirely endogenous to the system. Be-
cause these two aspects of the system are closely related to the attributes of
the four other actors and because available measures of environmental quality
and residuals are less numerous and less reliable than most others, the major
classification scheme of this research was developed from data on the public
sector, households, industry, and the physical environment, and then evaluated
by comparison with data on environmental quality and residuals. Due to the
fact that these four aspects are comprehensive in terms of the urban system
and are not biased toward environmental quality, they may have far-reaching
applications.
Each box in Figure 1 is represented in the data base by a group of varia-
bles called a significant characteristic set (SCS). Altogether, then, there
are six SCSs which together contain approximately 200 variables. The SCSs are:
1. Ambient Environmental Quality
2. Urban Form and the Physical Environment
3. Residuals
4. Demographic Characteristics
vii
-------
Fiqure 1
CAUSATION; ENVIRONMENTAL QUALITY
(First Order Effects)
Public Sector
Abatement
V
Residuals
Urban Form and
Physical Environment
7
Ambient Environmental
Quality
viii
-------
Table 1
MAJOR^SMSA GROUPINGS FOR GENERAL ENVIRONMENTAL PURPOSES
(based on total SMSA Sample)
Group
No.
1
2
3
4
5
No. of
Cities
in Group
36
46
27
48
18
Representative City
Little Rock-North Little
Rock , AR
Lake Charles, LA
Williamsport, PA
Albany-Schenectady-Troy ,
NY
Dallas, TX
Cities Close to Modal City
Baton Rouge, LA
Corpus Christi, TX
Lafayette, LA
Midland, TX
Montgomery , AL
Odessa, TX
Tyler, TX
Spartanburg , NC
Parkersburg-Marietta, WV-OH
Davenport, IL
Evansville, IN-KY
Lawrence, MA
Peoria, IL
Appleton-Oshkosh , WI
New Britain, CT
Portland , OR
Charlotte, NC
Richmond, VA
ix
-------
5. Government
6. Industry.
Correlation between variables within an SCS was anticipated as well as re-
lationships between SCSs. The methodology used to classify SMSAs takes
advantage of these correlations to reduce the vast amounts of data through
factor analysis.
The factor analysis technique was applied to the data in two stages. In
Stage I, a small number of factors was extracted from the data in each SCS.
These factors summarized the basic dimensions of the data available to describe
relevant attributes of urban areas. In the second stage, factors derived in
Stage I were treated as variables. The factors derived in Stage II, then,
reflect relationships both within the SCSs and between SCSs. The four factors*
from Stage II which explain the greatest amount of variance in the data base
were taken to characterize the SMSAs for purposes of classification. As the
factors generated are linear combinations of the original variables, it is
possible to estimate scores for each observation on each factor. These pro-
vide the location of each SMSA in the four-dimensional factor space derived
in Stage II. (The 262 SMSAs have been classified by applying a simple "nearest
neighbor" clustering technique to these factor scores.) Five major city groups
were developed, including 175 of the 262 SMSAs. (See Table 1)
These groups were tested with respect to their ability to discriminate
between cities with different levels of environmental quality. A series of
t- and F-tests (statistical comparisons of means) were performed using a
select set of environmental quality measures. Testing revealed the groups
to be significantly different in an environmental prospective. The groups
appear to be useful for environmental research and may be tested in a similar
manner to determine their applicability to any given area of research.
New classifications may be developed by the same method used here by
modifying the data base. The availability of new and relevant data will often
justify such an effort. For instance, land use is a valuable measure of a
number of influences on environmental quality, yet little data measuring
land use is available except that which cones from dispersed sources in
various forms. Should a new body of uniform data on land use in a large
number of SMSAs become available, a more enlightened classification of SMSAs
might be developed.
Other classifications might be developed to satisfy a more specific
emphasis. This research included such an effort for the Energy Resources
Development Administration, interested in potential energy savings through
changes in transportation patterns. The basis of the classification was
limited to variables related to auto use: auto ownership, per capita vehicle-
miles travelled, household size, urban density, etc. Comparison of the result-
ing groups with those previously developed indicates a great deal is common to
the two classifications. The data bank, the methodology, the classification
and modal cities will be valuable in a variety of applications related to
urban development and environmental quality. Specific classification of this
sort is applicable to a wide range of environmental and urban policy research
problems/ wherever detailed case studies are performed.
*The number of second stage factors used for clustering was arbitrarily
limited to four. A larger number of factors represents more dimensions along
which cities may differ, fragmenting city groups into a large number of small
clusters.
x
-------
1.0 INTRODUCTION
1.1 Research Objectives
The officials of the Environmental Protection Agency and other Government
agencies are frequently faced with the task of evaluating the effects of pro-
grams or policies at the local and regional levels. For example, EPA officials
may be concerned with the effects of parking restrictions on urban air quality.
To analyze this, they may monitor air quality in every city where parking
restrictions are imposed, or they may restrict their monitoring activities to
a more limited representative sample of cities. The second alternative is
clearly more economical, however, it requires an appropriate urban classifica-
tion scheme. The objective of this research project was the development of a
flexible methodology for the classification of cities, which would-be appro-
priate for the purposes of testing the effects of general environmental and other
programs, and to aid in assessing the impacts of specific environmental policies.
These typologies then group similar cities and identify modal, or representative
urban areas for each group, facilitating the generalization of case study results.
In this report the two terms, cities and SMSAs (Standard Metropolitan Statistical
Areas), are used interchangeably. Cities normally make up parts of SMSAs, as
demonstrated in Figure 1-1.
In the last few decades, a large number of city classification schemes have
been developed. It may be appropriate then to ask why yet another methodology
and typology was necessary? A brief review of past attempts to classify cities
may answer this question.
Although considerable resources have been directed toward developing com-
munity typologies, few of the resulting classifications have been applied to
further research or practical problems. One reason is that every potential use
of a community typology has specific requirements in terms of community
characteristics considered, and the universe of communities to be investigated.
No classification, then, is useful in every case. The majority of the earlier
classification schemes did not include environmental characteristics.
Recently there has been a great interest in environmental quality and
quality of life. Coughlin* performed factor analysis for 101 metropolitan
areas on sixty indicators of environmental quality and quality of life. The
*Robert E. Coughlin, Goal Attainment Levels, in 101 Metropolitan Areas,
RSRI Discussion Paper Series No. 41 (Philadelphia: Regional Science Research
Institute, 1970).
-------
Figure 1-1
Boston SMSA
Outer Boundary of Boston StMSA
Urbanized Area, 1970
-------
analysis, however, was biased toward social and economic characteristics since
data on physical conditions was sparse. The John Somers' study* performed
for EPA represents another recent example. This study utilized 1960-61
Census data for the most part, therefore, its results are somewhat obsolete.
More relevant is Berry's study,** Land Use, Urban Form and Environmental
Quality, which provides a city classification based on social, economic, and
environmental characteristics. Although there are weaknesses in both the
data base and the methodology used for this research, Berry has made a beginning
and provides inspiration, as well as a core data base, for future research.
A complete review of community classification studies may be found in
Appendix B to Volume III. Most of these studies developed groupings for
single research purposes (e.g., transportation analysis, environmental quality
analysis, etcetera); in many, the groupings are limited to a subset of U.S.
metropolitan areas; and some of the data sets used are incomplete or out of
date. In contrast, the research described here did not identify a single urban
typology, rather it developed a flexible methodology through which urban
classifications may be developed for testing a variety of research hypotheses.
Further, all 262 SMSAs are included in the analysis, which utilized an extensive
data base, with much of the information recently becoming available.
1.2 Research Methodology
As the initial research objective specified case study site selection
for environmental analysis, the data base was designed to include descriptors
of ambient environmental quality as well as its causal variables. This data
bank thus includes information on ambient air and water quality, on the types
and quantities of residuals being discharged into the environment, on socio-
economic parameters, on the activities of the local government which affect
environmental quality, and on variables describing the urban form, including
land use, density, and so on. The data sources used include STORET, SEAS, the
Bureau of the Census, the Department of Transportation, and the Department of
Agriculture. Specific policy analyses would only use the relevant portions
of the data, of course.
Theoretically, it is possible to develop city groupings directly on the
basis of the variables. However, the number of variables in the data bank
represent too many "axes" along which cities may differ, making it impossible
to develop consistent city groupings. We have used an iterative, two stage
factor analysis procedure for data reduction purposes.
Factor analysis is an arithmetic means of reducing a complex and highly
intercorrelated data set to a smaller number of underlying factors. For
example, the research design may require information on the percent of
families below the poverty level, or the prevalence of substandard housing
*John Somers, George B. Pidot, Jr., Modal Cities, prepared under
Contract 8EPA-600/5-74.-027, for The Office of Research and Development, U.S.
Department of Environmental Protection Agency.
**Brian J.L. Berry, et al/ Land Use, Urban Form and Environmental Quality,
Chicago: Department of Geography, University of Chicago, 1974.
-------
units, and unemployment statistics: three highly correlated variables.
Factor analysis offers one method of combining such variables into a single
dimension for further statistical analysis or for grouping communities.
This procedure is described in Chapter 2, as well as in Volume III.
The use of factor analysis for data reduction purposes and for the
development of groupings has been widely critiqued. In a tongue-in-cheek
study of the dangers of indiscriminate use of factor analysis, J. Scott
Armstrong uses an example in which Tom Swift, the young analyst, must collect
data of significance and then analyze a sample of metal blocks. Armstrong
chose the variables in such a way that there are only five significant
variables in the grouping of eleven, the other six being only combinations
of the first five. The results are amusing with the characteristics of
the metal block identified as "intensity, shortness and compactness."* The
point made in the Armstrong article is that the investigator must have some
prior knowledge of the sample under study, first, to frame the hypotheses, and
second, to interpret the results in light of reality.
In order to frame the research hypotheses, and to assist in the inter-
pretation of the results, an iterative, or two stage factor analysis procedure,
was used. This approach also structured and facilitated the data collection
efforts. The variables, on which information was to be collected, we*e
separated into six categories, or significant characteristic sets (SCSs),
each of which describe or affect ambjLent environmental quality. These sets
are:
SCS 1. Ambient Environmental Quality
SCS 2 * Urban F6rm and the Physical Environment
SCS 3. Residuals
SCS 4. Household Sector
SCS 5. Government Sector
SCS 6. Industrial Activity
Simple factor analyses were performed for each of these SCSs, with the result-
ant factors being inputs into the second stage factor analysis. City group-
ings were then developed on the basis of the second stage factors obtained.
The objective here was to minimize the within group variance, and to maximize
the between group variance in terms of the dimensions defined by the second
stage factor analysis. In other words, the objective was to form groups
of cities similar to one another, but different from the cities of other
groups.
*J. Scott Armstrong, "Derivation of a Theory by Means of Factor Analysis
or Tom Swift and His Electric Factor Analysis Machine," The American
Statistician (December 1967): 17-21.
-------
Modal, or most representative, cities were then selected for each of the
groups by simply identifying the city in each group which lies the closest in
the multidimensional space to the geometric centroid (center) of the group.
1.3 Results
This research project developed an urban typology, and identified repre-
sentative SMSAs appropriate for general environmental analysis. In addition,
it developed a flexible capability for developing similar typologies and identify-
ing representative SMSAs for testing alternative research, hypotheses. Each of
these will be described in turn,
1.3.1 SMSA Groupings for General Environmental Research —
Five major city groups were identified for general environmental purposes;
these five groups include 175 of the 262 SMSAs considered. The remaining cities
are either outliers, single cities significantly different from the cities in
the five major groups, or they are in mnor
of cities, having different characteristics.
groups comprised of a smaller number
Table 1-1 describes the five major city groupings. The largest group con-
tains 48 SMSAs, with the modal city being Albany-Schenectady-Troy, New York.
Other cities in this group include Appleton-Oshkosh, WI; New Britain, CF;
Portland, OR; etcetera. The second largest group contains 46 SMSAs, with Lake
Charles, LA, being- the modal city, while the smallest group contains 18 cities,
with Dallas, TX, being its representative.
Group
No.
1
2
3
4
5
Table 1-1
Major SMSA Groupings for General Environmental Purposes
No. of
Cities
36
46
27
43
13
Representative City
Little Bock-North Little
Rock, AR
Lake Charles, LA
Williamsport, PA
Albany-Schenectady-Troy ,
NY
Dallas, TX
Cities Close to Modal City
Baton Rouge, LA
Corpus Chris ti, TX
Lafayette, LA
Midland, TX
Montgomery , AL
Odessa, TX
Tyler, TX
Spartanburg, NC
Parkersfaurg-Marietta, WV-OH
Davenport, IL
Evansville , IN-KY
Lawrence , MA
Peoria, IL
Apple ton-Oshkosh , ~fll
New Britain, CT
Portland , OR
Charlotte, NC
Richmond , VA
-------
From this classification the single most representative city for environ-
mental analysis in the United States is Louisville, Kentucky. If one were
limited to a single case study, or demonstration project, the results would
suggest that it should be located in Louisville, KY. The study results then
could be allowed for a greater number of case studies/demonstration-projects,
say five, these should be located at Little Rock, AR; Lake Charles, LA;
Williamsport, PA; Albany-Schenectady-Troy, NY; and Dallas, TX; with the results
being appropriate for the other cities in each of the five groups.
To assess the effects of city size on city characteristics, the set of
262 cities was divided into small (less than 200,000) population, medium
(between 200,000 and 500,000 population), and large (greater than 500,000
population) SMSAs. Two analyses were performed: first, the second stage factor
analysis was repeated for each of the city size groups. Second, a separate
clustering, similar to that described for the entire sample, was performed with-
in each of these strata.
Second stage factors remained stable for the three size groups. For example,
factor 1 (largest explanatory power) from the all city analysis, indicating low
income, low expenditures for sewerage and low levels of manufacturing activity,
showed up as factor 1 in the analysis for each group of cities. These clusters
were based on a single set of stage 2 factors, the set used for the general
classification.
Separate classifications may be very useful where city size is of great
importance. The classifications appear to provide similar results. Note from
Table 2-7 that small, medium, and large SMSAs are distributed throughout the
general classification. Clusters within city size strata were found to be
similar to the general SMSA groups, and to clusters in the other size groups.
For example, Group 1 of the small SMSAs and Group 1 of the medium SMSAs show
similar characteristics, as do Group 3 of the medium size SMSAs and Group 4 of
the large SMSAs. In other words, city size did not significantly affect the
classification scheme.
1.3.2 Application to Alternative Research Hypotheses —
The data collected in this research project, and the methodology developed
may be used to assist other environmental and urban research in three major
ways: through the direct use of the data, through the identification of appro-
priate case study sites, and through facilitating the generalization of study
results. Each of these will be described in turn.
The use of the data collected during this project for other research
purposes is an obvious function. Although our data collection efforts were
limited to secondary sources, some of the information contained in the data
base was not easily accessible to the public. Information on ambient water
and air quality, obtained from STORET and from the SEAS model, are two such ex-
amples. Some of the descriptors of land use represent another case in point:
the urbanized proportions of SMSAs, and the land area devoted to outdoor
recreation were obtained from OSDA, and the Bureau of Outdoor Recreation, re-
spectively. This data collection effort should not be duplicated by other
researchers; a comprehensive description of our data collection efforts, as
well as a complete listing of data may be found in Volume III of this
series of reports.
-------
As described above, a broad data base containing some 200 variables was
collected, containing descriptors of ambient environmental quality and a diversity
of other phenomena believed to affect environmental quality. Alternative
research hypotheses may be described in terms of the variables contained in the
data base, on the basis of which representative cities appropriate for case
study/demonstration project siting may be selected. For example, a program
analyst interested in the effects of the bottle bill on resource recovery may
be interested in funding a limited number of demonstration projects. The optimal
sites for these may be identified by first specifying the variables believed
to affect the outcome, then performing factor analysisf groupings, and the
selection of representative cities as described above. If the number of
variables is limited,, the research hypothesis is well defined, a simple one-
stage factor analysis may be appropriate. Additional variables of interest
available from secondary sources may also be added. In addition, the universe
of cities may be limited to fit the requirements of the particular research
project; this may be limited to cities in a certain size range, in certain geo-
graphical region, OJT to cities possessing certain attributes, such as high
unemployment.
Environmental deterioration and other problems frequently occur in SMSAs
which are "outliers," which do not fit into any of the groups. Although the
studies analyzing these effects cannot be easily generalized to other cases, a
limited extrapolation may be possible. An examination of the data for the
outlying SMSAs will identify the factor axes or variables with extreme observa-
tions, which are in fact responsible for the outlier position of the city. If
these variables are not crucial to the analysis, the city may be grouped with
others in terms of the remaining variables. The study results then can be
generalized to this group, although at a lower level of confidence.
This capability was tested during the course of the project in connection
with siting potential demonstration projects by the Energy Research Development
Agency (SSDA) . This agency is interested in a limited number of demonstration
projects for electric cars; potential sites for these demonstration projects
were identified by USR&E. The results of this application are described in more
detail in Chapter 4 of this, volume, as well as in Volume II.
Case studies or demonstration projects may not always be performed at their
ideal sites; data limitations or other constraints may prevent this. Alterna-
tively, a researcher may be interested in generalizing the results of a pre-
viously performed study. The methodology developed in. this project may facili-
tate this process as well. Variables are selected, factor analysis is per-
formed, and groupings are developed in the same manner as described above.
The cities of interest are then located within or outside the groups, indica-
ting the degree the study results may be generalized. The development of the
general city typology, the data bases, factor analytic techniques, and cluster-
ing methods used are described in Chapter 2. The applicability of the groups
and their modal cities to general environmental research is discussed in
Chapter 3; the development of alternative urban typologies is summarized in
Chapter 4.
-------
2.0 GENERAL CLASSIFICATION OF SMSAs
2.1 Research Design and Data Base
Essential to the design of any classification methodology are (1) defini-
tion of the entities to be classified, (2) identification of attributes to be
considered in the classification and the formulation of hypotheses concerning
those attributes, and (3) selection of appropriate techniques by which the
data can be used together to differentiate between observations. Each of
these will be discussed below.
2.1.1 The Set of Localities to be Classified —
For this research, the definition of entities to be classified was a sim-
ple manner. Standard Metropolitan Statistical Areas (SMSAs) have been defined
by the Office of Management and Budget (OMB) to represent the areas in and
around one or more cities that act as a center in which the activities form an
integrated economic and social system. Although other definitions of U.S.
metropolitan areas are available, none have been utilized as extensively as
the SMSA for the collection of data. The use of SMSAs achieved the greatest
possible amount of data consistent between metropolitan areas of the U.S.
As cities are constantly changing and OMB revises the definition of an
SMSA periodically, it was necessary to perform the classification from the
perspective of a single point in time. The set of SMSAs defined as of January
7, 1972 was arbitrarily selected as the set of SMSAs to be classified and data
utilized in the classification was that which most closely represented each SMSA
at that point in time.
2.1.2 The Data Base —
Figure 1 in "Executive Summary" presents a general diagram intended to identify
the major determinants of environmental quality. Beginning at the bottom of
the diagram, ambient environmental quality is shown to be a function of both
residuals and the capacity of the environment to dilute and/or neutralize
pollutants (urban form and physical environment). Urban form also influences
residuals generation, particularly in the transportation sector. Residuals
are also the net result of pollution generated by households and industry and
abatement efforts by the public sector. The public sector also influences
consumption through investment in public facilities. However, the three
areas at the top are reduced to simple form by associating abatement with
the public sector, consumption with households, and production with industry.
Second order effects, such as the influence of environmental quality on the
three actors at the top of the causal path, are not considered here.
The primary areas of interest, then, are :
1. ambient environmental quality
2. urban form and the physical environment
3. residuals
4. demographic characteristics
5. government
6. industry.
-------
The variables in each of these categories comprise a significant character-
istic set (SCS). Initial sets of variables used to. represent the six SCSs are
listed in Appendix A to this volume. These variables represent the data which
are presently available; many items are indicators of relevant activity or
surrogates for otherwise unquantified attributes.
For the general environmental, classification of cities, only four SCSs were
utilized. These are: urban form and the physical environment; demographic
characteristics; government? and industry. These SCSs include the variables
appropriate for a general urban classification scheme. Further, these variables
describe some of the factors which determine ambient environmental .quality, as
well as the sources and levels of pollutants. The resulting typology should
then be appropriate for urban research, as well as for general environmental
policy analysis.
SCS2 (urban form) contains variables describing the distribution of
activities within the SMSA, the density of the SMSA and its urbanized portions,
the assimilative attributes of the city, as well as some transportation related
measures. SCS4 is comprised of household descriptors, providing information on
the demographic characteristics of the population, on housing quality and
living conditions, on economic welfare, on population and income changes, and
on the modal split in transportation. The public sector, SCSS, contains in-
formation on Government expenditures for improving environmental quality, as
well as general community concern and involvement. SCS6 is comprised of in-
dustry variables indicating the importance of industries critical.to environ-
mental quality,* the importance of manufacturing and describing the :industry
mix in terms of 20 manufacturing categories, wholesale trade, retail trade, and
selected services.
Because previous studies have found size and regional location to dominate
groupings, the variables were "standardized" where appropriate: in that they
were expressed in per capita or normalized terms. Regional biases were also
eliminated where possible; variables that describe only regional location (e.g.,
latitude) were excluded.
2.1.3 Methodology —
As described in Chapter 1, the data set contains too many variables to
perform groupings directly so a data reduction technique is necessary.
An iterative, two stage factor analysis was chosen in order to facilitate
framing the research hypotheses, and to aid in the interpretations of the
resulting factors. Selected factors derived from simple factor analyses of the
four individual SCSs served as the second stage factor analysis, the second
stage factors then formed variables, the basis for the clustering.
The basic function of factor analysis is to ascertain and measure the funda-
mental dimensions or interrelationships of a set of variates. The transforma-
tion is made possible by moderate or high correlation between variables which
compose the original data base. When two or more variables are very highly
correlated, a single "factor" may describe this related variable set. In geo-
metric terms, correlations between the original variates indicate nonorthogonal
Industries critical to environmental quality have been defined as the
heaviest polluters, and industries where abatement is the most difficult.
-------
dimensions. In most cases, resolution into orthogonal (independent) dimen-
sion: factors will simplify the vector space.*
Many alternative techniques have been devised in the development of
factor analysis. Although new methods often improve on the last for some
analyses, no single method is regarded as better than all others. In fact,
none is applicable to all research efforts. The methods differ in basic
assumptions about the data base as well as in solution techniques. As a result,
the value of any given application of factor analysis rests in the ability of
the researcher to choose the method most appropriate to his research. Alter-
native methods of performing factor analysis including principal components
analysis and common factor analysis are described in Appendix A to Volume II.
For purposes of this research, principal components analysis was selected
as the preferred method; with common factor analysis being the principal
alternative considered. The result of the two techniques did not differ
substantially, with principal components analysis being the less expensive
alternative, in terms of computer costs. The choice of this technique is
described in greater detail in Section 4.5.3 of Volume II. Other aspects of
refining the methodology include selecting the method of rotating the factors
and the. choice of the clustering technique. These are discussed in Chapters
5 and 6 of Volume II, respectively.
2.2 Stage I Factor Analysis
For each SCS, the general approach to developing Stage I factors was to
perform initial statistical descriptions and then repeat factor analyses to:
1. clear the data set of unrealistic and erroneous
observations;
2. modify hypotheses and the variables in each SCS
to reflect knowledge gained from initial analysis;
3. estimate values to replace missing observations;
4. complete the Stage I factor analysis.
The sections of this chapter will describe, separately for each SCS, the
course of analysis followed and the outcome of Stage I* analysis.
2.2.1 SCS2—Urban Form and the Physical Environment —
The analysis of SCS2 included a large number of factor analysis runs—
each yielding new information about the measures involved. From the '30
variables originally included in this SCS, 15 were chosen for the final Stage
I factors which best reflected the urban form concept of the original research
hypothesis.
*Note, however, that the final (preferred) solution may include non-
orthogonal factors.
10
-------
The remaining set of 15 variables was factor analyzed, and yielded the
five factors shown in Table 2-1, which explain 65.7 percent of the variance
in the variable set. Factor 1, for example, loads on all five central
tendency measures, and thus presents a measure of the proportion of activity
which takes place in the central city(s) of the SMSA. It is not surprising
this appears as the primary indicator of urban form.
Factor 3 has a high positive loading on the number of primary radial
facilities and a positive loading on the number of major circumferential facil-
ities (roads) in the urban area. The factor loads negatively on square miles
per person in the SMSA. Together, these indicate a heavily urbanized area.
Factor 5 loads heavily on the number of population centers in the area
(PCR) and the number of square miles per person in the central city(s) (SQMC).
This indicates a dispersed pattern of urbanization involving more than one
community in the core of the area.
Table 2-1
SCS2 - Stage I - Factor Analysis for 15 Variables
Variable
fACTOR
>" 3
Portion of Employment in Central City
Portion of Total Population in Central City
Portion of Manufacturing Bnployment in the
Central City
Portion of Retail Sales in Central City
Portion of Manufacturing in Central City
Portion of Land Which is Urbanized
Portion of Workers Working Outside SMSA.
Portion of Land Devoted to Outdoor
Recreation
Number of Radial Roadways
Number of Major Circumferential Roadways
Square Miles Par Parson
Total Miles of Roadway per Capita
Portion of Principal Arterials
Number of Population Centers
Square Miles per Person - Central City
factor loadings are indicated as follows .50 to .73 (+); .75 to 1.00 C-I-M; -.50 to -.75 W, -.75 to
-1.00 (—). The factors are numbered in order according to the percent of variance explained
65% of variance explained.
11
-------
2.2.2 SCS4—The Household Sector —
analysis of SCS4 data yielded factors which describe the socioeconomic
character of an urban area. In this sense, the analysis of this SCS is
more similar than any other to previous efforts in city classification.
The variables used in this factor analysis, and their factor loadings are
presented in Table 2-2. The interpretation of the resulting seven factors
is summarized in Table 2-3.
Table 2-2
SCS4 - Stage I - Factors
Variable
Movers into the SMSA
Individuals Residing in the Same
Dwelling for Five Years
Population Change - SMSA
Female/Male in 20-64 Age Group
Population Change - Central City.
Employmtn/Population - Total
Employment/Population - Male
Employment/Population - Black
Employment/Population - Female
Income - Gini Coefficient
Median Family Income
Married Couples without Own Household
Crowded Housing Units
Per Capita Income
Hbrk Trip— Drivers
Work Trip - Passengers
Single Fanily Dwellings
Households with one or more Auto-
mobiles
Owner-Occupied Housing
Units in Structure — More than Five
Portion of Population which is Urban
Units in Structure — More than Fifty
Portion of Population which is of
Foreign Stock
Average Household Size
Median Age
Fertility Rate
Portion of Population which is Black
Infant Death Rate
Relative Death Rate
1 2
A
W
-y
fr
o
tran
growtl
'
A
•f-t-
••
\y
standaj
f livij
s portal
r
Factors
3 4
employ
rd_
19
A
V
u-on
"^^^
1
^^
ment
<=:
^n
"
i
\
567
0
&
>\
hous
^
P
\+j
at
ossaopo',
//•I
I
ing
.itan
H
1
fan.
stn
Aj
n
-
1 +•/
health
.andard
I
Lly
icture
•H-
; t
73.3% of variance explained
12
-------
Factor
1. Growth
2. Employment
3, Standard of
Living
4. Location-
Density
Table 2-3
SCS4 - Stage I - Interpretation of Factors
Positive Score Indicates some Combination of—
Immigration to the SMSA
Increasing population in SMSA and central city
High employment participation rates in all sectors of
the population
Relatively low level of income as indicated by median
and per capita measures
Unequal distribution of income over the population
Crowded housing
5. Cosmopolitan
6. Family
Structure
7. Health
Standard
High proportion of single unit and owner-occupied housing
Heavy dependence on the automobile
Relatively high level of income as indicated by a
per capita measure
Dense housing
Urban and foreign elements of the population
relatively large
Crowded housing
Large proportion of young families
High infant death rate
Relatively high overall death rate given the age
distribution of the population
2.2.3 SCS5—The Public Sector --
As indicated in Table 2-4, eight variables have been used to describe the
public sector. These yielded four factors.
Table 2-4
SCS5 - Stacre I - Final Factors
Variable
Local Govemnent
General Revenue*
General Expenditures
Smploynent
Sewerage Expenditures — Total
Sewerage Expenditures — other than
capital
Expenditures on water Supply
Expenditures on parks S Recreation
Expenditures on Sanitation other than
Sewerage
Factors
1
<-v
f\
I**;
w
234
—large
A
n
I]
u
V
sa
government
•sewerage
A
r
lo
V
nitati
water
parks
/N
,n-0
\J
8S.5* of variance explained
13
-------
2.2.4 SCS6: The Industrial Sector —
Sesults of the first stage factor analysis for this SCS are shown in
Table 2-5. The eight factors described here explain 59 percent of the
variance indicated by the 24 input variables. The 8 factors appear to describe
well the basic dimensions of industry mixes and could be easily titled as
follows:
1. overall level of industrial activity
2. services and trade
3. textiles and apparel
4. miscellaneous manufacturing and instruments
5. fuel and chemicals
6. paper and allied products
7. leather and leather products
3. lumber and wood products.
The factors are listed in order according to the amount of variance explained
fay each; the latter factors, therefore, are the least valuable to the factor
description of SMSAs.
Table 2-5
SCS6 - Stage I - Final Factors
Factor Variables
1 VAM Value added in Manufacturing
EM Employment in Manufacturing
S34 Value added in Fabricated metal products
S35 Value added in machinery, except electrical
2 WHOL Wholesale Sales
RETT Retail Sales
SS Selected Services
S27 Value added in printing and publishing
3 S22 Value added in textile mill products
S23 Value added in apparel and other textile
products
4 S39 Value added in miscellaneous manufacturing
industries
S38 Value added in instruments and related
products
5 S28 Value added in chemicals and allied products
S29 Value added in petroleum and coal products
6 526 Value added in paper and allied products
7 S31
8 S24
59% of variance explained.
Value added in leather and leather products
Value added in lumber and wood products
NOTE: These factors did not load on the remaining 7 variables: S20—value
added in food and kindred products; S2S—value added in furniture and fixtures
530—value added in rubber and plastic products; S32---value added in stone,
clay and glass products; S33—value added in primary metal industries; S36—
value added in electrical equipment and supplies; S37==value added in trans-
portation equipment.
——
-------
2.3 Stage II Factor Analysis
The two-stage application of the factor analytic technique was utilized
to insure the proper evaluation of research hypotheses. Stage 1 analysis
involved separate factor analysis for each SCS to reveal the hypothesized
underlying dimensions within each group of variables. The output from in-
dividual Stage I analyses was then combined, and used as the input to Stage
II, Stage II thus identifies the relationships between SCSs and the basic
underlying attributes of U.S. metropolitan areas.
For a variety of reasons—both conceptual and statistical, a limited num-
ber of Stage I factors were chosen for input to Stage II. This set included
the first four factors from four SCSs—2, 4, 5, and 6—with the exception of
Factor 3 from SCS2.
Stage I analysis yielded a different number of factors for each of the
four SCSs to be pursued in Stage II. For SCSs 2, 4, 5, and 6,. the number of
factors was 4, 7, 4, and 8, respectively as indicated in the previous section.
To use this set of factors in Stage II would create a significant bias
toward SCS4 (.the household sector) and SCSS Cthe industry sector). In addi-
tion, since Stage I factors from each SCS are mutually independent, excess
factors in SCS4 and SCS6 wi.ll lead to the formation of additional separate
factors beyond the primary factors indicated by interactions between the
four SCSs. With these considerations/, the number of factors from Stage I
to be included in Stage II was limited to four per SCS.
The loss of explanatory power from the exclusion of these factors is
significant but tolerable. The loss will be 28.3 percent in SCS4 and 21.0
percent in SCSS. Percent of variance explained by the first four factors
in each SCS is as follows:
SCS2* Urban Form 70.4 percent
SCS4 Household Sector 50.0 percent
SCSS Public Sector 85.5 percent
SCS6 Industry Sector 38.0 percent
Although several alternative combinations of Stage I factors were tested,
the even distribution of factors between SCSs yielded the most meaningful
factors, therefore, the reduced set was accepted as the best input to Stage II.
Factor analysis of the 15 Stage I factors shown in Table 2-6 yielded
6 Stage II factors which, together, explain 63.1 percent o£ the variance in
Stage I factors. These factors are intuitively satisfying as well as
statistically valid in that each factor represents a set of characteristics
which are likely to be encountered together in an urban area.
Factor 1 indicates a low standard of living, low government expenditures
for sewerage, and a low level of total manufacturing activity; this factor
would characterize an economically depressed area on this factor scale.
*Factor 3 was dropped from SCS2 because of data problems.
15
-------
Factor 2 indicates a low level of total manufacturing activity, heavy
growth in recent years and a high concentration of population and economic
activity in the core (central city) of the SMSA.
Factor 3 indicates manufacturing activity in the miscellaneous category,
a compact core, and heavy expenditures for sanitation other than sewerage.
Factor 4 indicates high employment and service trade activities.
Factor 5 indicates low residential densities, high auto dependence, little
manufacturing in industries such as textiles and apparel, and heavy expenditures
on water supply and recreation.
Finally, factor 6 indicates a highly urbanized SMSA with a relatively
large local government.
scs
4
5
6
4
2
6
2
5
4
6
4
6
5
5
2
Stage 1
Factor No
3
2
1
1.
1
4
4
4
2
2
4
3
3
1
2
Table 2-6
Stage II Factors
- Staae I Factor Name
Low Standard of Living
Expenditure on
Sewerage
Overall Level of
Manufacturing
Growth
Central Tendency
Miscellaneous Manu-
facturing & Instru-
ments
Sprawling Core
Expenditure on Non-
Sewerage Sanitation
Employment
Services & Trade
Location-Density
(many single- family
homes, high auto
dependence}
Textiles & Apparel
Expenditure on Water
Supply s Recreation
Large Government
Highly Urbanized
Stage II Factor
123456
-H-
—
-
-
++
+
•H-
-
4>
+4-
•»+
•K
-
+
•H-
—
16
-------
Before proceeding to the next stage of the analysis, which identified
groups of similar cities, a test was performed to determine the stability of
Stage II factors between different size cities. The factor analysis was'
repeated for each of three groups of cities:
Group Population
small less than 200 ,.000
medium 200,000-500,000
large more than 500,000
The proportion of variance explained by primary factors is stable, varying-
only from 61.7 percent to 65.4 percent-
2.4 SMSA Groupings and Representative Cities
Groups of similar cities were identified through a simple geometrical clus-
tering technique; in which each SMSA is initially considered a separate point.
The two groups separated by the smallest geometric distance are then located
and combined to form a new group with its centroid (center) midway between
the two points. Then,, the two groups with nearest centroids of the new set
are combined; and a new centroid located; the process can continue until only
one groups remains. The centroids are weighted by the number of SMSA's
already in the group.
Criteria for choosing a stopping point in the process include the size and
number of groups, and the relationship between within-group variance and between-
group variance.
Once the set of groups is selected, modal cities are identified by simply
determining which city in each group lies closest in the multidimensional space
to the geometric center of the group.
It would have been possible to develop SMSA groups directly from the
initial variables using the geometrical clustering routine. In practical termsA
however, the variables of the data base provide too many dimensions along
which cities may differ—the additional descriptive information provided by the
variables stresses the uniqueness of each city rather than underlying basic
characteristics which the cities have in common- Because the use of too many
dimensions creates an unmanageable set of groups, input to the cluster analysis
was arbitrarily limited, to the first four Stage II factors-
Table 2-7 shows how the 262 SMSAs grouped together form basic classes
of SMSAs. Five major groups of cities were identified? these include 175 of
the 262 cities. The remaining cities grouped together as follows:
36 SMSAs in groups of five or more
34 SMSAs in groups of two to five
*Computer program written by Howard Gilbert and Steve Chasen, Health
Sciences Computing Facility, University of California, Los Angeles, California.
Reference: R.R. Sokal and P.H.A.- Sneath (1973) Numerical Taxonomy; the
Principles and Practice of Numerical Classification (San Francisco: W.H.
Freeman and Co.).
17
-------
17 SMSAs in groups of one.
In Table 2-7 the double lines indicate division between groups and the
dotted lines delineate subgroups. Subgroups within the major groups
exhibit minor dissimilarities; the major groups are dramatically different.
Group I shows a low level of manufacturing activity, low income and low
expenditures on sewerage. This is tempered by moderate loadings on Stage II
factors 2 and 4 which signify a growing economy oriented more than the average
toward services and trade. Little Rock, the modal city for this group/ is very
close to the group's centroid. In 1970, Little Rock was less wealthy than
the average SMSA as is indicated by the number of families below the low income
level: 13.5 percent as compared to the national level of 8.5 percent. But the
area is growing and, in 1970, enjoyed a high rate of employment of. 3.3 percent
Caverage for all SMSAs was 4.3 percent), and 34.7 percent of all housing units
were built since 1960 (average for all SMSAs was 25.5 percent). Employment in
manufacturing was below the national level: 20.L percent as compared to 25.8
percent for all SMSAs in 1970. Other cities found near the- center of this
group include Baton Rouge, .LA; Corpus Christi, TX; and Montgomery, AL.
Group II includes cities which are closer to the centroid of all cities
than Group I. Factor scores indicate these cities have high unemployment and
are not active in services and-trade. In addition, they generally have slightly
lower than average income levels and economic activity and they may have experi-
enced less than average growth. Lake Charles, LA, the modal city for this group,
is still different from a hypothetical city at the centroid of the group, loading
more heavily on Factor 1 and not at all on Factor 3. A high Factor 1 score
is the result of a lower than average standard of living. Of all. families
in Lake Charles, 16.6 percent, have incomes below the low income level and 4.7
percent of all housing units lack some or all plumbing facilities (national
averages for SMSAs are 8.5 and 2.9 respectively). And, a large negative score
on Factor 4 reflects the combined effect of slightly more than average activity
in services and trade accompanied by very high unemployment (5.7 percent as
opposed to the average 4.3 percent). Other cities near the centroid of the
group are Spartanburg, NC; and Parkersburg, WV. The group also includes the
overall modal city of Louisville, KY.
Group III has high negative scores on Factors 2 and 3. Cities in this
group, then are expected to be small SMSAs (in area) with an industrial base
and little recent growth. Williamsport, PA, the modal city for this group,
includes only one county of moderate size, 42.6 percent of its employment is in
manufacturing and population growth in the decade ending 1970 was only 3.6
percent, as compared with the national average of 16.6 percent for SMSAs.
Other cities near the centroid of this group include Davenport-Bock Island-
Moline, IA-IL; Evansville, IN-lOf; and Lawrence'-Haverhill, MA-NH.
Group IV is even closer to the overall centroid than Group II. Scores are
moderately negative for Factors 1 and.2, moderately positive for Factor 3 and
almost zero for Factor 4. Thus* these cities are expected to rank about
average on the dimensions defined by the factor analysis. The modal city,
Albany-Schenectady-Troy, NY, has a more negative score than the centroid
18
-------
VO
CROUP 1
Abilene, TX
Lafayette, LA
San Angela, TX
Midlar.d, TX
Lubbock. TX
Tuscon, AZ
Albany, GA
Knoxville, TN
Uest I'dlui Beach-Boca Raton, FL
Tan.pa , FL
Monroe , LA
Orlando FL
Table 2-7
Groups of Similar SMSAs*
CROUP 2
Ifuntington-Ashla.n<}, WV-KN-OH
Stockton, CA
Modestot_CA ___ _
Augusta, GA-SC
Pueblo, CO
Fresno, CA __
Killene-Tonple, TX
Lewiston-Auburn, ME
Pine Bluff, AR
Spokane, UA
Owensboro, KN
Fort Hyera, FL
Yakima^^UA
HINOH CBDUP
El Paao, TX
Tuacalooaa, AL
San Antonio, TX
Columbus, GA-AL
MINOR CROUP
Calveston-Texas City,
Manchester, NH
Santa Barbara, CA
Columbia, SC
Corpus Christ t, TX
SHre vuport , LA
Macon . GA
Texarkana, TX-AR
Wilmington, NC
Savannah, GA
Tyler, TX
- Montgomoty , AL
MS
New Orleans, LA
Portland, ME
Waco, TX
Little Kock-N- Little Rock, AR«
Odessa, TX
Tulsa. OK
Baton Rouge i LA
St. Joa, >h, MO
Sioux City, IA-NB
Billings, MT'
Boise City, ID
lluntsville, AL
Springfield, HO
Amarlllo. TX
Florence, AL
Santa Rosa, CA
Penaacola, FL
Riverside-San Bernadlno-Ontarlo, CA
Provo-Orem, UT
Charleston, SC
Salem, OR
Ouluth-Sugerior^ WI-Mjj___ _ _ _
Altoona, PA
Gadsden, AL
Lake Charles, LA**
Mobile, AL
Bakersfield, CA
Chattanooga, TN-GA
Lakeland-Winter Haven, FL
BirniUujhain, AL__
GROUP 3
Allentoun-Bethlehen-Eaaton, PA-NJ
llarrisburg, PA
St. Louis, MO-JL
Greenville, SC
Reading, PA
York, PA
Lancaster^ £*____.___________,^_
Cedar Rapids, IA
Waterloo-Cedar Falls, ID
Fort Wayne, IN
Toledo, Oil- MI
Nockford, IL
Erie, PA
Wheeling, HV-OII
Poughkeepsie, NY
Louisville, KM-IN
Ricliland-Kennewick, HA
Springfield, Oil
Parkersburg-Marietta, HV-OH
Sacramcntot_CA ___„______,_
Uavonport-MoUne-Rock Inland, IA-IL
Hilliamsport, PA*«
Elnlra, TX
Mansfield, OH
Lima, Oil
Peoria, IL
Lawrence-Haverhill, HA-NU
Gastonia, NC
Salt Lake City, UT
Pcternburg-Colonial Heighta-Hopewall, VA
Charleston. WV
HiIkes Barre-Hazelton, PA
Beaumont, TX
Spartanburg, SC
Hew Bedford, HA
Alexandria, LA
Indianapolis, IN
Springfield, IL
Wichita, KS
Decatur. IL
Terre Haute, IN
Anderson, IN
•This table is to ba used in conjunction with Table 6-5 as discussed below pertaining to Group V.
••Modal SMSA
-------
N3
O
MINOR.
—/ itxm-.xiiut.uj
Hartford, CT
Minneapolis-St. Paul, MN-WI
San Jose. CA
Milwaukee, HI
Mashing ton, DC-MB-VA
Norwalk, CT
Rochester, HY
UKOUP 4
Bristol, CT
Mariden, CT
N..IW London-Norwich, CT-RJ
Baltimore, MO
Los Angeles, CA
Molborune-Titusville-Cocoa, Beach, FL
Day (.oiia Beach, FL
Newark, HJ
Philadelphia, PA
AfpU-ton-Oshkosh, WI
Syracuse, NY
Racine, WI
Scra.-iton, PA
Loraine-Elyrla, OH
Worcester, MA
South Bond, IN
Bui.jh.iml^ii, NY-PA
Now Brunswich-Parth-Amboy-SayrevHle, NJ
Mi IminyLon, HE-NJ-HD
Cleveland, Oil
Detroit, MI
Tienton, NJ
O.iytun, Oil
Cincinnati, OH
Brockton, !-!A
PittsCield, HA
UJ--.-11, HA-IIII
WI
Anahrim-Santa Ana-Garden Grove, CA
Ni>w Iiavon-W.'st Haven, CT
Chicago, IL
Jersu/ City, NJ
Seattle-Everett, WA
San rranci >ico-Oak land, CA
fort I and, OK- HA
New Britain, CT
Oxnard-Siroi Valley-Ventura, CA
Nashville, TN
Green Bay, I.I
La crosse. MI
Dubutjue , IA
Ogj^-n, UT
Santa Cruz, CA
Haiti! 1 ton-Middle town, OH
Albany-Schenectady-Troy, NY**
CKOIIl' 5
Charlotte, NC
Dallas, TX**
Oklahoma City, OK
Baleigh, NC
Lexington, KV
Tallahassee, FL
Jacksonville, FL
Durham. NC
. TN-AR-HI
Qes Moinea, IA
Kansas City, KS-MO
Stamford, CT
Richmond, VA
Omaha, NB
Denver, CO
Coluinbua, Oil
Houston, TX
MINOR CROUPS
Fort Lauderdale-Hollywood, FL
Phoenix, AZ
Roanoke, VA
Sarasota, FL
Fall kiver, MA-RI
Albuquerque, NH
Eugene-Springfield, OR
Fargo-Hoorehead, ND-HN
Lincoln, NB
Lafayctte-Wust Lafayette, IN
Rochester, MN
Topeka, KS
Bloomington-Normal, Id
Canton, OH
Youngstown-Warren, OH
Pittsburgh, PA
Paterson-Clifton-Paasalc, NJ
Utica-Rome, NY
StoubenvHle-Weirton, Oil-WV
Johnstown, PA
Long Branch-Asbury Park, HJ
Atlantic City, NJ
'•Modal SMSA
Vineland-Mlllville-Bridgeton, NJ
Fort Smith, AR-OK
Lynchburg, VA
Ashevllle, NC
Las Vegas, NV
New York, NY
Madison, WI
Bryan-College Station,. TX
Gainesville, FL
Columbia, MO
:VAuatin, TX
H1NOH (illOUPS
Greensboro-Hinston-Salem-Highpoint, NC
Nashville-Davidson, TN
Sioux Falls, SO
Reno, NV
Atlanta, GA
Baltimore, HD
Jackson, HI
Huskogon-Huakegon Heights, HI
Gary-Hammond, IN
Saginaw, HI
Huncie, IN
Bay City, HI
Flint, HI
Lansing-Cast Lansing, HI
Kalamazoo-Portage, MI
Ann Arbor, HI
Fayetteville, NC
Lawton, OK
Newport News-Hampton, VA
Norfolk-Virginia Beach-Portsmouth, NC
San Diogo, CA
Champalgn-Urbana-Rontoul, IL
Colorado Springs, CO
, WA
Salinas-Seaside-Honterey, CA
Vallejo-Fairfield-N-pa, CA
Great Falls, HT
Btloxi-Gulfport, MS
Hiama, FL
Provi dence-Harwick-Pawtucket , RI-HA
Haterbury, CT
Springfield-Chicopee-llolyoke, HA-CT
Boston, HA
Bridgeport, CT
Akron, Oil
Laredo. TX
HcAllen-Pharr-Edinburg, TX
Brownsville-Harlingen-San Benito, TX.
-------
on Factor 1, perhaps the result of higher incomes and manufacturing activity.
Cities very similar to the modal city include Appleton-Oshkosh, WI, and New
Britain, CT.
Group V has high scores on Factors 2 and 4, describing large SMSAs which
are prosperous, as indicated by growth and high employment, and which are
active in services and trade rather than manufacturing. Dallas, TX, has
been designated as the modal city for this group and appears to fit the factor
description.
A great deal of caution should be exercised in dealing with the modal
city and groups of cities since the group includes a large number of cities
for which several of the values were estimated. Most of these estimated values
are for descriptors of the industry mix, which is important to the grouping
fof these cities.
21
-------
3.0 APPLICABILITY TO GENERAL ENVIRONMENTAL RESEARCH PROBLEMS
As described in the previous chapter, the general city classification scheme
excluded the SCSs describing ambient environmental quality, and the residuals
discharged into the environment^ There were several reasons for this r a
classification scheme based on the other four SCSs would result in city
groupings useful for general urban research; and the data describing- environ—
mental quality had some limitations.. Further* the general city groupings
should reflect differences in the generation of residuals, and in ambient.
environmental quality if the causal relationships hypothesized in our research
design are true (see Figure 1).
In this chapter, first the data base contained in the ambient environmental
quality SCS and in the residuals SCS are described. Second, differentials
in environmental quality are analyzed between the general city groupings..
*
3.1 Environmental Data Base
This data base consists of eleven, ambient water quality indicators, two
measures of air quality, a single subjective measure of perceived water quality,
and eleven drinking water quality variables.
The water quality variables were obtained from STORET. For each SMSA,
up to eleven longitude-latitude points were identified along the boundaries;
information was retrieved from all STORET stations within the polygon defined
by these longitude/latitude points. A simple average of the readings was then
calculated for each variable and used to indicate water quality differences
across SMSAs. These measures represented approximations at best because of
the uneven distribution of sampling over time and over space» Because the
motive for sampling varies, the parameters measured, the sampling methods and
the location of the STORET stations also vary. Missing values also represented
a significant problem.
Suspended particulates and sulfur dioxide were the only two air quality
parameters included in our data base> for other descriptors of air quality
(oxidants, carbon monoxide, nitrogen dioxide, for example) information was
available for a limited number of SMSAs only. The SEAS data file was the source
of the air quality information.
The PDI index is a subjective measure of the prevalence, duration, and
intensity of water pollution, calculated by the Office of Water Programs in
EPA. Drinking water quality data has been obtained from the Water Supply
Division of EPA. The ten drinking water quality parameters include informa-
tion on the chemical content of the water supply, its alkalinity, hardness,
and acidity.
Information on the quantities of residual pollutants discharged into the
environment has been obtained from the SEAS data base. This data base includes
information on the quantities of residuals discharged into the air: particulates,
sulfur oxides, etcetera into the water; BOD, suspended solids, etcetera, and on
the generation of solid wastes. Data on residuals from the SEAS data bank is
22
-------
computed rather than directly measured. Industry coefficients are developed
for approximately 400 pollution producing economic sectors and subsectors.
These coefficients relate the generation of a specific pollutant by the partic-
ular industry to the output of that industry at the national level. The
coefficient times the output of the industry in the given SMSA equals the
total gross residual. A second coefficient estimating abatement by sector
is applied to the gross residual at the SMSA level to obtain the net residual.
The use of national coefficients for most sectors ignores regional differentials
in the production of residuals generation process-
Industry output at the SMSA level is measured by total economic value of
production. The 1975 data used here is actually forecast by the SEAS model
rather than measured. The national forecast is shared out between SMSAs based
on disaggregate forecasts prepared by the Bureau of Economic Analysis COBERS),
the Economic Information System (EIS) tapes and other appropriate sources.
3.2 Applicability to Environmental Research
The simplest method of testing the applicability of the general SMSA groups
to environmental policy analysis was to look at the variation in environmental
measures between groups of SMSAs. Three approaches to comparison have been
followed: regression analysis, factor analysis, and t- and F-tests (comparison
of means).
Regression analysis was used to test the hypothesis that there is a
significant relationship between environmental variables and Stage I factors
derived from nonenvironmental data. The statistics support this hypothesis.
For example, dissolved oxygen (DO) was. found to rise with sewer expenditures
(S5F2)*, although it is negatively related to other sanitation expenditures
(S5F4). Growing cities have, on the average, lower DO than older centralized
cities (S2F1, S2F2, S4F1) , and low income also is correlated with low stream DO.
A portion of these effects may be related to the fact that northern cities
naturally have higher DO because of lower temperatures. However, the general
relationship is that sewers, higher incomes, and slow growth all improve DO.
Similarly, significant relationships were found between the other environ-
mental quality variables, and the Stage I factors. For a comprehensive
description of these results, see Chapter 6 of Volume II.
The second approach to testing the relationship between environmental
characteristics and more basic urban attributes was to include Stage I factors
and variables from SCSI and SCS3 in the Stage II factor analysis. The
second stage analysis was repeated with each of the eight variables indicating
ambient environmental quality. A close relationship between environmental and
general attributes would be expected to cause the added measures to join
Stage I factors from other SCSS to form factors similar to those which resulted
from the basic fifteen factor set. If environmental attributes were not closely
aligned with other attributes, the added measures would cause the factors to be
restructured to some extent—perhaps forming an entirely new Stage II factor.
*S5F2 indicates factor number 2 from SCSS.
23
-------
When SCSI indicator variables were added, the result was as expected—
the indicator variables appended themselves directly to factors derived from
the basic fifteen factors. In no case was an additional factor developed.
The final approach, to testing the suitability of the groups to environ-
mental analysis involved t- and F-tests, comparisons of means to test whether
the groups xere significantly different in terms of the environmental quality
variables.
In the simplest case—testing-whether Group A and Group B have significantly
different yalues for a single variable—a t-test is used with the null hypo-
thesis indicating equal means for the two groups. For each variable, then, each
pair of groups was. compared to generate t-statistics which indicate the magnitude
of any difference in the means relative, to the variance of the given variable.
significant differences between groups were found for all but one variable—
the PDI index.
Groups can also be compared in terms of general environmental quality by
performing F-tests between the groups using all eight indicator variables
together. The null hypothesis is that no group has its own characteristics,
that there is a high probability that the eight environmental quality variables
do not show significant differences between the groups. If this was true for
a pair of groups, they could be combined to form a single, unique group in
terms of environmental quality.
In every case, however, F-statistics indicate with at least 60 percent
probability that the groups were significantly different given the eight
environmental indicators.
24
-------
4,0 OTHER RESEARCH PURPOSES
As discussed in. Chapter 1, the data base and the methodology developed.
during this research project may be useful for other research purposes. In
particular, the data may be used directly, city groupings and representative
cities may be selected for testing alternative research.hypotheses, and the
results of studies performed in. particular localities may be extrapolated
to other areas. During the project, this site selection capacity was tested.
for ERDA? so that potential sites for electric car demonstration projects were
identified- The example described in the following section is followed by other
potentiaL applications described in Section. 4.2..
4.1 Transportation Demonstration Project:
In the context of this project, a specific city classification scheme
was developed in response to a problem proposed by the Energy Research Develop-
ment Agency. ERDA is concerned with the potential for energy conserva-
tion which may be achieved through alternative transportation policies, in urban
areas, in particular, through the use of electric cars. The development of an
appropriate urban classification scheme and the identification of SMSAs for
performing case studies and/or siting demonstration projects are the
objectives for this task.
Given the more limited scope of this classification, the factor analysis
was performed in a single stage. A set of thirteen variables was chosen from
the assembled data base to reflect attributes of an urban area which are im-
portant in urban transportation analysis (see Table 4-1) ..
The transportation analysis is particularly interesting because it has
been performed for three city siae strata, (see page 6) as well as for the
full sample of 262 SMSAs.
Table 4-1
Variables Selected for tha Transportation Classification
Variable Description [
1. Hit! Percent of families in single-unit housing
2» TD Percent of workers commuting as auto drivers
3. AUTO Percent of households with one or more cars
4_ CSHE Percent of SKSA employment in central city
5. PCS Percent population change, 1960-1970
6_ 2PF Percent of woaen 18 or older who are employed
7. Y2i Median household income
3. par, Percent of population which is Blacfc
9. PPH Persons/housing unit
10. SOMU Square miles/person, urbanized area
II* RAD Count of radial highways
12. CIR Count of circumferential highways
13. VMT? Vehicle nilss travelled/capita-day
25
-------
Four separate single stage factor analyses were performed; one for the 262
SMSAs and one for each of the three city size strata. Although a total of
twelve factors were generated, three factors describing auto use,, income/racial
characteristics, and highways dominate all the runs. In other words, city size
appears to have a limited effect on the factors describing urban transportation
characteristics.
Four separate clustering procedures were performed; one for all cities,
and one for each of the three size strata- The four most important factors
were used for clustering in each case; the factors selected varied somewhat
between the city size groups.
The 65 large cities formed two.major groups with Providence, RI, and
Louisville, KY-IN, being their representatives. About a quarter of the large
cities are outliersf indicating the wide divergence in characteristics shown by
the large cities,
The medium size cities also formed two major groups, with Little Rock, AR,
and Tacoma, WA, being their modal SMSAs. Of the 87 cities in this group, only
9 were outliers.
Within the 107 small cities, there are five major groups, and about 10 per-
cent unclassified (outlier) cities. Modal cities were Sarasota, FL; Lincoln, NB;
St. Joseph, MO; Parkersburg,. WV; and Spartanburg, NC.
The results of this classification may be used for a variety of purposes.
The modal cities Suggest natural case study sites, or locations for demonstra-
tion projects; the results of these may then be generalized to other cities
in their groups. Further considerations may lead to other choices, data
availability represents a case in point. The results of these studies may
also be generalized to a larger set of cities. In addition,.factor scores
may be used to compare cities along the urban transportation related dimensions
defined by the factors.
Large city case studies, for example,- should be located in Providence, RI,
and Louisville, KY-IN, with Providence results being applicable to Akron, OH;
Rochester, Pittsburgh, and so on, and the Louisville results being relevant for
Atlanta, Baltimore, Omaha, and so on. Medium city case studies should ideally
be located in Little Rock and in Tacoma. Should study results be available for
"outlier" cities such as Nashville, Hartford, or San Antonio, the factor
scores for these cities should be examined individually. They may indicate that
the SMSA~San Antonio, for example—is like no other city, in this case the
results cannot be generalized. For other outliers, similarities may be discovered
at least along some of the axes, for example, Nashville resembles groups 5 and 6
(of the large city groupings)—but does not fall into them because of an extreme
value on Factor 1. Thus, its results have at least limited relevance to Toledo,
Norfolk, and other cities in these groups.
4.2 Potential Applications
The data base and the methodology developed during this research project
may be applied for alternative research purposes in three main uses. These are:
the direct use of the data base, case study/demonstration project site selec-
tion, and the extrapolation of results of existing case studies to other sites.
26
-------
The direct use of the data base does not merit extensive discussion.
Clearly, researchers requiring the information available for our data base
should not duplicate our data collection efforts, particularly since some of
the information has. been obtained from unpublished secondary sources. A
complete list, of the variables included in the data base is included in
Appendix A to this volume; a comprehensive description of the data sources,
strengths and weaknesses, as well as a data listing of all data may be found
in Volume
Case study or demonstration project sites may be selected for a variety
of research purposes through the use of the data base and methodology
developed in this project. The major constraint to developing an appropriate
classification scheme is that the research hypotheses to be tested must be
capable of being framed in terms of the variables included in the data base.
Indeed/ some subset of the data base may be adequate for that purpose.
For example, if a program analyst was interested in studying the effects
of an antipoverty program, environmental quality and urban form descriptors
would be of, peripheral interest for his research purposes. Alternatively,
if air quality maintenance programs represented the focal point of the inquiry,
then water quality descriptors, and some of the income variables may not be
relevant. The first step in developing appropriate city groupings is the
specification of the relevant data set to be used for factor analysis and
• clustering. Secondly, the universe of SMSAs may be limited to fit the
requirements of the research project. The program being tested may apply
to a limited geographical area such as the South, or may be relevant for cities
in a certain size category only. It is possible to identify cities with cer-
tain attributes to be excluded/included from the analysis. Once the data set
and the universe of SMSAs is delimited, factor analysis is performed. If the
data set is composed of a limited number of variables, and if the research
hypotheses are relatively simple and well-defined, a one stage factor analysis
may ±>e adequate. A more complicated research design may call for a two stage
factor analysis. Clustering is then performed on the basis of the first and
second stage factors obtained, and representative cities are selected for each
of the city groups. The case studies or demonstration projects, should be
sited at these representative cities to best ensure that their results can
be generalized to the other SMSAs in the group.
In some cases, it may not be possible to perform a case study at an
ideal location in the appropriate representative city because of costs, data
limitations, or simply due' to lack of cooperation. Alternative sites may then
be chosen. Further, the site selection process may be random, based on less
rational criteria than the ones described here. It is appropriate to ask
whether the results of such studies may be extrapolated to other sites.
The data base and methodology developed here may be used for this purpose
as well. The data base and the universe of SMSAs must be delimited first, and
second, the factor analysis and clustering, are performed as described above.
The sites of the case studies/demonstration projects are identified relative
to the city groups; with the results being capable of generalization to the
other cities in the groups. If the sites do not fall into any of the groups,
then the study results may not easily be generalized to other cities. A large
number of case studies have been performed in outlier cities, such as
27
-------
New York, not because of a random site selection process but because the pro-
blem areas are frequently outliers. It is possible to analyze the data for
such SMSAs,, and to determine the axes or variables with extreme observation,
which are in fact responsible for the outlying position of the SMSA. If
these variables are not crucial to the analysis, the SMSA may be grouped
with other cities in terms of the remaining variables. Study results may be
generalized to this group—although at a lower level of confidence.
28
-------
APPENDIX A
LISTING OF DATA VARIABLES
ASSIGNED TO
SCS
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
CODE
BOD
FCOL
N
C
TDSS
TSS
TURB
AALK
PHS
OAG
MB AS
SUS
SO2
PDI
CL
FL
FE
MG
N03
S03
ALK
HD
PH
STATISTICAL UNITS
Biochemical Oxygen Demand (5-day, 20° C)
Fecal Coliforms, measured by membrane
filter method
*
Total Nitrogen (mg/1)
*Total Organic Carbon (mg/1)
^Dissolved Solids (mg/1)
*Suspended Solids (mg/1)
*Turbidity (Jackson Candle Units)
*Alkalinity (mg/1 as CaCO3)
*Acidity (standard units)
*0il and Grease, soxhlet extraction (mg/1)
%ethylene Blue Active Substance (mg/1)
Suspended Particulates (micro-g/cu.m.)
Sulfur Dioxide (micro-g/cu.m.)
PDI Index
"^Chloride (mg/1)
"^Fluoride (mg/1 )
"*"Iron (mg/1)
"^Manganese (mg/1)
"Citrate (mg/1)
+Sulfate (mg/1)
"^Alkalinity (mg/1 as CaC03)
^Hardness (mg/1 as CaC03)
"^Acidity (pH standard units)
*Ambient Water Quality ^Drinking Water Quality
29
-------
ASSIGNED TO
SCS
1
2
2
2
2
2
CODE
TDS
SQMS
SQMC
SQMU
CENP
CENM
STATISTICAL UNITS
••"Total Dissolved Solids (mg/1)
Square Miles Per Person
Square Miles Per Person - Central City
Square Miles Per Person - Urban Places
Portion of Total Population in Central City
Portion of Manufacturing in Central City
CENS
CENE
CENME
2
2
2
2
2
2
2
2
2
2
2
2
ARC
RAD
CIR
PCR
LU
EEC
LA
TCOM
LAT
LONG
ALT
PREC
(by value added)(percent)
Portion of Retail Sales in Central City
(percent)
Portion of Employment in Central"City
(percent)
Portion of Manufacturing Employment in
the Central City (percent)
Arc of SMSA around the Center (quadrants)
Number of Major Radial Roadways
Number of Major Circumferential Roadways
Number of Population Centers
Portion of Land Which is Urbanized (percent)
Portion of Land Devoted to Outdoor Recreation
(percent of total land area)
Land in Farms (percent)
Portion of Workers Working Outside SMSA
(percent)
Latitude of SMSA (degrees)
Longitude of SMSA (degrees)
Altitude (feet)
Mean Annual Precipitation (inches)
30
-------
ASSIGNED TO
SCS
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
CODE
SUN
WIND
HWPC
HWPR
INV
WTMP
DO
HARD
WTR
RPAR
RSO
RNOX
RHC
RCO
RBOD
*
RSS
RDS
RNUT
RWW
STATISTICAL UNITS
Mean Annual Possible Sunshine (percent)
Mean Annual Wind Velocity (miles per hour)
Total Miles of Roadway per Capita
Portion of Principal Arterials (percent)
Inversions (mean annual frequency)
Water Temperature (ambient) (°C)
Dissolved Oxygen in Water (ambient) (rag/1)
Hardness of Water (ambient) (mg/1 as CaCO,)
Large Water Bodies (number)
Parti culates (tons per year per capita)
Sulfur Oxides (tons per year per capita)
Nitrogen Oxides (tons per year per capita)
Hydrocarbons (tons per year per capita)
Carbon Monoxide (tons per year per capita)
Biochemical Oxygen Demand (tons per year per
capita)
Suspended Solids (tons per year per capita)
Dissolved Solids (tons per year per capita)
Nutrients (tons per year per capita)
Wastewater (million gallons per year per
capita)
RNSW Noncombustible Solid Waste (tons per year
per capita)
RIS industrial Sludges (tons per year per capita)
31
-------
ASSIGNED TO
SCS
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
CODE
«•••••*••••••••
AGE
PP
PR
NOH
PU
NMOV
PPH
PBL
MIGP
FTM
PCC
PCS
YMC
H1U
HOCC
HSU
4
4
4
H50U
GINI
YM
YP
STATISTICAL UNITS
Median Age (years)
Portion of Population which is of Poreign
Stock (percent)
Fertility Rate (children ever born, per
thousand women ever married)
Married Couples Without Own Household
(percent)
Portion of Population Which is Urban
(percent)
Individuals Residing in the Same Dwelling-
for Five Years (percent)
Average Household Size (persons per household)
Portion of Population Which is Black (percent)
Movers into the SMSA (percent of individuals
over five years of age)
Female/Male in 20-64 Age Group
Population Change ~ Central City (percent)
Population Change — SMSA (percent)
Change in Median Income (percent)
Single Family Dwellings (percent)
Owner-Occupied Housing (percent)
More than Five Units
Units in Structure
(percent)
Units in Structure
(percent
More than Fifty Units
Income — Gini Coefficient
Median Family Income (dollars)
Per Capita Income (dollars)
32
-------
ASSIGNED TO
CODE ^STATISTTCAT. TTOTTS
GX Local Government — General Expenditures
(dollars per capita)
GSEV Local Government — General Revenues
(dollars per capita)
GEMP Local Government — Employment (full time
equivalent per capita)
EM Total Employment in Manufacturing (percent)
VAM Value Added by All Manufacturing (dollars
per capita)
IFLT Total Value of Production in Meat Animals
and other Livestock (dollars per capita)
IOIL Total Value of Production in Crude Petro-
leum, Natural Gas (dollars per capita)
IMET Total Value of Production in Meat
Products (dollars per capita)
ICTH Total Value of Production in Broad and
Narrow Fabrics (dollars per capita)
145 Total Value of Production in Household
Furniture (dollars per capita)
IPLP Total Value of Production in Pulp Mills
(dollars per capita)
IPPR Total Value of Production in Paper and
Paperboard Mills
ICHM Total Value of Production in Industrial
Chemicals (dollars per capita)
IPRT Total Value of Production in Commercial
Printing (dollars per capita)
IFRT Total Value of Production in Fertilizers
(dollars per capita)
IMCM Total Value of Production in Miscellaneous
Chemical Products (dollars per capita)
33
-------
ASSIGNED TO
SCS CODE STATISTICAL UNITS
6 IPLA Total Value" of Production in Plastic
Materials and Resin (dollars per capita)
6 IPNT Total Value of Production in Paints
(dollars per capita)
6 IFUL Total Value of Production in Petroleum
Refining (dollars per capita)
t
6 IASP Total Value of Production in Paving and
Asphalt (dollars per capita)
6 IGLS Total Value of Production in Glass
(dollars per capita)
6 ICLY Total Value of Production in Structural
Clay Products (dollars per capita)
6 ICMT Total Value of Production in Cement, Concrete,
Gypsum (dollars per capita)
6 ISTL Total Value of Production in Steel
(dollars per capita)
6 IALM Total Value of Production in Aluminum
(dollars per capita)
6 IAPL Total Value of Production in Household
Applicances (dollars per capita)
6 ICAR Total Value of Production in Motor Vehicles
(dollars per capita)
6 IELC Total Value of Production in Electric
Utilities (dollars per capita)
6 ICOL Total Value of Production in Coal Mining
(dollars per capita)
6 IVEG Total Value of Production in Canned and Frozen
Foods (dollars per capita)
6 IFBR Total Value of Production in Cellulous Fibers
(dollars per capita)
34
-------
ASSIGNED TO
SCS
6
6
6
RETT
WHOL
S20
6
6
CODE STATISTICAL UNITS
ITAN Total Value of Production in Leather
and Industrial Leather Products (dollars
per capita)
IABS Total Value of Production in Other Stone
and Clay Products (dollars per capita}
ICU Total Value of Production in Copper
(dollars per capita)
IPB Total Value of Production in Lead (dollars
per capita)
IZN Total Value of Production in Zinc (dollars
per capita)
IMTL Total Value of Production in Other
Fabricated Metal Products (dollars per
capita)
IWST Total Value of Production in Wholesale
Trade (dollars per capita)
Total Retail Sales ($000 per capita)
Total Wholesale Sales ($000 per capita)
Total Value Added in Food and Kindred
Products (SIC 20) ($ millions per capita)
S22 Total Value Added in Textile Mill Products
(SIC 22) C$ millions per capita)
S23 Total Value Added in Apparel and Other Textile
Mill Products (SIC 23) ($ millions per capita)
SS Selected Services ($000 per capita)
S24 Total Value Added in Lumber and Wood
Products (SIC 24) ($ millions per capita)
S25 Total Value Added in Furniture and Fixtures
(S 25) ($ millions per capita)
S26 Total Value Added in Paper and Allied
Products (SIC 26) ($ millions per capita)
35
-------
ASSIGNED TO
SCS CODE STATISTICAL UNITS
6 S27 Total Value Added in Printing and
Publishing (SIC 27) ($ millions per capita)
6 S28 Total Value Added in Chemicals and Allied
Products (SIC 28) ($ millions per capita)
6 S29 Total Value Added in Petroleum and Coal
Products (SIC 29) ($ millions per capita)
6 530 Total Value Added in Rubber and Plastic
Products (SIC 30) ($ millions per capita)
6 S31 Total Value Added in Leather and Leather
Products (SIC 31) ($ millions per capita)
6 S32 Total Value Added in Stone, Clay and
Glass Products (SIC 32)($ millions per capita)
6 S33 Total Value Added in Primary Metal Industries
(SIC 33) ($ millions per capita)
6 S34 Total Value Added in Fabricated Metal Products
(SIC 34) ($ millions per capita)
6 S35 Total Value Added in Machinery, Except
Electrical (SIC 35) ($ millions per capita)
6 S36 Total Value Added in Electrical Equipment
and Supplies (SIC 36) (? millions' per capita)
6 S37 Total Value Added in Transportation Equipment
(SIC 37) ($ millions per capita)
6 S38 Total Value Added in Instruments and Related
Products (SIC 38) ($ millions per capita)
6 S39 Total Value Added in Miscellaneous Manufacturing
Industries (SIC 39) ($ millions per capita)
36
-------
CQpE STATISTICAL UNITS
control data* P Population (in thousands)
control data* HTOT Total Housing Units (in thousands)
control data* TALL Total Commuters (hundreds)
control data* LS Total Land Area — SMSA (square miles)
control data* LUBB Total Land Area — Urbanized Portion
(square miles)
*Control data were used for computing normalized variables.
37
-------
TECHNICAL REPORT DATA
(Please read Instructions on the reverse before completing)
1. REPORT NO.
EPA-600/3-7T-008a
3. RECIPIENT'S ACCESSIOI^NO,
4. TITLE AND SUBTITLE
Classification of American Cities For Case Study
Analysis-Volume I Summary Report
5. REPORT DATE
May 1977 (issuing date)
6. PERFORMING ORGANIZATION CODE
7. AUTHOR(S)
Elizabeth Lake, Carol Blair, James Hudson,
Richard Tabors
8. PERFORMING ORGANIZATION REPORT NO.
9. PERFORMING ORGANIZATION NAME AND ADDRESS
Urban Systems Research & Engineering Inc.
1218 Massachusetts Avenue
Cambridge, Massachusetts 02138
10. PROGRAM ELEMENT NO.
1HA091
11. CONTRACT/GRANT NO.
68-01-3299
12. SPONSORING AGENCY NAME AND ADDRESS
Office of Monitoring & Technical Support - Wash., DC
Office of Research and Development
U.S. Environmental Protection Agency
Washington, D.C. 20^60
13. TYPE OF REPORT AND PERIOD COVERED
Final
14. SPONSORING AGENCY CODE
EPA/600/19
15. SUPPLEMENTARY NOTES
Volumes II - Detailed Report and III - Documentation of Study are available from
Na1"irvna1
16. ABSTRACT
Attempts to analyze and evaluate the impacts of federal programs has,led
to the extensive use of case studies of program impacts at selected sites.
This project has developed a methodology for the systematic selection of
representative case study sites and for generalizing the study results. The
methodology, involving two stage factor analysis and clustering, is applied
to a specific program/policy problem, the selection of metropolitan areas
for case studies in analyzing the impact of federal policies on general
environmental quality.
The methodology begins with a data base on standard metropolitan statistical
areas, SMSAs, including variables related to environmental quality, urban form,
and household, industrial, and government activity. It analyzes these
variables through a two-stage factor analysis technique which allows heuristic
consideration of the significant characteristics. Finally, it develops city
clusters which group areas with similar attributes. Modal (or representative)
cities are selected for each group and suggested as case study sites. These
groups-may be used to generalize the study results and to analyze the*trans-
ferrability of results between areas. The methodology is sufficiently flexible
to consider a wide range of research hypotheses.
7.
KEY WORDS AND DOCUMENT ANALYSIS
DESCRIPTORS
b.lDENTIFIERS/OPEN ENDED TERMS
COSATI Field/Group
Economic Factors
Economic Surveys
Economic Geography
Census
Central City
Demographic Surveys
Populations
Socioeconomic Stans
Urban Areas
Urban Sociology
Urban Geography
Factor Analysis
Modal Cities
City Classification
08F
05C
05K
13B
3. DISTRIBUTION STATEMENT
Unlimited
19. SECURITY CLASS (ThisReport)
Unlimited
21. NO. OF PAGES
48
20. SECURITY CLASS (Thispage)
Unlimited
22. PRICE
EPA Form 2220-1 (9-73)
38
.S. GOVERNMENT PRINTING OFFICE: 1977-757-056/6426 Region No. 5-11
------- |