iu
 S
 10
                Vlusclc Shoals AL ,<5660
        United States
        Environmental Protection
        Agency
             Effri ts Rcsear
EPA GOO 1 8
M;m:h 1984
        Research and Development
Consolidation of
Baseline Information,
Development of
Methodology, and
Investigation of Thermal
Impacts on Freshwater
Shellfish, Insects, and
Other Biota

Interagency
Energy/Environment
R&D  Program Report

-------
         CONSOLIDATION OF  BASELINE  INFORMATION,
             DEVELOPMENT OF METHODOLOGY, AND
            INVESTIGATION  OF  THERMAL  IMPACTS
            ON FRESHWATER  SHELLFISH,  INSECTS,
                    AND OTHER BIOTA
                          by
       John S.  Grossman and James R. Wright, Jr.
              Office of Natural Resources
              Tennessee Valley Authority
              Knoxville, Tennessee  37902

                   Roger L. Kaesler
                 Department of Geology
                 University of Kansas
                Lawrence, Kansas  66045
         Interagency Agreement EPA-IAG-DS-E721
                Project No. E-AP-80-BDR
             Program Element No. INE-625A
                    Project Officer

                     Alfred Galli
Office of Environmental Processes and Effects Research
         U.S. Environmental Protection Agency
                 Washington, DC  20460
                     Prepared for

Office of Environmental Processes and Effects Research
          Office of Research and Development
         U.S.  Environmental Protection Agency
                 Washington, DC  20460
                           U S  Environmental Protection Agency
                           Region 5, Library (5PL-16)
                           230 S  Dearborn Street, Room 1670
                           Chicago, Ii   60604

-------
                                  DISCLAIMER
     This report was prepared by the Tennessee Valley Authority and has been
reviewed by the Office of Research and Development,  Energy and Air Division,
U.S. Environmental Protection Agency, and approved for publication.  Although
the research described in this document has been funded wholly or in part by
the United States Environmental Protection Agency through Interagency
Agreement No. EPA-IAG-82-D-X0511 with TVA, it has not been subject to Agency
policy and peer review and therefore does not necessarily reflect the views of
the agency or the Tennessee Valley Authority and no  official endorsement
should be inferred.
                                     ii

-------
                                  ABSTRACT
     A computerized information system was developed for storing, retrieving,
and analyzing data collected during limnological surveys.  The system
accommodates 19 variables and uses the Statistical Analysis System as the
basic data-management system.  To facilitate storage of information, a series
of hierarchical codes was developed.  These codes not only reduced storage
requirements but also helped reduce computing costs.

     When the information system had been developed, three analytical
procedures were tested used in various forms, and evaluated as tools for
analyzing benthic macroinvertebrate data sets.  The first two of these were
cluster analysis and ordination using nonmetric multidimensional scaling
(MDS).   Tests of these two methods included, first, development of a rationale
for selecting methods of transforming the data and choosing coefficients of
similarity to express relationships between samples and, second, application
of the two methods to data from the Clinch River, Virginia, and the Cumberland
River,  Tennessee.  The resulting dendrograms from cluster analysis and
ordinations from multidimensional scaling were evaluated by preparing
analytical tables and dealing with small subsets of the total data set.

     Measurements of species diversity from information theory were the third
analytical technique considered, again using benthic macroinvertebrate data.
First,  the relative importance of diversity at the species level was compared
to components of diversity contributed by other categories in the taxonomic
hierarchy.  Results indicated that identifications to species contributed
little information about the structure of the communities that discrimination
of genera had not already contributed.  Second, the heuristic properties of
species diversity were used to evaluate two classifications stressing
functional morphology and trophic-functional relationships of benthic
invertebrates, independent of the taxonomic hierarchy.  Both methods produced
results similar to ones obtained by cluster analysis, suggesting that they
merit further investigation.
                                     ill

-------
                               CONTENTS
Abstract	    iii
List of Figures	    vii
List of Tables	    xii
List of Abbreviations and Symbols	    xvi
Disclaimer	     ii
Acknowledgments  	   xvii

1.    Introduction  	      1

2.    Conclusions and Recommendations 	      3
      2.1  Introduction  	      3
      2.2  Generic	      3
      2.3  Cluster Analysis  	      3
      2.4  Ordination	      4
      2.5  Indices of Diversity  	      5
      2.6  Final Statement 	      5

3.    Development of the Information System 	      6
      3.1  General Description of the Information System 	      6
      3.2  Preparation of Data	      6
      3.3  Editing of Data	     12

4.    Methods	     31
      4.1  Description of Analytical Techniques  	     31
           4.1.1  Cluster Analysis 	     31
           4.1.2  Ordination by Nonmetric Multidimensional Scaling .  .     34
           4.1.3  Indices of Species Diversity 	     41
      4.2  Description of Data Sets	     42
           4.2.1  Clinch River Data Set	     42
           4.2.2  Cumberland River Data Sets —1973 and 1975	     44

5.    Cluster Analysis  	     54
      5.1  General Description 	     54
      5.2  Analytical Procedures 	     54
           5.2.1  Selection of Similarity Coefficients 	     54
           5.2.2  Reducing Size of Data Matrices	     57
           5.2.3  Evaluation of Distortion	     60
      5.3  Results	     60
           5.3.1  Q-Mode Analysis	     60
                  5.3.1.1  Clinch River Data Set	     60
                           5.3.1.1.1  Presence-absence data  	     60
                           5.3.1.1.2  Quantitative data, counts
                                      of species	  -     62

-------
                         CONTENTS (continued)
                  5.3.1.2  Cumberland River Data Set--1973 	    62
                           5.3.1.2.1  Substrate  	    62
                           5.3.1.2.2  Zoomacrobenthos  	    62
                                      5.3.1.2.2.1  Presence-absence
                                                   data	    65
                                      5.3.1.2.2.2  Quantitative data,
                                                   counts of species .    65
                           5.3.1.2.3  Summary  	    65
5.   Cluster Analysis (continued)
                  5.3.1.3  Cumberland River Data Set--1975 	    69
                           5.3.1.3.1  Substratum 	    69
                           5.3.1.3.2  Zoomacrobenthos  	    69
                                      5.3.1.3.2.1  Presence-absence
                                                   data	    69
                                      5.3.1.3.2.2  Quantitative data,
                                                   species counts  . .    70
                           5.3.1.3.4  Summary  	    70
           5.3.2  R-Mode Analysis	    73
                  5.3.2.1  Introduction  	    73
                  5.3.2.2  Clinch River Data Set	    76
                           5.3.2.2.1  Presence-absence data  	    76
                           5.3.2.2.2  Quantitative data,  species
                                      counts	    85
                           5.3.2.2.3  Summary  	    85
                  5.3.2.3- Cumberland River Data Set--1973 	    85
                           5.3.2.3.1  Presence-absence data  	    85
                           5.3.2.3.2  Quantitative data,  species
                                      counts	    87

6.   Ordination—Nonmetric Multidimensional Scaling  	   124
      6.1  General Description 	   124
      6.2  Analytical Procedures 	   124
      6.3  Results	   125
           6.3.1  Q-Mode Analysis—Clinch River Data	   125
                  6.3.1.1  Presence-Absence Data 	   125
                  6.3.1.2  Quantitative Data (Species Counts)  ....   125

7.   Diversity Indices 	   129
      7.1  General Description 	   129
      7.2  Analytical Procedures 	   130
      7.3  Diversity of Samples from the Clinch River	   134
           7.3.1  Species Diversity  	   134
           7.3.2  Hierarchical Diversity 	   136

8.  Summary and Discussion	   148
      8.1  Introduction	   148
      8.2  Nature of the Ecosystems from which Data Bases were
           Selected	   148

-------
                         CONTENTS (continued)
      8.3  Methods	   149
           8.3.1  Relationships Between Methods  	   149
           8.3.2  Data	   150
                  8.3.2.1  Presence-Absence Data 	   150
                  8.3.2.2  Quantitative Data 	   151
           8.3.3  Cluster Analysis 	   151
           8.3.4  Ordination	   153
           8.3.5  Species Diversity and Hierarchical Diversity ....   153

References	   154

-------
                               LIST OF FIGURES

Number

  1       Flow diagram illustrating the computerized infor-
          mation system (CIS) used for TVA's nonfisheries
          biological data 	
          Coding form used for keypunching field and laboratory
          limnological data 	
  3       One-dimensional Q-mode ordination of hypothetical
          samples in Table 9 computed with nonmetric multidi-
          mensional scaling, nine iterations (stress = 0.230) ....     36

  4       Two-dimensional Q-mode ordination of hypothetical
          samples in Table 9 computed with nonmetric multi-
          dimensional scaling, one iteration (stress = 0.097) ....     37

  5       Two-dimensional Q-mode ordination of hypothetical
          samples in Table 9 computed with nonmetric multidimen-
          sional scaling, 20 iterations (stress = 0.051)  	     37

  6       Three-dimensional Q-mode ordination of hypothetical
          samples in Table 9 computed with nonmetric multidi-
          mensional scaling, 43 iterations (stress = 0.001) 	     39

  7       Dendrogram computed from Q-mode cluster analysis of a
          matrix of distance coefficients (Table 10) showing
          faunal similarities between hypothetical samples in
          Table 9	     40

  8       Map of the Clinch River in Virginia and Tennessee
          showing the locations of stations sampled during the
          1970 zoomacrobenthic survey 	     43

  9       Stream discharge of the Clinch River at the United
          States Geological Survey gauging station at Cleveland,
          Virginia,  June-August 1970  	     45

 10       Location of stations in the vicinity of a power plant
          on the Cumberland River	     46

 11       Dendrogram computed from cluster analysis of a matrix  of
          coefficients of cophenetic correlation showing similarity
          between the various correlation and similarity matrices
          in Tables  17 and 20 (r   = 0.914)	     56
                                      VI1

-------
                            LIST OF FIGURES (Cont.)

Number                                                                Page

 12       Dendrogram computed from Q-mode cluster analysis of a
          matrix of Jaccard's coefficients showing faunal simi-
          larities between samples collected from the Clinch River
          in 1970; data include total insect fauna  	    58

 13       Dendrogram computed from Q-mode cluster analysis of a
          matrix of Jaccard's coefficients showing faunal simi-
          larities between 36 zoomacrobenthic samples collected
          from stations 4, 7, 8, 9, 10,  and 11 in the Clinch
          River, 1970 (r   = 0.93)  	    61
                        cc
 14       Dendrogram computed from Q-mode cluster analysis of a
          matrix of correlation coefficients computed from propor-
          tions of each phi size in the  substrate after arcsine
          transformation; shows similarity of substrate between
          samples collected from the Cumberland River in 1973
          (r   = 0.760)	    63

 15       Dendrogram computed from Q-mode cluster analysis of a
          matrix of distance coefficients computed from propor-
          tions of each phi size in the  substrate after arcsine
          transformation; shows similarity of substrate between
          samples collected from the Cumberland River in 1973 ....    64

 16       Dendrogram computed from Q-mode cluster analysis of a
          matrix of Jaccard's coefficients showing faunal simi-
          larities between samples collected from the Cumberland
          River in 1973 (r   = 0.852)	    66
                          cc
 17       Dendrogram computed from Q-mode cluster analysis of a
          matrix of distance coefficients computed from data
          transformed with the square-root transformation; shows
          faunal similarities between samples collected from the
          Cumberland River in 1973 (r   = 0.833)	     67
                                     cc
 18       Dendrogram computed from Q-mode cluster analysis of a
          matrix of correlation coefficients computed from data
          transformed with the square-root transformation; shows
          faunal similarities between samples collected from the
          Cumberland River in 1973 (r   = 0.979)	     68
                                     cc

 19       Dendrogram computed from Q-mode cluster analysis of a
          maxtrix of distance coefficients computed from data
          that had been transformed by the square-root transfor-
          mation and standardized by rows; shows faunal similar-
          ities between samples collected from the Cumberland
          River in 1975	     71
                                     Vlll

-------
                            LIST OF FIGURES (Cont.)

Number                                                                Page

 20       Dendrogram computed from Q-mode cluster analysis of a
          matrix of distance coefficients computed from data that
          has been transformed by the square-root transformation
          and standardized by rows; shows faunal similarities
          between samples collected form the Cumberland River
          in 1975	      72

 21       A truncated log-normal distribution fitted to a distri-
          bution of species in an aquatic ecosystem not adversely
          affected by environmental stress 	      74

 22       Response of organism to severe organic enrichment:
          changes in types of organisms present, population
          densities, and biological diversity  	      75

 23       Dendrogram computed from R-mode cluster analysis of a
          matrix of Jaccard's coefficients,  showing distributional
          similarities of taxa collected from stations on the
          Clinch River unaffected by low-pH stress that resulted
          from the 1970 spill of acid (r   = 0.97)	      77

 24       Dendrogram computed from R-mode cluster analysis of a
          matrix of simple matching coefficients, showing distri-
          butional similarities of taxa collected from stations on
          the Clinch River unaffected by low-ph stress that resulted
          from the 1970 spill of acid (r   = 0.91)	        78

 25       Dendrogram computed from R-mode cluster analysis of a
          matrix of distance coefficients computed from data
          that had been transformed by the square-root transfor-
          mation;  shows distributional similarities of taxa
          collected from stations on the Clinch River  unaffected
          by low-pH stress that resulted from the 1970 spill of
          acid (r    = 0.92)	     79

 26       Dendrogram computed from R-mode cluster analysis of a
          matrix of Jaccard's coefficients,  showing distributional
          similarities  of taxa collected from stations on the Clinch
          River affected by  low-pH stress that resulted from the
          1970 spill  of acid (r   = 0.97)	     80
                               cc

 27       Dendrogram  computed from R-mode custer analysis of a
          matrix of simple matching coefficients,  showing distri-
          butional similarities  of taxa  collected from stations
          on the Clinch River affected by low-pH stress that
          resulted from the  1970 spill of acid (r   =  0.84)  	     81
                                                 cc

-------
                            LIST OF FIGURES (Cont.)

Number                                                                Page

 28       Dendrogram computed from R-mode cluster analysis of a
          matrix of distance coefficients computed from data
          that had been transformed by the square-root transfor-
          mation; shows distributional similarities of taxa
          collected from stations on the Clinch River affected
          by low-pH stress that resulted from the 1970 spill
          of acid (r   = 0.97)	     82
                    cc

 29       Dendrogram computed from R-mode cluster analysis of a
          matrix of Jaccard's coefficients, showing distributional
          similarities of taxa collected from stations on the
          Clinch River both affected and unaffected by the low-pH
          stress that resulted from the 1970 spill of acid
          (r   = 0.95).  Only those taxa are included that comprise
          10 percent or more of the total sample	     83

 30       Dendrogram computed from R-mode cluster analysis of a
          matrix of distance coefficients computed from data that
          had been transformed by the square-root transformation;
          shows distributional similarities of taxa collected
          from stations on the Clinch River both affected and
          unaffected by the low-ph stress that resulted from the
          1970 spill of acid (r   = 0.97).  Only those taxa that
          comprise 10 percent or more of the total sample are
          included	     84

 31       Dendrogram computed from R-mode cluster analysis of a
          matrix of Jaccard's coefficients, showing distributional
          similarities of taxa collected from the Cumberland
          River in 1973	     86

 32       Dendrogram computed from R-mode cluster analysis of a
          matrix of simple matching coefficients, showing distri-
          butional similarities of taxa collected from the
          Cumberland River in 1973	     88

 33       Dendrogram computed from R-mode cluster analysis of a
          matrix of distance coefficients with data transformed
          with the square-root transformation and standardized
          by rows, showing distributional similarities of taxa
          collected from the Cumberland River in 1973	     89

 34       Dendrogram computed from R-roode cluster analysis of a
          matrix of distance coefficients with data transformed
          by the square-root transformation, showing distributional
          similarities of taxa collected from the Cumberland
          River in 1973	     90

-------
                            LIST OF FIGURES (Cont.)

Number

 35       Dendrogram computed R-mode cluster analysis of a matrix
          of correlation coefficients with data transformed by
          the square-root transformation and standardized by rows,
          showing distributional similarities of taxa collected
          from the Cumberland River in 1973	     91

 36       Three-dimensional ordination by nonmetric multidimen-
          sional scaling computed from distance coefficients
          based on presence-absence data, showing faunal simi-
          larities between samples collected from the Clinch
          River in 1970	    126

 37       Three-dimensional ordination by nonmetric multidimen-
          sional scaling computed from distance coefficients
          based on counts of species, showing faunal similarities
          between samples collected from the Clinch River in 1970  .    127

 38       Species diversity (Brillouin's H) of 36 samples from the
          Clinch River, 1970; contour interval 0.4 	    135

 39       Species diversity (approximate index H") of 36 zoomacro-
          benthic samples from the Clinch River,  1970  	    137
                                      XI

-------
                               LIST OF TABLES

Number                                                                Page
  1       Numerical codes for rivers and streams in the Tennessee
          Valley, and sources of organisms for bioassays 	    14

  2.0     Codes for identifying methods of limnological sampling
          and types of gear	    15

  2.1     Codes for types of sampling equipment and sampling material
          used in artificial-substrate sampling  	    16

  2.2     Codes for types of sampling equipment and sampling material
          used in natural substrate removal and organism removal .  .    17

  2.3     Codes for types of sampling equipment used with emergence
          traps and volume samplers	    18

  3       Codes for identifying location of limnological samples
          on a river or reservoir transect	   19

  4       Alphabetic codes identifying site (or project) at which
          sample was collected	    20

  5       Codes for identifying type of sample (community,  parameter,
          test)	    21

  6       Codes used to report data units	    22

  7       Codes for identifying basic type of habitat from which
          sample was collected or hardness at which bioassay was
          performed	    29

  8       Codes for recording instar or size of organisms
          collected or used in bioassays	    30

  9       Hypothetical data showing proportions of 10 species at
          12 stations	    47

 10       Matrix of distance coefficients computed from the data
          in Table 9 after arcsine transformation  	    48

 11       Generalized trophic, functional classification of zoo-
          macrobenthic invertebrates 	    49

 12       Hierarchical classification of the trophic-functional
          role of organisms; includes only those categories that
          occurred in samples	    50

                                      xii

-------
                          LIST OF TABLES  (Cont.)

                                                                     Page

         Hierarchical classification assigned for zoomacrobenthic
         invertebrates based on functional morphology:  head
         position, body shape, and respiratory organs  	    51

14       Range and mean of physicochemical data for the Clinch
         River, June to September 1970	    52

15       Descriptions of habitats at stations 4, 7, 8, 9, 10,
         and 11 on the Clinch River, 1970	    53

16       Coefficients of correlation, distance, and similarity:
         abbreviations, equations, and upper and lower limits  ...    92

17       Contingency table (2 X 2) defining the terms a_, b, (:,
         and d, as used in the equations  in Table 13	    94

18       Effect of the transformation log (X  . +1), where X..
         is the abundance of the :ith species In the j^th sample   .  .    94
19       Effect of the transformation V(x-- + 0.5), where X..
         is the abundance of the ^th species in the j^th sample  . .    94

20       Labels of correlation and distance matrices with
         various transformations  	    95

21       Matrix of coefficients of cophenetic correlation
         computed between corresponding elements of 21 correlation
         and similarity matrices  	    96

22       Coefficients of cophenetic correlation comparing
         distance matrices and selected correlation and
         similarity matrices  	    98

23       Number of taxa in each major taxonomic group in the
         Clinch River (1970) before and after relative species
         abundance was determined and rare taxa were eliminated . .    99

24       Twenty-nine taxa and their respective trophic codes
         for the reduced Clinch River data set, 1970	   100

25       Number of taxa with a relative abundance X).01
         divided by the total number of taxa per station,
         Clinch River, 1970	   101

26       Taxa with a relative abundance X1.01 as percent of the
         total number of taxa per station, Clinch River, 1970 .  .  .   101

27       Total number of organisms per station, Clinch River,
         1970	    102

                                     xiii

-------
                           LIST OF TABLES (Cont.)

Number                                                                Page

 28       Twenty-nine taxa with relative abundance >0.01 as
          percent of total number of organisms per station,
          Clinch River, 1970	102

 29       Analytical technique, type of comparison, and the number
          and type of similarity coefficients used to analyze the
          reduced (1970) Clinch River data set  	  103

 30       Cophenetic correlation values (r  ) for 24 dendrograms
          computed from the Clinch River data set, June to
          August 1970	104

 31       Results of Q-mode cluster analysis of zoomacrobenthic
          samples from the Clinch River, 1970; minimum level of
          similarity used to define clusters = 0.62; Jaccard's
          coefficient	105

 32       Results of Q-mode cluster analysis of zoomacrobenthic
          samples from the Clinch River, 1970; level of simi-
          larity used to define clusters = 0.06; correlation
          coefficient 6, ^/Y + 0.5 transformation and standard-
          ization by rows	106

 33       Results of Q-mode cluster analysis of zoomacrobenthic
          samples from the Clinch River, 1970; level of simi-
          lar ity__usjed to define cluster = 1.2; distance coefficient
          6, VY + 0.5 transformation and standardization by rows   .  .  107

 34       Results of Q-mode cluster analysis of zoomacrobenthic
          samples from the Clinch River, 1970; level of simi-
          larity used to define clusters = 4.8; distance co-
          efficient 7, VY + 0.5 transformation and standardi-
          zation by rows	108

 35       Summary of results of Q-mode cluster analyses,
          Cumberland River, 1973  	  109

 36       Summary of results of Q-mode cluster analyses,
          Cumberland River, 1975	HI

 37       Results of R-mode cluster analysis of Jaccard's  (S.)
          and simple-matching  (S  ) coefficients, Clinch River
          data set, 1970  . .  . ?m	113

 38       Results of R-mode cluster analysis of distance
          coefficients, Clinch River data set,  1970	114

 39       Results of R-mode cluster analysis of S    and S.,
          Clinch River stations unaffected by theSIF970 pHJstress   .  .  116
                                      xiv

-------
                           LIST OF TABLES (Cont.)

Number                                                                Page

 40       Trophic-functional codes for taxa clustered by R-mode
          cluster analysis of Jaccard's (S.) and simple-matching
          (S  ) coefficients, Clinch Riverjdata set, 1970	117
            sm

 41       Trophic-functional codes for taxa cluster by R-mode
          cluster analysis of Dist 7 coefficients Clinch River
          data set, 1970	  118

 42       Trophic-functional codes for taxa clustered by R-mode
          cluster analysis of simple-matching (S  ) and Jaccard's
          (S.) coefficients, Clinch River, 1970 ?m	  119
            J
 43       Results of R-mode cluster analysis of S.  and S   after
          reordering clusters according to trophic-functional
          codes, Clinch River, 1970 	  120

 44       Results of R-mode cluster analysis of Dist 7 coefficients
          after reordering clusters according to trophic-functional
          codes, Clinch River, 1970 	  122

 45       Clusters of taxa defined by R-mode cluster analysis of
          Jaccard's and the simple matching coefficients, Clinch
          River, 1970	123

 46       Species diversity (Brillouin's H)  of 36 zoomacrobenthic
          samples from the Clinch River, 1970	140

 47       Species diversity (approximate index H")  of 36 zoo-
          macrobenthic samples from the Clinch River, 1970	141

 48       Hierarchical taxonomic diversity (Brillouin's H) of
          zoomacrobenthic samples from the Clinch River, 1970 ....  142

 49       Component of species diversity (H)  at each level in the
          trophic-functional hierarchy for five samples or sub-
          samples collected immediately after the acid spill  on the
          Clinch River,  1970	145

 50       Percent of species diversity (H) contributed at each level
          in the trophic-functional hierarchy for five samples or
          subsamples collected immediately after the acid spill on
          the Clinch River,  1970	146

 51       Component of species diversity (H)  at each level in the
          head-body-respiratory functional morphology hierarchy for
          five samples or subsamples  collected immediately after the
          acid spill on  the  Clinch River,  1970	147
                                     xv

-------
                      LIST OF ABBREVIATIONS AND SYMBOLS
CIS
Corr

AT
d
Dist

DM/IS
DMS
H
MDS
NT
NTSYS
Q-mode comparison
R-mode comparison
SAS
SCE
SJ

SSM
TF
7LB&MS
7RB
TSO
computerized information system
correlation coefficient; for use with species
counts
temperature change,  C
Wilhm-Dorris diversity index
distance coefficient; for use with species
counts
data management/information system
data management system
Brillouin's index
multidimensional scaling
number of taxa
Numerical Taxonomy System
pairwise comparison between columns;
used to determine similarity between sites
or stations on basis of biotic assemblages
present
coefficient of cophenetic correlation
pairwise comparison between rows; used to
determine similarity between species
assemblages on basis of distribution among
samples
Statistical Analysis System
standing crop estimates
Jaccard's coefficient; for use with presence
absence data
simple matching coefficient; for use with
presence-absence data
Trophic-Functional code
Station 7, left bank and midstream
Station 7, right bank
time-sharing option
greater than or equal to
less than
                                     xvi

-------
                               ACKNOWLEDGMENTS
     The authors express their appreciation to Dr. Ralph H.  Brooks,
Mr. Billy G. Isom, and Dr. Harrison Hickey, Tennessee Valley Authority,
for their cooperation and support; Mr.  Clinton W.  Hall, Project Officer,
Environmental Protection Agency, for his patience  and support;  and
Ms. Rachel C. Strong for her drafting work.  Acknowledgment  is  also
given to Dr. Brian Armitage, Mr. Clay Barr, and Ms.  Sandra Emond of  the
Tennessee Valley Authority, for their help in developing many of the
coding schemes for the computerized information system.  Recognition is
also given Dr. Cornelius Weber, Environmental Protection Agency, and
Mr. Tom Toole and Dr. Ken Tennessen, Tennessee Valley Authority, for
their assistance in updating the BIO-STORET species  list.
                                     xvi i

-------

-------
                                   SECTION  1

                                 INTRODUCTION
     An  essential part  of  achieving  national  self-sufficiency  in energy  is
 minimizing  the  adverse  environmental impacts  that may  accompany accelerated
 development and increased  use  of  energy  resources.  The Tennessee Valley
 Authority  (TVA) has been studying ways to  evaluate and minimize these  impacts
 through  the Federal Energy/Environment Research and Development Program,
 coordinated by  the Office  of Energy,  Minerals, and Industry  (OEMI) of  the
 Environmental Protection Agency (EPA).   This  program is designed to  (1)  add
 environmental objectivity  and  balance to the  mission of the  Department of
 Energy,  (2) prevent delays in  development  of  energy resources  that are caused
 by  inadequate environmental information, (3)  develop cost-effective  strategies
 for pollution control,  (4) promote transfer of energy-related  environmental
 information, and (5) project the  impacts of future energy technologies
 (Environmental  Protection  Agency  1976).

     As  part of the federal interagency  agreement with EPA,  TVA's applied
 research program undertook a comprehensive evaluation  of the impacts of
 energy-related  technologies on the aquatic environment.  The research
 discussed in this milestone report summarizes the work completed during  the
 first two years of the project on Task 1 (Information  Systems Development) of
 Subagreement 10 (Consolidation of Thermal  Impacts on Freshwater Shellfish,
 Insects, and Other Biota).  The overall  objective of Task 1 was to develop the
 capability  to measure and  evaluate existing and expected environmental impacts
 of  energy-related technologies on important biotic assemblages (nonfish) in
 the aquatic environment.

     To  accomplish this objective, the first priority was to develop a
 computerized information system (CIS).   This system was designed to be
 inexpensive to  operate, user oriented, adaptable for use in both routine
 monitoring  studies and research projects, capable of performing a variety of
 analytical  procedures, and  able to interface with EPA's BIO-STORET system.

     Using  these criteria  and  the Statistical Analysis System  (SAS)
 (Barr et al., 1976) as the  basic  data management system, a CIS was developed
 that could  accommodate 19  different biotic and abiotic parameters.   Since it
 was important to conserve  space while including information for each
 parameter,  a series of numeric and alphanumeric codes was developed.   These
 codes not only  reduced storage requirements,  but saved time and resources each
 time the data were sorted,   compiled,  and analyzed.   Additional software
 packages were also adapted  or developed to interface with the CIS.   These
 routines expanded the analytical and computer-graphics capabilities of the
 system.

     After  the CIS had been developed, the next priority was to evaluate three
procedures used to analyze data from biological surveys.   The first procedure

-------
was the cluster analysis routine in the Numerical Taxonomy System (NTSYS)
written by F. J. Rohlf and associates.  Two applications of cluster analysis
were considered.  1.  Q-mode analysis was used to determine the similarity of
different samples or sampling stations on the basis of the species found in
each sample.  2.  R-mode analysis was used to identify associations of species
on the basis of their spatial and temporal distributions.

     The second analytical technique tested was nonmetric multidimensional
scaling (MDS), in which the information was presented in a scatter diagram
that was examined without first assuming that clusters were present.   MDS was
examined as, and might be considered, an alternative to cluster analysis for
determining whether species form distinct biological assemblages.

     The third analytical procedure considered was hierarchical diversity
analysis.  This technique was an extension of the diversity measures  commonly
used to summarize biological data from environmental surveys.  It was used to
determine (1) the usefulness of two trophic-functional coding schemes
developed for zoomacrobenthic organisms and (2) the importance of species
diversity compared to diversities at higher taxonomic levels.

-------
                                  SECTION 2

                       CONCLUSIONS AND RECOMMENDATIONS
2.1  INTRODUCTION

     The primary purpose of "Task 1:  Information Systems Development" was to
develop a system for storing, retrieving, and analyzing data on aquatic
ecosystems.  A secondary objective was to examine and evaluate forms of
various analytical techniques routinely used to summarize and predict the
environmental impacts of accelerated development of energy resources.  This
report summarizes the system and provides tests and demonstrations of
quantitative procedures used to analyze large data sets.  The methods tested
are cluster analysis, ordination, species diversity, and hierarchical
diversity analysis.

2.2  GENERIC

     Throughout the study the paramount importance of pertinent and
representative data in assuring sound environmental interpretations was
evident.  Although this point is intuitive, it is often a prerequisite that is
not achieved.  One controllable and important part of assuring that sound data
will be available is to make sure that the right questions are asked.
Attempts at general-purpose monitoring should be abandoned, and statistically
trained ecologists familiar with sampling theory should be brought into study
teams from the very start.  Moreover, extensive preliminary sampling should be
undertaken to discover subtleties of the ecosystem.  During actual monitoring
unessential parts of the preliminary sampling plan can be abandoned in favor
of more detailed sampling of critical areas.  Cursory study of a few adequate
samples will provide greater insight than detailed examination of many
inadequate ones.  Samples must not only be well located but also of sufficient
size that possible errors of sampling are minimized.  For example, with small
sample sizes the likelihood that rare species may be missed by chance alone
increases.  Thus, species may be present in some samples and absent from
others due to chance and not due to environmental reasons.

     Biological data may be recorded as presence-absence data or quantitative
data.   Presence-absence data may be obtained much more quickly and less
expensively, but information provided by differential abundances is lost.   For
most purposes of applied aquatic ecology, quantitative data are preferred, but
more work needs to be done to assess the utility of presence-absence data.
Such data are often all that is available, and quick results may be essential
in times of acute environmental crisis.

2.3  CLUSTER ANALYSIS

     Cluster analysis is a multivariate quantitative procedure that provides a
classification of samples.  The method is often useful in environmental

-------
surveys because it provides readily interpretable results in the form of a
tree-like graph called a dendrogram.   Similarities among all samples in a
study are presented simultaneously in the dendrogram, although a measurable
amount of distortion may be introduced during the clustering procedure.
Fortunately, this distortion is measurable, and interpretation of dendrograms
with unacceptable levels of distortion can be avoided.

     In Q-mode analyses, samples with similar faunas are grouped together.  In
aquatic surveys involving a perturbation of the environment, one expects
samples from upstream control stations to be grouped with downstream controls
in the zone of recovery.  Unfortunately, faunas at downstream stations are
sometimes inherently different from upstream ones because of changes in stream
gradient, discharge, substrate, and other factors.

     R-mode cluster analyses indicate similarities among, or associations of,
species on the basis of their distribution and on the basis of abundance if
quantitative coefficients are used.  Q-mode analyses have been used much more
frequently than R-mode in applied ecology, but R-mode analysis has promise as
a method of comparing faunal associations from stream to stream or basin to
basin.

     A further disadvantage of cluster analysis and a further source of
distortion is that it forces samples into clusters whether or not such
clusters exist in nature.  Although the total amount of distortion can be
measured, its effects cannot be determined precisely without time-consuming
study of the raw data.

     For cluster analysis of presence-absence data, we recommend the use of
Jaccard's coefficient, which bases similarity between stations on only the
mutual presence of organisms at stations and not on their absences.
Quantitative data should be analyzed with correlation or distance coefficients
after the data have been transformed with a square-root transformation to
reduce inordinate effects of highly abundant species.  Future research should
be directed toward evaluating use of presence-absence data.

2.4  ORDINATION

     The results of ordination are presented as a scatter diagram in which
stations are plotted as points in a space defined by axes that represent the
faunal similarities of stations to each other.  The goal of ordination is
similar to that of Q-mode cluster analysis because both are computed from
similarities among stations.  It differs in that no a priori assumptions need
be made about the presence of clusters in the data.  Although little use has
been made of ordination in applied aquatic ecology, it has potential as an
analytical tool.  Streams are linear ecosystems, and to cluster stations along
such an environmental gradient may require forcing them into unrealistic
configurations.  An ordination can let the investigator determine if clusters
exist.

     Several methods of ordination are available.  We have tested only
nonmetric multidimensional scaling, a technique that seems ideally suited to
data from aquatic environmental surveys.  Because little use has been made of
ordination, we have evaluated it only against the results of cluster analysis.
In general, it seems to give comparable results, being especially sensitive to

-------
 inadequate data,  from which it produces  uninterpretable  results.  Future
 research should  stress additional  study  of  nonmetric  multidimensional  scaling,
 especially applications to  presence-absence data.

 2.5   INDICES  OF  DIVERSITY

      An  index of species diversity is  a  single  statistic  that expresses both
 the  number of species present  and  the  evenness  of distribution of organisms
 among species.   Unlike cluster analysis  and ordination,  it does not  consider
 which species are present.   Thus,  a  sample  collected  in  a zone of recovery
 downstream from  a source of pollutional  stress  may have  a species diversity
 equal to samples  from upstream control stations, yet  the  species may be
 altogether different.   Because cluster analysis and ordination measure a
 different aspect  of  community  structure  from species  diversity, the  methods
 should be used in combination  for  best results.

      The concept  of  species  diversity  has been  misunderstood, misapplied, and
 subsequently  much maligned  in  the  ecological literature.  Too many
 investigators have overlooked  the  nonuniqueness of the relationship  between a
 particular community structure and the index of species diversity computed
 from it.   Others  have  sought global values  of species diversity to indicate
 healthy  or damaged ecosystems,  overlooking  the  dependence of all such indices
 on sample size.   Nevertheless,  species diversity is a useful tool for applied
 aquatic  ecologists.   If  used properly  and not overinterpreted, it gives a
 useful and efficient measure of community structure for communicating
 information about the  state of the ecosystem to nonbiologists.

      Species  diversity may be  partitioned according to categories in the
 taxonomic hierarchy  to give diversity  components contributed by orders,
 families,  genera, and  species.  For some purposes of applied aquatic ecology,
 discrimination of genera may provide as much environmental information as
 identification of species,  with savings of time and money.  Diversity may also
 be partitioned according to other hierarchical classifications that consider
 functional morphology and feeding strategies.  Hierarchical diversity was
 tested in all three ways in the report and was found to be useful.

      Species diversity can be  computed with a number of equations.   We
 recommend  the use of Brillouin's equation from information theory.   Other
 equations  from information theory are biased and often give misleading
 results.   Hierarchical diversity analysis should be applied in future studies
 to help reduce the cost of  aquatic environmental surveys.

2.6  FINAL STATEMENT

     Although most of the methods of applied aquatic ecology were derived from
theoretical ecology,  the goals of the two sciences  are not the same.   It is
incumbent on applied ecologists to adapt  and modify existing methods  to suit
their needs and to address  questions of urgency. At the  same time,  they must
continue  to recognize their dependence on the groundwork  laid by  more
theoretically inclined ecologists.

-------
                                  SECTION 3

                    DEVELOPMENT OF THE INFORMATION SYSTEM


3.1  GENERAL DESCRIPTION OF THE INFORMATION SYSTEM

     When TVA initiated the research program on effects of energy use and
development, Task 1 (Information Systems Development) was responsible for
developing a system through which to measure and evaluate the impact of energy
technologies on biotic assemblages.  To accomplish this task, a computerized
information system (CIS) was developed to accommodate biological data.
Criteria for design of the system were that it must interface with EPA's
BIO-STORET system, be inexpensive to operate, be user oriented, be adaptable
for use with both routine monitoring and research programs, and perform a
variety of analytical functions.

     A flow diagram illustrating the CIS and steps involved in processing the
data is presented in Figure 1.  In its present stage of development, the CIS
uses (1) the Statistical Analysis System (SAS) (Barr et al., 1976) to create
and manipulate data sets; print, sort, rank, and store data; and perform
analyses such as simple descriptive statistics, Duncan's multiple range test,
analysis of variance, correlation, prohibit analysis, and regression; (2) the
Numerical Taxonomy System (NTSYS) for cluster analysis and ordination with
nonmetric multidimensional scaling; (3) the MIT-SNAP programs (Hoaglin and
Welsch, 1975) for resistant regression and box plotting; (4) user-written
FORTRAN programs to calculate diversity indices; and (5) Tektronix software
for producing instantaneous hard-copy graphics.

3.2  PREPARATION OF DATA

     To store and manipulate the large volume of information available within
TVA, a series of numeric and alphanumeric codes was developed to allow for
easier storage and retrieval of biological data; to save time and resources
each time the data were sorted, compiled, and analyzed; and to centralize
storage of environmental data.

     An example of the standard coding form used to transfer field and labora-
tory data into a format for keypunching is presented in Figure 2.  The type of
program (macrobenthos, phytoplankton,  or zooplankton), account number, job
number, data originator, address and phone number of originator, and sheet
number are recorded at the top of each sheet.

     The standard coding form has 80 columns and 26 rows.   The 80 columns are
divided into 19 data fields, with 1 to 16 columns per field.  The data fields
correspond to the following variables:

-------
                         RAW DATA
                             _y
                   DATA PREPARATION

                         DATA CONVERSIONS
                         CODING
                         KEYPUNCHING
 DATA STORAGE
    TAPE
    DISC
    CARDS
                  
-------
ACCOUNT NO..
JOB NO..
                                DATA ORIGINATOR
 PROGRAM
	  ADDRESS —
                                                         Sheet No..
                                                        PHONE (EXT)..
•O
O
0
€>
>

























River Lotit
Mile Lone
5 IO





























































































































ude /
jitude
15

























Date
20






















































































































































Gear
Code
25





























































































































3
8
ti





!




























Collect.
Number
30


















































1

















































Depth
35




































































































Temp.
40




































































































Time
45











































































1
H*
Q.
i
c/>

























»
o«
a
£z


















































™
P
CE
K


















































&
^
S
2S
?

























«l
c (A
~ ,«






































































c
c
O
K


















































-jConc Toxicant


















































Species Code
SO 65 70 75
















































































































































































































































































































































































































Number
of
Organ.
80





























































































































Figure  2.  Coding form used for  keypunching field and laboratory limnological data.

-------
      1.   river code                  11.   sample type
      2.   river mile                  12.   replicate number
      3.   latitude and longitude      13.   reporting unit
      4.   date                        14.   habitat type
      5.   gear code                   15.   instar and size class
      6.   sample location  code        16.   toxicant
      7.   collection number           17.   concentration of toxicant
      8.   depth                       18.   species code
      9.   temperature                 19.   number of organisms
    10.   time

      The  first data field consists of a two-digit numeric code for rivers and
streams in the Tennessee Valley  (Table 1) and sources of cultures utilized in
bioassays.

      Columns 3 to 7 contain the  river-mile location of a sampling site.  This
information specifies the distance a site is located upstream from the mouth
of the river and is usually obtained from a basin navigation map.

      The  latitude and longitude  of a site are entered in columns 8 to 15.
These data are reported in degrees and minutes.  The first four columns are
for latitude, and the second four columns are for longitude.  For example, a
sample taken at latitude 36 19'  and  longitude 86 23' is be entered as
"36198623."

      Columns 16 to 21 specify the date of sampling or bioassay.  The date is
entered numerically by year, month, and day with "730615" denoting June
15, 1973.

     The  code identifying the type of equipment used to collect a sample, is
entered in columns 22 to 26.  This hierarchical gear code has six general
categories:  undefined,  artificial substrates, natural substrate removal,
direct organism removal, emergence traps, and volume samplers (Table 2).  Five
of these  categories are divided  into subgroupings to accommodate information
such as sampler name, mesh size, or type of substratum used.  Each sampling
method or device has a five-character alphanumeric code, starting with the
letter "A" and followed by four numbers.   Generally, the first two digits
identify the type of sampling equipment,  and the next two digits describe the
specific type of sampling material used in artificial substrate samplers
(Tables 2.1-2.3).   For example, the four major categories of artificial
substrate samplers are baskets, trays,  flat surfaces,  and sterile indigenous
substrates; and one or more substrate materials can be used with each.  Thus,
the code "A1210" denotes a tray sampler with a rock substratum.  The code
"A1410" indicates that the sampler uses sterile, indigenous rocks (Table 2.1).

     Table 2.2 lists the gear codes for natural substrate removal and organism
removal.   The gear codes for natural substrate removal have five main
categories, three of which give specific  information on the name or type of
sampler used.   For example,  the code "A2200" is the general code number for
corers, but "A2250" is the code for the Benthos 2170 gravity corer.   Again,
the last two digits indicate the type of  gear used.

     Columns 27 and 28 contain a two-digit code (Table 3) that identifies the
distance a sample was collected from the  riverbank.   The codes "01" to "49"

-------
indicate distance from the right bank (facing downstream), and the codes "50"
to "98" are coded distances from the left bank.  Code "63," for example,
indicates that a sample was taken 25.1 to 30.0 m from the left bank, facing
downstream.

     The collection number is a five-character alphanumeric code in columns 29
through 33.  The first two characters identify the location or project at
which at which a sample was collected.  The remaining three digits refer to
the 1st, 2nd, 3rd, .  .  .  , or 999th time a sample was collected at the site.
Table 4 lists only the alphabetic prefix of the collection code.

     The depth (in meters) a sample was collected is recorded in columns 34 to
37.  For example, samples collected 3 and 24.5 m below the surface are
recorded as "03.0" and "24.5."  The decimal points occupies a separate column.
If the depth occupies only three columns, a zero is placed on the left, as in
"03.0."  A computer printout of 3.0 is obtained by right justifying all
integers.

     The water temperature (in  C) at which the sample was collected or at
which the bioassay was performed is entered in columns 38 to 41.  A tempera-
ture of 21.2 C is entered as "21.2."  Again, the decimal point is entered in a
separate column.   If the  temperature occupies only three columns, zeros are
placed to the left to fill the space; e.g., 9.6 C is entered as "09.6."

     The time of day (0000 to 2400) a sample was collected, or the length of
time for a bioassay,  is recorded in columns 42 through 45.   A sample collected
at 2:45 p.m. is entered as "1445."  A data entry for hour 48 of a bioassay is
entered as 0048.

     The code for sample  type is a one-character alphabetic code entered in
column 46 (Table 5).   This code identifies the type (community, parameter) of
sample or data collected  or reported, such as periphyton, phytoplankton,
chlorophyll, or bioassay.

     The replicate number is entered in columns 47 and 48.   Because replicate
sampling and testing is done for most surveys and bioassays, it is important
to know not only how many replicates were collected but the replicate one is
dealing with.  This code  refers to a given sample or replicate and not to the
total number collected.  Thus, "06" refers to the sixth sample collected at a
particular sample station, and "10" refers to the tenth sample taken at the
same station.

     The units in which the data are reported are coded as a two-character
alphanumeric code entered in columns 49 and 50.  Table 6 lists the units
commonly used.  These units are listed according to general categories of
area, chlorophyll pigments, percentages, productivity-respiration,
radiation-light,  rate,  flags and foul-ups, length, number,  ratio, temperature,
time, turbidity,  volume,  weight, and zooplankton.

     The code for habitat type or ecologic zone is entered in columns 51
and 52.  This two-digit code identifies the habitat (ecologic zone and
dominant substratum)  found at each site.  In the case of bioassays, the code
indicates the range of hardness (Table 7).
                                      10

-------
     Columns 53 to 55 contain a three-character code  tor  instar and HI'XP
class.  The first column refers to the stage of development or to the set of
characters used to express the stage of development of a  specimen:

                              I--instar
                              S--general size class
                              L--length class
                              P--pupal stage
                              H--head capsule
                              A—immature

The next two columns contain a numeric code that represents size intervals in
millimeters (Table 8).  Because instars are not always determined by measure-
ments of total body length, it is important to specify on the laboratory bench
sheet whether head capsule dimensions, wing pad length, or other measurements
were used to determine the instar.  Examples of common codes are:

          L48--length class:  100.0 to 125.0 mm
          S07—size class:  0.61 to 0.70 mm
          P00--pupal stage (interval code disregarded)
          A00--nauplius immature (interval code disregarded)
          H03—head capsule width:  0.21 to 0.30 mm

     Additional bioassay information where appropriate is recorded in columns
56 through 59.  Symbols from the periodic table or a coded listing of
compounds are entered in columns 56 and 57, and the concentration of the
toxicant is entered in columns 58 and 59.  If data on a toxicant are recorded,
the units code entered in columns 49 and 50 gives the units of the
concentration of the toxicant.  For example, a combination of "68" in columns
49 and 50 (units; see Table 6),  "HG" in columns 56 and 57 (toxicant),  and "05"
in columns 58 and 59 (concentration of toxicant) indicates that a
concentration of 5 pg/1 mercury was used in the bioassay.

     The next field refers to the 16-unit biological species code, which is
entered in columns 60 to 75.  This code identifies eight taxonomic categories.
Columns 60 and 61 identify the phylum or division; columns 62 and 63 identify
class; columns 64 and 65 identify order;  columns 66 and 67 identify family;
columns 68 and 79 identify genus; columns 71 to 73 identify species; and
columns 74 and 75 identify either the variety,  form, or authority.  The
numeric values are 01 to 99 for the two-digit fields and 001 to 999 for the
three-digit fields.

     A typical 16-unit species code is "1801190100100100," which gives the
following information:

          18                          Phylum:   Arthropoda
          1801                         Class:   Crustacea
          180119                       Order:   Branchiura
          18011901                    Family:   Thalestridae
          18011901001                  Genus:   Argulus sp.
          18011901001001             Species:   japonicus
          1801190100100100         Authority:   Thiele
                                      11

-------
     The codes for species found in the Tennessee Valley were compiled in the
Synoptic Catalog of Algae and Aquatic Invertebrates for the Tennessee Valley
included as Appendix A of this report.

     The total number of organisms per data entry (row) is recorded in
columns 76 to 80.  For field collections this number refers to the number of
organisms collected per data entry (sample or replicate).  To indicate that 97
individuals of Chironomus tentans Fabricius were found per square meter,  "NG"
is entered in the units field (columns 49 and 50), "1802111501602000" is
entered as the species code (columns 60 to 75),  and "00097" is entered as the
number of organisms (columns 76 to 80).  For recording bioassay results,
columns 76 to 80 are utilized to indicate the number of organisms killed  and
the total number of organisms tested, respectively.  For example, to indicate
that after 48 hours of an acute bioassay, 28 of 40 specimens of C. tentans had
died, "0048" is entered in columns 42-45, "H" is entered in column 46,
"1802111501602000" is entered in columns 60-75,  "28" is entered in columns
76-77, column 78 is left blank, and "40" is entered in columns 79-80.

3.3  EDITING OF DATA

     After the biological data are coded and transcribed onto the coding  form,
they are keypunched onto cards of a specified color, depending on the type of
data collected.  For example, zooplankton data are punched on yellow cards,
zoomacrobenthic data on red cards, phytoplankton data on green cards,
carbon-14 data on orange cards, and chlorophyll  data on white cards.

     After the data have been punched,  they are  read into the computer where a
SAS program sorts and merges the data with another SAS data set that contains
the taxonomic name associated with each species  code.  If a data record does
not have a taxonomic name for a given code, an error message is printed.   A
list of the data is then checked by the principal investigator to ensure  that
all other requirements of the data have been met.  If a record is incomplete,
new data are added by means of a user-written FORTRAN program.  For instance,
if a zoomacrobenthic species was found in sample replicates 1, 5, 6, and  9 at
a site, the FORTRAN program would insert zeros for replicate numbers 2, 3, 4,
7, 8, and 10 if ten replicated samples had been taken at the station.  Once
this is done, SAS can be used to sort,  print, and perform statistical
analyses.  Additional data can also be merged with the test data set at this
time.

     Once these steps have been completed, a formatted data set can be output
via a time-sharing option (TSO) or batch processed for use with other
programs.  Programs used on TSO or submitted to  batch from TSO are:

     1.   SPECLIST provides a species list.  This list is checked to ensure
          that all organisms listed were actually found in the study.

     2.   MATRIX provides a data matrix for use  in NTSYS.  NTSYS language
          control cards are inserted on TSO, and the programs are submitted to
          batch.

     3.   DIVER calculates the diversity index H'.
                                      12

-------
4.   MIT-SNAP; median polish is used from this software package.

5.   NTSYS provides 7 types of clustering, 7 types of ordination, 24 data
     transformations, and 22 indices of similarity or dissimilarity.

6.   User-written programs for species diversity (H) and hierarchical
     diversity.
                                 13

-------
TABLE 1.  NUMERICAL CODES FOR RIVERS AND STREAMS IN THE TENNESSEE VALLEY,
      AND SOURCES OF ORGANISMS FOR BIOASSAYS (UTILIZED IN COLUMNS 1
                      AND 2 OF CODING FORM—FIGURE 2)

Numerical
code
00
01
02
03
04
05
06
07
08
09
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

38


39

Alphabetical
code
TRM
ERM
HRM
LTRM
FBRM
HORM
HSRM
WRM
CURM
CRM
SRM
CFRM
MRM
OHRM
GRM
DRM
OBRM
BRM
PRM
HNRM
HPRM
NRM
PGRM
SQRM
CHRM
NTRM
TGRM
NLRM
TCRM
EMR
OCO
TELL
CFM
TCM
CCM
YCM







Identification
Tennessee
Elk
Hiwassee
Little Tennessee
French Broad
Ho Is ton
South Fork Holston
Watauga
Cumberland
Clinch
Stones
Caney Fork
Mississippi
Ohio
Green
Duck
Obey
Buffalo
Powell
North Fork Holston
Harpeth
Nolichucky
Pigeon
Sequatchie
Cheoah
Nantahala
Tuckasegee
Nottely
Toccoa
Emory
Ocoee
Tellico
Clear Fork River
Town Creek
Crooked Creek
Yellow Creek
Research Station,
Browns Ferry
Rearing Ponds ,
NFDC, Muscle
Shoals
EPA Laboratory,
Corvallis, OR

 Used by other data storage systems in TVA.
                                14

-------
 TABLE 2.0.   CODES FOR IDENTIFYING METHODS  OF  LIMNOLOGICAL  SAMPLING,
          AND TYPES OF GEAR (UTILIZED IN COLUMNS  22-26  OF
	CODING FORM—FIGURE  2)
                   AOOOO    Undefined

                   A1000    Artificial  substrates

                   A2000    Natural  substrate  removal

                   A3000    Direct organism removal

                   A4000    Emergence traps

                   A5000    Volume samplers
                                  15

-------
 TABLE 2.1.  CODES FOR TYPES OF SAMPLING EQUIPMENT AND SAMPLING
          MATERIAL USED IN ARTIFICIAL-SUBSTRATE SAMPLING
A1000     Artificial substrate

A1100     Baskets
A1110       Rock
Allll         Rocks, bagged (basket collected in bag)
All12         Rocks, unbagged (basket not collected in bag)
A1120       Leaf
A1130       Brush
A1140       Conservation webbing
A1150       Other synthetic material
A1160       Balls (porcelain or other material)
A1170       Concrete and pebble blocks
A1171       Concrete, pebble blocks, and conservation webbing

A1200     Trays
A1210       Rock
A1220       Pebbles
A1230       Sand
A1240       Silt
A1250       Clay
A1260       Mud
A1270       Conservation webbing
A1280       Other synthetic material

A1300     Flat surfaces
A1310       Multiplate samplers
A1320       Glass slides or plates
A1330       Plexiglass slides or plates
A1340       Plastic strips or sheets
A1350       Polyethylene plates
A1360       Polyurethane
A1370       Polystyrene
A1380       Iron plates
A1390       Ceramic tile or block
A13AA       Wood
A13BB       Cement tile or block

A1400     Sterile indigenous substrate
A1410       Rock
A1420       Wood
A1430       Aquatic vascular plants
                           16

-------
TABLE 2.2.  CODES FOR TYPES OF SAMPLING EQUIPMENT AND SAMPLING
MATERIAL USED IN NATURAL SUBSTRATE REMOVAL AND ORGANISM REMOVAL
              A2000  Natural substrate removal

 A2100     Dredges
 A2110       Ecfcman
 A2120       Petersen
 A2130       Ponar
 A2140       Franklin-Anderson
 A2150       Shipek
 A2160       Dietz-LaFond
 A2170       Orange peel dredge
 A2180       Tonolli spiraling mud burrower

 A2200     Corers
 A2210       Vertical core sampler
 A2211         Cork borer
 A2212         Dendy inverted sampler
 A2213         Phleger corer
 A2214         Ewing piston corer
 A2220       FRB multiple corer
 A2230       Peat corer
 A2240       Deep core sampler
 A2250       Benthos 2170 gravity corer
 A2260       Alpine 211 gravity corer
 A2270       PVC pipe corer
 A2280       Glass tube corer
 A2290       Kajak corer

 A2300     Area samplers
 A2310       Surber square foot sampler
 A2320       Wilding square foot sampler
 A2330       Hess circular sampler
 A2340       Neill sampler
 A2350       Dome sampler
 A2360       Diver-Actuated Sampler  -  Circular
 A2361       Diver-Actuated Sampler  -  Square

 A2400     Scrapes

 A2500     Ooze suckers
                  A3000  Organism removal
A3100     Kick nets
A3200     Hand net sweeps
A3300     Drift-nets
A3400     Grab samples
A3500     Trawls
                           17

-------
TABLE 2.3.  CODES FOR TYPES OF SAMPLING EQUIPMENT USED WITH
             EMERGENCE TRAPS AND VOLUME SAMPLERS	

                  A4000  Emergence traps

     A4100     Light traps
     A4110     Lantern-sheet method
     A4120     Incandescent
     A4130     Fluorescent
     A4140     Black light (UV)

     A4200     Submerged traps
     A4300     Floating
     A4400     Aerial net traps
     A4500     Staked box traps
     A4600     Hand net sweeps
                  A5000  Volume samplers

     A5100     Juday traps
     A5200     Kemmerer bottles
     A5300     Van Dorn bottles
     A5400     Clarke-Bumpus plankton sampler
     A5500     Nansen bottle
     A5600     Grab samples
     A5700     Undefined
     A5800     0.5-Meter tow net (#20 mesh)
                               18

-------
     TABLE  3.   CODES  FOR  IDENTIFYING LOCATION OF LIMNOLOGICAL  SAMPLES
         ON A  GIVEN  RIVER OR RESERVOIR TRANSECT  (UTILIZED  IN
             COLUMNS 27  AND 28 OF CODING FORM—FIGURE  2)

Code
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
r*
Interval (meters)
0.1
1.1
2.1
3.1
4.1
5.1
6.1
7.1
8.1
9.1
10.1
15.1
20.1
25.1
30.1
35.1
40.1
45.1
50.1
75.1
100.1
125.1
150.1
175.1
200.1
300.1
400.1
500.1
600.1
700.1
800.1
900.1
1000.1
1250.1
1500.1
1750.1
2000.1
2250.1
2500.1
3000.1
3500.1
4000.1
4500.1
5000.1
5500.1
6000.1
6500.1
7000.1
7500.1
- 1.0
- 2.0
- 3.0
- 4.0
- 5.0
- 6.0
- 7.0
- 8.0
- 9.0
- 10.0
- 15.0
- 20.0
-25.0
- 30.0
- 35.0
- 40.0
- 45.0
- 50.0
- 75.0
- 100.0
- 125.0
- 150.0
- 175.0
- 200.0
- 300.0
- 400.0
- 500.0
- 600.0
- 700.0
- 800.0
- 900.0
- 1000.0
- 1250.0
- 1500.0
- 1750.0
- 2000.0
- 2250.0
- 2500.0
- 3000.0
- 3500.0
- 4000.0
- 4500.0
- 5000.0
- 5500.0
- 6000.0
- 6500.0
- 7000.0
- 7500.0
- 8000.0
Code
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
Interval (m)
0.1
1.1
2.1
3.1
4.1
5.1
6.1
7.1
8.1
9.1
10.1
15.1
20.1
25.1
30.1
35.1
40.1
45.1
50.1
75.1
100.1
125.1
150.1
175.1
200.1
300.1
400.1
500.1
600.1
700.1
800.1
900.1
1000.1
1250.1
1500.1
1750.1
2000.1
2250.1
2500.1
3000.1
3500.1
4000.1
4500.1
5000.1
5500.1
6000.1
6500.1
7000.1
7500.1
- 1.0
- 2.0
- 3.0
- 4.0
- 5.0
- 6.0
- 7.0
- 8.0
- 9.0
- 10.0
- 15.0
- 20.0
- 25.0
- 30.0
- 35.0
- 40.0
- 45.0
- 50.0
- 75.0
- 100.0
- 125.0
- 150.0
- 175.0
- 200.0
- 300.0
- 400.0
- 500.0
- 600.0
- 700.0
- 800.0
- 900.0
- 1000.0
- 1250.0
- 1500.0
- 1750.0
- 2000.0
- 2250.0
- 2500.0
- 3000.0
- 3500.0
- 4000.0
- 4500.0
- 5000.0
- 5500.0
- 6000.0
- 6500.0
- 7000.0
- 7500.0
- 8000.0

Distance from right bank (facing downstream),
Distance from left bank (facing downstream).
                                   19

-------
        TABLE 4.  ALPHABETIC CODES IDENTIFYING SITE (OR PROJECT) AT WHICH
                       SAMPLE WAS COLLECTED (UTILIZED IN
                   COLUMNS 29 AND 30 OF CODING FORM—FIGURE 2)
Site prefix code
         Site
Site prefix code
        Site
       AA
       AB
       AC
       AD
       AE
       AF
       AG
       AH
       AI
       AJ
       AK
       AL
       AM
       AN
       AO
       AP
       AQ
       AR
       AS
       AT
       AU
       AV
       AW
       AX
       AY
       AZ
       BA
       BB
       BC
       BD
       BE
       BF
       BG
       BH
       BI
       BJ
       BK
       BL
       BM
       BN
       BO
       BP
       BQ
       BR
       BS
       BT
       BU
       BV
Raccoon Mountain (P)
Bellefonte (N)
Browns Ferry (N)
Hartsville (N)
Sequoyah (N)
Watts Bar (N)
Yellow Creek (N)
Watts Bar (FF)
Johnsonville (FF)
Widows Creek (FF)
Shawnee (FF)
Kingston (FF)
Colbert (FF)
John Sevier (FF)
Gallatin (FF)
Allen (FF)
Paradise (FF)
Bull Run (FF)
Cumberland (FF)
Kentucky (F)
Pickwick (R)
Wilson (R)
Wheeler (R)
Guntersville (R)
Nickajack (R)
Chickamauga (R)
Watts Bar (R)
Fort Loudon (R)
Melton Hill (R)
Norris (R)
Douglas (R)
Nolichucky (R)
Cherokee (R)
Fort Patrick Henry  (R)
Boone (R)
Watauga (R)
South Holston  (R)
Tims Ford (R)
Tellico (R)
Chilhowee (R)
Calderwood (R)
Cheoah (R)
Fontana (R)
Santetlah (R)
Nantahala (R)
Thorpe (R)
Appalachia (R)
Hiwassee (R)
       BW
       BX
       BY
       BZ
       CA
       CB
       CC
       CD
       CE
       CF
       CG
       CH
       CI
       CJ
       CK
       CL
       CM
       CN
       CO
       CP
       CQ
       CR
       CS
       CT
       CU
       CV
       CW
       CX
       CY
       CZ
       DA
       DB
       DC
       DD
       DE
       DF
       DG
       DH
       DI
       DJ
       DK
       DL
       DM
       DN
       DO
       DP
       DQ
Chatuge (R)
Nottely (R)
Parksville (R)
Blue Ridge (R)
Hales Bar (R)
Great Falls (R)
Barkley (R)
Cheatham (R)
Old Hickory (R)
Ocoee No. 1 (H)
Wilbur (H)
Ocoee No. 2 (H)
Nolichucky (H)
Great Falls (H)
Wilson (H)
Blue Ridge (H)
Norris (H)
Wheeler (H)
Pickwick (H)
Guntersville  (H)
Chickamauga (H)
Hiwassee (H)
Cherokee (H)
Watts Bar (H)
Nottely (H)
Chatuge (H)
Ocoee No. 3 (H)
Appalachia (H)
Douglas (H)
Fort Loudon (H)
Kentucky (H)
Fontana (H)
Watauga (H)
South Holston  (H)
Boone (H)
Fort Patrick Henry  (H)
Melton Hill (H)
Nickajack (H)
Tims Ford (H)
Clinch R. Breeder
Clinch R. Carbo Plant
Jamestown CH  (S)
Jamestown CM  (S)
Jamestown CA  (S)
Jamestown LY  (S)
Jamestown LB  (S)
Jamestown PM  (S)
a
 'Abbreviations:  P = pump storage facility; N = nuclear power plants; FF  -  fossil
 fuel power plants; R = reservoir; H = hydro plants; and S = strip mine site.

                                         20

-------
 TABLE 5.   CODES FOR IDENTIFYING TYPE OF SAMPLE (COMMUNITY,  PARAMETER,
	TEST)  UTILIZED IN'COLUMN 46 OF CODING FORM—FIGURE  2	


	Code	Sample type	  	  	   	
                A           Zoomacrobenthos

                B           Periphyton

                C           Phytoplankton

                D           Zooplankton

                E           Macrophyton

                F           Productivity,  light  bottle-
                              dark bottle (oxygen)

                                          14
                G           Productivity,   C

                H           Bioassay,  acute

                I           Bioassay,  chronic
                                    21

-------
 TABLE 6.   CODES UTILIZED TO REPORT DATA UNITS  (COLUMNS  49  AND  50
__^__	ON CODING FORM—FIGURE 2)
                              Area

                       AO      Undefined
                       Al      Sq.  micron
                       A2      Sq.  millimeter
                       A3      Sq.  centimeter
                       A4      Sq.  meter
                       A5      Hectare
                       A6      Sq.  kilometer
                       A7      Sq.  inch
                       A8      Sq.  foot
                       A9      Sq.  yard
                       AA      Acre
                       AB      Sq.  mile
                       Chlorophyll-pigments

        CO      Undefined
        Cl      Micrograms active chlorophyll A/sq. centimeter
        C2      Micrograms phaeophytin/sq. centimeter
        C3      Micrograms chlorophyll A/sq. centimeter
        C4      Micrograms chlorophyll B/sq. centimeter
        C5      Micrograms chlorophyll C/sq. centimeter
        C6      Micrograms beta-carotene/sq. centimeter
        C7      Milligrams active chlorophyll A/sq. meter
        C8      Milligrams phaeophytin/sq. meter
        C9      Milligrams chlorophyll A/sq. meter
        CA      Milligrams chlorophyll B/sq. meter
        CB      Milligrams chlorophyll C/sq. meter
        CC      Milligrams beta-carotene/sq. meter
        CD      Milligrams active chlorophyll A/cubic meter
        CE      Milligrams phaeophytin/cubic meter
        CF      Milligrams chlorophyll A/cubic meter
        CG      Milligrams chlorophyll B/cubic meter
        CH      Milligrams chlorophyll C/cubic meter
        CI      Milligrams beta-carotene/cubic meter
        CJ      Milligrams chlorophyll A/liter
                                 22

-------
              TABLE 6 (continued)
                Percentages

             0      Undefined
             XI     Percent abundance, numbers
             X2     Percent abundance, biomass
             X3     Percent efficiency
         Productivity-respiration

PO      Undefined
PI      Milligrams ATP/cubic meter/day
P2      Milligrams ATP/cubic meter/hour
P3      Milligrams ATP/sq. meter/day
P4      Milligrams ATP/sq. meter/hour
P5      Milligrams C/cubic meter/day
P6      Milligrams C/cubic meter/hour
P7      Milligrams C/sq. meter/day
P8      Milligrams C/sq. meter/hour
P9      Milligrams CCL/cubic meter/day
PA      Milligrams CO /cubic meter/hour
PB      Milligrams CL/cubic meter/day
PC      Milligrams CL/cubic meter/hour
PD      Milligrams protein/cubic meter/day
PE      Milligrams protein/cubic meter/hour
PF      Milligrams protein/sq. meter/day
PG      Milligrams protein/sq. meter/hour
PH      Grams C/cubic meter/day
PI      Grams C/cubic meter/hour
PJ      Grams C/sq. meter/day
PK      Grams C/sq. meter/hour
PR      Milligrams CL uptake/gram fr. wt./day
PS      Milligrams CL uptake/gram fr. wt./hour
PT      Milligrams CL uptake/gram dry wt./day
PU      Milligrams CL uptake/gram dry wt./hour
              Radiation-light

        RO      Undefined
        Rl      Foot-candles
        R2      Gram calories/centimeter square
        R3      Langleys
        R4      Lux
        R5      Percent of surface illumination
                       23

-------
       TABLE 6 (continued)
            Rate-(l)

10      Undefined
11      Millimeters/second
12      Centimeters/second
13      Meters/second
14      Millimeters/minute
15      Centimeters/minute
16      Meters/minute
17      Millimeters/hour
18      Centimeters/hour
19      Meters/hour
1A      Kilometers/hour
IB      Inches/second
1C      Inches/minute
ID      Inches/hour
IE      Feet/second
IF      Feet/minute
1G      Feet/hour
1H      Miles/hour
           Rate-(v)

20      Undefined
21      Cubic feet/second
22      Cubic meters/second
          Rate-(wt/a)

30      Undefined
31      Grams/sq. meter/day
32      Grams/sq. meter/hour
33      Milligrams/sq.  meter/day
34      Milligrams/sq.  meter/hour
         Rate-(wt/vol)

40       Undefined
41       Grams/cubic meter/day
42       Grams/cubic meter/hour
43       Micrograms/liter/day
44       Micrograms/liter/hour
45       Milligrams/liter/day
46       Milligrams/liter/hour
                 24

-------
           TABLE 6 (continued)
           Flags and foul-ups

FO      Undefined
Fl      Sample lost during collection
F2      Sample lost during analysis
F3      Unable to access sampling station
F4      Unable to recover sample or substrate
    LO      Undefined
    LI      Micron
    L2      Millimeter
    L3      Centimeter
    L4      Meter
    L5      Kilometer
    L6      Inch
    L7      Foot
    L8      Yard
    L9      Mile
    LA      International nautical mile
                Number

       NO     Undefined
       Nl     Number of colonies
       N2     Number of eggs
       N3     Number of exuviae
       N4     Number of hatches
       N5     Number of individuals
       N6     Number/acre
       N7     Number/cubic meter
       N8     Number/gram dry wt.
       N9     Number/gram ashfree  wt.
       NA     Number/hectare
       NB     Number/liter
       NC     Number/milliliter
       ND     Number/sq.  millimeter
       NE     Number/sq.  centimeter
       NF     Number/sq.  foot
       NG     Number/sq.  meter
       NH     Number/sample
       NI     Number/cubic meter X 10
                     25

-------
           TABLE 6 (continued)
              Ratio (wt/a)

50      Undefined
51      Grams/sq. meter
52      Kilograms/acre
53      Kilograms/hectare
54      Micrograms/sq. centimeter
55      Milligrams/sq. centimeter
56      Milligrams/sq. meter
57      Grams/sq. meter,  ashfree
58      Milligrams/sq. centimeter,  ashfree
59      Milligrams/sq. meter, ashfree
5A      Grams/sq. meter,  ash
5B      Milligrams/sq. centimeter,  ash
5C      Milligrams/sq. meter, ash
5D      Kilograms/channel
            Ratio (wt/vol)

       60      Undefined
       61      Grams/cubic meter
       62      Grams/liter
       63      Kilograms/cubic meter
       64      Micrograms/milliliter
       65      Milligrams/cubic meter
       66      Milligrams/liter
       67      Milligrams/milliliter
       68      Micrograms/liter
              Temperature

       DO     Undefined
       Dl     Degrees Celsius
       D2     Degrees Fahrenheit
       D3     Degrees Kelvin
       D4     Degree-days
                 Time

           TO      Undefined
           Tl      Day
           T2      Hour
           T3      Minute
           T4      Month
           T5      Second
           T6      Week
           T7      Year
                      26

-------
         TABLE 6 (continued)
             Turbidity

GO      Undefined
Gl      Jackson turbidity units (JTU)
G2      Formazin turbidity units (FTU)
G3      Coleman nephlos units (CTU)
G4      Percentage transmittance (%T)
              Volume

         VO      Undefined
         VI      Cubic micron
         V2      Cubic millimeter
         V3      Milliliter
         V4      Liter
         V5      Cubic meter
         V6      Cubic kilometer
         V7      Cubic inch
         V8      Ounce
         V9      Cubic foot
         VA      Cubic yard
         VB      Cubic mile
    WO      Undefined
    Wl      Picogram
    W2      Microgram
    W3      Milligram
    W4      Centigram
    W5      Gram
    W6      Kilogram
    W7      Grain
    W8      Ounce
    W9      Pound
    WA      Ton
    WB      Milligram,  ashfree
    WC      Gram, ashfree
    WD      Kilogram, ashfree
    WE      Milligram,  ash
    WF      Gram, ash
    WG      Kilogram, ash
                   27

-------
               TABLE 6 (continued)
                  Zooplankton
ZO
Zl
Z2
Z3
Z4
Z5
Z6
Z7
Z8
Z9
ZA
ZB
ZC
ZD
Undefined
No. females
No.
No.
No.
No.
No.
No.
No.
No.
No.
No.
No.
No.
females
females
females
females
females
females
females
females
females
females
females
females
with 0 eggs per brood chamber
with 1 egg per brood chamber
with 2 eggs per brood chamber
with 3 eggs per brood chamber
with 4 eggs per brood chamber
with 5 eggs per brood chamber
with 6-7 eggs per brood chamber
with 8-10 eggs per brood chamber
with 11-15 eggs per brood chamber
with 16-20 eggs per brood chamber
with 21-25 eggs per brood chamber
with 26-30 eggs per brood chamber
with >30 eggs per brood chamber
                          28

-------
TABLE 7.  CODES FOR IDENTIFYING BASIC TYPE OF HABITAT FROM WHICH
SAMPLE WAS COLLECTED, OR HARDNESS AT WHICH BIOASSAY WAS PERFORMED
      (USED IN COLUMNS 51 AND 52 OF CODING FORM—FIGURE 2)

Code
1
2
3

4
5
6
7
8
9

A
B
C
D
E
F
G
H



Ecologic zone Code
Pool 1
Riffle 2
Profundal 3

Pelagic 4
Littoral 5
Sublittoral 6
Eulittoral 7
Abyssal 8
Channel 9

Overbank A
Tail water B
Supratidal C
Intertidal
Subtidal
Splash
Intake canal
Discharge canal



Substratum
Bedrock
Boulders
Rubble
(small rocks)
Gravel
Sand
Silt
Clay
Marl
Organic detritus
(unconsolidated)
Fibrous peat
Pulpy peat
Organic muck








Code
01
02
03

04
05
06
07
08
09

10
11
12
13
14
15
16
17
18
19
20
Water Hardness
(mg/1 as CaC03)
10
20
30

40
50
60
70
80
90

100
110
120
130
140
150
160
170
180
190
200
                          29

-------
 TABLE 8.  CODES FOR RECORDING INSTAR OR SIZE OF ORGANISMS
COLLECTED OR USED IN BIOASSAYS (UTILIZED IN COLUMNS 54 AND
               55 OF CODING FORM—FIGURE 2)

Code
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
Interval
0.001 -
0.11 -
0.21 -
0.31 -
0.41 -
0.51 -
0.61 -
0.71 -
0.81 -
0.91 -
1.1 -
2.1 -
3.1 -
4.1 -
5.1 -
6.1 -
7.1 -
8.1 -
9.1 -
10.1 -
11.1 -
12.1 -
13.1 -
14.1 -
15.1 -
16.1 -
17.1 -
18.1 -
19.1 -
20.1 -
21.1 -
22. 1 -
23.1 -
(mm)
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
10.0
11.0
12.0
13.0
14.0
15-0
16.0
17.0
18.0
19.0
20.0
21.0
22.0
23.0
24.0
Code
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
Interval
24.1 -
25.1 -
27.1 -
29.1 -
31.1 -
33.1 -
35.1 -
37.1 -
39.1 -
41.1 -
43.1 -
45.1 -
50.1 -
75.1 -
100.1 -
125.1 -
150.1 -
175.1 -
200.1 -
225.1 -
250.1 -
275.1 -
300.1 -
325.1 -
350.1 -
375.1 -
400.1 -
425.1 -
450.1 -
475.1 -
500.1 -
525.1 -
550.1 -
(mm)
25.0
27.0
29.0
31.0
33.0
35.0
37.0
39.0
41.0
43.0
45.0
50.0
75.0
100.0
125.0
150.0
175.0
200.0
225.0
250.0
275.0
300.0
325.0
350.0
375.0
400.0
425.0
450.0
475.0
500.0
525.0
550.0
575.0
Code
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93 -
94
95
96
97
98
99
Interval
575.1 -
600.1 -
625.1 -
650.1 -
675.1 -
700.1 -
725.1 -
750.1 -
775.1 -
800.1 -
825.1 -
850.1 -
875.1 -
900.1 -
925.1 -
950.1 -
975.1 -
1000.1 -
1100.1 -
1200.1 -
1300.1 -
1400.1 -
1500.1 -
1600.1 -
1700.1 -
1800.1 -
1900.1 -
2000.1 -
3000.1 -
4000.1 -
5000.1 -
7500.1 -
(mm)
600.0
625.0
650.0
675.0
700.0
725.0
750.0
775.0
800.0
825.0
850.0
875.0
900.0
925.0
950.0
975.0
1000.0
1100.0
1200.0
1300.0
1400.0
1500.0
1600.0
1700.0
1800.0
1900.0
2000.0
3000.0
4000.0
5000.0
7500.0
10000.0
>10000.1
                                30

-------
                                  SECTION 4

                                   METHODS


4.1  DESCRIPTION OF ANALYTICAL TECHNIQUES

     After the CIS had been placed in operation, emphasis was initially placed
on analysis of selected sets of data representing radically different environ-
mental situations with which the staff had had experience.  Three analytical
(exploratory) techniques were tested with these data.  The first was cluster
analysis, which was used (1) to determine the similarity of selected sites
along two rivers on the basis of the biotic assemblages found in samples from
those sites (Q-mode) and (2) to identify biological communities or
associations of species (R-mode).

     The second technique was an ordination procedure called nonmetric multi-
dimensional scaling (MDS).  This technique was used because it allows "one to
examine a scatter diagram displaying a summary of the structure of the data
without- having to first assume that clusters are present" (Rohlf 1970).
Ordination was thus an alternative to cluster analysis and served as a check
to help determine if stations or species actually formed distinct groups.

     The third technique considered was an index of species diversity, which
concurrently examines the number of species and the uniformity or evenness of
distribution of individuals among species (Pielou 1969).

4.1.1  Cluster Analysis

     Cluster analysis is a multivariate analytical technique used to consider
simultaneously all the data contained in a large data matrix.   A frequent
application of this technique is to search for patterns in data, especially
data that do not meet assumptions of rigorous statistical methods.

     When cluster analysis is used to analyze data from limnological surveys,
the data are tabulated by taxa for each station into a data matrix in which
rows are taxa and columns are stations at which samples were collected or, in
the case of multiple samples from the same station,  the samples themselves.
The data can be presented as the number of individuals of each species in each
sample,  presence or absence of species in samples, or ranked abundances  of
species.   A typical matrix for presence-absence data is shown below, in  which
1 stands for presence and 0 stands for absence.
                                      31

-------
                                           Station
Taxa
A
B
C
D
E
F
1
1
1
1
1
1
0
2
1
1
1
1
0
1
3
0
0
1
0
1
1
4
0
0
1
0
1
1
5
1
0
0
0
0
1
     Once a data matrix has been compiled, a similarity or distance matrix is
computed that expresses the resemblance between each pair of samples or
species in the data matrix.  Pairwise comparison between columns (samples) is
referred to as Q-mode analysis; comparison between rows (species) is called
R-mode analysis.  Any of a variety of similarity, correlation, or distance
coefficients may be used to quantify the resemblance.  One of the simplest is
Jaccard's coefficient, S  (Jaccard 1908):
                          J   a + b + c '


where, in Q-mode analysis;

          a = number of taxa found at both stations,
          b = number of taxa found at the first station and not
              the second,
          c = number of taxa found at the second station and not
              the first.

The similarity matrix that results from Q-mode analysis of the hypothetical
data matrix on the previous page is:
              Stations
                 1
                 2
                 3
                 4
                 5
4/6
276  2/6
2/6  2/6  3/3
1/6  2/5  1/4  1/4
     The next step in the procedure is the clustering itself.  One of the most
widely used techniques for clustering is the unweighted pair-group method
using arithmetic averages (UPGMA) (Sokal and Sneath, 1963).  Using this
procedure the computer first seeks mutually closest resemblances in the
similarity or distance matrix.  In our example the closest resemblances,
indicated by the highest similarity coefficients, are (1) station 1 with
station 2 (4/6 = 0.67) and (2) station 3 with station 4 (3/3 - 1.00).  Note
                                      32

-------
that station 5's highest similarity (2/5) is with station 2 but that this is
not a mutually highest similarity because station 2 is more similar to station
1 than it is to station 5.  After the mutually most similar pairs have been
found, the average similarities of these stations with all others in the
matrix is found, and the matrix is recomputed.  Mutually highest pairs in the
new similarity matrix are sought, and the process is repeated until all
stations have joined a cluster.  Continuing with our example:

Avg. sim. (1-2) with (3-4) = (1 w/ 3 + 1 w/ 4 + 2 w/ 3 + 2 w/ 4)/4

                           = (2/6 + 2/6 + 2/6 + 2/6)4 = 0.33

Avg. sim. (1-2) with (5)   = (1 w/ 5 + 1 w/ 5)/2

                           - (1/6 + 2/5)/2 = 0.28

Avg. sim. (3-4) with (5)   = (3 w/ 5 + 4 w/ 5)/2

                             (1/4 + l/4)/2 = 0.25
The resulting recomputed similarity matrix, with mutually highest similarity
underlined is:
Stations
1-2
3-4
5
1-2 3-4
0.33
0.28 0.25
5

Continuing:

Avg. sim. (1-2-3-4) with (5) = [(1-2 w/ 5 + (3-4) w/ 5]/2

                             = (0.28 + 0.25)/2 = 0.27
                                      33

-------
The scale at the top of the figure indicates the level of average similarity
between stations and clusters of stations.   For example, station 1 clusters
with station 2 at a similarity of 0.67; they in turn cluster with stations 3
and 4 at an average similarity of 0.33, and the four stations join station 5
at an average similarity of 0.27.

     The last step in cluster analysis is to compute a coefficient of
cophenetic correlation (Sokal and Rohlf, 1962).  This procedure, described by
Roback et al. (1969),

     "is necessary because the clustering method involves averaging of simi-
     larities in order to express the multidimensional Jaccard coefficient
     matrix as a 2-dimensional,  hierarchical relationship.   Sokal and Rohlf
     (1962) have developed a method of making this comparison in which
     similarity values from the  dendrogram are expressed as a matrix of
     cophenetic values.  A correlation coefficient is computed between this
     matrix and the original matrix of coefficients of association.  This
     correlation coefficient, called the cophenetic correlation, is a measure
     of the amount of distortion introduced by the clustering method.  The
     unweighted pair-group method commonly yields a higher cophenetic correla-
     tion than other clustering  methods."

     The principal difficulty in using cluster analysis, particularly in the
study of lotic and semilotic systems, is that most agglomerative clustering
algorithms construct hyperspheroidal clusters.  In some instances, opposite
sides of a stream may be less similar to each other than either side is to
upstream or downstream areas.  In such situations, longitudinal environmental
gradients may exist rather than  true clusters, and hyperspheres may be poor
descriptors.  In essence, the use of cluster analysis superimposes hyper-
spheroidal clusters onto the natural system, whether or not such clusters
express the true similarities in the natural system (Kaesler, 1970).

     If hyperspheroidal clusters do not exist, the use of cluster analysis
will result in distortion of the similarities among samples.  Fortunately, the
overall amount of distortion introduced by clustering can be measured by the
coefficient of cophenetic correlation, r   (Sokal and Rohlf, 1962; Farris,
1969; Kaesler, 1970).  Use of the coefficient provides a means of quantifying
the uncertainties about applying cluster analysis to a particular set of data.
Values of r   larger than 0.8 usually indicate that serious distortion has not
been introduced.  The larger the matrix of similarity coefficients, the
greater the chance of introducing distortion of similarities that may go
undetected.

4.1.2  Ordination by Nonmetric Multidimensional Scaling

     Ordination is an exploratory analytical technique often used when dealing
with large sets of data.  In aquatic ecology, ordination involves plotting
either the samples or the taxa found in the samples in a two- or
three-dimensional scatter diagram.  The choice of axes on which the points are
plotted depends on the method of ordination selected.  In the most successful
ordinations, either the samples are plotted in a space where the axes are
taxa, or the taxa are plotted in a space defined by the samples.  The primary
advantage of ordination over cluster analysis is that it "allows one to
examine a scatter diagram displaying a summary of the structure of the data


                                      34

-------
without having to first assume that clusters are present" (Rohlf, 1970).   In
this sense, ordination is a powerful alternative to cluster analysis.

     Several techniques of ordination are available, including principal
component ordination, principal coordinate ordination, the Bray-Curtis polar
ordination technique (Whittaker, 1975), and nonmetric multidimensional scaling
(MDS) (Kruskal, 1964a & 1964b).  Only MDS was dealt with in this project
because it seems to be ideally suited to the kinds of data obtained from
biological surveys.  Specifically, MDS is computationally robust when data are
missing; it can accommodate quantitative, ranked, or presence-absence data;
and it can be used with any measure of correlation, similarity, or distance.

     In contrast, principal component ordination is usually inappropriate with
missing data and operates only on matrices of correlation coefficients.
Principal coordinate ordination is not hampered by missing data and operates
on a distance matrix, but it gives results that are proportional to principal
component ordination when no data are missing.  Thus, principal coordinate
ordination is likely to be redundant when used with principal component
ordination.

     An important advantage of MDS, according to Sneath and Sokal (1973), is
that "it seems better than principal component analysis in giving balance
between the large intercluster distances and the fine differences between
members of a given cluster."  That is, MDS is a reasonable compromise between
cluster analysis, which recognizes fine, intracluster distances, and principal
component ordination, which gives a better indication of large intercluster
distances than of fine intracluster distances.

     The following discussion of MDS is based on the work of Green and
Carmone (1970) and Rohlf (1972).  Principal component analysis, principal
coordinate analysis, and the ordinations developed from them operate in such a
way as to find dimensions or components that explain the greatest amount  of
variance or scatter in the data.  The first principal component explains  the
most variance.  The second principal component explains as much of the
remaining variance as possible, and  it is perpendicular to and uncorrelated
with the first component.  If any variance remains, additional components are
computed and arranged in the same manner described for the first two.   In
practice, the investigator usually settles for three principal components
because three is the greatest number that can be expressed graphically
although Green (1979) has urged caution because much of the variation in a
data set "may be irrelevant to the purposes of the study."

     Table 9 shows data on proportional faunal composition of twelve samples
from a hypothetical ecosystem containing ten species.  Table 10 shows a
distance matrix computed from the data in Table 9 after they were transformed
by means of an arcsine transformation.  The distance matrix is in the Q-mode
and thus shows similarities among samples based on their faunal composition.

     Figure 3 shows an ordination that resulted from one-dimensional nonmetric
MDS, in which the stations are arrayed along a line.  The stress equals 0.230,
which indicates a fit that is less than fair.

     Figures 4 and 5 show the results of nonmetric MDS in two dimensions.  The
figures show ordinations, respectively, after 1 and 20 iterations.  The stress

                                      35

-------
                                                     H Jl                 GEDBCA
Figure 3.  One-dimensional Q-^mode ordination of hypothetical  samples  iji Table 9 computed
           with nonmetric multidimensional scaling,  nine  iterations  (stress  = 0.230).

-------
                                              • H
                                                          .B
                                                          'D
                                                           'C
Figure 4.  Two-dimensional Q-mode ordination of hypothetical samples in
Table 9 computed with nonmetric multidimensional scaling, one iteration
(stress = 0.097).
                                            •H

                                          I
                                                             •A
                                                          •B

                                                     "6  D*C
                           •>          .                 •
Figure 5.  Two-dimensional Q-mode ordination of hypothetical samples in
Table 9 computed with nonmetric multidensional scaling, 20 iterations
(stress = 0.051).
                                37

-------
of these ore) i naLJ oris is, respectively,  0.097 and 0.051.   In the write-up for
the NT-SYS programs, Rohlf evaluated stress as follows (modified from Kruskal,
1964a):

                  STRESS            GOODNESS OF FIT

                   0.40                  Poor
                   0.20                  Fair
                   0.10                  Good
                   0.05                  Excellent
                   0.00                  "Perfect"

Because our hypothetical data set is small and well structured, the
goodness-of-fit is good to excellent.  As a result, little change in stress or
in the ordination itself is seen as the number of iterations increases.
Slight changes in the configuration of the ordination can be observed,
however, particularly within cluster ABCDEG and as station F gradually moves
away from that cluster.

     The ordination in Figure 6 is nearly perfect.  It is a three-dimensional
ordination in which the vertical axis has been greatly exaggerated.  Note that
samples A, B, C, D, E, and G are grouped together and that sample F is close
on the two horizontal axes.  The wide separation of sample F from samples A,
B, C, D, E, and G on the vertical axis results partly from the exaggerated
vertical scale and partly from its low similarity to the other samples
(Table 13).  The other five samples (H, I, J, K, and L) are arrayed linearly
more than they are grouped into distinct clusters.

     For comparison the results of a Q-mode cluster analysis of the same data
set  (Tables 9 and 10) are shown in Figure 7.  Samples A, B, C, D, E, and G
form a compact cluster, as do samples H and I and samples J, K, and L.  Sample
F is similar to cluster ABCDEG, but it is not closely similar to all members
of that cluster.  (Refer to Table 10 for actual, unaveraged similarities.)

     The use of principal component and principal coordinate analysis in an
ecological context generally consists of representing the samples collected in
a space of reduced dimensionality (Q-mode) where the axes of the space are
composites of species that explain as much of the sample variance as possible.
Alternatively, one may plot the species found in the study in a space where
the  axes are composites of samples (R-mode application).  Interpretation of an
ordination based on either of these methods  (principal component or principal
coordinate analysis) involves two steps:   (1) the explanation of the
configuration of points (the arrangement of  samples in a reduced species-space
or of  the species in a  reduced reification of station-space), and  (2)
reification of the axes and relating them to  some important environmental
factor, such as degree  of environmental stress, season, or stream gradient.
In general, the development of computational  methods for multivariate analysis
is far ahead of the development of means of  interpreting the results.

     Nonmetric MDS  takes an entirely different approach to ordination.
Similarities or distances between samples  are treated on the ordinal  scale;
that is,  they are ordered from smallest to  largest.  As a  result,  a
configuration is  found  in which "rank  order  of  (ratio scaled)  distances best
produces  the original  input  ranks.   One tries to  do this in the  lowest


                                       38

-------
                                  K
                                                              H
                                                                               B!
D
Figure 6.  Three-dimensional Q-mode ordination of hypothetical samples in Table 9 computed
           with nonmetric multidimensional scaling, 43 iterations (stress = 0.001).

-------
                            D
Figure 7.  Dendrogram computed from Q-mode cluster analysis of a matrix of
distance coefficients (Table 10) showing faunal similarities between
hypothetical samples in Table 9.

-------
dimensionality that produces a 'close enough' ordinal fit" (Green and Carmone,
1970).  The important point is that MDS operates on ranked similarities and
distances rather than on actual similarities.  Thus, MDS is perfectly
applicable to matrices of similarity and distance that are computed from
ranked data or even presence-absence data.

     For a specified number of dimensions, chosen in advance by the investi-
gator, the computer programs "try to find a configuration of points whose
interpoint distances are monotone—that is, have the same (or possibly the
inverse) ranks as the input data" (Green and Carmone, 1970).  The coordinates
of this new configuration are the values used, in the ordination.  In practice,
perfect configurations are unusual.  The measure of departure from
monotonicity is called stress.  The higher the stress, the less nearly perfect
the degree of monotonicity.

4.1.3  Indices of Species Diversity

     Margalef (1956) proposed the use of indices derived in information theory
for analysis of multispecies communities.  Such use is appropriate where
diversity is equated with the uncertainty that exists as to what species will
be found when a single organism is selected at random from the community.  The
greater the number of species present in a community and the more even their
distribution, the greater the degree of uncertainty and, thus, the larger the
associated species diversity.  Information content is a measure of uncertainty
and is thus a reasonable measure of diversity as well (Pielou, 1977; Kaesler
and Herricks, 1977).  The three most commonly used indices of species diversity
from information theory are Shannon's index (H*) (Shannon and Weaver, 1949),
the approximate index (H") (also called d by Wilhm and Dorris, 1968), and
Brillouin's (1962) index (H).
     H'   =  - I  p. log  p.
     H"   =  - I  (Ni/H) loge (N./N)
             1             N!
                  8
             N     e N '  N '  ... N !
                      12       s
where p. is the probability of selecting at random a member of the .ith
species, N. is the number of individuals of the i.th species in a collection, N
is the total number of individuals of all species in a collection, and s is
the number of species.  Logarithms to the base e are recommended, although
some authors have used the base 2 or the base 10 (Wilhm and Dorris, 1968).  It
is necessary to specify which logarithmic base is used because the value of
diversity is quite base-dependent.

-------
     Pielou (1966, 1969, 1975, 1977) and Kaesler and Herricks (1977), in the
context of applied aquatic biology, have stressed that Brillouin's index H is
the appropriate one for use with fully censused collections of organisms,
especially from biological surveys.  H*  cannot be used because our only
knowledge of the p.'s must come from samples.   H",  which uses data from
samples, is a biased estimator of H, always giving too high a value.  H, on
the other hand, gives the actual species diversity of a fully censused
collection of organisms.

     In the past, Brillouin's H has been difficult to compute because the
factorials involved usually become astronomically large, and even their
logarithms are difficult to handle.  The ready availability of high-speed
digital computers, however, obviates any perceived need to use the approximate
diversity H".

     In addition to the traditional use of Brillouin's equation for species
diversity, diversity was also partitioned into components of diversity using
two proposed classification schemes contributed at several levels of three
hierarchical classifications, the taxonomic hierarchy, a trophic-functional
hierarchy, and a head-body-respiratory functional morphological hierarchy
(Tables 11, 12, and 13).

4.2  DESCRIPTION OF DATA SETS

     To evaluate and delineate the variations  and the usefulness of the three
described analytical techniques, three representative sets of zoomacrobenthic
data were selected for testing.  These are referred to as (1) the Clinch River
data set (acute stress); (2) the Cumberland River data set--1973 (chronic
stress, high flow); and (3) the Cumberland River data set--1975 (chronic
stress, low flow).  Zoomacrobenthic data sets  alone were utilized in part
because Kaesler and Cairns (1972) noted a great deal of redundancy among
information derived from the different groups  of organisms that are commonly
studied as a part of biological surveys.  They concluded that the distribution
of aquatic insects often is representative of  the total biota in lotic
environments.

4.2.1  Clinch River Data Set

     The Clinch River data set includes information on the fauna collected
immediately before and after an accidental release of concentrated sulfuric
acid.  The spill resulted in an acute stress that killed an estimated 5300
fish.  A cursory examination by the Virginia State Water Control Board
indicated that stream damage was confined to a 22-km section of the stream,
starting 1.5 km downstream from the power plant and extending to St. Paul,
Virginia, a distance of 22 km (Soukup, 1970).   This conclusion was later
substantiated by Grossman et al. (1973, 1974).

     Six similar riffle-pool habitats were sampled:  one upstream control
station, four stations in the affected area, and one station downstream from
St. Paul, Virginia (Figure 8).  Station 4, 2.5 km upstream from the site of
the spill, served as a control station.  Stations 7, 8, 9, and 10 were located
downstream from the power plant at 3- to 10-km intervals within the affected
area.  Station 11 was located 11 km downstream from station 10 and was used to
substantiate whether the effects of the spill  were restricted to the section
of the river from the power plant to St. Paul, Virginia.

                                      42

-------
Figure 8.  Map of the Clinch River in Virginia and Tennessee showing  the
           locations of stations sampled during the 1970 zoomacrobenthic
           survey.

-------
     Stations 4, 7, 8, 9, and 10 were sampled 12 to 48 hours before the spill
as part of a survey initiated the previous year that also included numerous
other upstream and downstream stations (Crossman et al.,  1973).    Stations 7,
8, 9, 10, and 11 were sampled again immediately after notification of the
spill to determine the extent of the damage to the zoomacrobenthic community.
Sampling continued at two-week intervals for the next 56  days at stations 7,
8, 9, and 10.  Stations 4 and 11, the upstream and downstream control
stations, were sampled every four weeks.  Physical and chemical  data from the
power plant and in situ measurements taken in conjunction with biological
sampling are summarized in Table 14.  Daily measurements  of flow from a U.S.
Geological Survey (USGS) gauging station 7 km upstream from the  power plant
are presented in Figure 9.  Descriptions of habitats, including  information on
station location, substratum, width, depth, stream gradient, and riparian
vegetation, are summarized in Table 15.

     Water quality and stream discharge remained essentially unchanged during
the June-August study period (Table 14 and Figure 9).  Habitats  at each
station were also similar, as evidenced by the descriptive data  summarized in
Table 18.  The only major difference was found at station 7.  Effluents from
the power plant channeled along the right bank at this station,  resulting in
an alteration of the natural substratum.

4.2.2  Cumberland River Data Sets —1973 and 1975

     Two sets of data were utilized from the Cumberland River (a
river-reservoir environment).  The study area was the headwater  region of Old
Hickory Reservoir (mean width, 0.9 km),  with depths at the different stations
varying from 0.9 to 10 m.  The major water user within the study area was a
steam electric generating plant located on the north bank of the reservoir
(Figure 10).  Unlike the study area on the Clinch River,  this site was
affected by the discharges from the power plant's once-through cooling system,
which was essentially constant through time.  Data were collected from June to
October 1973, in January 1975, and from April to September 1975.

     Several factors complicated the collection and analysis of  the Cumberland
River data:

     1.   The size and shape of the thermal plume varied seasonally and
          yearly, at times moving upstream over areas previously designated as
          controls.

     2.   The flow in the river varied markedly from year to year.  For
          example, during the 1973 study, the median daily average flow did
          not drop below 5000 cfs for any week, whereas in 1975, there were
          nine weeks in which median weekly flows were less than 5000 cfs.

     3.   Some of the stations were located in shallow overbank areas of the
          reservoir, whereas others were located in the channel (Figure 10).

     Rather than sample all benthic habitats, sampling was limited to the
predominant silty clay substratum.  An a priori assumption was made that the
response exhibited by the macrobenthic community inhabiting the silt-clay
substratum was representative of the total macrobenthic community.
                                      44

-------
             ISO
                                    ~i	1	1	1	1	1	1	1	1	1	1	1	1	1	r
            X



            0
             100
.p-
l/l
tc
llj
              50
           (_>
           
           o
                          10    15    20    25  ~3of

                             JUNE              '
                                                                            10    15    20   25   3O

                                                                               AUGUST
                Figure 9.  Stream  discharge of the Clinch River at the  United States Geological  Survey

                           gauging station at Cleveland,  Virginia, June-August 1970.

-------
                           Station  I
                           CRM 243.8
                           I0-I5ft.
Station  7
CRM 244.6
18-24 ft.
                                 Station  3
                                 CRM 241.7
                                    10 ft.
                     Station  6
                     CRM  240.9
                         10ft.
                                                                    Station  2
                                                                    CRM 242.5
                                                                       8ft.
                                                               Station  4
                                                               CRM  241.7
                                                                  30ft.
Station  5
CRM  240.9
    30ft.
   Figure 10.  Location of stations in the vicinity  of a power  plant  on
               the Cumberland River.
                                     46

-------
          TABLE 9.  HYPOTHETICAL DATA SHOWING PROPORTIONS OF 10 SPECIES AT 12 STATIONS
Species
number
1
2
3
4
5
6
7
JN
^1
8
9
10
Station
A
0.110
0.090
0.100
0.100
0.100
0.100
0.100


0.100
0.100
0.100
B
0.200
0.100
0.100
0.100
0.100
0.100
0.100


0.100
0.050
0.050
c
0.200
0.200
0.100
0.100
0.100
0.100
0.100


0.100
0
0
D
0.200
0.200
0.100
0.100
0.100
0.100
0.050


0.050
0.050
0.050
E
0.250
0.250
0.150
0.050
0.050
0.050
0.050


0.050
0.050
0.050
F
0.250
0.250
0.250
0.250
0
0
0


0
0
0
G
0.300
0.100
0.100
0.100
0.100
0.100
0.100


0.100
0
0
H
0.400
0.050
0.050
0 '
0
0
0


0.050
0.050
0.400
I
0.500
0.050
0.050
0
0
0
0


0.050
0.050
0.300
J
0.600
0.100
0.100
0.100
0.050
0.010
0.010


0.010
0.010
0.010
K
0.750
0.050
0
0.050
0
0.050
0


0.050
0
0.050
L
0.910
0.010
0.010
0.010
0.010
0.010
0.010


0.010
0.010
0.010
For all subsequent computations,  these data were transformed with an arcsine transformation.

-------
              TABLE 10.   MATRIX OF DISTANCE COEFFICIENTS COMPUTED FROM THE DATA IN TABLE 9
                                      AFTER ARCSINE TRANSFORMATION
oo

Station
A
B
C
D
E
F
G
H
I
J
K
L

A
0
0.037
0.064
0.055
0.082
0.124
0.076
0.154
0.162
0.181
0.244
0.337

B

0
0.039
0.039
0.064
0.108
0.040
0.149
0.147
0.150
0.215
0.308
Station
CDEFGHIJKL


0
0.032 0
0.050 0.039 0
0.096 0.090 0.082 0
0.046 0.056 0.068 0.106 0
0.169 0.153 0.147 0.186 0.152 0
0.163 0.151 0.141 0.178 0.138 0.049 0
0.153 0.149 0.138 0.150 0.119 0.154 0.111 0
0.219 0.217 0.206 0.225 0.184 0.182 0.134 0.080 0
0.312 0.310 0.297 0.312 0.276 0.265 0.219 0.166 0.098 0

-------
     TABLE 11.  GENERALIZED TROPHIC,  FUNCTIONAL CLASSIFICATION OF
      ZOOMACROBENTHIC INVERTEBRATES (ADAPTED FROM CUMMINS, 1973)
Level in
hierarchy
     Name
             Subdivision
              Functional group
    II
Feeding mechanism
   III
    IV
Dependence
Food habit
Shredders (vascular plant tissues)

Collectors (detrital materials)

Grazers (Aufwuchs)

Predators

Parasites

Chewers and miners

Filterers (suspension feeders)

Gatherers (sediment or deposit
             feeders)

Scrapers

Chewers and suckers

Swallowers and chewers

Piercers

Attachers

Obligate

Facultative

Herbivory

Detritivory

Carnivory

Omnivory
                                    49

-------
   TABLE 12.  HIERARCHICAL CLASSIFICATION OF THE TROPHIC-FUNCTIONAL
        ROLE OF ORGANISMS; INCLUDES ONLY THOSE CATEGORIES THAT
	OCCURRED IN SAMPLES	

Functional group    Feeding mechanism   Dependence •   Food habit   Code

                                                    1-Herbivory    1111
1-Shredders
                                      1-Obligate
1-Chewers & miners
                                                    2-Detritivory  1112
                                      2-Facultative 4-Omnivory     1124
2-Collectors
                  2-Filterers
3-Gatherers
                  6-Swallowers &
                     chewers
1-Obligate    2-Detritivory  2212

2-Facultative 4-Omnivory     2224


1-Obligate    2-Detritivory  2312

2-Facultative 2-Detritivory  2322



1-Obligate    2-Detritivory  2612
3-Grazers
                                      1-Obligate
                                                    1-Herbivory    3411
                  4-Scrapers
                  5-Chewers & suckers
                                  2-Detritivory  3412


                    2-Facultative 4-Omnivory     3424

                    1-Obligate    1-Herbivory    3511

                    2-Facultative 4-Omnivory     3524
4-Predators
                  6-Swallowers &
                     chewers
                  7-Piercers
                    1-Obligate    3-Carnivory    4613

                    2-Facultative 4-Omnivory     4624


                    1-Obligate    3-Carnivory    4713
                                     50

-------
  TABLE 13.  HIERARCHICAL CLASSIFICATION AND NUMERICAL CODES ASSIGNED
         FOR ZOOMACROBENTHIC INVERTEBRATES BASED ON FUNCTIONAL
    MORPHOLOGY:  HEAD POSITION, BODY SHAPE, AND RESPIRATORY ORGANS
Level in
hierarchy	Name	Subdivision	
                 Head position
                   (feeding category)
                         Hypognathous
                         Prognathous
                         Opisthorphynchous
                         Vestigial or other
  II
Body shape
  (current of stream)
Flattened irregular
Flattened oval
Flattened elongate
Compressed laterally
Cylindrical
Elongate
Short, compact
Fusiform
Irregular
Hemicylindrical or
  subtriangular
 III
Respiratory organs
  (substratum)
Simple filamentous gills
Compound filamentous gills
Platelike gills
Operculate gills
Leaflike gills or organs
Respiratory dish
Respiratory tube
Spiracular gills
Caudal chamber
Plastron
Body integument
Tracheal respiration
                                    51

-------
   TABLE 14.  RANGE AND MEAN OF PHYSICOCHEMICAL DATA FOR THE CLINCH RIVER,
	JUNE TO SEPTEMBER 1970	


	Characteristic	Range	Mean

pH                                          7.4-8.9                       8.2

Temperature, °C                            14.4-28.3                    21.1

Dissolved oxygen, mg/1                      4.6-9.8                       7.4

Total hardness, mg/1 as CaCO                127-198                       155

Conductivity, |_mihos/cm                      150-305                       246
                                      52

-------
       TABLE 15.  DESCRIPTIONS OF HABITATS AT STATIONS 4,  7,  8,  9, 10, and
                          11 ON THE CLINCH RIVER,

Station
Characteristic
River kilometer
Mean depth, m
Mean width, m
Stream gradient, m/km
Composition of
substratum (percent)
Bedrock
Rubble
Gravel
Sand
Maximum rooted vegetation <-
Dominant streams ide
vegetation
4
934
0.4
60
1.1
5-40
40-75
10-20
10-15

W
7
430
0.4
65
1.0
5-60°
20C-75
10-15
10-15
Restricted I
W
8
427
0.4
25
1.7
10-30
40-70
10-15
10-15

C
9
423
0.4
60
0.3
10-20
40-40
10-15
5-10

C
10 11
413 402
0.4 0.4
65 60
0.3 0.9
0.10 5-15
70-80 40-70
5-20 15-30
0-5 10-15
-Moderate (,10-jU/t) >
W C

 Adapted from a classification developed by Pennak (1971).

 Because each station was sampled  along the right  bank,  left  bank,  and
 midchannel,  the composition is expressed as a range.

"The right bank area of station 7  was  a solid,  calcareous  substratum.

 Symbols:   W = woodland;  C = combination of woodland  on  one side  of the
 stream and brush with herbs and grasses on the other  side.
                                  53

-------
                                  SECTION 5

                              CLUSTER ANALYSIS
5.1  GENERAL DESCRIPTION

     Only in recent years has cluster analysis been applied in aquatic ecology
and problems related to water pollution.   Cairns and Kaesler (1969, 1971),
Cairns et al. (1970), Roback et al.  (1969), Kaesler and Cairns (1972), and
Kaesler et al. (1971) were the first to use cluster analysis to evaluate the
impact of a power plant on a river environment.  In a similar study Crossman
et al. (1973, 1974) used cluster analysis to describe the response of
zoomacrobenthic communities to pH stress and the recovery of a stream from a
spill of hazardous materials.  Stephenson and Dredge (1976) and Stephenson
et al. (1976) also used numerical classification and other methods of analysis
to estimate the impact of construction on estuarine fisheries and macrobenthic
communities.

     The purpose of cluster analysis is to produce a classification that
expresses the degree of similarity between the items being classified.  The
major application of cluster analysis in stream surveys has been to determine
the similarities between stations or samples on the basis of their contained
biotas.  Such analysis is referred to as Q-mode analysis.  The use of Q-inode
analysis, of course, is not limited to biological applications.  It can also
be used to compare habitats on the basis of their physical and chemical
properties.  One such application was Shannon's (1970) use of cluster analysis
to group 55 lakes in Florida according to water quality and trophic structure.

     R-mode analysis quantifies the similarities between species on the basis
of their distribution among the samples studied.  Little use has been made of
this heuristic technique, although ecologists have long recognized the need to
quantify associations of species (Forbes, 1907; Shelford, 1915).  Buchanan and
Lighthart  (1973), however were able to associate water parcels in a eutrophic
lake with  characteristic assemblages of species.  Stephenson et al. (1972)
also used R-mode comparison to determine whether Petersen communities that can
be characterized by one or two dominant species actually exist in natural
systems.

5.2  ANALYTICAL PROCEDURES

5.2.1  Selection of Similarity Coefficients

     When  considering the use of cluster analysis for summarizing data, the
investigator must first select an appropriate similarity coefficient.  In this
study 26 coefficients were tested to (1) evaluate their usefulness in
analyzing  environmental data and (2) determine which ones were highly
correlated so that redundant expressions could be identified and duplication
eliminated.  Three kinds of coefficients were tested.  Pearson's

                                     54

-------
product-moment correlation coefficient and Sokal's (1961) average taxonomic
distance were used with species count data, while a number of similarity
coefficients were used with presence-absence data.  Some of these coefficients
have been analyzed and compared previously (Simpson, 1960; Sokal and Sneath,
1963; and Cheetham and Hazel, 1969) but never in the context of aquatic
ecology.  Their equations are given in Table 16, where n is the number of
species, s_. is the standard deviation of the j^th variable, S^k is the square
root of the covariance of the j^th and kth variables, Xij is the value of the
.ith species in j^th sample, and a, b, c, and d are terms from a 2 by 2
contingency table (Table 17).

     In general, correlation coefficients give a high similarity for samples
with species present in the same proportions, whereas distance coefficients
give a high similarity (low distance) for samples with species present in the
same numbers.  The similarity coefficients vary, depending on whether negative
matches are included.  For a more detailed discussion of these coefficients,
the reader should consult Sokal and Sneath (1963).

     In addition to the different coefficients, several transformations of
quantitative data were evaluated:  standardization by rows (species),
logarithm of the abundance plus 1, and square root of the abundance plus 0.5.
Standardization is the process of transforming each row in a data matrix by
subtracting the row mean and dividing by the row standard deviation.  When a
matrix is standardized, each species has a mean abundance of 0 and a standard
deviation of 1.  Of course, it is not possible to conceptualize a species with
a mean abundance of 0, because it would necessitate negative abundances to
offset the positive ones.  The justification for standardization is that very
abundant species are given proportionally less weight.

     The logarithmic transformation of species abundance is a drastic
procedure (Table 18).  It has the net effect of reducing species abundances to
ranks and effectively eliminating the impact of very abundant species.

     The square-root transformation is a compromise between the use of
untransformed data and the logarithmic transformation (Table 19).  Throughout
the study, correlation and distance coefficients were computed on
untransformed and transformed data.  In these computations, the coefficients
were labeled as shown in Table 20.

     To compare the various coefficients and transformations, matrices of
correlation, distance, and similarity were computed for 30 zoomacrobenthic
samples.  The data were collected at six stations from June to October 1973 in
the vicinity of the Cumberland River power plant.  All correlation and
similarity matrices were then compared by using the coefficient of cophenetic
correlation, r   (Sokal and Rohlf, 1962; Kaesler, 1970; Kaesler and Cairns,
1972; and Kaesler et al., 1974).  The r   is a product-moment correlation
coefficient computed between corresponding elements of the matrices.  If the
matrices have identical or exactly proportional values, they are perfectly
correlated and r   =1.0.  Ifr   =-1.0, the values in the matrices show a
perfect negative correlation.  After the r  's were computed, they were
arranged in a correlation matrix (Table 21).   This matrix was clustered by
using the unweighted pair-group method with arithmetic averages (UPGMA), and a
dendrogram was constructed (Figure 11).  The dendrogram shows the overall
similarities in the matrices and groups those coefficients that produce

                                     55

-------
0,0
CORRELATION
                                   1,0
             I      I      I      I      I     I     I     I
                                                                   1
                                                                   7
                                                                   2
 I      I      I      I     I     »     I     I      i     I      I
 Figure 11.  Dendrogram computed  from cluster analysis of a matrix
             of coefficients  of cophenetic correlation showing
             similarity between the various correlation and similarity
             matrices  in Tables 17 and 20  (rpr, = 0.914).
                                           Lcc
                               56

-------
similar matrices.  The distance coefficients were not included because they
are negatively correlated with similarity matrices with which they are similar
because distance is the opposite of similarity.

     With 0.75 selected as an arbitrary limit for clustering, five distinct
clusters were formed (Figure 11):  (1) Corr 1 and 7; (2) Corr 2, 4, and 6; (3)
S  , H, UNI, and RT; (4) S , UN2, D, OCH, K2, UN4, RHI, UN5,  RR, and Y; and
(§7 UN3 and Kl.  One coefficient was selected for further consideration from
each of the first four clusters.  The low intracorrelations in cluster 5 and
the similarity of cluster 5 to clusters 3 and 4 suggested that it could be
ignored.  The representatives selected from the first four clusters were
Corr 7 (square-root transformation, unstandardized data), Corr 6 (square-root
transformation, standardized data), S   (simple matching coefficient), and S
(Jaccard's coefficient).

     The S   and S  coefficients were chosen from their respective clusters
because of their simplicity and their relatively widespread use in analyzing
environmental data.  Corr 7 was selected over Corr 1 because of the utility of
the square-root transformation.  Although Corr 1 and 7 gave similar results,
it was possible that some species might be present in very large numbers in
other data sets, thus giving misleading results with the product-moment
correlation coefficient.  Similarly, Corr 6 was selected over Corr 2 and 4
because of the square-root transformation.  In this case, however, the
transformed data were standardized by rows before the correlation matrix was
computed.

     A similar rationale was used to select representative distance
coefficients (Table 22).  Dist 1 and 7 were highly intracorrelated, as were
Dist 2, 4, and 6.  Dist 7 and 6 were selected as exemplars because of the
square-root transformation.  The negative correlations between the distance
matrices and the correlation and similarity matrices are also presented in
Table 22.

5.2.2  Reducing Size of Data Matrices

     After the similarity coefficients were selected, data matrices were
prepared for the Clinch and Cumberland Rivers data sets.  In the case of the
Clinch River data set, it was necessary to reduce the size of the data matrix.
If all the 1970 Clinch River data had been included (i.e., stations 1
through 21 with right bank, left bank, and midchannel substations), the matrix
would have had 248 substations and 123 taxa.  This matrix would not only be
computationally unmanageable, but the resulting dendrogram would have been
difficult to interpret (Figure 12).  Two mechanisms were used to limit the
size of the data matrices: (1) partitioning the data set into subsets on the
basis of the season the stations were sampled and (2) reducing the data set to
these samples collected only in immediate vicinity of the spill, i.e.,
stations 4, 7, 8, 9, 10, and 11.

     In order to reduce further the data set to manageable size, the number of
taxa needed to be reduced.  Relative abundance of species was selected as the
criterion upon which to base this reduction.  The rationale for using relative
abundance was based on Patrick's (1961) discussion on how to  determine whether
a diatom species was an established resident or a temporary inhabitant of a
station.  Patrick "considered those species established which were represented

                                     57

-------

H



Jj-H
H

4 ,— i 	 .
r^—1
i
A
in— i
, H
u — '
"-(

j i
"A

|H
j-H
H
, 	












Trir
iin




...— 7DHC
u ' 	 "*
ft..,,., l 	 i°a
H 1 	 lowc
1 | \ornc

f~l v
^
_| 	 .
H — '
4 1
1

i— i
4
H

H

r4 ' 	
M — '
rf
1

J
1 — 1

rl j
P— 1
,4
r Lr1
1 — (
fi— T"1111
j j i
1 •

H_i 	

1 — |

J-H , 	
H ' 	

" r1 — i

H
1 M
i





'
116HC
13CL
11CR



•.AR





'• 	 - 7FR
9CHC
1..MC










	 SCR
	 BCK




. ioa
70B
nnw-
	 9ER
	 9FNC
	 8FR

fir


^— BFL
	 9FL


	 12BMC
t- 	 UBMC


"

                                      L
J_
                                                             J
Figure 12.  Dendrogram computed from Q-mode cluster analysis of a
            matrix of Jaccard's coefficients showing faunal simi-
            larities between samples collected from the Clinch River
            in 1970; data include total insect fauna.
                                 58

-------
by six or more specimens when 8,000 or more diatoms are counted."  Since this
criterion appeared applicable to any biological community, relative species
abundance was selected as the most reasonable means of reducing the number of
species.

     The formula used for computing relative abundance (RA) was

                                 s
                        RA = N./ I  N.,
where    N. = number of individuals for the jLth species,

      s
      Z  N. = total number of organisms per station.
To determine the minimum acceptable value of RA, RA's were calculated for each
taxon collected at a station unaffected by the spill.  Four different levels
were considered, 0.10, 0.05, 0.01, 0.005.  The 0.10 and 0.05 levels were too
exclusive.  Emphasis was therefore placed on RA values of 0.01 and 0.005.

     An RA of 0.01 reduced the number of taxa from 123 to 29 taxa (Table 23).
Representatives of each major taxonomic group normally found in the stream
were present.  To determine whether major functional groups remained
represented, trophic, functional designations were assigned (refer to
Tables 11 and 12, section 4.1.3) to each taxon.  With the exception of the
shredders, every functional group had at least five representatives at the
0.01 level of discrimination (Table 24).

     In addition to determining whether major functional groups were present,
numerical importance was also considered.  Taxa with an RA X).01 accounted for
38 to 64 percent of the total number of taxa found at each station (Tables 25
and 26) and 82 to 99 percent of all the organisms found at each site
(Tables 27 and 28).  While species with an RA <0.01 were important in
determining total diversity they were not considered a major component of the
stream's macrobenthic community and were, therefore, treated as outliers and
deleted from, further cluster analyses.

     At the RA = 0.005 level of discrimination, the number of taxa increased
from 29 to 37 taxa.  Since this was an increase of only 8 taxa, the 0.005
level of discrimination was not considered further.

     After the taxa to be included had been selected, three data matrices were
considered for the Clinch River data set:

1.   Those stations affected by the pH stress—stations 7RB ,  8, 9, and 10
     after the spill on June 19,  1970.

     Those stations unaffected by the :
     throughout the study and stations  8,  9,  and 10 before the spill.
2.   Those stations unaffected by the spill—stations 4, 7LB&MS ,  and 11
1Station 7 was divided into 7RB (right bank)  and 7LB&MS (left bank and
 midstream because effluents from the power plant flowed along the right
 bank.
                                     59

-------
3.   A composite data matrix that considered each station sampled.

The analytical techniques, types of comparison, and similarity coefficients
used to analyze each data set and the number of dendrograms and scatter
diagrams computed are listed in Table 29.

5.2.3  Evaluation of Distortion

     The amount of distortion introduced by the clustering procedure, was
measured by the coefficient of cophenetic correlation (r  ).   If values were
>0.8, it was assumed that no serious distortion was introduced by the
clustering procedure.

5.3  RESULTS

5.3.1  Q-Mode Analysis

5.3.1.1  Clinch River Data Set—Of the six Q-mode dendrograms developed from
the Clinch River data set, four had acceptable values of r   X).8 (Table 30).

     5.3.1.1.1  Presence-absence data—The dendrogram from cluster analysis of
the matrix of Jaccard's coefficients had the least distortion of any of the
Q-mode dendrograms (r   = 0.93).  Two clusters were formed at a similarity
level of 0.62 (Figure 13).  The first cluster contained 22 samples,  the second
had 6 samples, and eight stations were left unclustered.  The first cluster
was dominated by samples from stations that were either unaffected by the
spill or had recovered from the pH stress (Table 31).  The second cluster
consisted of samples from stations 8, 9, and 10, the stations impacted by the
spill.

     A sequence of stream degradation and recovery is indicated by the
clusters in Table 31.  Station 8, the downstream station nearest the power
plant and the most severely impacted site, had only three samples in the first
cluster, samples collected before the pH stress and in August, six and eight
weeks after the spill.  Station 9, the next station downstream, had four
samples in the first cluster.  They were the same as those for station 8, but
also included the late July sample collected four weeks after the spill.  At
station 10, the farthest downstream station affected by the spill, only the
sample collected immediately after the pH stress was missing from the cluster.
This indicated that the farther downstream a station was located, the less
severe the impact and the faster its recovery.

     The six samples in the second cluster are from stations 8, 9, and 10
after the spill.  Their clustering indicates that their zoomacrobenthic
communities were similar immediately after the spill.  The severity of the
impact and the time required for recovery depended on the station's proximity
to the site of the spill.  This cluster also indicates that the Clinch River's
zoomacrobenthic community had a high natural resiliency or ability to recover
from an acute stress.

     The eight unclustered samples were those collected from stations 7RB and
station 7LB&MS in August.  As noted in the description of the sites, effluents
from the power plant channeled along the right bank, resulting in a chronic
stress.  As a result, the low similarities of samples from station 7RB to
other samples were not unexpected.  The failure of samples from station 7LB&MS
                                     60

-------
    0.090 0.240
     I	1—
0.39O   0-640
 0.690
H—
 0.840
-I—
 0.990
-I	
                                                         4-6/17
                                                         4-6/21
                                                         10-6/19
                                                         11-6/22
                                                         8-6/18
                                                         9-6/18
                                                         4-8/15
                                                         11-8/18
                                                         7-6/12
                                                         7-6/23
                                                         7-7/7
                                                         7- 7/20
                                                         8-8/6
                                                         10-8/7
                                                         9-7/22
                                                         9-8/6
                                                         9-8/17
                                                         10-7/23
                                                         10-8/17
                                                         11-6/23
                                                         8-8/16
                                                         10-7/10
                                                         7-8/5
                                                         8-6/23
                                                         10-6/24
                                                         8-7/8
                                                         8-7/21

                                                         9-6/24
                                                         9-7/8
                                                         7-8/16
                                                         7- 6/12
                                                         7-7/20
                                                         7-6/23
                                                          7-8/16
                                                          7-7/7
                                                         7-8/5
    0.090 0.240  0.390  0.540   0.690   0.840   0.990
Figure 13. Dendrogram computed from Q-mode cluster analysis of a matrix  of
          Jaccard's coefficients showing faunal similarities between 36
          zoomacr obenth ic samples collected from stations 4, 7, 8, 9, 10,
          and 11 in the Clinch River, 1970 (rcc = 0.93).
                                 61

-------
to cluster with the unaffected stations was not expected, however, and the
only possible explanation was an increase in the size of the discharge plume
during the period of extremely low stream flow in August.

     The dendrogram resulting from use of the simple-matching coefficient
(S  ) produced an unacceptable level of distortion (r   <0.8) and is not
discussed here.

     5.3.1.1.2  Quantitative data, counts of species—The Corr 6 and Dist 6
dendrograms had acceptable r  's of 0.82 and 0.91.  The resulting clusters,
however, had very low levels of similarity and were uninterpretable.  For
example, at the 0.06 and 1.2 levels of similarity for the Corr 6 and Dist 6
dendrograms, samples from the upstream control station clustered with the
stations impacted by the spill (Tables 32 and 33).

     The Dist 7 dendrogram had an r   of 0.80.  Since it is computed from a
data matrix transformed by the square-root transformation, the level of
similarity was expected to be low.  At a level of similarity of 4.8, three
clusters were formed (Table 34).  The first cluster was dominated by 11
samples from the control stations and from stations 7 through 10 before the
spill.  The second cluster consisted of 20 samples, from stations 7 through
10, with station 7LB&MS clustering with the most severely impacted stations.
These results were contrary to previous findings by Crossman (1973, 1974) and
suggest that dendrograms computed from correlation and distance coefficients
should be interpreted with caution.

5.3.1.2  Cumberland River Data Set—1973--

     5.3.1.2.1  Substrate—Stations on the Cumberland River were established
in areas with sediments of similar texture.  The intent was to sample only
substrates dominated by very fine sand to clay.  Cluster analysis of the
particle size data using various coefficients showed similarity of the
substratum at the various stations.

     Cluster analysis of a matrix of correlation coefficients (Figure 14),
however, shows changes in sediment composition with time, especially in
overbank areas (level of similarity 0.7).  A higher level of similarity, 0.82,
shows the trend even more strongly with almost all samples collected in
September remaining unclustered, whereas samples collected in August tended to
join clusters.  At both levels of similarity, it was evident that the
substratum was more consistent (1) in the channel than in overbank areas and
(2) in downstream areas than in upstream areas around the intake and discharge
canals.

     Cluster analysis of a matrix of distance coefficients showed the same
(Figure 15):   (1) August samples clustered; (2) September samples generally
remained unclustered, especially in overbank areas; and  (3) downstream and
channel stations generally had more consistent substratum composition than
upstream and overbank stations.

     5.3.1.2.2  Zoomacrobenthos--Six different similarity coefficients (two
presence-absence and four quantitative) were used to examine the total
macrobenthic data set for 1973.
                                     62

-------
1 ' 1
0


















I.I,



















1
1


















1
1













































—









1
1


1 1
1
	 	 1 l,m
r i, ,i
L) — JUL
r 5— JUN
— t ij-AuG
1 ._ r A II,-
l 	 „ /i |m
r ° c D
O--OEP
C3~JUN
9-.ini
1 — 6—AuG
r~ 3— JUL
1 ^ 3— AUG
Mr I--AUG
L 2-AuG
' 	 4— SEP
r °rr
J — oEP
4 JUN
r 6— JUN
L 6— JUL
	 i i . . .
i — JUL
	 2— SEP
	 -7 o
J — ObP
1 rcrn
I--JEP
I
Figure 14.  Dendrogram computed from Q-mode cluster analysis of a matrix of
            correlation coefficients computed from proportions of each phi
            size in the substrate after arcsine transformation; shows simi-
            larity of substrate between samples collected from the Cumberland
            River in 1973 (rcc = 0.760).
                                   63

-------
                      Distance
6.0
           I     I     I
 I
0.0
                                                               I--JUN
                                                               5-JuL
                                                               5--JUN
                                                               4— AUG
                                                               5— AUG
                                                               4--JUL
                                                               3--JUL
                                                               3— AUG
                                                               1— AUG
                                                               4--SEP
                                                               2--AuG
                                                               S--SEP
                                                               3— JUN
                                                               2--JUL
                                                               6— AUG
                                                               G--JUN
                                                               6--JuL
                                                               G--SEP
                                                               I--JUL
                                                               2--SEP
                                                               3--SEP
                                                               I--SEP
 Figure 15.  Dendrogram computed from Q-mode cluster analysis of a matrix of distance
           coefficients computed from proportions of each phi size  in the substrate
           after arcsine transformation; shows similarity of substrate between
           samples collected from the Cumberland River  in 1973.
                                 64

-------
     5.3.1.2.2.1  Presence-absence data--Jaccard's coefficient (S ) produced a
dendrogram with an acceptably low level of distortion (r   = 0.85).  At
similarity levels of 0.5 and 0.6, samples from stations 2 and 3 tended to form
clusters (Figure 16).  These stations were located at the discharge and on the
overbank area below the discharge.  Moreover, samples collected in September
and October tended to form clusters except for samples from stations 2 and 3.

     The simple-matching coefficient (S_M) produced higher similarities than
Jaccard's coefficient because a number of samples  contained rare species
(i.e., species that did not occur in more than one or two other samples).  The
large number of negative matches increased overall similarity.  A more
thorough discussion of the results obtained with the simple-matching
coefficient is not warranted because it included an unacceptable level of
distortion (r   <0.8).

     5.3.1.2.2.2  Quantitative data, counts of species—The distance
coefficient Dist 7 formed clusters of samples at a high level of similarity
and with an acceptably low level of distortion (Figure 17).  Two groups of
samples appear as distinct clusters at a level of  similarity of 2.3:
(1) samples collected from station 2 during the summer and from station 1
during the late summer and fall (The temperature at station 1 in late summer
and fall was similar to that at station 2 in early summer as a result of
warmer weather and possible upstream migration of  the thermal plume.); and
(2) samples collected at all downstream stations in the month of September.
At a higher level of similarity, samples from stations 1 and 2 remained
clustered, but two new clusters appeared:  (1) samples from stations located
in overbank areas only and (2) samples predominantly from the channel stations
and entirely from the stations downstream from the plant.

     With the correlation coefficient Corr 7, clusters were formed at
extremely high levels of similarity (Figure 18).  At a level of 0.9, for
example, three clusters were formed.  The first cluster comprised only samples
collected from stations 1 and 2 in June, a similarity that cannot be explained
by temperature alone because AT between these two  stations in June was 5.6 C.
The biotic similarity may have resulted from similarity of substrate.  The
second cluster contained 57 percent of the samples.  The only noticeable trend
was the complete absence of samples from September, during which time the
lowest flows and highest temperatures were recorded.  Cluster 3 contained only
two samples (3-SEP and 3-OCT), indicating a distinct biota during the autumn
at station 3.

     5.3.1.2.3  Summary--A thorough discussion of  results obtained with Corr 6
and Dist 6 is not warranted because use of these two coefficients produced
unacceptable levels of distortion (r   <0.8).  In  addition, use of these
coefficients produced dendrograms with less structure and lower similarities
than those of their counterparts (Corr 7 and Dist  7), indicating that the
standardization of data may increase distortion and decrease the structure and
similarity level of dendrograms.

     A summary of the cluster analyses for the 1973 data set is presented in
Table 35.  Three of the six coefficients had acceptably low levels of
distortion (S_, Dist 7, and Corr 7).  They also produced dendrograms that were
readily interpretable, and all three indicated that both station location and
the month in which the samples were collected were causes for association.

                                     65

-------
I
0.2
           1
         0.0
0.2
                         0.4
                              0.6
 1
0.8
I
.
I
                     1



 1
 1
1.0
                                                                 1—JUN
                                                                 3—JUN
                                                                 6—JUN
                                                                 4--JUN
                                                                 4—AUG
                                                                 3-JuL
                                                                 3—AUG
                                                                 3-Oc.T
                                                                 6—AUG
                                                                 if—OCT
                                                                 5-Oc.T
                                                                 1-Oc.T
                                                                 6—OCT
                                                                 1—AUG
                                                                 5—AUG
                                                                 5-
                                                                 2-
                                                                 2-
                                                                 H-
                                                                 5-
                                                                 2
                                                                 2-
                                                                 1-
                                                                 6-
                                                                 5-
                                                                 H-
                                                                 3-
                                                                 2-
                                                                 6-
                                                               •JUL
                                                               • ^ C D
                                                               O d i

                                                               -JUL
                                                             —AUG
                                                               •SEP
                                                               -SEP
                                                               -JuN
                                                               -Jut
                                                               •JUN
                                                               •JUL
 1
Figure 16.   Dendrogram computed from Q-mode cluster analysis  of a matrix of Jaccard's
           coefficients showing faunal similarities between  samples  collected from
      the Cumberland River in
                                1973 (r
                                   66
                                 cc
                              = 0.852).

-------
                                  DIST  7
6.0































|
— T 	































1
1 1































I . I
1 1































i I






^•B


























I 	































1


































































1 1 ' 1
0.0
3— 1 1 1 M


5 Illl
.- -i- ^ Anr
^ Aur
6 — — 1 1 1 M
0 llIN
1 
-------
                               CORR7
                                                                      I
Figure 13.   Dendrogram computed from Q-mode cluster analysis of a matrix of cor-
            relation coefficients computed from data transformed with the square-
            root transformation;  shows faunal similarities between samples collected
            from the Cumberland River in 1973 (rcc = 0.979).
                                         68

-------
     Two of the three  coefficients whose use produced unacceptable  levels of
distortion (Corr 6, Dist 6) also produced dendrograms that had low
similarities and were  not  readily interpretable.  This  result weighs against
the use of standardization in the analysis of ecological data.

     The high degree of structure and  similarity in the dendrogram  resulting
from use of the simple-matching coefficient was due largely to its  inclusion
of negative matches, a possible source of bias.

     In this case, however, despite the possible bias and the introduction of
more distortion than Jaccard's coefficient, the stations associated at the
highest levels of similarity were nearly the same with  these two
presence-absence coefficients (Table 35).  The highly similar stations using
these presence-absence similarity coefficients were in many cases, however,
different from those that  were associated at the highest levels by the various
quantitative coefficients.

5.3.1.3  Cumberland River  Data Set—1975--

     5.3.1.3.1  Substratum—Sediment data collected in  1975 from seven
stations on the Cumberland River were  clustered through use of the Pearson
product-moment correlation coefficient.  Moderate distortion was introduced
through the clustering procedure (r   = 0.76).  At a similarity level of 0.40,
two large clusters were formed.  Cluster 1 contains most of the samples and is
dominated by samples from  stations 2, 3, and 6, indicating that sediments at
the downstream overbank stations were similar.  Cluster 2 is dominated by
channel stations, suggesting that composition of substratum differed between
the overbank and channel sites.  At higher levels of similarity, change in
sediment composition with  time is suggested.

5.3.1.3.2  Zoomacrobenthos—

     5.3.1.3.2.1  Presence-absence data--Jaccard's coefficient (S ) produced a
dendrogram that clustered  samples at low levels of similarity but with little
distortion (r   = 0.88).  At a similarity level of 0.68, seven clusters were
formed.  The first three clusters were dominated by samples collected in May,
the fourth cluster was dominated by samples collected in June and July, and
the last three clusters were dominated by samples collected in August and
September.   These groupings suggest that faunal assemblages were influenced
more by seasonal factors than by their proximity to the thermal discharge.

     This type of distribution is expected when environmental conditions are
the same at all stations.   Examination of temperature data revealed that
temperatures differed very little between the control stations upstream and
the stations downstream from the power plant.   Temperatures in May were
13 + 1.5 C at all stations, increasing to 26.5 + 2 C at all stations in
August.

     The simple matching coefficient (S  )  resulted in overall higher
similarities,  but a great deal more distortion was introduced (r   = 0.70).
Because 0.80 had been established as the minimum r   value for anCacceptable
dendrogram,  interpretation of this  dendrogram is limited to the general
observation that seasonal factors tended to influence similarities between
stations more  than location.

                                     69

-------
     5.3.1.3.2.2  Quantitative data,  species counts—Cluster analysis with
the distance coefficient Dist 6 resulted in a dendrogram with little
distortion (r   = 0.95).  At a level  of similarity of 0.84,  one large cluster
and three small clusters were formed, with 11 samples remaining unclustered
(Figure 19).   The large cluster was dominated by samples collected in May and
June and samples from station 3, whereas the unclustered samples were
collected predominantly at station 2  or in the month of September.

     Only 47 taxa were collected either at station 3 or in May when the
diversity was low, whereas 78 taxa were collected at station 2 or in
September.  Similarly, the mean number of organisms collected per square meter
in May was 1,325, while the mean number in September was 2,337.  These facts
probably account for the separation of these sample groups.

     Cluster analysis with the distance coefficient Dist 7 produced a
dendrogram with a stair-stepped appearance and little distortion of
similarities (r   = 0.84) (Figure 20).  At a similarity level of 1.55, two
large clusters and five small clusters were formed.  Both large clusters
contained samples from (1) at least four of the five months sampled, (2) both
overbank and channel stations, and (3) samples both upstream and downstream
from the plant.  This arrangement seems to indicate that all the samples
collected were similar.  At higher levels of similarity, however, we were able
to detect slight differences with time--a trend that was also noted in cluster
analysis of other coefficients.

     Cluster analysis of coefficients Corr 6 and Corr 7 resulted in
dendrograms that contained an unacceptable amount of distortion and a
structure that made interpretations difficult.  Neither temporal nor spatial
factors were shown to predominate in the formation of biotic associations.

     5.3.1.3.4  Summary--Only three of the six coefficients tested, Jaccard's
coefficient and the distance coefficients Dist 6 and Dist 7, produced
dendrograms with acceptable low levels of distortion (r   >0.80) (Table 36).
Samples with highest similarities are nearly the same in all cluster analyses.
That is, some samples are closely similar regardless of which coefficient is
used.  The principal difference among samples with a high similarity was that
the two presence-absence coefficients (S  and S  ) indicated a high similarity
between a channel and an overbank sample collected during May, which the
quantitative coefficients associated at a much lower level.   Similarly, the
quantitative coefficients indicated high similarity for one pair of overbank
samples in May that the presence-absence coefficients clustered at a much
lower level.  All three dendrograms with acceptable levels of distortion
indicated that the season during which the samples were collected was the
primary cause of association.  Only the Jaccard coefficient (presence-absence
data only), however,  failed to indicate that station location was a secondary
factor.

     For both the 1973 and 1975 Cumberland River data sets, only three of the
six coefficients tested produced dendrograms with acceptably low levels of
distortion.  The Jaccard coefficient and the Dist 7 (quantitative, distance,
square root transformation with no standardization) were two of the three that
were acceptable for both data sets.  For the 1973 data set, the Corr 7
coefficient (quantitative, correlation, square root transformation with no
standardization) produced acceptably low distortion, while the Dist 6

                                     70

-------
                                        DIST 6
2.240
i











































J 	
1.890
1











































,














































































_



-









1.54C


















^

























1
1.190 0.840 0.490 0.140 -0.210
0 HAY
r- _.. - . 5 JUN

._ 	 . 5 -JUT,
i— 	 	 	 1-MAY
I 6 MAY

7-4"r-
7-M*Y
'i-M4v

	 	 . S-MAY

1 Jin,
_. 1-J1IN

_ T-JU1,
1 - TUN
	 .. 7-.TTN

_., 	 7 -JUT,

fi-iiir.


5 A^jp
6 SEP
	 2-J"N

j—
5_cj^p
3 IFF
_^__ 6-JI1N




1-5EP
, 	 	 'I-TIIN


/ cut
. .. 9-AlIf

1 	 1 	 1 	 1
            2.240    1.890     1.540     1.190     0.840     0.490     0.140     -0.210
Figure 19.  Dendrogram  computed from Q-mode  cluster analysis of a matrix of distance
            coefficients computed  from  data  that  had been transformed by the square-
            root transformation and standardized  by rows; shows faunal similarities
            between samples collected from the  Cumberland River in 1975.

                                           71

-------
            3.300
                     2.800
                              2.300
                                            DIST  7
                                       1.800      1.300
0.800
         0.300
 -0.200

 —I
 	  1-MAY

 	  2-MAY

 	  3-MAY

 	  6-MAY

 	  3-AUG

 	  1-JUN

 	  7-AUG

 	  7-MAY

     5-JUN

     5-JUL

     1-AUG

     4-AUG

     5-SEP

     4-MAY

     5-MAY

     1-JUL

     3-JUN

      3-JUL

      2-AUG

      3-SEP

      2-JUN

      5-AUG

      6-AUG

      7-SEF

      7-JOT

      7-JUL

      4-JUN

      4-JUL

	   1-SEP

      C6-JUN

      6-JUL

	   2-JUL

	   2-SEP

	   4-SEP

	   6-SEP
                                                                          -c
             3.300     2.800     2.300      1.800      1.300     0.800      0.300   -0.200
Figure  20.   Dendrogram  computed from Q-mode  cluster analysis  of a matrix of distance
             coefficients  computed  from data  that had  been-transformed by the square-
             root  transformation and  standardized by rows; shows faunal  similarities
             between samples  collected from the Cumberland River in 1975.

                                               72

-------
coefficient (quantitative, distance, square root transformation with
standardization by rows) produced the third acceptably low level of
distortion.

     For the 1973 data set, the principal cause of association (faunal
similarity) based on an interpretation of biological and physicochemical data
(including cluster analyses interpretation based on the formation of distinct
clusters) was the location of the sampling station.  In some cases this
analysis differed from an analysis based solely on a listing of the highest
(i.e., highest similarity level) biological associations in the dendrogram.
For the 1975 data set, the principal cause of association based on an
interpretation of biological and physicochemical data was in every case
different from an interpretation based solely on a listing of the clusters
formed at the highest levels of similarity.  While these lists identify the
most similar pair groups, they do not provide insight into community
structure, and for this reason we do not recommend the use of mere
similarity-level lists as a basis for interpretaton of biological data.

5.3.2  R-Mode Analysis

5.3.2.1  Introduction--
     The purpose of R-mode cluster analysis is to identify recurring groups of
species that form biological communities (i.e., associations).  If the
biological communities are discrete, the cluster analysis will form distinct
clusters.  If the communities intergrade, the results of cluster analysis will
show this.

     Whether discrete communities exist has been the subject of numerous
ecological studies, and investigators are almost equally divided as to whether
their analyses support or contradict the community concept.  In a series of
studies of rivers of the eastern and southeastern United States,  Patrick
(1961, 1967) noted that "the number of species of the major groups of
organisms--that is, the algae, protozoa, other lower invertebrates, insects,
and fish--remain similar from season to season in the same stream.  Likewise
in similar types of streams they are similar."  In other words, the total
number of different kinds of organisms is remarkably constant from one system
to another.  Patrick (1961, 1967) also observed that unstressed,  healthy
systems usually consist of many species with only a few individuals and a few
species with many individuals.  Plotting ranked abundance of species against
number of species characteristically gives a truncated log-normal curve
(Figure 21).  Although such generalities do not prove the existence or
nonexistence of communities or even associations of species, they indicate
similarity among the biotic components of unpolluted streams.

     A familiar work suggesting the existence of specific
species-environmental associations is the study of indicator organisms
developed in Europe by Kolkwitz and Marrson (1908, 1909).   In their saprobian
system, Kolkwitz and Marrson associated certain species with different zones
or regions of water quality.  As pointed out by Cairns et al.  (1972), this
system "was a logical extrapolation of the niche concept (Hutchinson, 1957;
Parker and Turner,  1961).   That is, each organism has a particular set of
environmental prerequisites essential to its survival."  An illustration of
the type of information resulting from study of indicator organism is shown in
Figure 22.

                                     73

-------
    35 r-
    30
CO
UJ  25
O
UJ
0.
CO  20
Of
UJ
CD
15
     10
         1-2   2-4   4-8   8-16   16-32 32"64   64-   128-   256~   512-   1024-  2048'  4096-8192-
                                                  128   256    512    1024  2048  4096   8192  16384


                                    INDIVIDUALS PER SPECIES
      Figure 21.  A truncated log-normal distribution fitted to a distribution of species in an aquatic
                 ecosystem not adversely affected by environmental stress.

-------
                                      ACTIVE  DECOMPOSITION
    36
24
Figure 22.   Responses of organisms to severe organic enrichment:  changes in types of
            organisms present, population densities, and biological diversity.

-------
     In general, studies indicate that structural pattern exists within fresh-
water communities.  Problems confronting the ecologist are (1) identifying and
characterizing species associations and (2) identifying and understanding
environmental variables that control the presence of these associations.
Stephenson (1972) discussed the first problem in a paper describing the use of
computers to classify marine benthic communities.  He identified three basic
characteristics or attributes that a species must have to belong to an
association.  The first characteristic was dominance, which is usually
expressed as the number of organisms or biomass per unit area.  The other
characteristics were constancy and fidelity.  According to Stephenson (1972),
"A species is highly constant if it appears in all the samples or quadrants
within an association, but it need not be restricted to a single association.
Conversely, a species is highly faithful if it occurs in a single association,
but it need not occur within all the samples within the association."

     Although the theory of community ecology is based on associations of
species, the discussion that follows is based on taxonomy at the generic level
or higher.  Use of less refined taxonomy is necessitated by the difficulties
inherent in large aquatic ecological surveys where taxonomy must be based
solely on morphological observations of often limited size classes and some-
times damaged specimens.  There also remain, of course, taxonomic
uncertainties and discrepancies in certain groups.

5.3.2.2  Clinch River Data Set--
     To identify representative assemblages of taxa, R-mode clustering of the
Clinch River data was undertaken with six similarity coefficients--S , S  ,
Corr 6, Corr 7, Dist 6, and Dist 7.  Because the complete Clinch River data
set included samples collected before and after the pH stress, it was divided
into affected and unaffected subsets.  R-mode clustering was done on each
subset to determine (1) whether assemblages of taxa changed after the spill
and (2) whether keeping all 36 zoomacrobenthic samples in the same data matrix
changed the clusters of taxa identified in each subset.

     The r   for each dendrogram is listed in Table 27 (refer back to section
          cc
5.3.1.1).  Because r   values for Corr 6, Corr 7, and Dist 6 dendrograms were
less than 0.8, no discussion of these dendrograms is included here.  The r
values for the S , S  , and Dist 7 dendrograms were greater than 0.8, except
for one S   dendrogram that had a value of 0.76.
         oil
     To interpret the results, one or two arbitrary levels of similarity were
selected for each dendrogram.  The clusters or assemblages of taxa found at
each level of similarity were tabulated, and the tables compared.  For eight
dendrograms (Figures 23 to 30) with an r   of X).8, 11 tables were formed, and
the dendrograms were grouped according to their overall similarity to each
other (Tables 37 through 39).  Trophic, functional (TF) codes (refer to
Table 12) were then assigned to each taxon, and intercluster reordering by
numeric code was initiated.  The TF clusters were then rearranged within each
dendrogram to show similarities among dendrograms (Tables 40 to 42).  The last
step was to decode the TF clusters and list by scientific name (Tables 43 to
45).

     5.3.2.2.1  Presence-absence data—In Table 39, four clusters of taxa were
identified at the 0.87 and 0.89 levels of similarity for the ST and S
dendrograms computed from samples unaffected by the spill.  Cluster 1 had
eight taxa, dominated by pH-tolerant beetles.  Two major trophic-functional
                                     76

-------
0.015
         0.165
                     0.315
  r
 O.A65
	1	
                                         0.615
 0.765
	1	
                                                              0.915
                                                                         1.065
  l_
                                                                          _l
                                                                                STENELMIS

                                                                                OPTIOSERVUS

                                                                                DUBIRAPHIA

                                                                                EPHEMERELLA

                                                                                BAETIS

                                                                                HYDROPSYCHE

                                                                                PSEPHENUS HERRICKI

                                                                                GONIOBASIS SPINELLA

                                                                                STENONEMA

                                                                                ISONYCHIA

                                                                                GONIOBASIS CARINIFERA

                                                                                ACRONEURIA

                                                                                TRICORYTHODES

                                                                                CHIRONOMIDAE

                                                                                CORYDALUS CORNUTUS

                                                                                CHEUMATOPSYCHE

                                                                                POTAMANTHUS

                                                                                ANCULOSA

                                                                                ANCULOSA SUBGLOBOSA

                                                                                HETAERINA

                                                                                EPHORON

                                                                                MICROCYLLOEPUS

                                                                                HEMERODROMIA

                                                                                PROMORESIA

                                                                                SIMULIUM

                                                                                PERLESTA PLACIDA

                                                                                PLEUROCER1DAE

                                                                                SPHAERIUM

                                                                                HEPTAGENIA
 0.015    0.165      0.315      0.465      0.615       0*765     0.915        1.065
Figure 23.   Dendrogram computed  from R-mode  cluster  analysis  of a matrix  of
              Jaccard's  coefficients, showing  distributional similarities of
              taxa collected  from  stations on  the Clinch River  unaffected by
              low-pH stress that resulted  from the 1970 spill of  acid  (rcc  = 0.97).
                                               77

-------
                                           'sm
0.369
0.459
0.549
                                0.639
                                0.729
                                	1	
                                0.819
                                —I	
0.909
                                                                           0.999
                                                                            ,  STENELMIS

                                                                               OPTIOSERVUS
                                                                            .,  DUBIRAPHIA

                                                                            T  EPHEMERELIA

                                                                               BAETIS
                                                                           i
                                                                     HYDROPSYCHE

                                                                     PSEPHENUS HERRICKI

                                                                     GONIOBASIS SPINELLA

                                                                     STENONEMA

                                                                     ISONYCHIA

                                                                     GONIOBASIS CARINIFERA

                                                                     ACRONEURIA

                                                                     TRICORYTHODES

                                                                     CHIRONOMIDAE

                                                                     CORYDALUS CORNUTUS

                                                                     CHEUMATOPSYCHE

                                                                     POTAMANTHUS

                                                                     ANCULOSA

                                                                     ANCULOSA SUBGLOBOSA

                                                                     EPHORON

                                                                     HETAERINA

                                                                     MICROCYLLOEPUS

                                                                     PROMORESIA

                                                                     HEMERODROMIA

                                                                     SIMULIUM

                                                                     PERLESTA PLACIPA

                                                                     HEPTAGENIA

                                                                     PLEUROCERIDAE

                                                                     SPHAERIUM
0.369     0.459       0.549       0.639      0.729      0.819      0.909      0.999
Figure 24.  Dendrogram computed from R-mode  cluster  analysis of a matrix of  simple
             matching coefficients,  showing distributional similarities  of taxa
             collected from  stations  on the Clinch River unaffected by low-pH stress
             that resulted from the  1970 spill of acid (rcc  = 0.91).
                                              78

-------
                                      DIST   7
 13.800
   I	
 11.800
	1	
 9.800
	1	
 7.800
	1	
 5.800
	1	
                                                     3.800
 1.800
	1	
                                                                          -0.200
                                                                                 STENELMIS

                                                                                 ISONYCHIA

                                                                                 STENONEMA

                                                                                 CHEUMATOPSYCHE

                                                                                 MICROCYLLOEPUS

                                                                                 HEMERODROMIA

                                                                                 HETAERINA

                                                                                 PROMORESIA

                                                                                 PLEUROCERIDAE

                                                                                 SPHAERIUM

                                                                                 PSEPHENUS  HERRICKI

                                                                                 ANCULOSA SUBGLOBOSA

                                                                                 GONIOBASIS SPINELLA

                                                                                 ACRONEURIA

                                                                                 HEPTAGENIA

                                                                                 PERLESTA PLACIDA

                                                                                 EPHEMERELLA

                                                                                 CORYDALUS  CORNUTUS

                                                                                 TRICORYTHODES

                                                                                 EPHORON

                                                                                 POTAMANTHUS

                                                                                 CHIRONOMIDAE

                                                                                 ANCULOSA

                                                                                 GONIOBASIS CARINIFERA

                                                                                 SIMULIUM

                                                                                 OPTIOSERVUS

                                                                                 BAETIS

                                                                                 DUBIRAPHIA

                                                                                 HYDROPSYCHE
13.800     11.800      9.800      7.800      5.800      3.800       1.800     -0.200
Figure  25.  Dendrogram  computed from R-mode  cluster analysis of  a matrix  of distance
             coefficients  computed  from data  that had been transformed by  the square-
             root  transformation, showing distributional  similarities of taxa collected
             from  stations  on the Clinch  River  unaffected by low-pH stress  that resulted
             from  the 1970  spill of  acid  (rcc = 0.92).
                                               79

-------
 0.015
  I	
0.165
           0.315
          	1	
0.465
           0.615
          	,	
 0.765
	1	
 0.915
	1	
 1.065
——1
                                                                                STENELMIS

                                                                                OPTIOSERVUS

                                                                                CHIRONOMIDAE

                                                                                CORYDALUS CORNUTUS

                                                                                MICROCYLLOEPUS

                                                                                CHEUMATOPSYCHE

                                                                                DUBIRAPHIA

                                                                                HEMERODROMIA

                                                                                HYDROPSYCHE

                                                                                PROMORESIA

                                                                                ACRONEURIA

                                                                                GONIOBASIS SPINELLA

                                                                                ISONYCHIA

                                                                                BAETIS

                                                                                TRICORYTHODES

                                                                                HETAERINA

                                                                                EPHEMERELLA

                                                                                GONIOBASIS CARINIFERA

                                                                                STENONEMA

                                                                                SIMULIUM

                                                                                ANCULOSA

                                                                                PSEPHENUS HERRICKI

                                                                                POTAMANTHUS

                                                                                PERLESTA PLACIDA

                                                                                ANCULOSA SUBGLOBOSA
O.Q15      0.165      0.315      0.465       0.615       0.765      0.915      1.065
Figure 26.   Dendrogram computed  from R-mode cluster analysis  of a matrix of  Jaccard's
              coefficients,  showing  distributional  similarities  of taxa  collected from
              stations  on the  Clinch River  affected by low-pH stress that resulted  from
              from the  1970  spill  of acid  (rcc =  0.97).
                                               80

-------
 0.408
0.488
                      0.568
                      0.648
                                           0.728
                                                      0.808
                                                      0.888
                                                                            0.968
                                  I
                                                                      STENELMIS

                                                                      OPTIOSERVUS

                                                                      MICROCYLLOEPUS

                                                                      CORYDALUS CORNUTUS

                                                                      CHIRONOMIDAE

                                                                      DUBIRAPHIA

                                                                      HEMERODROMIA

                                                                      HYDROPSYCHE

                                                                      CHEUMATOPSYCHE

                                                                      PROMORESIA

                                                                      ACRONEURIA

                                                                      GONIOBASIS SPINELLA

                                                                      ISONYCHIA

                                                                      BAETIS

                                                                      EPHEMERELLA

                                                                      GONIOBASIS CARINIFERA

                                                                      TRICORYTHODES

                                                                      HETAERINA
                                                                                 PSEPHENUS HERRICKI

                                                                                 SIMULIUM

                                                                                 ANCULOSA

                                                                                 POTAMANTHUS

                                                                                 ANCULOSA SUBGLOBOSA

                                                                                 STENONEMA

                                                                                 PERLESTA PLACIDA
0.408     0.488      0.568       0.648       0.728       0.808      0.888      0.968
Figure 27.  Dendrogram computed from  R-mode  cluster  analysis  of a matrix  of  simple
             matching  coefficients,  showing distributional  similarities of  taxa
             collected from stations on the Clinch River affected by  low-pH stress
             that resulted from the  1970 spill of acid (rcc =  0.84).
                                               81

-------
                                     DIST  7
 17.330
          14.830
                    12.330
                   	1	
 9.830
	1	
 7.330
	1	
4.830
          2.330
                    -0.170
                    	1
                                                                            STENELMIS

                                                                            OPTIOSERVUS

                                                                            CHEUMATOPSYCHE

                                                                            MICROCYLLOEPUS

                                                                            CORYDALUS CORNUTUS

                                                                            BAETIS
                                                                            CHIRONOMIDAE

                                                                            DUBIRAPHIA

                                                                            PROMORESIA

                                                                            HEMERODROMIA

                                                                            PSEPHENUS HERRICKI

                                                                            POTAMANTHUS

                                                                            ANCULOSA SUBGLOBOSA

                                                                            GOKIOBASIS CARINIFERA

                                                                            ANCULOSA

                                                                            GONIOBASIS SPINELLA

                                                                            STENOHEMA

                                                                            PERLESTA PLACIDA

                                                                            EPHEMERELLA

                                                                            TRICORYTHODES

                                                                            HETAERINA

                                                                            ACROSEURIA

                                                                            ISONYCHIA
                                                                            SIMULIUM
                                                                           HYDROPSYCHE
17.330    14.830      12.330      9.830      7.330      4.830      2.330     -0.170
 Figure  28.   Dendrogram  computed  from R-mode cluster analysis  of a matrix of distance
              coefficients  computed from  data that had been transformed  by the  square-
              root  transformation,  showing distributional  similarities  of taxa  collected
              from  stations on the  Clinch River  affected by low-pH stress that  resulted
              from the 1970 spill  of acid  (r

                                             82
                  cc
                       0.97).

-------
0.000
          0.150
                    0.300
o.wo
          0.600
0.750
                               0.900
                                         1.050
                                                                              STENELMIS

                                                                              OPTIOSERVUS

                                                                              DUBIRAPHIA

                                                                              HYDROPSYCHE

                                                                              CHIRONOMIDAE

                                                                              CORYDALUS CORNUTUS

                                                                              CHEUMATOPSYCHE

                                                                              ACRONEURIA

                                                                              MICROCYLLOEPUS

                                                                              PROMORESIA

                                                                              HEMERODROMIA

                                                                              GONIOBASIS SPINELLA

                                                                              ISONYCHIA

                                                                              BAETIS

                                                                              EPHEMERELLA

                                                                              GONIOBASIS CARINIFERA

                                                                              TRICORYTHODES

                                                                              HETAERINA

                                                                              SIMULIUM

                                                                              STENONEMA

                                                                              ANCULOSA

                                                                              ANCULOSA SUBGLOBOSA

                                                                              PSEPHENUS HERRICKI

                                                                              EPHORON

                                                                              POTAMANTHUS

                                                                              PERLESTA

                                                                              PLEUROCERIDAE

                                                                              SPHAERIUM

                                                                              HEPTAGENIA
0.000
         0.150
                    0.300
                              0.450
                                          0.600
                                                   0.750
                                                             0.900
                                                                        1.050
Figure 29.   Dendrogram computed  from R-mode cluster  analysis  of a  matrix  of Jaccard's
              coefficients,  showing distributional similarities of taxa collected  from
              stations on  the Clinch Riber both  affected and  unaffected by  the low-pH
              stress  that  resulted from  the 1970 spill of acid  (rcc  = 0.95).   Only
              those  taxa are included that comprise 10  percent or more of  the total
              samples.

                                             83

-------
                                      DIST  7
17.000
          14.500
                    12.000
                               9.500
                                         7.000
                                                   4.500
                                                            2.000
-0.500
                                                                             STENELMIS

                                                                             OPTIOSERVUS

                                                                             CHEUMATOPSYCHE

                                                                             ISONYCHIA

                                                                             MICROCYLLOEPUS

                                                                             CORYDALUS CORNUTUS

                                                                             CHIRONOMIDAE

                                                                             BAETIS

                                                                             DUBIRAPHIA

                                                                             PROMORESIA

                                                                             HEMERODROMIA

                                                                             PSEPHENUS HERRICKI

                                                                             ANCULOSA SUBGLOBOSA

                                                                             PLEURDCERIDAE

                                                                             SPHAERIUM

                                                                             GONIOBASIS SPINELLA

                                                                             HEPTACENIA

                                                                             HETAERINA

                                                                             ACRONEURIA

                                                                             PERLESTA PLACIDA

                                                                             EPHEMERELLA

                                                                             TRICORYTHODES

                                                                             EPHORON

                                                                             POTAMANTHUS

                                                                             STENONEMA

                                                                             ANCUI.OSA

                                                                             GONIOBASIS CARINIFERA

                                                                             SIMULIUM

                                                                             HYDROPSYCHE
17.000    14.500     12.000      9.500      7.000      4.500      2.000      -0.500
Figure 30.   Dendrogram computed from R-mode cluster analysis  of a matrix of distance
              coefficients computed from  data that had been transformed  by the  square-
              root transformation,  showing distributional  similarities of taxa  collected
              from stations on  the  Clinch River  both affected and unaffected by the
              low-pH stress that  resulted from the 1970 spill of acid  (rcc = 0.97).  Only
              those taxa are included that comprise 10 percent  or more of the total sample.
                                             84

-------
groups were represented—collectors and grazers.  Clusters 2 and 3 had four
and five taxa, respectively, but each cluster had three major functional
groups—collectors, grazers, and predators.  This indicated that assemblages
of taxa represented in clusters 2 and 3 were more characteristic of the total
zoomacrobenthic community than those in cluster 1.  The fourth cluster in
Table 39 consisted of two snail taxa representing one TF group.

     Table 43 shows that several taxa clustered together (using
presence-absence coefficients), regardless of the level of similarity, type of
data considered, or size of the data matrix: (1) Optioservus-Stenelmis, (2)
Chironomidae-Corydalus ornutus, (3) Hydropsyche-Cheumatopsyche-Dubiraphia,
(4) Isonychia-Baetis, (5) Goniobasis carinifera-Ephemerella, and (6)
Tricorythodes-Hetaerina.  These were identified as (1) grazers, (2)
grazer-predators, (3) collector-grazers, (4) collector-grazers, (5) grazers,
and (6) collector-predators.

     The six clusters were evaluated with regard to constancy and fidelity.

     5.3.2.2.2  Quantitative data, species counts—Associations of taxa based
on counts of individuals per taxon were more difficult to identify and less
discrete.  Table 44 summarizes clusters of taxa computed with the distance
coefficient Dist 7.  Only three taxa (Psephenus herricki, Goniobasis spinella,
and Hetaerina) met the criteria of constancy and fidelity.  Unfortunately, the
information provided by this coefficient was of limited value, and other
quantitative techniques should be considered in future studies.

     5.3.2.2.3  Summary--0ne justification for dividing the original data
matrix into subsets was to determine whether the associations of taxa found at
stations unaffected by the spill were different from those found at stations
affected by the spill.  To answer this question, the associations identified
in Tables 39 and 43 were collated.  Table 45 shows that species associations
after the spill were much smaller (two to three taxa per association), than
before the spill.  The table also shows that Stenelmis-Optioservus and
Chironomidae-Corydalus cornutus were the only associations found in
dendrograms of both unaffected and affected stations.  This indicates that
associations were influenced by the pH stress and that division of the
original data matrix of 36 samples into subsets was justified.

5.3.2.3  Cumberland River Data Set—1973--

     5.3.2.3.1  Presence-absence data--Jaccard's coefficient S  produced
clusters with low levels of similarity (Figure 31).  At a similarity level of
0.27, four clusters were formed, with several taxa left unclustered.  The
first cluster consisted of a heterogeneous association of nine taxa from three
phyla—five detrital collectors, three predators, and one grazer.  This group
included one omnivore and almost equal numbers of detritivores and carnivores.
Cluster 2 contained a small but equally heterogeneous group of taxa displaying
three different food habits.  The last two clusters consisted entirely of
wormlike taxa that are usually abundant only in streams that have been
organically enriched.
                                     85

-------
-0.105
0.045
0.195
0.345
0.495
0.645
                                                                     0.795
                                               7
 0.945

—   BRANCH.

—   PROCLAD.

—   C.(CRYP)

—   HEXAGEN.

—   LIM.SP.2

—   CHAOBOR.

—   PENT.SPE

—   LIM.SP.l

—   CORBIUC.

—   COELOTAN

—   PECTINAT

—   PENT.MON

—   CHIR.SP.

—   LUM.SP.l

—   NEM.SP.l

—   CHIR.RIP

_   SMITTIA

—   TANYPUS

—   CHIR.TEN

—   HELOB.

—   SPHAER.
                                                                                    NEURECL.
           0.495
                                 0.645
                                                        0.795
                                                                                0.945
-0.105      0.045       0.195       0.345
Figure 31.   Dendrogram computed  from R-mode cluster  analysis of a matrix of Jaccard's
             coefficients, showing distributional  similarities of taxa  collected from
             the Cumberland River in 1973.
                                            86

-------
     Cluster analysis of the simple matching coefficient S   produced clusters
with low levels of similarities (Figure 32).  Two clusters were formed at a
level of similarity of 0.75.  The first cluster comprises four taxa that were
also closely clustered with Jaccard's coefficient S .   The second cluster is a
heterogeneous association of taxa that had been unclustered or placed in small
clusters by Jaccard's coefficient.

     5.3.2.3.2  Quantitative data, species counts--The distance coefficient
Dist 6 (square-root transformation of data and standardization by rows)
produced clusters with low levels of similarity (Figure 33).  Seven small
clusters were formed at a level of similarity of 1.11, with six taxa remaining
unclustered.  Only one association had also occurred when presence-absence
coefficients were used.

     The distance coefficient Dist 7 (square-root transformation of data)
produced a dendrogram in which almost all the taxa were grouped at the same
level of similarity and thus contained little useful information about
discrete associations (Figure 34).

     The correlation coefficients Corr 6 and Corr 7 produced identical
matrices and dendrograms due to the standardization inherent in the use of the
product-moment correlation coefficient.  Again, levels of similarity of
clusters were fairly low.  At levels of 0.46 and 0.35, five and seven small
clusters, respectively, were formed (Figure 35).  Some clusters contained
functionally and morphologically similar taxa, but others did not.  Some
contained only frequently collected taxa, whereas others contained mixtures of
frequent and rare taxa.  Very little ecological information that pertained to
the question of associations of taxa came from study of these dendrograms.
                                     87

-------
0.413
 0.488
0.563
0.638
0.713
0.788
0.863
0.938
 r
T
            T
            T
                                                         T
                       T
  I
                                                                                   BRANCH.

                                                                                   PROCLAD.

                                                                                   C.(CRYP)

                                                                                   HEXAGEN.

                                                                                   CHAOBOR.

                                                                                   PENT.SPE

                                                                                   LIM.SP.2

                                                                                   CHIR.RIP
                                                                               —  CHIR.TAN

                                                                                   HELOB.

                                                                                   NBURBCL.

                                                                                   SPHAER.

                                                                               L_  TANYPUS

                                                                                   SMITTIA

                                                                                   NEM.SP.l

                                                                                   PENT.MON

                                                                                   COELOTAN

                                                                                   PECTINAT

                                                                                   CORBIUC.

                                                                                   CHIR.SP.

                                                                                   LUM.SP.l

                                                                                   LIM.SP.l
                                                                      I
 0.413      0.488      0.563       0.638       0.713       0.788       0.863       0.938
Figure  32.   Dendrogram computed from R-mode  cluster analysis  of a matrix of simple
             matching coefficients, showing distributional similarities of taxa
             collected from the Cumberland River in 1973.
                                            88

-------
1.422     1.332
                DIST6
1.242       1.152       1.062      0.972
                                                                0.882
0.792
r~

































i






































































i











































NEM SP 1
. . 	 CHIP TAN


— — — — PROCLAD .

_ r (cicn*}

,,,_^_ TM T TT T fi


TANYPITS
CHAr>BriB



SPHAER.


	 . 	 	 NEURZCL





1 i 1 1 1 | 1
1.422     1.332       1.242       1.152      1.062      0.972       0.882       0.792
Figure 33.   Dendrogram computed from R-mode  cluster analysis  of  a  matrix of distance
            coefficients with data transformed with the  square-root  transformation
            and standardized by rows,  showing distributional  similarities  of taxa
            collected from the Cumberland River in 1973.
                                         89

-------
                                        DIST  7
 16.000
  I	
 13.500
	1	
 11.000
—I	
 8.500
—I	
 6.000
—1	
 3.500
—I	
 1.000
—I—
-1.500
                                                                                  BRANCH.

                                                                                  CHIR.RIP

                                                                                  CHIR.TAN

                                                                                  TANYPUS

                                                                                  HELOB.

                                                                                  NEM.SP.l

                                                                                  PENT.MON

                                                                                  SPHAER.

                                                                                  NEURECL.

                                                                                  SMITTIA

                                                                                  COELOTAN

                                                                                  CHIR.SP.

                                                                                  CORBIUC.

                                                                                  PECTINAT

                                                                                  C.(CRYP)

                                                                                  PENT.SPE

                                                                                  PROCLAD.

                                                                                  HEXAGEN.

                                                                                  LUM.SP.l

                                                                                  CHAOBOR.

                                                                                  LIM.SP.l

                                                                                  LIM.SP.2
16.000     13.500     11.000      8.500      6.000      3.500       1.000      -1.500
Figure  34.  Dendrogram computed from R-mode cluster  analysis of a matrix of distance
            coefficients with data transformed  by  the square-root transformation,
            showing distributional similarities of taxa collected from the Cumberland
           . River  in 1973.
                                           90

-------
                               CORR6
.210 -0.060 0.090 0.240 0.390 0.540 0.690 0.840
i 1
i i




....... •pFrTTTSTAT












. 	 11W HEKACEN















i i i i i i
-0.210     -0.060      0.090      0.240       0.390      0.540      0.690       0.840
Figure 35.  Dendrogram computed R-mode cluster  analysis  of a matrix  of correlation
            coefficients with data transformed  by  the square-root transformation and
            standardized by rows,  showing  distributional similarities of taxa collected
            from the 'Cumberland River  in 1973.
                                          91

-------
TABLE 16.  COEFFICIENTS OF CORRELATION,  DISTANCE,  AND SIMILARITY:   ABBREVIATIONS,  EQUATIONS,
                                AND  UPPER AND LOWER LIMITS3

Coefficient
Pearson product-moment
correlation coefficient
Sokal's average taxonomic
distance coefficient
Simple matching
coefficient
Unnamed coefficient 1
Rogers and Tanimoto's
coefficient
Unnamed coefficient 3
Hamann
Jaccard
Russell and Rao
Dice
Unnamed coefficient 2
Abbreviation
Corr
Dist
SSM
UNI
RT
UN3
H
SJ
RR
D
UN2
Equation
r .. = s ., /s . s.
jk jk j k
d/r v f-v v \ o t „ \
.. = vIECX. . - X )Z/nj
J K. 1 J IK
SSM = (a + d)/n
SUN1 = 2(a + d)/n + a + d
SRT = (a + d)/(a + d + 2b + 2c)
SUN3 = (a + d)/(b + c)
£„ = (a + d - b - c)/n
H.
S = a/(a + b + c)
•J
S = a/n
S = 2a/(2a + b + c)
S._TO = a/ (a + 2b + 2c)
Lower
limit
-1
0
0
0
0
0
0
0
0
0
0
Upper
limit
1
CO
1
1
1
CO
1
1
1
1
1

-------
                                                    TABLE 16 (continued)
1C
w

Coefficient
Kulczynski 1
Kulczynski 2
Unnamed coefficient 4
Ochiai (Otsuka)
Unnamed coefficient 5
Yule
Phi
Abbreviation
Kl
K2
UN4
OCH
UN5
Y
PHI
Lower
Equation limit
S = a/(n. + n - 2a) 0
JxJ. J ix
Svo = l/2(a/n.) + l/2(a/nv) 0
IxZ J IS.
STrKT. = l/4(a/nT) + l/4(a/nT,) 0
UN4 J K
SOCH - a//(nJ V °
^5 ~ <,nj nK nj nk->
SY = (ad - be) /(ad + be) -1
S = (ad — bc)//(n n n. n. ) —1
PHI J K j k
Upper
limit
oo
1
1
1
1
1
1

         See Sokal and Sneath (1963) for further discussion.

-------
         TABLE 17.  CONTINGENCY TABLE (2 X 2) DEFINING THE TERMS
           a., b_,  £, AND d. AS USED IN THE EQUATIONS IN TABLE 13
   Sample k
                                       Sample j
Species present
     Species absent
Species present


Species absent
a (present in both)
c (present in j,
   absent from k)
b (present in k, absent
   from j)

d (absent from both;
   negative match)
         TABLE 18.  EFFECT OF THE TRANSFORMATION LOG  (X.. + 1),
      WHERE X IS THE ABUNDANCE OF THE ith SPECIES IN THE^th SAMPLE

Abundance
0
9
99
999
9999
Value after transformation
0
1
2
3
4

     TABLE 19.  EFFECT OF THE TRANSFORMATION /(X. . + 0.5), WHERE  X..
          IS THE ABUNDANCE OF THE i.th SPECIES IN^HE j_th  SAMPLE     J
          Abundance
          Value after transformation
                0
               10
              100
             1000
            10000
                     0.707
                     3.240
                    10.025
                    31.631
                   100.003
                                    94

-------
      TABLE 20.  LABELS OF CORRELATION AND DISTANCE MATRICES
                    WITH VARIOUS TRANSFORMATION
      Transformation
Correlation
Distance
None (raw data)                   Corr 1

Standardization by rows           Corr 2

Log transformation,
  standardization by rows         Corr 4

Square root transformation,
  standardization by rows         Corr 6

Square root transformation,
  no standardization              Corr 7
                   Dist 1

                   Dist 2


                   Dist 4


                   Dist 6


                   Dist 7
                                 95

-------
      TABLE 21.   MATRIX OF COEFFICIENTS OF  COPHENETIC  CORRELATION  COMPUTED  BETWEEN CORRESPONDING
                          ELEMENTS  OF 21  CORRELATION AND  SIMILARITY MATRICES3


Corr 1
Corr 2
Corr 4
Corr 6
Corr 7
SSM
UNI
RT
UN 3
H
SJ
RR
D
UN2
Kl
K2
UN4
OCR
UN5
Y
PHI
Corr 1
1.000
0.021
0.137
0.086
0.988
0. 121
0.116
0.127
0.129
0.121
0.224
0.223
0.240
0.211
0.169
0.336
0.289
0.282
0.251
0.344
0.270
Corr 2
1.000
0.786
0.925
0.055
0.494
0.478
0.509
0.446
0.494
0.283
0.138
0.253
0.302
0.286
0.237
0.348
0.253
0.339
0.211
0.351
Corr 4
1.000
0.943
0.202
0.646
0.622
0.668
0.600
0.646
0.537
0.348
0.508
0.554
0.507
0.531
0.631
0.529
0.618
0.486
0.637
Corr 6
1.000
0.139
0.571
0.550
0.590
0.525
0.571
0.421
0.257
0.389
0.441
0.409
0.386
0.491
0.397
0.484
0.341
0.498
Corr 7
1.000
0.171
0.165
0.176
0.163
0.171
0.282
0.273
0.300
0.266
0.208
0.407
0.360
0.348
0.316
0.415
0.341
SSM
1.000
0.996
0.994
0.759
1.000
0.700
0.424
0.668
0.711
0.611
0.648
0.828
0.674
0.812
0.619
0.834
UNI
1.000
0.981
0.718
0.996
0.684
0.418
0.658
0.689
0.577
0.636
0.817
0.663
0.796
0.619
0.832
RT

1.000
0.809
0.994
0.712
0.427
0.673
0.732
0.654
0.654
0.832
0.679
0.822
0.608
0.837
UN3


1.000
0.759
0.599
0.338
0.532
0.662
0.805
0.521
0.651
0.538
0.674
0.417
0.657
H



1.000
0.700
0.424
0.668
0.711
0.611
0.648
0.828
0.674
0.812
0.619
0.834
SJ




1.000
0.916
0.987
0.989
0.804
0.900
0.858
0.972
0.958
0.678
0.895

Abbreviations defined in Table 20.

-------
                                                  TABLE 21 (continued)
10
                      RR        D        UN2         Kl         K2        UN4     OCH     UN5      Y      PHI
Corr 1
Corr 2
Corr 4
Corr 6
Corr 7
s
SM
UNI
RT
UN3
H
s
J
RR
D
UN2
Kl
K2
UN4
OCH
UN5
Y













1.000
0.910
0.889
0.705
0.813
0.667
0.889
0.787
0.546














1.000
0.955 1.000
0.731 0.871 1.000
0.916 0.870 0.669 1.000
0.856 0.845 0.697 0.951 1.000
0.987 0.940 0.719 0.968 0.911 1.000
0.953 0.943 0.765 0.941 0.957 0.968 1.000
0.721 0.633 0.445 0.906 0.912 0.809 0.811 1.000
        PHI         0.718     0.891      0.880      0.707      0.951      0.995   0.933   0.975   0.886   1.000

-------
      TABLE  22.   COEFFICIENTS OF COPHENETIC CORRELATION COMPARING
 DISTANCE MATRICES AND  SELECTED CORRELATION AND  SIMILARITY MATRICES3

Dist 1
Dist 1 1.000
Dist 2 0.332
Dist 4 0.227
Dist 6 0.326
Dist 7 0.890
Corr 6 	
Corr 7 	
SOM -0.157
SM
ST 0.093
J
Dist 2 Dist 4 Dist 6 Dist 7

1.000
0.878 1.000
0.978 0.951 1.000
0.416 0.320 0.431 1.000
	 	 	 	
-0.276 -0.514
-0.412 -0.680 -0.532 -0.190

-0.100 -0.358 -0.210 -0.060


aData collected from Cumberland  River (Old Hickory Reservoir)  in 1973.
                                  98

-------
   TABLE 23.  NUMBER OF TAXA IN EACH MAJOR TAXONOMIC GROUP IN THE
        CLINCH RIVER (1970) BEFORE AND AFTER RELATIVE SPECIES
       ABUNDANCE WAS DETERMINED AND RARE TAXA WERE ELIMINATED

Major taxonomic
group
Amphipoda
Annelida
Coleoptera
Decapoda
Diptera
Ephemeroptera
Gastropoda
Hemiptera
Hydracarina
Lepidoptera
Megaloptera
Odonata
Pelecypoda
Plecoptera
Trichoptera
Total
Number of
All taxa
1
3
22
4
13
15
14
1
1
2
3
14
6
7
17
123
taxa per group
Rare taxa
eliminated


6

3
8
5



1
1
1
2
_2
29

Based on a relative abundance (RA) < 0.01.
                                 99

-------
    TABLE 24.   TWENTY-NINE TAXA AND THEIR RESPECTIVE TROPHIC CODES
              FOR THE REDUCED CLINCH RIVER DATA SET, 1970

Taxon Trophic code* Taxon Trophic code*
Stenelmis sp.
Microcylloepus sp.
Optioservus sp.
Dubiraphia sp.
Promoresia sp.
Psephenus herricki
Chironomidae
Simulium sp.
Hemerodromia sp .
Ephoron sp.
Potamanthus sp.
Stenonema sp .
H eplageni a s p .
Isonychia sp.
Ephemerella sp.
3411
3411
3411
3411
3411
3412
3524
2212
4713
2612
2612
3424
3424
2224
3424
Tricorythodes sp.
Baetis sp.
Anculosa sp.
Anculosa subglobosa
Pleuroceridae
Goniobasis spinella
Goniobasis carinifera
Corydalus cornutus
Hetaerina sp.
Sphaerium sp.
Perlesta placida
Acroneuria sp.
Hydropsyche sp.
Cheumatopsyche sp.
2312
3424
3411
3411
3411
3411
3411
4613
4613
2212
4624
4624
2224
2224

*Trophic codes are listed in Tables 11 and 12 (section 4.1.3).
                                  100

-------
TABLE 25.  NUMBER OF TAXA WITH A RELATIVE ABUNDANCE >0.01
      DIVIDED BY THE TOTAL NUMBER OF TAXA PER STATION,
                     CLINCH RIVER, 1970

Station
number
4
7
8
9
10
11
Early
June
28/64
24/54
24/47
25/48
26/50
26/49
Late
June

24/46
16/28
12/24
19/39

Early
July

23/38
16/25
13/30
19/35

Late
July
27/72
24/43
16/30
17/36
20/39
22/56
Early
August

20/43
19/33
16/29
22/48

Late
August
25/57
25/50
15/29
18/35
21/43
23/44

TABLE
THE

Station
number
4
7
8
9
10
11
26. TAXA WITH A RELATIVE ABUNDANCE >0.01 AS PERCENT OF
TOTAL NUMBER OF TAXA PER STATION, CLINCH RIVER, 1970

Early
June
44
44
51
52
52
53

Late
June

52
57
50
49


Early
July

61
64
43
54


Late
July
38
56
53
47
51
39

Early
August

47
58
55
46


Late
August
44
50
52
51
49

                              101

-------
TABLE 27.  TOTAL NUMBER OF ORGANISMS PER STATION,
               CLINCH RIVER. 1970

Station
number
4
7
8
9
10
11
Early
June
2693
1830
1465
1738
3857
1645
Late
June

853
703
936
1186

Early
July

989
901
1191
2189

Late
July
3847
1636
2618
5042
5614
2723
Early
August

562
656
1010
2945

Late
August
2522
912
397
812
2351
1557

TABLE 28. TWENTY-NINE

AS PERCENT OF
TAXA WITH
RELATIVE
TOTAL NUMBER OF ORGANISMS
CLINCH RIVER, 1970
ABUNDANCE
>0.01
PER STATION,

Station
number
4
7
8
9
10
11
Early
June
95
95
96
96
97
91
Late
June

90
97
97
97

Early
July

93
97
97
98

Late
July
93
88
98
97
99
96
Early
August

84
89
96
97

Late
August
93
88
82
89
90
86
                        102

-------
     TABLE 29.  ANALYTICAL TECHNIQUE, TYPE OF COMPARISON, AND THE
      NUMBER AND TYPE OF SIMILARITY COEFFICIENTS USED TO ANALYZE
	THE REDUCED 1970 CLINCH RIVER DATA SET	
                                                Nonmetric multi- ,
                        Cluster analysis      dimensional scaling
	Q-mode	R-mode      Q-mode	R-mode

Unaffected stations        -         6           -            -

Affected stations          -         6           -

All stations               662

           Total dendrograms = 24     Total scatter diagrams = 2
o
 Six coefficients were used:  two correlation, two
 distance, and two presence-absence coefficients.

 Two coefficients were used:  one distance and one
 presence-absence coefficient.
                                   103

-------
TABLE 30.  COPHENETIC CORRELATION VALUES (r  ) FOR 24 DENDROGRAMS
  COMPUTED FROM THE CLINCH RIVER DATA SET,  SflNE TO AUGUST 1970
                              SM
                                     Corr 6
                                                  Corr 7
Dist 6
Dist 7
                                 R-mode

                     0.97   0.84      0.73      0.63      0.76      0.97

                     0.97   0.91      0.63      0.63      0.69      0.92

                     0.95   0.76      0.75      0.75      0.78      0.97

                                 Q-mode

                     0.93   0.77      0.82      0.70      0.91      0.80
Affected stations

Unaffected stations

All stations



All stations
                                104

-------
   TABLE 31.  RESULTS OF Q-MODE CLUSTER ANALYSIS OF ZOOMACROBENTHIC
        SAMPLES FROM THE CLINCH RIVER, 1970:  MINIMUM LEVEL OF
   SIMILARITY USED TO DEFINE CLUSTERS = 0.62; JACCARD'S COEFFICIENT


                                             Station
  Sampling Interval	4   7LB&MS   7RB   8   9   10   11

                             Cluster 1
Before pH stress                XX           XXX
Immediately after pH stress            X                         X
2 weeks after                          X                    X
4 weeks after                   XX               XXX
6 weeks after                                      XXX
8 weeks after                   X                  X   X    X    X
                             Cluster 2
Before pH stress
Immediately after pH stress                        XXX
2 weeks after                                      X   X
4 weeks after                                      X
6 weeks after
8 weeks after
                       Unclustered stations
Before pH stress                              X
Immediately after pH stress                   X
2 weeks after                                 X
4 weeks after                                 X
6 weeks after                          X      X
8 weeks after                          X      X
                                   105

-------
   TABLE 32.  RESULTS OF Q-MODE CLUSTER ANALYSIS OF ZOOMACROBENTHIC
       SAMPLES FROM THE CLINCH RIVER, 1970:  LEVEL OF SIMILARITY
      USED TO DEFINE CLUSTERS = 0.06; CORRELATION COEFFICIENT 6,
          /Y + 0.5 TRANSFORMATION AND STANDARDIZATION BY ROWS
                                             Station
	Date	4   7LB&MS   7RB   8   9   10   11

                             Cluster 1
Before pH stress                XXX
Immediately after pH stress            X      X
2 weeks after                          X      X
4 weeks after                   XXX
6 weeks after                          X      X
8 weeks after                   XXX
                             Cluster
Before pH stress                                   X
Immediately after pH stress                        X   X
2 weeks after                                      X   X
4 weeks after
6 weeks after
8 weeks after
                             Cluster 3
Before pH stress
Immediately after pH stress
2 weeks after                                               X
4 weeks after                                      X   X    X    X
6 weeks after                                      XX    X
8 weeks after                                      X   X    X    X
                             Cluster 4
Before pH stress                                       X    X
Immediately after pH stress                                 X
2 weeks after
4 weeks after
6 weeks after
8 weeks after
                                    106

-------
   TABLE 33.  RESULTS OF Q-MODE CLUSTER ANALYSIS OF ZOOMACROBENTHIC
       SAMPLES FROM THE CLINCH RIVER, 1970:  LEVEL OF SIMILARITY
        USED TO DEFINE CLUSTERS =1.2; DISTANCE COEFFICIENT 6,
          /Y + 0.5 TRANSFORMATION AND STANDARDIZATION BY ROWS
           Date
                                             Station
    7LB&MS   7RB   8
             10   11
                             Cluster 1
Before pH stress
Immediately after pH stress
2 weeks after
4 weeks after
6 weeks after
8 weeks after
X
X
X
X
X
X
X
X
X

X
X
X
X

X
X
           Clusters 2 and 4 and unclustered stations (U)
Before pH stress U
Immediately after pH stress
2 weeks after
4 weeks after
6 weeks after
8 weeks after

2
2
2
2
U
U


4





4


                             Cluster 3
Before pH stress
Immediately after pH stress
2 weeks after
4 weeks after
6 weeks after
8 weeks after
                            X
                            X
                   X

                   X
                                  107

-------
   TABLE 34.  RESULTS OF Q-MODE CLUSTER ANALYSIS OF ZOOMACROBENTHIC
       SAMPLES FROM THE CLINCH RIVER,  1970:  LEVEL OF SIMILARITY
         USED TO DEFINE CLUSTERS = 4.8; DISTANCE COEFFICIENT 7,
          /Y + 0.5 TRANSFORMATION AND STANDARDIZATION BY ROWS

Date

4 7LB&MS
Station
7RB 8 9 10 11
Cluster 1
Before pH stress                XX           XXX
Immediately after pH stress
2 weeks after
4 weeks after                   X
6 weeks after                                               X
8 weeks after                   X                           X
                             Cluster 2
Before pH stress                              X
Immediately after pH stress            X      X    X   X    X
2 weeks after                          X      XXX
4 weeks after                          X      X
6 weeks after                          X      XXX
8 weeks after                          X      XXX
                Cluster 3 and unclustered stations
Before pH stress
Immediately after pH stress
2 weeks after                                               X
4 weeks after                                      X   X    U    U
6 weeks after
8 weeks after
                                   108

-------
                    TABLE 35.  SUMMARY OF RESULTS OF Q-MODE CLUSTER ANALYSES, CUMBERLAND RIVER,  1973
O
VD

Coefficients
Characteristics
rcc
Similarity level
of clusters
Interpretable
structure
Highest
associations















Principal causes
of association
(in order of
importance)

SJ
0.85

1

good

4 Oct
5 Oct
3 Jun
6 Jun
1 Jul
5 Jul


1 Oct
6 Oct



6 Aug
4 Oct
5 Oct
Station
location and
month sampled


SSM
0.64

3

good

2 Jun
6 Jul
3 Jun
6 Jun
4 Oct
5 Oct


1 Oct
6 Oct



1 Jul
5 Jul

Station
location
and month
sampled

Dist 6
0.62

1

poor

6 Jul
2 Aug
2 Oct
6 Jul
2 Jul
2 Oct
6 Jul
2 Aug
2 Jul
2 Oct
6 Jul
2 Aug
3 Sep
1 Oct
6 Oct

Station
location



Dist 7
0.83

2

good

1 Oct
6 Oct
2 Jul
2 Oct
6 Jul
2 Oct


3 Jul
5 Jul



4 Aug
5 Oct

Station
location and
month sampled


Corr 6
0.62

1

poor

1 Oct
6 Oct
6 Jul
2 Aug
1 JUn
2 Jun


4 Oct
5 Oct



3 Jun
5 Aug

Station
location and
month sampled


Corr 7
0.98

3

fair

3 Jul
5 Jul
1 Oct
6 Oct
3 Jun
6 Jul


4 Oct
5 Oct



2 Aug
5 Aug

Month
sampled and
station
location
     as indicated
     by analysis of
     biological  and
     chemical data

-------
                                               TABLE 35  (continued)
     Coefficients
    Characteristics
                                               SM
Dist 6
Dist 7
Corr 6
Corr 7
Principal causes
of association as
indicated solely
by the highest
biological
associations
Month
sampled and
station
location


Month Station
sampled location
and
station
location

Station
location and
month sampled



Month
sampled and
station
location


Month sampled
and station
location



   1 = distinct clusters form only at relatively low similarity levels;
   2 = distinct clusters form at intermediate similarity levels;
   3 = distinct clusters form at relatively high similarity levels.

  bResults of cluster analysis were weighed heavily in this evaluation.
t-1
o CAssociations listed in characteristic four of this table.

-------
TABLE 36.  SUMMARY OF RESULTS OF Q-MODE CLUSTER ANALYSES, CUMBERLAND RIVER, 1975

Coefficients
Characteristics


rcc
Similarity level
of clusters
Interp re table
structure
Highest
associations

i-"
h-*









ST
J
0.88


1

good

3 Jun
3 Jul
4 Jun
4 Jul
5 Jun
5 Jul
6 Jun
6 Jul
7 Jun
7 Jul
2 May
5 May

S0«
SM
0.70


2

good

3 Jun
3 Jul
4 Jun
4 Jul
5 Jun
5 Jul
6 Jun
6 Jul
7 Jun
7 Jul
2 May
5 May

Dist 6

0.95


2

poor

3 Jun
3 Jul
4 Jun
4 Jul
5 Jun
5 Jul
6 Jun
6 Jul
7 Jun
7 Jul
3 May
6 May

Dist 7

0.84


2

fair

3 Jun
3 Jul
4 Jun
4 Jul
5 Jun
5 Jul
6 Jun
6 Jul
7 Jun
7 Jul
3 May
6 May

Corr 6

0.73


1

poor

3 Jun
3 Jul
4 Jun
4 Jul
5 Jun
5 Jul
6 Jun
6 Jul
7 Jun
7 Jul
3 May
3 Aug

Corr 7

0.74


2

poor

3 Jun
3 Jul
4 Jun
4 Jul
5 Jun
5 Jul
6 Jun
6 Jul
7 Jun
7 Jul
1 May
3 May

-------
                                             TABLE 36 (continued)

Coefficients
Characteristics Sow
SM
Principal cause(s) Month in
of association which col-
(in order of lected
importance) as
indicated by
analysis of both
biological and
physicochemical
data
Principal cause(s) Station
of association location
i-" (in order of and month
N> importance) as in which
indicated solely collected
by the highest
biological
associations
Dist
Month in
which col-
lected
and station
location




Station
location
and month
in which
collected



6 Dist
Month in
which col-
lected and
station
location




Station
location
and month
in which
collected



7 Corr
Month in
which col-
lected
and
station
location



Station
location
and month
in which
collected



6
Month in
which col-
lected and
station
location




Station
location
and month
in which
collected



Corr 7
Month in which
collected and
station loca-
tion





Station location
and month in
which collected






al = distinct clusters form only at relatively low similarity levels;
 2 = intermediate;
 3 = distinct clusters form at relatively high similarity levels.
bResults of cluster analysis were included in this evaluation.
cAssociations listed in characteristic four of this table.

-------
                TABLE 37.  RESULTS OF R-MODE CLUSTER ANALYSIS OF JACCARD'S (S )  AND SIMPLE-MATCHING (S  )
                                        COEFFICIENTS, CLINCH RIVER DATA SET,  1970

                 CATEGORY OF STATIONS INCLUDED (COEFFICIENT, SIMILARITY LEVEL AT WHICH CLUSTERS FORMED)
     Affected
    (SJ5 0.88)
                            Affected
                          (SSM, 0.88)
                           All stations
                            (S, 0.77)
                                Affected
                                  , 0.71)
                                Affected
                               (SJ} 0.58)
Stenelmis
Optioservus
Chironomidae
Corydalus cornutus
                       Stenelmis
                       Optioservus
                       Microcylloepus
                       Corydalus cornutus
                       Chironomidae
                       Stenelmis
                       Optioservus
                       Dubiraphia
                       Hydropsyche
                       Chironomidae
                       Corydalus cornutus
                       Cheumatopsyche
                       Acroneuria
                       Microcylloepus
                       Promoresia
                       Hemerodromia
                          Stenelmis
                          Optioservus
                          Microcylloepus
                          Corydalus cornutus
                          Chironomidae
                          Dubiraphia
                          Hemerodromia
                          Hydropsyche
                          Cheumatopsyche
                          Promoresia
                          Acroneuria
                          Isonychia
                          Baetis
                          Stenelmis
                          Optioservus
                          Chironomidae
                          Corydalus cornutus
                          Microcylloepus
                          Cheumatopsyche
                          Dubiraphia
                          Hemerodromia
                          Hydropsyche
                          Fromoresia
                          Acroneuria
                          Goniobasis spinella

                          Isonychia
                          Baetis
                          Tricorythodes
                          Hetaerina
Microcylloepus
Cheumatopsyche
Dubiraphia
Hemerodromia
Dubiraphia
Hemerodromia
Hydropsyche
Cheumatopsyche
Isonychia
Baetis
Ephemerella
Goniobasis carinifera
Tricorythodes
Hetaerina
                                                                                                  Ephemerella
                                                                                                  Goniobasis carinifera
                       Tricorythodes
                       Hetaerina
                                              Ephemerella               Simulium
                                              Goniobasis carinifera     Anculosa
                                              Tricorythodes
                                              Hetaerina

-------
TABLE 38.  RESULTS OF R-MODE CLUSTER ANALYSIS OF DISTANCE COEFFICIENTS, CLINCH RIVER DATA SET, 1970

     CATEGORY OF STATIONS INCLUDED (COEFFICIENT, SIMILARITY LEVEL AT WHICH CLUSTERS FORMED)
      Affected
    (Dist 7, 1.8)
      Affected
    (Dist 7,  5.3)
    All stations
   (Dist 7, 4.2)
    Unaffected
   (Dist 7, 2.3)
Promoresia
Hemerodromia
Psephenus herricki
Potamanthus
Anculosa subglobosa
Goniobasis carinifera
Anculosa
Goniobasis spinella
Stenonema
Perlesta placida
Ephemerella
Tricorythodes
Hetaerina
Stenelmis
Optioservus
Microcylloepus
Corydalus cornutus
Baetis
Chironomidae
Dubiraphia
Promoresia
Hemerodromia
Psephenus herricki
Potamanthus
Anculosa subglobosa
Goniobasis carinifera
Anculosa
Goniobasis spinella
Stenonema
Microcylloepus
Corydalus cornutus
Chironomidae
Baetis
Promoresia
Hemerodromia
Psephenus herricki
Anculosa subglobosa
Pleuroceridae
Sphaerium
Goniobasis spinella
Heptagenia
Hetaerina
Acroneuria
Perlesta placida
Ephemerella
Tricorythodes
Ephoron
Microcylloepus
Hemerodromia
Hetaerina
Promoresia
Pleuroceridae
Sphaerium
Psephenus herricki
Anculosa subglobosa
Goniobasis spinella
Acroneuria
Ephemerella
Corydalus cornutus

-------
                                 TABLE 38  (continued)
  Affected                    Affected                    All  stations                Unaffected
(Dist  7.  1.8)	(Dist 7, 5.3)	(Dist 7, 4.2)	(Dist  7,  2.3)

                        Perlesta placida              Potamanthus
                        Ephemerella
                        Tricorythodes
                        Hetaerina                     Stenonema
                        Acroneuria                    Anculosa
                                                      Goniobasis carinifera

-------
            TABLE 39.   RESULTS OF R-MODE CLUSTER ANALYSIS OF S   AND S ,
               CLINCH RIVER STATIONS UNAFFECTED BY THE 1970 pH STRESS

CATEGORY OF STATIONS INCLUDED (COEFFICIENT,  SIMILARITY LEVEL AT WHICH CLUSTERS FORMED)

          Unaffected (S  ,  0.89)Unaffected (S   0.87)~
                       u5i 1                                      J

          Stenelmis                               Stenelmis
          Optioservus                             Optioservus
          Dubiraphia                              Dubiraphia
          Ephemerella                             Ephemerella
          Baetis                                  Baetis
          Hydropsyche                             Hydropsyche
          Psephenus herricki                      Psephenus herricki
          Goniobasis spinella                     Goniobasis spinella

          Stenonema                               Stenonema
          Isonychia                               Isonychia
          Goniobasis carinifera                   Goniobasis carinifera
          Acroneuria                              Acroneuria
          Tricorythodes                           Tricorythodes

          Chironomidae                            Chironomidae
          Corydalus cornutus                      Corydalus cornutus
          Cheumatopsyche                          Cheumatopsyche
          Potamanthus                             Potamanthus

          Anculosa                                Anculosa
          Anculosa subglobosa                     Anculosa subglobosa

-------
TABLE 40.  TROPHIC-FUNCTIONAL CODES FOR TAXA CLUSTERED BY R-MODE CLUSTER ANALYSIS OF

 JACCARD'S (S ) AND SIMPLE-MATCHING (S  ) COEFFICIENTS, CLINCH RIVER DATA SET, 1970
             J                        Oil


 CATEGORY OF STATIONS INCLUDED (COEFFICIENT, SIMILARITY LEVEL AT WHICH CLUSTERS FORMED)
Affected
(SJf 0.88)


3524
4613


2224
3411
4713










Affected
(SSM, 0.88)

3411
3524
4613


2224
3411
4713








2312
3424
All stations
(Sj, 0.77)
2224
3411
3524
4613
4624


3411
4713
2224
3424



3411
3424

2313
4613
Affected
(SSM, 0.71)
2224
3411
3524
4613
4624
4713



2224
3424


2312
3411
3424
4613
2212
3411
Affected
(Sj, 0.58)
2224
3411
3524
4613
4624
4713



2224
2312
3424
4613

3411
3424




-------
                TABLE 41.  TROPHIC-FUNCTIONAL CODES FOR TAXA CLUSTERED BY R-MODE CLUSTER ANALYSIS
                                OF DIST 7 COEFFICIENTS CLINCH RIVER DATA SET, 1970

              CATEGORY OF STATIONS INCLUDED (COEFFICIENT, SIMILARITY LEVEL AT WHICH CLUSTERS FORMED)
00
Affected
(Dist 7, 1.8)
3411
4713

2312
2612
3411
3412
3424
4613
4624






Affected
(Dist 7, 5.3)
3411

2312
2612
3411
3412
3424
3524
4613
4624
4713





All stations
(Dist 7, 4.2)
3411
3424
3524
4613

2212
2312
2612
3411
3412
3424
4613
4624
4713
3411
3424
Unaffected
(Dist 7, 2.3)
2212
3411
3412
4613
4624
4713

3424
4613








-------
  TABLE 42.  TROPHIC-FUNCTIONAL CODES FOR TAXA
     CLUSTERED BY R-MODE CLUSTER ANALYSIS OF
    SIMPLE-MATCHING  (S  ) AND JACCARD'S  (S  )
  	COEFFICIENTS, CLINCH RIVER,  1970	


         Unaffected3  (STb, 0.87° and SCMb,  0.89°)
                        j              on

                        2224
                        3411
                        3412
                        3424
                         2224
                         2312
                         3411
                         3424
                         4624

                         2224
                         2612
                         3524
                         4613

                         3411
a
 Stations unaffected by spill.

 Clustering coefficient utilized.
c
 Similarity level at which clusters formed.

-------
TABLE 43.  RESULTS OF R-MODE CLUSTER ANALYSIS OF S  AND S   AFTER REORDERING CLUSTERS
              ACCORDING TO TROPHIC-FUNCTIONAL CODES, CLINCH RIVER, 1970

CATEGORY OF STATIONS INCLUDED (COEFFICIENT,  SIMILARITY LEVEL AT WHICH CLUSTERS FORMED)

Affected
(SJt 0.88)
Optioservus
Stenelmis
Chironomidae
Corydalus cornutus
Cheumatopsyche
Hydropsyche
Dubiraphia
Microcylloepus
Hemerodromia

Affected
(S , 0.88)
Optioservus
Stenelmis
Microcylloepus
Chironomidae
Corydalus cornutus
Cheumatopsyche
Hydropsyche
Dubiraphia
Hemerodromia

All stations
(Sj, 0.77)
Cheumatopsyche
Hydropsyche
Dubiraphia
Optioservus
Stenelmis
Chironomidae
Corydalus cornutus
Acroneuria
Microcylloepus
Promoresia
Hemerodromia
Isonychia
Baetis
Affected
(SSM, 0.71)
Cheumatopsyche
Hydropsyche
Dubiraphia
Microcylloepus
Optioservus
Promoresia
Stenelmis
Chironomidae
Corydalus cornutus
Acroneuria
Hemerodromia
Isonychia
Baetis
Affected
(Sj, 0.58)
Cheumatopsyche
Hydropsyche
Dubiraphia
Microcylloepus
Optioservus
Promoresia
Stenelmis
Goniobasis spinella
Chironomidae
Corydalus cornutus
Acroneuria
Hemerodromia
Isonychia
Tricorythodes
                                                                                Baetis
                                                                                Hetaerina

-------
                                                    TABLE 43 (continued)
       Affected
      (SJf 0.88)
 Affected
S,  0.88)
All stations
 (SJf 0.77)
 Affected
S, 0.71)
 Affected
(SJf  0.58)
1-0
                                                Goniobasis carinifera
                                                Ephemerella
                                             Tricorythodes
                                             Goniobasis carinifera
                                             Ephemerella
                                             Hetaerina
                                                Goniobasis  carinifera
                                                Ephemerella
                         Tricorythodes
                         Hetaerina
                   Tricorythodes
                   Hetaerina
                                                                          Simulium
                                                                          Anculosa

-------
TABLE 44.  RESULTS OF R-MODE CLUSTER ANALYSIS OF DIST 7 COEFFICIENTS AFTER REORDERING
CLUSTERS ACCORDING TO TROPHIC-FUNCTIONAL CODES, CLINCH RIVER, 1970

CATEGORY OF STATIONS INCLUDED  (COEFFICIENT, SIMILARITY LEVEL AT WHICH CLUSTERS FORMED)
      Affected
    (Dist 7, 1.8)
      Affected
    (Dist 7, 5.3)
    All stations
   (Dist 7, 4.2)
    Unaffected
   (Dist 7, 2.3)
Promoresia
Hemerodromia
Tricorythodes
Potamanthus
Anculosa
Anculosa subglobosa
Goniobasis carinifera
-*	 	
Goniobasis spiiiella
Psephenus herricki
Ephemerella
Stenonema
Hetaerina
Perlesta placida
Optioservus
Stenelmis

Tricorythodes
Potamanthus
Dubiraphia
Microcylloepus
Promoresia
Anculosa
Anculosa subglobosa
Goniobasis carinifera
Goniobasis spinella
Psephenus herricki
Baetis
Ephemerella
Stenonema
Chironomidae
Corydalus cornutus
Hetaerina
Acroneuria
Perlesta placida
Hemerodromia
Microcylloepus
Baetis
Chironomidae
Corydalus cornutus

Sphaerium
Tricorythodes
Ephoron
Potamanthus
Promoresia
Anculosa subglobosa
Goniobasis spinella
Pleuroceridae
Psephenus herricki
Ephemerella
Heptagenia
Hetaerina
Acroneuria
Perlesta placida
Hemerodromia
Sphaerium
Microcylloepus
Promoresia
Anculosa subglobosa
Goniobasis spinella
Pleuroceridae
Psephenus herricki
Hetaerina
Acroneuria
Hemerodromia
                                                                                        Ephemerella
                                                                                        Corydalus cornutus

-------
                     TABLE 45.   CLUSTERS OF TAXA DEFINED BY R-MODE CLUSTER ANALYSIS OF JACCARD'S AND THE
              	SIMPLE-MATCHING COEFFICIENTS,  CLINCH RIVER, 1970	

              ~                  Unaffected  stations                      Affected stations


                                  Stenelmis                              Stenelmis
                                  Optioservus                           Optioservus
                                  Dubiraphia
                                  Ephemerella                           Hydropsyche
                                  Baetis                                 Cheumatopsyche
                                  Hydropsyche                           Dubiraphia
                                  Psephenus herricki
                                  Goniobasis  spinella                    Isonychia
                                                                        Baetis
                                  Stenonema
                                  Isonychia                              Goniobasis carinifera
                                  Acroneuria                             Ephemerella
M                                 Trieorythodes
S3
u>
                                  Chironomidae                          Chironomidae
                                  Corydalus  cornutus                    Corydalus  cornutus
                                  Cheumatopsyche
                                  Potamanthus                           Trieorythodes
                                                                        Hetaerina
                                  Anculosa
                                  Anculosa subglobosa

-------
                                  SECTION 6

               ORDINATION--NONMETRIC MULTIDIMENSIONAL SCALING
6.1  GENERAL DESCRIPTION

     Ordination is an analytical technique often used when dealing with large
sets of data.  In aquatic ecology, ordination involves plotting either the
samples or the species found in the samples in a two- or three-dimensional
diagram.  The choice of axes on which the points are plotted depends on the
method of ordination selected.  In most ordinations, either samples are
plotted in a space where the axes are species or species are plotted in a
space defined by the samples.  The primary advantage of ordination over
cluster analysis is that it "allows one to examine a scatter diagram
displaying a summary of the structure of the data without having to first
assume that clusters are present" (Rohlf, 1970).  In this sense, ordination is
a powerful alternative to cluster analysis.

     Several techniques of ordination are available, including principal
component ordination, principal coordinate ordination, the Bray-Curtis polar
ordination technique (Whittaker, 1975), and nonmetric multi-dimensional
scaling (MDS) (Kruskal, 1964a and 1964b).  Only nonmetric MDS is dealt with in
this study because it seems ideally suited to the kinds of data obtained from
biological surveys.  Specifically, nonmetric MDS is computationally robust
when data are missing; it can accommodate quantitative, ranked, or
presence-absence data; and it can be used with any kind of measure of
correlation, similarity, or distance.

6.2  ANALYTICAL PROCEDURES

     In nonmetric MDS, similarities or distances between samples are treated
on the ordinal scale; that is, they are ordered from smallest to largest.  A
configuration is then found in which "rank order of (ratio scaled) distances
best produces the original input ranks.  One tries to do this in the lowest
dimensionality that produces a 'close enough" ordinal fit" (Green and
Carmone, 1970).  The important point is that the MDS program operates on
ranked similarities and distances rather than on actual similarities.  Thus,
nonmetric MDS is perfectly applicable to matrices of similarity and distance
computed from ranked data or even presence-absence data.

     For a specified number of dimensions chosen in advance by the
investigator, the computer programs "try to find a configuration of points
whose interpoint distances are monotone—that is, have the same (or possibly
the inverse) ranks as the input data" (Green and Carmone, 1970).  The
coordinates of this new configuration are the values used in the ordination.
In practice, perfect configurations are unusual.  The measure of departure
from monotonicity is called stress.  The higher the stress, the less nearly
perfect the degree of monotonicity.  In general, the stress decreases with the
number of dimensions and the number of iterations used in the analysis.

                                     124

-------
6.3  RESULTS

6.3.1  Q-Mode Analysis—Clinch River Data

6.3.1.1  Presence-Absence Data—
     Two applications of nonmetric MDS, presence-absence and quantitative
data, were tested (as an alternative to cluster analysis).  Figure 36 shows
the Q-mode ordination of presence-absence data from 36 zoomacrobenthic samples
from the Clinch River.  In this three-dimensional display, one large, fairly
tightly grouped set of 16 samples lies near the right-center portion of the
diagram, indicating the faunal similarity of these samples.  This group of
samples is quite similar to the first cluster in Table 31, which shows the
results of cluster analysis of the matrix of Jaccard's coefficients (S ).
Samples outside the main group in Figure 36 are from station 8, six and eight
weeks after the spill; station 9, four, six, and eight weeks after the spill;
and station 10, two weeks after the spill.

     Samples collected at stations 8, 9, and 10 after the spill tended to form
individual groupings near the upper center of Figure 36.  For example, samples
collected at station 9 four, six, and eight weeks after the spill grouped
together.  Since the biological communities at these stations were exposed to
different pH conditions for varying lengths of time, they were expected to
stand alone, not to cluster together.

     Samples from substation 7RB did not form any distinct pattern, but were
scattered along (left side of Figure 36).  This agreed with the results
obtained from Q-mode clustering, where samples from 7RB did not cluster at the
0.62 level of similarity (reference Table 31).  Samples from 7LB&MS, collected
six and eight weeks after the spill, were also widely separated.  The lack of
similarity of these samples with the other 34 samples also agreed with the
Q-mode clustering results presented in Table 31.

6.3.1.2  Quantitative Data (Species Counts)—
     When species counts were used in nonmetric MDS, the 36 zoomacrobenthic
samples appeared to be randomly scattered (Figure 37).  Only one recognizable
group of stations was present.  This cluster, near the upper left margin of
the diagram, consisted primarily of samples from substation 7RB.  Two samples
from 7LB&MS, collected six and eight weeks after the spill, were also in this
cluster, indicating that the left bank and midstream sections of station 7
were similar to the right bank.

     The results of cluster analysis of the distance coefficient Dist 7
(Table 34) were compared with Figure 37 to determine any similarities between
clustering and the ordination of quantitative data.  At a level of similarity
of 4.8, three clusters were formed in the Dist 7  dendrogram;  (1) 11 samples
from either the control stations (stations 4 and 11) or stations 8, 9, and 10
before the spill;  (2) 20 samples from stations 7  through 10, including station
7LB&MS; and (3) samples from stations 8 and 9, four weeks after the spill, and
station 10, two weeks after the spill.   Figure 37 has only one poorly defined
group of nine stations:  station 8, immediately after the spill and two and
eight weeks after the spill; station 9, immediately after the spill and two,
six, and eight weeks after the spill; station 7LB&MS, immediately after the
spill and two weeks after the spill; and station  11, six weeks after the
spill.   It could be equated with the second cluster in Table 34, but only  in a
                                     125

-------
  O  27-32  BEFORE SPILL
  A  33-36  IMMEDIATELY AFTER SPILL
  D  47-50  2 WEEKS AFTER
  •  51-56  4 WEEKS AFTER
  A  58-61  6 WEEKS AFTER
  •  62-67  3 WEEKS AFTER
                     A7B
NOTE:   7R = 7RB
       7LM = 7LB&MS
BINARY
        Figure 36.  Three-dimensional ordination by nonmetric multidimensional scaling computed from distance
                    coefficients based on presence-absence data, showing faunal similarities between samples
                    collected from the Clinch River in 1970.

-------
O 27-32  BEFORE SPILL
A 33-36  IMMEDIATELY AFTER SPILL
D 47-50  2 WEEKS AFTER
• 51-56  4 WEEKS AFTER
A 58-61  6 WEEKS AFTER
• 62-67  8 WEEKS AFTER
NOTE:   7R = 7RB
       7LM = 7LBSMS
                        7R
        Three-dimensional ordination of nonmetric multidimensional scaling computed from distance
        coefficients based on counts of species, showing faunal similarities between samples collected
        from the Clinch River in 1970.

-------
general sense.   No other clusters were found in Figure 37 that would compare
favorably with those noted in Table 34, indicating that cluster analysis
tended to group stations that nonmetric MDS did not.
                                     128

-------
                                  SECTION 7

                              DIVERSITY INDICES
 7.1  GENERAL DESCRIPTION

     The third kind of analysis of community structure considered was the use
 of indices of species diversity, and indices of diversity at higher taxonomic
 levels.  A diversity index is a statistic that combines information on both
 the number of species in an assemblage and the evenness of the distribution of
 individuals among those species (Pielou, 1969).  A high index of species
 diversity results from the presence of many species with nearly even
 abundances; a low index results from a few species in an assemblage dominated
 by one species.  A diversity index of intermediate value can result either
 from a few species with nearly even distributions or from many species with
 uneven distributions.  Therein lies the method's greatest weakness:  that a
 value of species diversity can result from study of assemblages with quite
 different distributions of species.  A second weakness is that, unlike cluster
 analysis and ordination, which operate on similarity coefficients, diversity
 indices are not affected by the species present, only by their numbers.  Thus,
 although quite different communities can be compared, equal diversity does not
 imply equal tolerance to potential environmental impacts.

     Failure to recognize the importance of these two inherent weaknesses has
 led to widespread misuse of diversity indices and misunderstanding of the
 concept of diversity.  As a result, an extensive literature exists on the
 disadvantages of their use and the drawbracks of the concept (see e.g.,
 Hamilton, 1975; Hedgpeth, 1973).  Diversity indices remain, however, as one
 tool available for ecologists.  Like any tool they can be misused, and getting
 the most out of them requires an understanding of their limitations.  Like
 many other tools, they are most valuable when used in conjunction with other
 information or tools.  The concept of diversity has been widely applied in all
 branches of ecology (Woodwell, 1970), and its use for evaluating potentially
 impacted communities in applied aquatic ecology is particularly noteworthy.
 Former EPA administrator R. E. Train (1973) has stated that "for top
 management and general public policy development, monitoring data must be
 shaped into easy-to-understand indices that aggregate data into understandable
 forms.   Failure to do so will result in suboptimun achievement of goals at
much greater expense."  Whitten (1975) has concluded that "the most suitable
means of analyzing community structure for the purpose of pollution assessment
 appears to be the diversity index."

     Although most of the theory of diversity of communities has been based  on
 species diversity,  applied ecologists often have to enumerate organisms at
higher taxonomic levels  because of the difficulty of identifying species (see
earlier discussions,  section 5.3.2.1) and,  consequently,  the high cost of
doing so.   Results  of such analyses,  while internally consistent within a
study,  are by no means comparable  from study to study.   Therefore,  the search


                                     129

-------
 for absolute values of diversity are unrealistic and inconsistent with the
 purpose of applied aquatic ecology.  Nonetheless, in the examples that follow,
 we continue to refer to species diversity as a concept although the diversity
 values are in fact sometimes based on undifferentiated higher taxa, especially
 for such groups as oligochaetes and chironomids.

 7.2  ANALYTICAL PROCEDURES

     Measures of species diversity can be grouped into four general types:
 (1) species richness; (2) indices that assume a particular distribution, for
 example Williams' ct index (Fisher et al., 1943); (3) indices measuring
 probability of encounter, such as Simpson's (1949) index and the sequential
 comparison index (SCI) (Cairns et al., 1968); and (4) indices adapted from
 information theory, Shannon's index (Shannon and Weaver, 1949), Brillouin's
 index (Brillouin, 1962), and the approximate index (e.g., Wilhra and
 Dorris, 1968).

     Species richness (S) is simply the number of species present in a sample
 or community.  This measure of diversity has the same drawbacks as the use of
 presence-absence data in cluster analysis and ordination.  It makes no
 distinction, for example, between a sample with 1 individual each of species A
 and B, a sample with 1 individual of species A and 1000 of species B, and a
 sample with 1000 of each species.  Moreover, it is highly dependent on sample
 size.

     Dennison and Hay (1967) presented an equation that allows one to compute
 the number of individuals one must collect to be certain (with a given
 probability of error) of collecting species that comprise a given proportion
 of the community.

         N =   log (E)
             log (1 - p)


where E = the probability of error the investigator is willing to accept,
p = the proportion each target species comprises of the community, and N = the
number of individuals to be collected.  For example, to be 95 percent certain
(i.e., 5 percent error: E = 0.05) of collecting species that comprise 1
percent of the community (p = 0.01),
     The use of this equation shows that rather large samples are needed to
assure that even relatively common species are not missed by chance alone and
casts doubt on the value of species richness as a measure of species diversity
in applied aquatic ecology.

     Williams' a index (Fisher et al., 1943) is based on the assumption that
the abundances of organisms in the community being studied fit a logarithmic
series.  Williams (1950) proposed that the parameter of such distributions be
used as an index of diversity of the community.  The parameter Of is an
intrinsic property of communities, unaffected by sample size, that is
proportional to the number of species present.
                                     130

-------
          S = a log  (1 + -)
                   e      a

 where a = the parameter of the logarithmic  series  used  to  measure  species
 diversity, N = the number of individuals  in the  sample,  and  S  =  the  number  of
 species in the sample.   Thus,  although a  is a  property  of  the  community  that
 is independent of sample size, since  its  estimation  is  based on  number of
 individuals and species in a sample,  the  size  of the sample  is of  paramount
 importance.  This index can be applied only if species  abundance fits a
 logarithmic series,  which may be  impossible to determine when  only a few
 species are present.   Thus,  as Pielou (1969) has pointed out,  "a is  unsuitable
 as a  diversity index unless  the collection  at  hand has  many  species  and  also
 unless abundances form a logarithmic  series."

      Simpson's (1949)  index  was derived from probability theory  and  answers
 the question "What is  the probability that  two specimens picked  at random from
 a  community of infinite size are  the  same species?"   If  a  species  i  is present
 in the community in the proportion p.,  the  probability  of  selecting  two
 individuals of species  ;L at  random is the joint  probability  of p.  .  Simpson's
 equation is:                                                     1

                  S
          D = 1  -  Z (?i2)
                 i=l

 where  S  = the  number of species,  p. = the proportion  of  individuals  in the  ith
 species,  and D  =  the diversity.   As with some  of the  other indices discussed
 below,  the correct computation of Simpson's diversity requires a fully
 censused community, which never occurs  in practical  ecology.   Moreover,  Krebs
 (1972) has noted  that Simpson's index gives relatively little  weight to  rare
 species,  a characteristic  that is undesirable  where  rare species have great
 impact on the  community,  as  is  the case with predators.

     A problem  universally associated with  the use of diversity  indices  has
 been the time and  level  of professional expertise  required for taxonomic
 identification  of  organisms  in the sample.  Cairns et al.  (1968) developed  the
 sequential  comparison index  (SCI) to  overcome  this problem.  Given a random
 sample of  specimens, A  , A ,  .  .  . A  ,  sequential  comparisons  are made between
 organisms  (e.g., AI vs. A2, A2  vs. A  ), and the number of runs or  consecutive
 specimens  of the same species  is  determined.  The  ratio of number of runs
 divided  by  (N + 1), where  N =  the number of organisms, is the measure of
 diversity.  Patil  and Taillie  (1976) have demonstrated that, with a minor
 correction  for bias, the SCE becomes an unbiased estimator of Simpson's  index
 diverted  from probability  theory.

     Margalef (1956) proposed analysis of mixed-species communities by methods
derived  from information theory.  In this sense,  diversity is equated with the
uncertainty that exists when a  single organism is selected at random from the
community.  The more species present in a community and the more even their
distribution, the greater the uncertainty and the larger the species
diversity.  Information content is a measure of uncertainty and is  a
reasonable measure of diversity.  The most commonly employed diversity indices
derived from information theory are  Shannon's index (H'), Brillouin's index
 (H), and the approximate index  (H").  Pielou (1966,  1967, 1969,  1974,  1975,
                                     131

-------
1977) has elucidated the theory behind the use of these indices in ecology,
and Kaesler and Herricks (1977) and Kaesler et al. (1978), drawing on her
work, have evaluated their use in applied aquatic ecology.

     Shannon and Weaver (1949) introduced the following equation for the
information content per symbol of a code made up of S different symbols, each
with a probability of occurrence of p.:

                S
         H' = - Z  (p± loge pj.


In an ecological context, S = the number of species in a conceptually infinite
community and p. = the proportion the ith species comprises of the community.
Note that this equation is not intended for use with sample data but rather
requires knowledge of the proportions each species comprises of the community,
information that is never available to the applied aquatic ecologist.  In
their evaluation of this index of diversity, Kaesler and Herricks (1977)
wrote:  "The primary problem with the use of this equation is that in the real
world it is usually impossible to define or to sample randomly from the
conceptually infinite population.  How, for example,  would one define the
limits of a community occupying the polluted reaches of a stream in which
downstream recovery and accommodation, seasonal change, and recruitment due to
stream drift are important factors?  Moreover, the probabilities of occurrence
or the proportion of each species in the community can never be known, and a
reasonably precise estimation may require unreasonably large samples,
especially for rare species."

     Brillouin's equation gives the information content per symbol of a
message and is thus based on samples rather than conceptually infinite
communities:

          H   l -i           N!
          H = - log
              N   6e N1!N2! ... Ng!

In an ecological context, S = the species in a sample, N = the number of
organisms in the sample, and N. = the number of organisms belonging to the v
                              .
species.  Kaesler and Herricks (1977) wrote:  "Brillouin's H is the species
diversity per individual of a collection in which all N specimens have been
assigned to one of s species and counted to give the N.'s.  This equation has
not been popular, partly because the factorials involved often become astro-
nomically large, but with readily available, high-speed computers, and tables
of logarithms of factorials it need no longer be avoided.  Moreover, it is
important because it, and not Shannon's equation, gives the actual diversity
of a fully censused collection of organisms.  It is not a statistical estimate
but an actualy measurement of the diversity of the working ecologists basic
unit—the sample."

     The approximate index is given by:
                     l  log
                =l  N
                                     132

-------
where S =  the number  of  species  in a  sample and N./N = the proportion of the
ith  species  in the  sample.  Kaesler and Herricks  ^1977) have pointed out that
the  approximate index "is the one that is used most often in practice.  It
resembles  Shannon's equations, but the probabilities of occurrence of species
have been  replaced  by their proportion in the sample, N./N.  In spite of its
popularity,  the approximate index has some serious drawbacks that have not
always been  thoroughly appreciated by some of its users.

     "Pielou (1966) has  shown that this equation  can be used for two quite
different  purposes.   Unfortunately, it suits neither purpose particularly
well.  First, it has  been used to estimate Shannon's diversity, H', when the
N.'s are sample values.  H" is,  in fact, a maximum likelihood estimator of H*,
but  it is  also a biased  estimator.  An appropriate correction term has been
derived, but in order to apply it, it is necessary to know s, the number of
species in the community (in the universe, not the sample) (Basharin, 1959).
It is, of  course, not possible to know the number of species in a real
community  without making a complete census of the community, which, even if it
were possible, would  not be consistent with the purposes of environmental
protection.  Wilhm  (1968) has pointed out that for a finite s the bias
approaches zero as  the number of individuals in the sample, N, approaches
infinity;  and Peet  (1974) has suggested that the bias is small for most
ecological applications.  However, when equal-effort sampling methods are
used, one  is not assured of large samples, particularly from heavily polluted
stations.  Thus, the  very samples in which the applied ecologist is most
interested will in  many  instances be those in which the bias is most severe.

     "Second, H" has  been used as a substitute for Brillouin's H to
approximate  (not estimate in a statistical sense) the reputedly cumbersome H
in order to  avoid computing fractorials.  This is done when the N.'s are
regarded as population values.  The equation for H"...can be derived from
Brillouin's  equation,...but the derivation is valid only when all the N.'s
are...very large indeed...(Pielou, 1969).  The reason for this stipulation of
very large N.'s is  that  the derivation of [H"]...from Brillouin's
equation...depends  on the substitution of N(ln N-l) for InN!  in...[H].  For
small values of N,  this  substitution is simply not warranted.  Especially with
equal-effort sampling, one cannot be sure that all the N.'s will be very
large, and again the  low diversity samples that are usually of greatest
interest to applied ecologists will be incorrect—in this case with
artifically high values because H" is always larger than H.

     "Wilhm  (1968) has argued that H seldom applies to ecological measures of
diversity because the N.'s "...are rarely population values and must be
estimated from the sample."  But, as was pointed out earlier, this is entirely
a matter of defining  the population (universe).   If ecological samples are
regarded as fully-censused collections, then H ... gives the exact diversity
not  an estimate or approximate.   Because it is usually not possible to define
the  limits of a population,  especially in stream surveys,  nor to  sample at
random from it,  we judge that it is most appropriate to regard a  sample as a
message from the ecosystem.   That message has an information content or
diversity that is appropriately computed with Brillouin's  equation..."

     Pielou  (1967) has shown that diversity indices from information theory
may be studied at each level in the taxonomic hierarchy and that  the
                                     133

-------
components at various levels are independent and additive.  Thus, for example,
one may study diversity of orders, families within orders, genera within
families, and species within genera.  Furthermore, one may devise other
schemes of classification that provide greater ecological insight than the
taxonomic heirarchy.  In this study two nontaxonomic hierarchical
classifications were used in addition to the taxonomic hierarchy.  The first,
adapted from Cummins' (1973) work, emphasized the trophic-functional (TF) role
of each organism in the stream ecosystem.  The major categories used in the
classification are functional group, feeding mechanism, dependence, and
principal food habit (Table 11).  To adapt this scheme to our needs, a
numerical code was assigned to each designation in a category and a hierarchy
established (Table 12).  The second classification scheme was based on the
observations of Steinmann (1907, 1908) and Hynes (1960, 1970), who related
anatomical and behavioral adaptations of benthic invertebrates to habitat
preference and the organism's ability to obtain food.  This system used three
categories that stressed functional morphology:  head position, H; general
body shape, B, and type and shape of respiratory organs, R (Table 13).

7.3 DIVERSITY OF SAMPLES FROM THE CLINCH RIVER

     Species diversity of 36 samples collected in 1970 from the Clinch River
were computed using Brillouin's H and, for comparison, the approximate index
H".

7.3.1  Species Diversity

     Results of diversity analysis with Brillouin's index (H) are summarized
in Figure 38, in which time in days is plotted along the abscissa and stations
along the ordinate.  The ordinate was complicated somewhat by dividing
station 7 into substations 7RB (right bank) and 7LB&MS (left bank and
midstream).  Both substations were at the same river kilometer, but 7LB&MS was
plotted closer to station 4 because it was unaffected by the spill and
day-to-day operation of the power plant.  Substation 7RB, on the other hand,
was strongly affected by both because the effluent from the power plant flowed
along the right bank.

     H was rounded to one decimal place, values were plotted, and diversity
was contoured at an interval of 0.4 to illustrate geographic and temporal
differences between stations or groups of stations.  The most discernible
differences involved the periods before and after the spill and substations
7LB&MS and 7RB (Figure 38).  The contours of diversity near these samples are
closely spaced, indicating a pronounced change in community structure.

     After the spill, diversity decreased to between 1.6 and 1.8 at the
affected stations.  This decrease was followed by an increase in diversity
over the next 60 days until H equaled or approximated the values found before
the pH stress.  Examination of the samples contained within each contour line
indicated a sequence of recovery for the affected stations.  Biological
recovery through time began at the farthest downstream station and proceeding
upstream.  This is contrary to the sequence postulated by some investigators,
who claim that biological recovery begins at the site closest to healthy
sites.  Although their hypothesis may explain the recovery of some streams, it
does not in this instance.  In this study, biological recovery depended more
on the severity of the initial damage than on a site's proximity to unaffected
tributaries or headwater areas.

                                     134

-------
                            0>
                                      CD
                                       SNOI1V1 S


Figure 38.  Species diversity (Brillouin's H) of 36 zoomacrobenthic samples
            from the Clinch River, 1970; contour interval 0.4.
                                     135

-------
     Values of H tabulated in Table 46 form groups that are remarkably similar
to those obtained from clustering presence-absence data with Jaccard's
coefficient (Table 31).  In both analyses, the upstream and downstream control
stations (stations 4 and 11) are similar.  Moreover, samples from stations 8,
9, and 10 collected before the pH stress are similar to the control stations.
After the spill, the most severely impacted sites tended to cluster together,
as evidenced by the cluster containing station 8, immediately after the spill
and two and four weeks later; station 9, immediately after the spill and two
weeks later; and station 10, immediately after the spill.

     Results using the approximate index H" (Figure 39, Table 47) are similar,
but H" appears to depend more on sample size than other investigators have
thought (Wilhm and Dorris, 1968).

7.3.2  Hierarchical Diversity

     It is important to remember that species diversity is a single statistic
that cannot provide a complete picture of community structure.  The statistic
does, however, provide a useful quantitative measure of environmental
conditions, and it has characteristics that allow it to be used as a heuristic
tool.  Pielou (1967, 1969, 1975) has demonstrated the possibility of
partitioning species diversity into additive components that express the
amount of diversity contributed by each level or component of a taxonomic
hierarchy.

     The use of hierarchical diversity as a heuristic tool enables one to
determine whether additional insight can be gained from species-level
determinations in biomonitoring programs.  For example, if only one species of
each genus is found in a community, the component of diversity due to species
within genera equals zero.  Similarly, if only one genus of each family is
found, the component of diversity due to genera within families also equals
zero.  Thus, under some circumstances, costly and time-consuming species
determinations may be of limited value, and environmental  assessment can be
based on discrimination of higher taxa rather than identification of lower
ones .

     To test the usefulness of hierarchical diversity, the Clinch River data
set was partitioned into the taxonomic categories order, family, and genus.
Diversity was calculated for each substation, left bank, right bank, and
midchannel sections of the stream (Table 48).  For the control stations,
stations 4 and 11, the component of diversity for each taxonomic category
decreased from order to family to genus.

     At stations affected by day-to-day operation of the power plant and the
acid spill, however, the trend of decreasing diversity within the taxonomic
hierarchy was not found.  At the stations impacted by the  power plant, the
component of diversity for genera within families contributed more to H than
did families within orders.

     In previous studies of the Clinch River, Grossman et al. (1973) noted
that station 8 was chronically stressed by the power plant and was the most
severely impacted station after the pH stress.  Stress not only caused a
reduction in the number of taxa, but it changed the relative abundance of each
species.  As a result, diversity decreased.  At substation 8LB the diversity
                                     136

-------
                                                                     <0
                         o>
                                  SNOI1VJL S

Figure 39.  Species diversity  (approximate index H)  of  36 zoomacrobenthic
            samples from the Clinch River, 1970.
                                   137

-------
component for genera (0.52) before the spill was higher than that for families
(0.33), but less than that for orders (1.44) (total = 2.29).  Immediately
after the spill, the difference between familial and generic diversity was
even more pronounced:  0.57 (genera), 0.00 (families), 0.79 (orders), and
total = 1.36.

     This example also suggests that each component of diversity in a
taxonomic hierarchy may be a useful indicator of stressed vs.  unstressed
communities.  Note, however, that the diversity for each category in the
taxonomic hierarchy did not decrease in proportion to the overall decrease in
H.  And, if the Clinch River data set is representative, one can conclude that
in unstressed, healthy ecosystems the combined components of diversity for
higher taxonomic categories (order and family) contribute more to measures of
diversity than genera and species.

     A second test of the usefulness of hierarchical diversity deals with
functional morphology.  If all species in a genus have the same functional
role in the ecosystem, how important is their diversity to the overall
stability of the system when compared with the diversity of other taxa?  To
answer this question, two hierarchical classifications that bypassed the
classical taxonomic hierarchy were developed.

     The first classification was modified from the system proposed by Cummins
(1973) (refer to Tables 11 through 13, section 4.1.3).  It stresses the func-
tional group, feeding mechanism, dependence, and the principal food habit of
each species.  In adapting it to our needs, numerical codes were assigned to
each designation in a category (Table 12) and a hierarchy was  developed that
is admittedly artifical (Tables 11 and 13).  Hierarchical diversity was
computed for five samples or subsamples from the Clinch River  immediately
after the spill (Table 49).  Station 7 was divided into substation 7LB&MS,
unaffected by the spill and day-to-day operation of the power  plant, and
substation 7RB, strongly affected by both.  Stations 8, 9, and 10 were located
in the zone of mixing downstream from the spill site.  Stations 7LB&MS and 10
had similar numbers of taxa, but quite different components of diversity at
all levels.  Other than number of taxa, samples from stations  7RB, 8, 9, and
10 were quite similar.

     In addition to summarizing diversity for various levels in each
hierarchical classification, Table 49 gives the standard deviations.  In this
example, the standard deviation was a convenient measure of scatter of the
diversity components at each level.  For the TF classification, the standard
deviation for the control station was much larger than that for any other
stations.  The high standard deviation found at the control substation
resulted primarily from the large component of diversity contributed by
species within food habit.  Also, the component contributed by dependence
within feeding mechanism was appreciably larger for the control than for the
affected stations.  Substation 7RB had a higher diversity component
contributed by feeding mechanism within functional group (Table 50).

     The second classification (HER) resulted from the work of Steinmann
(1907, 1908) and Hynes (1960, 1970), who related anatomical and behavioral
adaptations of benthic invertebrates to conditions found in lotic
environments.  Three categories stress functional morphology,  i.e., an
organism's functional role in the ecosystem and its habitat preference as
                                     138

-------
reflected by its morphology:  head position, body shape, and type and shape of
respiratory organs (refer to Table 13).

     The relative contribution of each component of diversity did not vary
appreciably from sample to sample as one proceeded from one category to the
next (Table 51).  More importantly, each component within the hierarchy
contributed to the overall diversity, suggesting that the HER
(head-body-respiratory) classification scheme warrants further investigation.

     In the hierarchical diversity study, different classification schemes
were used.  Hynes (1970) stated, "It should be borne in mind that these are,
to a considerable extent, artificial, and that any one species or higher taxon
may display several types of adaptation.  Alas, some of the phenomena to be
discussed are not strictly adaptation; they are changes brought about by life
in this peculiar environment."  Although it is still too early to discern
whether some of the observed trends are real, it appears that classifications
emphasizing functional relationships merit further consideration and that
hierarchical diversity is a useful heuristic technique.
                                     139

-------
          TABLE 46.  SPECIES DIVERSITY (BRILLOUIN'S H) OF 36
          ZOOMACROBENTHIC SAMPLES FROM THE CLINCH RIVER, 1970
Date
       7LB&MS
         7RB
              10
       11
                       Group I—H values >2.1
Before pH stress
Immediately after
  pH stress
2 weeks after
4 weeks after
6 weeks after
8 weeks after
X
X

X
X
X
X
X
X
X
                                                  X

                                                  X
                                  X
               X
                   Group II—H values >1.7 to <2.1
Before pH stress
Immediately after
  pH stress
2 weeks after
4 weeks after
6 weeks after
8 weeks after
                   X
                   X
                  X
                  X
                                  X
                                  X
               X
               X
               X
                   Group III—H values >0.9 to <1.7
Before pH stress
Immediately after
  pH stress
2 weeks after
4 weeks after
6 weeks after
8 weeks after
                   X
                   X
                   X
                  X
                  X
                  X
       X
       X
                                          X
                                   140

-------
       TABLE 47.  SPECIES DIVERSITY  (APPROXIMATE INDEX H") OF 36
          ZOOMACROBENTHIC SAMPLES FROM THE CLINCH RIVER. 1970
Date
7LB&MS
7RB
       10
        11
                       Group I—d values >3.0
Before pH stress    X       X
Immediately after
  pH stress                 X
2 weeks after               X
4 weeks after       X       X
6 weeks after               X
8 weeks after       X       X
            X

            X



            X
         X
         X
         X
        X
        X
        X

        X

        X
                  Group II—d values >1 to <3.0
Before pH stress
Immediately after
  pH stress
2 weeks after
4 weeks after
6 weeks after
8 weeks after
            X
            X
            X
         X
         X
         X
X
X
X
X
X
X
X
                                  141

-------
TABLE 48.  HIERARCHICAL TAXONOMIC DIVERSITY (BRILLOUIN'S H) OF
      ZOOMACROBENTHIC SAMPLES FROM THE CLINCH RIVER, 1970

Sample
4 RB
4 LB
4 MS
4 RB
4 LB
4 MS
4 RB
4 LB
4 MS
7 RB
7 LB
7 MS
7 RB
7 LB
7 MS
7 RB
7 LB
7 MS
7 RB
7 LB
7 MS
7 RB
7 LB
7 MS
7 RB
7 LB
7 MS
8 RB
8 LB
8 MS
Period
Before pH stress
Before pH stress
Before pH stress
4 weeks after
4 weeks after
4 weeks after
8 weeks after
8 weeks after
8 weeks after
Before pH stress
Before pH stress
Before pH stress
Immediately after
Immediately after
Immediately after
2 weeks after
2 weeks after
2 weeks after
4 weeks after
4 weeks after
4 weeks after
6 weeks after
6 weeks after
6 weeks after
8 weeks after
8 weeks after
8 weeks after
Before pH stress
Before pH stress
Before pH stress
Order
1.19
1.55
1.32
1.32
1.28
1.42
1.30
1.20
1.37
1.41
1.37
1.09
1.21
1.20
1.27
0.63
1.35
1.37
0.82
1.21
1.34
0.60
0.83
1.31
1.20
1.26
1.35
1.37
1.44
1.39
Component
family
0.97
0.56
0.42
0.75
0.59
0.54
0.63
0.64
0.39
0.47
0.92
1.09
0.08
0.96
0.40
0.06
0.76
0.41
0.10
0.74
0.43
0.12
0.99
0.61
0.18
0.87
0.72
0.29
0.33
0.42
Genus
0.21
0.39
0.32
0.55
0.32
0.44
0.24
0.33
0.35
0.14
0.41
0.25
0.50a
0.29
0.52a
0.53a
0.33
0.45
0.40a
0.31
0.34
0.17a
0.26
0.19
0.34a
0.24
0.22
0.533
0.52a
0.543
H
2.37
2.50
2.06
2.62
2.19
2.40
2.17
2.17
2.11
2.02
2.70
2.43
1.79
2.45
2.19
1.22
2.44
2.23
1.32
2.26
2.11
0.89
2.08
2.11
1.72
2.37
2.29
2.19
2.29
2.35
                               142

-------
TABLE 48 (continued)

Sample
8 RB
8 LB
8 MS
8 RB
8 LB
8 MS
8 RB
8 LB
8 MS
8 RB
8 LB
8 MS
8 RB
8 LB
8 MS
9 RB
9 LB
9 MS
9 RB
9 LB
9 MS
9 RB
9 LB
9 MS
9 RB
9 LB
9 MS
9 RB
9 LB
9 MS
9 RB
9 LB
9 MS
Period
Immediately after
Immediately after
Immediately after
2 weeks after
2 weeks after
2 weeks after
4 weeks after
4 weeks after
4 weeks after
6 weeks after
6 weeks after
6 weeks after
8 weeks after
8 weeks after
8 weeks after
Before pH stress
Before pH stress
Before pH stress
Immediately after
Immediately after
Immediately after
2 weeks after
2 weeks after
2 weeks after
4 weeks after
4 weeks after
4 weeks after
6 weeks after
6 weeks after
6 weeks after
8 weeks after
8 weeks after
8 weeks after
Order
0.87
0.79
0.84
0.61
0.62
0.49
1.32
1.28
1.42
0.56
1.01
0.86
1.19
1.33
1.15
1.41
1.40
1.21
0.67
0.43
0.85
0.28
0.97
0.71
0.97
0.90
1.07
1.39
1.29
1.31
1.38
1.58
1.56
Component
family
0.22
0.00
0.07
0.05
0.04
0.01
0.75
0.59
0.54
0.05
0.12
0.07
0.25
0.33
0.16
0.33
0.69
0.19
0.03
0.13
0.06
0.03
0.09
0.01
0.14
0.14
0.17
0.12
0.20
0.05
0.17
0.31
0.24
Genus
0.44
0.57a
0.52a
0.95a
0.94a
0.89a
0.55
0.32
0.44
0.62a
0.65a
0.64a
0.42a
0.17
0.403
0.64a
0.38
0.443
0.65a
0.83a
0.71a
0.753
0.81a
0.80a
0.69a
0.533
0.60a
0.39a
0.44a
0.16a
0.50a
0.14
0.10
H
1.33
1.36
1.43
1.61
1.60
1.39
2.62
2.19
2.40
1.23
1.78
1.57
1.86
1.83
1.71
2.38
2.47
1.84
1.35
1.39
1.62
1.06
1.87
1.52
1.80
1.57
1.84
1.90
1.93
1.52
2.05
2.03
1.90
         143

-------
                         TABLE 48 (continued)

Sample
10 RB
10 LB
10 MS
10 RB
10 LB
10 MS
10 RB
10 LB
10 MS
10 RB
10 LB
10 MS
10 RB
10 LB
10 MS
10 RB
10 LB
10 MS
11 RB
11 LB
11 MS
11 RB
11 LB
11 MS
11 RB
11 LB
11 MS
Period
Before pH stress
Before pH stress
Before pH stress
Immediately after
Immediately after
Immediately after
2 weeks after
2 weeks after
2 weeks after
4 weeks after
4 weeks after
4 weeks after
6 weeks after
6 weeks after
6 weeks after
8 weeks after
8 weeks after
8 weeks after
Immediately after
Immediately after
Immediately after
4 weeks after
4 weeks after
4 weeks after
8 weeks after
8 weeks after
8 weeks after
Order
1.20
1.46
1.32
1.11
1.25
0.86
1.19
1.20
1.33
1.28
1.15
1.14
1.36
1.24
1.23
1.42
1.63
1.36
1.34
1.12
1.24
1.17
1.22
1.30
1.37
1.42
1.33
Component
family
0.29
0.62
0.36
0.04
0.06
0.03
0.18
0.20
0.39
0.19
0.20
0.11
0.15
0.19
0.22
0.24
0.35
0.39
0.71
0.78
0.51
0.49
0.46
0.57
0.85
0.63
0.67
Genus
0.383
0.31
0.35
0.44a
0.483
0.37a
0.40a
0.33a
0.33
0.35a
0.33a
0.34a
0.50a
0.34a
0.27a
0.46a
0.29
0.16
0.24
0.26
0.35
0.36
0.39
0.37
0.17
0.18
0.26
H
1.87
2.39
2.03
1.59
1.79
1.26
1.77
1.73
2.05
1.82
1.68
1.59
2.01
1.77
1.72
2.12
2.27
1.91
2.29
2.16
2.10
2.02
2.07
2.24
2.38
2.23
2.26

The generic component of diversity contributed more to H than the
familial component of diversity.
                                   144

-------
            TABLE 49.  COMPONENT OF SPECIES DIVERSITY (H) AT EACH LEVEL IN THE TROPHIC-FUNCTIONAL HIERARCHY
                   FOR FIVE SAMPLES OR SUBSAMPLES COLLECTED IMMEDIATELY AFTER THE ACID SPILL ON THE
                                                  CLINCH RIVER. 1970
Cn

Additive component


Station
7LB&MS
7RB
8
9
10


Taxa
44
12
28
24
39

Functional
group
0.90
0.70
0.82
0.78
0.89
Feeding
mechanism
within
functional
group
0.36
0.32
0.09
0.12
0.09
Dependence
within
feeding
mechanism
0.29
0.12
0.04
0.02
0.03
Food
habit
within
dependence
0.03
0.00
0.00
0.03
0.01
Species
within
food habit
1.16
0.58
0.65
0.74
0.54


H
2.74
1.72
1.60
1.69
1.65

Standard
deviation
0.47
0.30
0.38
0.39
0.38

-------
TABLE 50.  PERCENT OF SPECIES DIVERSITY (H)  CONTRIBUTED AT EACH LEVEL IN THE TROPHIC-FUNCTIONAL
      HIERARCHY FOR FIVE SAMPLES OR SUBSAMPLES COLLECTED IMMEDIATELY AFTER THE ACID SPILL
                                   ON THE CLINCH RIVER, 1970

Station
7LB&MS
7RB
8
9
10
Taxa
44
12
28
24
39

Functional
group
32.7
40.9
51.2
46.0
53.9

Feeding mechanism
within functional
group
13.0
18.2
5.9
7.0
5.6
Additive component
Dependence within
feeding mechanism
10.6
6.9
2.3
1.4
7.6

Food habit
within
dependence
1.0
0.0
0.0
1.9
0.1

Species within
food habit
42.6
34.0
40.6
43.7
32.8

-------
TABLE 51.   COMPONENT OF SPECIES DIVERSITY (H) AT EACH LEVEL IN THE HEAD-BODY-RESPIRATORY
        FUNCTIONAL MORPHOLOGY HIERARCHY FOR FIVE SAMPLES OR SUBSAMPLES COLLECTED
               IMMEDIATELY AFTER THE ACID SPILL ON THE CLINCH RIVER, 1970

Additive component
Station
7LB&MS
7RB
8
9
10
Head
position
0.67
0.63
0.70
0.69
0.64
Body shape within
head position
1.29
0.29
0.19
0.19
0.43
Respiratory organ
within body shape
0.84
0.36
0.09
0.08
0.09
Species within
respiratory organ
0.61
0.44
0.62
0.73
0.46
H
2.74
1.72
0.60
1.69
1.62
Standard
deviation
0.87
0.47
1.39
0.44
0.40

-------
                                  SECTION 8

                           SUMMARY AND DISCUSSION
8.1  INTRODUCTION

     Different methods of analysis were used with varying results.  The
purpose of this section is to evaluate the results obtained, to assess the
different levels of interpretability,  and to make general recommendations
about the application of these methods and the results one can expect from
them.

8.2  NATURE OF THE ECOSYSTEMS FROM WHICH DATA BASES WERE SELECTED

     A point that must be stressed is  that the nature of the ecosystem and the
nature of the data collected from the  ecosystem determine both the methods to
be used in analyzing the data and the  ease with which the results can be
interpreted.  For example the Clinch River is a shallow, rapidly flowing
stream with a normally diverse benthic community that was adversely impacted
by an acute pH stress.  Numerous large samples were collected at closely-timed
intervals both before and after the acute pH stress.  Like many streams, it
could be regarded as a linear system,  in which contemporaneous samples from
adjacent stations are expected to be similar.  The Cumberland River, on the
other hand, was studied in an impounded area with deep, slowly moving water
that received chronic thermal stress.   Although both high- and low-flow
conditions were studied, the velocity  was never as great as the Clinch River.
Moreover, although temperatures in the range of the Cumberland River are
common in nature and adapted to by numerous warm-water organisms, the low pH
stress received by the Clinch River was a shock that few organisms were able
to tolerate.  Finally, the Cumberland  River was not a linear system, and the
thermal plume (potential stress) moved upstream during times of low flow.

     The depth of water in the Clinch  River was much less than the depth in
the Cumberland River, where benthic stations were located at depths varying
from 2.5 to 9 meters.  Samples from the Clinch River contained more
individuals and greater diversities than samples from the Cumberland River.
This difference stems in part from more extensive sampling in the Clinch and
in part from real differences between  the two ecosystems.  The relative
uniformity of the substratum sampled in the Cumberland River provided fewer
data from fewer habitats and niches.  Depth reduced the impact of the stress
in the Cumberland, whereas all macroinvertebrates in the Clinch River
downstream from the site of the acid spill were exposed to the low pH shock.
The net effect of the differences between the ecosystems and the kind and
degree of stress resulted in the Clinch River ecosystem being more
understandable and definable than the  Cumberland River system.

     While major environmental changes are evident with coarse methods of
analysis, subtle impacts may go undetected even with elaborately designed


                                     148

-------
 studies.   This  is  especially  true when  unexpected  factors,  such as upstream
 flow of  the plume,  inhibit  the  effectiveness of the analytical protocol
 established for the study.  In  our  examples, the Clinch River was studied  in
 greater  detail  than necessary to determine  initial impact and recovery.  The
 Cumberland River was probably not sampled adequately in view of the problem
 caused by  upstream flow.  The important point is that it is virtually
 impossible to detect most subtleties or the lack of them before the fact.
 Thus, the  adequacy of a sampling program may be in doubt until after some  of
 the  data has been  analyzed.   Preliminary studies are too rarely conducted.  As
 a  result judgmental errors  may  be made  that limit the value of the study or
 seriously  overextend resources.

 8.3  METHODS

 8.3.1  Relationships Between  Methods

     Cluster analysis and ordination of data are intended to serve the same
 purpose when used  in the Q-mode, i.e.,  to express similarities between samples
 from different  stations.  Similarities  between samples are based on their
 faunal content  with species given equal weight a priori.  Because cluster
 analysis will force samples into clusters whether or not clusters exist in
 nature, it may  be  unsuited  for  analysis of strictly linear  (e.g., riverine)
 systems.   When  numerous samples are collected through time, however, as was
 done in study of the Clinch River, the  strict linearity of the system is
 overcome due to the imposition  of temporal trends.   In such cases, cluster
 analysis may be a useful analytical tool.  For river-reservoir systems, such
 as the Cumberland River, where  overbank areas and channel areas may have high
 intracorrelations  regardless  of their position along the river, cluster
 analysis may be entirely appropriate.

     Throughout this report we have stressed cluster analysis and have
 deemphasized ordination because the application and interpretation of cluster
 analysis to problems of applied aquatic ecology are much better established.
 Specifically,  procedures have not been  developed and tested for limiting the
 size of data sets to be studied by ordination and for excluding rare species
 that may be present  in some samples and absent from others due to chance
 alone.  In spite of  being somewhat more difficult to interpret, ordination
 methods such as nonmetric multidimensional scaling can show everything cluster
 analysis shows  without some of the inherent disadvantages.   Ordination methods
 merit further investigation by ecologists.

     Analysis  of species diversity serves a different purpose from cluster
 analysis and ordination.  Indices of species diversity do not consider what
 species are present.  They are based on the number of species present and the
 evenness of the distribution of individuals among species.   Thus,  species
 diversity is a measure of community structure that is independent of the
particular community being studied.   It is  tempting to compute indices of
diversity from numerous communities and to  compare  the state of health of the
ecosystems by  comparing indices.  Three aspects  of communities suggest that
this approach  must be used with caution if  at all.   First,  not all natural
 communities have high diversity.  For example,  undisturbed communities of
macroinvertebrates from rapidly flowing streams  are more diverse than those
from slower moving streams and reservoirs.   The  difference in diversity is not
a difference in health of the  ecosystems,  although it may bear on their


                                     149

-------
resiliency.  Second, undisturbed upstream communities may be as diverse as
undisturbed downstream ones, but the faunas may be quite different.  Third, a
diverse community that is tolerant to heated water or other pollutional stress
may become established .   Such a community may be described as perched if it
is faunally different from upstream communities that would ordinarily
contribute to repopulation of the area.  An incremental increase in
pollutional stress or the temporary removal of stress may decimate the perched
community and leave it degraded since no source of stress-tolerant species
exists upstream.  None of these three points can be considered adequately by
the rote comparison of species diversity alone.

8.3.2  Data

     The kind and nature of data dictate the type of procedures that should be
used for a study, the conclusions that can be drawn, and the confidence one
has in those conclusions.  The most important aspect of planning any
ecological study is to make sure the data collected are adequate for answering
the question at hand.  Sampling must have a well defined purpose and the more
clearly that purpose is stated the greater the likelihood that data will be
adequate for the intended purpose.  In Section 7 we dealt briefly with
problems that may arise from collecting too small a sample.   Other aspects of
sampling design should also be considered to develop sampling protocols that
produce data of the desired quality and quantity.

8.3.2.1  Presence-Absence Data--
     The use of presence-absence data has one serious drawback:  all
information contributed by the differential abundances of organisms is lost.
For some purposes, however, the benefits to be gained may outweigh the
disadvantages.   First, presence-absence data can be obtained much more
quickly, and large samples can be processed at less expense.  Second,
depending on the purpose  of the study, presence-absence data may give all the
information needed.  It was also determined that quantitative data collected
during a cursory survey or poorly documented field work should be used only as
presence-absence data.  Note also that a wide selection of coefficients of
similarity are available  for comparing stations and assembling groups of
species by cluster analyses.

     Of the numerous coefficients available for cluster analyses of
presence-absence data, two were tested:  Jaccard's coefficient, which omits
negative matches, and the simple matching coefficient, which regards negative
matches (the absence of a species from two stations) as contributing to
similarity.  Both coefficients range from 0 to 1.  In practice, however,
Jaccard's coefficients are lower than simple matching coefficients because
data from aquatic environmental surveys are often characterized by a large
number of species with few individuals that differ from station to station.
As a result mutual absences (negative matches) may indicate  more similarity
between samples than warranted.  To minimize this effort we  recommend
Jaccard's coefficient except for those rare cases where mutual absences
clearly result from the same cause.  If, however, a species  is absent from one
station because the temperature is too high and absent from  another because
the substrate is unsuitable, the simple matching coefficient may cause an
investigator to draw an erroneous conclusion.   If Jaccard's  coefficients are
uniformly low,  one should be wary of randomness or absence of species by
chance alone due to sampling error.
                                     150

-------
8.3.2.2  Quantitative Data—
     Quantitative data usually consist of the numbers of individuals per
species in a sample.  Sometimes not all groups of organisms are distinguished
at the species level, so the counts may be expressed as numbers of individuals
at higher taxonomic levels.  Specifically, oligochaetes and chironomids are
difficult, costly, and time consuming to identify as to species and are often
lumped.  Quantitative data may also be reported as ranked abundances or as
proportions of each species or higher taxon in a sample.

     Individual species counts are sometimes transformed before analyses are
performed.  Transformations are usually intended to reduce the quantitative
impact of highly abundant species.  Ranking data is a kind of transformation
that is roughly equivalent to using a logarithmic transformation.  Either
procedure is drastic.  The logarithmic transformation, for example, changes
abundances of 1, 10, 100, and 1000 to 0, 1, 2, and 3.  The square-root trans-
formation is less drastic (Vy + 0.5), changing abundances of 1, 10, 100, and
1000, respectively, to 1.22, 3.24, 10.02, and 31.63.  This transformation
worked well with all three data sets studied.  Finally, counts may be
standarized, whereby a data matrix of t samples (columns) and n species (rows)
is operated on, such that each element in a row has the row mean subtracted
from it and is divided by the row standard deviation to give a new row mean
of 0 and a new row standard deviation of 1.  This procedure is effective in
removing inordinate effects of species that are very abundant throughout a
study.  Use of data with species abundances expressed as proportions was not
examined in this report.  Because of problems that may arise with proportions,
the arcsine transformation (arcsine Vp) is recommended for proportional data.
We suggest that use of proportional data be tested in the future.

     Two coefficients for use in cluster analysis of quantitative data were
tested:  the Pearson product-moment correlation coefficient and Sokal's
taxonomic distance coefficient.  In general, the correlation coefficient shows
high similarities between samples with species present in the same relative
proportions, whereas the distance coefficient shows high similarity (low
distance) between samples with species present in the same absolute
abundances.  Either coefficient produced useful results with data transformed
by the square-root transformation and unstandardized.  Of course, since the
two coefficients measure different things, the results were different; and, in
general, cluster analysis of correlation coefficients proved to be more
readily interpretable.

8.3.3  Cluster Analysis

     Cluster analysis was computed in two modes:  Q-mode, which shows
similarities among samples on the basis of their contained species; and
R-mode, which shows similarities among species on the basis of their
distributions and abundances in samples.  Q-mode cluster analysis has been
used much more than R-mode in applied aquatic ecology.  Results of Q-mode
analysis are applicable only to the stream or reservoir from which data have
been collected and clustered.  R-mode analysis, on the other hand, has promise
as a method of comparing faunal associations from stream to stream and from
basin to basin.

     Fortunately, the distortion introduced during clustering can be assessed
by the coefficient of cophenetic correlation (r  ).  Cluster analysis not


                                     151

-------
accompanied by such a coefficient is suspect, and we have used an r   X1.8 as
the criterion for consideration in data interpretation.  Once it has been
determined that distortion is acceptably low, the overall shape and the levels
of clustering of a dendrogram provide useful information about the data being
clustered and the reliability of interpretations.  Ideally a dendrogram should
have several good, tight clusters that are distinct from other clusters or
samples in the dendrogram (see, e.g., Figures 11 and 15).  Such dendrograms
have sufficient structure that they may be easily interpreted, depending, of
course, on the samples that comprise the clusters.  An example of a dendrogram
in which the similarities are all rather low and of about the same value is
Figure 12.  One interpretation is that similarities among samples are actually
low and more or less equal throughout the study.  A common cause of such
clusters, however, is randomness.  In Figure 12, for example, all species were
included in the data matrix, and no attempt was made to remove rare species
that were present in some samples and, perhaps, absent from others by chance
alone due to sampling error.  When such data are used with Jaccard's coeffi-
cient, which always gives low similarities because it ignores negative
matches, the resulting similarities can have a large, unknowable component of
randomness.  Another cause of the shape of such dendrograms can be lack of
natural clusters.  Recall that cluster analysis forces samples into clusters
whether or not such clusters exist in nature.  If the system sampled is
strictly linear with sequential similarities such as might be expected in a
stream, the resultant dendrogram can lack structure.  (To use cluster analysis
with such data is analogous to trying to logically subdivide a piece of
string.  In such cases ordination is called for as an alternative to cluster
analysis.)  Other dendrograms have a stair-stepped arrangement and shape (see
e.g., Figure 31).  Such shape usually implies the lack of real clusters in
nature and again calls for ordination.

     In our examples, we have usually chosen a single level of similarity for
comparison of all clusters in a dendrogram.  In practice, there is no
compelling reason why more than one level cannot be chosen, with a different
level of similarity used in different parts of the dendrogram.  Samples from
the high-gradient, upstream portion of a stream, for example, may generally be
more similar to each other than samples from the low-gradient, downstream
portion.  Similarly, samples collected in winter may generally be more similar
to each other than samples collected in summer.  As with analysis of species
diversity, a thorough understanding of the ecosystem and a grasp of the
questions to which answers are being sought is required, not mere cookbook
applications of complex quantitative analytical techniques.

     In some dendrograms, especially those with low overall similarities, a
number of samples may be left unclustered at any reasonable level of
similarity.  In Figure 19, for example, six of the lower eight samples in the
dendrogram are unclustered.  Lack of clustering can result from real
dissimilarity or can be an artifact of clustering.  One should refer to the
original similarity matrix to check the values and, if necessary, go back to
the data matrix to see which species contributed to the anomalous lack of
clustering of some samples.

     Cluster analysis is best for expressing similarities at the tips of
dendrograms (smaller pair groups, higher similarity levels) and is less
reliable for interpretation of intercluster similarities (lower similarity
levels).  Reliability is lost because the clustering algorithm averages
                                     152

-------
similarities at each step in the clustering procedure.  If one is interested
in intergroup similarities, he is well advised to use principal component
ordination, which was not considered in this report.

8 •.?. • ^.  J? rj? i5?ii!?5

     As was pointed out earlier, ordination has been little used in applied
aquatic ecology, and our evaluation of it here is mainly in terms of how well
it agrees with the results of cluster analysis.  In general, however,
ordination is a more widely applicable technique than cluster analysis because
no a priori assumptions need be made about clusters in the data.  Such
ordination techniques as principal component analysis are well suited for
showing intercluster similarities rather than fine details of intracluster
similarities.  The use of nonmetric multidimensional scaling tested was a
compromise between principal component analysis and cluster analysis.  We have
found that when no structure exists in a data matrix due to small sample sizes
or randomness, nonmetric multidimensional scaling is quick to show it by
producing totally meaningless and uninterpretable ordination.  We recommend
that more work be done with ordination of other data sets to test the
applicability of nonmetric multidimensional scaling to data from aquatic
surveys.

8.3.5  Species Diversity and Hierarchical Diversity

     Because species diversity is meant to provide a different measure of
community structure than cluster analysis or ordination, we suggest that it be
used in conjunction with the other methods.  Indices of species diversity have
been widely misused, and a vast literature discusses the disadvantages of
using them.  The principal difficulty stems from the nonuniqueness of a given
value of species diversity.   Thus, a moderate value of species diversity may
result from a wide range of community structures.   Communities with many
species with few individuals each, a moderate number of species with a
moderate number of individuals each, and few species with many individuals
each could all give the same value of species diversity, depending on the
sampling and evenness of distribution of individuals among species.   With such
latitude,  the method must of course be used with care.   In particular, the
search for absolute, global values of species diversity to indicate healthy or
damaged ecosystems is wrong in principal.   Nevertheless, species diversity of
samples from within a stream,  reservoir,  or drainage basin can be compared,
especially if used with care and in conjunction with cluster analysis or
ordination.

     Hierarchical diversity has been little used in applied aquatic ecology
except by authors of this report and their coauthors.   We recommend that it be
further explored because of its value in showing ways  in which the high cost
of aquatic environmental surveys can be reduced.
                                     153

-------
                                 REFERENCES


Barr, A. J.,  J.  H.  Goodnight,  J.  P.  Sail, and J.  T.  Helwig.   1976.   A User's
     Guide to SAS-76.  SAS Institute, Inc., Raleigh, North Carolina.
     Sparks Press  pp 329.

Basharin, G.  P.   1959.  On a Statistical Estimate for the Entropy of a
     Sequence of Independent Random Variables.   Theory Probab.   Its Appl.
     4:333-36.

Brillouin, L.  1962.  Science and Information Theory.  2nd ed.  New York:
     Academic Press.  347 pp.

Buchanan, R.  J.  and B. Lighthart.  1973.  Indicator Phytoplankton Communities:
     A Cluster Analysis Approach.  B. C. Prov.  Mus.  Nat.  Hist.  Anthropol. Rep.
     6:1-10.

Cairns, John, Jr.,  D. W. Albaugh, F. Busey, and M.  D. Chanany.   1968.  The
     Sequential Comparison Index—A Simplified Method for Nonbiologists to
     Estimate Relative Differences in Biological Diversity in Stream Pollution
     Studies.  J. Water Pollut.  Control Fed. 40:  1607-1613.

Cairns, J., Jr., and R. L. Kaesler.   1969.  Cluster Analysis of Potomac River
     Survey Stations Based on Protozoan Presence-Absence  Data.   Hydrobiol.
     34:414-32.

Cairns, J., Jr., and R. L. Kaesler.   1971.  Cluster Analysis of Fish in a
     Portion of the Upper Potomac River.  Trans.  Am. Fish Soc.  100:750-56.

Cairns, J., Jr., R. L. Kaesler,  and R.  Patrick.  1970. Occurrence and
     Distribution of Diatoms and Other Algae in the Upper Potomac River.
     Nat. Acad.  Nat. Sci. Philadelphia  436:1-12.

Cairns, J., Jr., G. R. Lanza,  and B. C. Parker.  1972. Pollution Related
     Structural and Functional Changes in Aquatic Communities with Emphasis on
     Freshwater Algae and Protozoa.   Proc. Acad.  Nat. Sci. Philadelphia
     124(5):79-127.

Cheetham, A.  H.  and J. E. Hazel.   1969.  Binary (Presence-Absence)  Similarity
     Coefficients.   J. Paleontol.  43(5):1130-36.

Grossman, J.  S., J. Cairns, Jr.,  and R. L. Kaesler.   1973.  Aquatic
     Invertebrate Recovery in the Clinch River Following  Hazardous Spills and
     Floods.   Va. Folytech. Inst. State Univ. Water
     Resour.  Res. Cent. Bull.   63:1-56.
                                  154

-------
Crossman, J. S., R. L. Kaesler, and J. Cairns, Jr.  1974.  The Use of Cluster
     Analysis in the Assessment of Spills of Hazardous Materials.  Am.
     Midi. Nat.  92(1):94-114.

Cummins, K. W.  1973.  Trophic Relations of Aquatic Insects.  Annu.
     Rev. Entomol.  18:183-206.

Dennison, J. M. and W. W. Hay.  1967.  Estimating the Needed Sampling
     Area for Subaquatic Ecologic Studies.  J. Paleont.  41:706-708.

Environmental Protection Agency.  1976.  Federal Interagency Energy/
     Environment Research and Development Program, Status Report II.
     Office of Energy, Minerals, and Industry, Office of Research and
     Development, Washington, DC, 1976.

Farris, J. S.  1969.  On the Cophenetic Correlation Coefficient.  Syst.
     ZoojL.  18:279-85.

Fisher, R. A., A. S. Corbet, and C. B. Williams.  1943.  The Relation
     Between the Number of Species and the Number of Individuals in a
     Random Sample of an Animal Population.  J. Anim. Ecol.  12:42-58.

Forbes, S. A.  1907.  On the Local Distribution of Certain Illinois
     Fishes:  An Essay in Statistical Ecology.  Bull. 111. State Lab.
     Nat. Hist.  7:273-303.

Green, P. E. and F. J. Carmone.  1970.  Multidimensional Scaling
     and Related Techniques in Marketing Analysis.  Boston, Mass.:
     Allyn and Bacon, Inc.  203 pp.

Green, R. H., Jr.  1979.  Sampling Design and Statistical Methods for
     Environmental Biologists.  Wiley-Interscience, New York, 257 pp.

Hamilton, M. A.  1975.  Indexes of Diversity and Redundancy.  J. Water
     Pollut. Control Fed.  47:630-32.

Hedgpeth, Joel.  1973.  Temperature Relationships of Near Shore Oceanic
     and Estuarine Communities.  In Effects and Methods of Control of
     Thermal Discharges, pp. 1271-1431.  Serial No. 93-14, Pt. 3,
     Washington, DC:  U.S. Govt. Printing Office.

Hoaglin, David C. and Roy E. Welch.  1975.  MIT-SNAP.  An Interactive Data
     Analysis Systems.  MIT:  60 pages.

 Hurlbert, S. H.  1971.   The Nonconcept of Species Diversity:  A Critique
          and Alternative Parameters.  Ecology  52:577-86.

Hutchinson, G. E.  1957.  Concluding Remarks, Cold Spring Harbor Symposium.
     Quant. Biol.  22:415-27.

Hynes, H.B.N.  1960.  Biology of Polluted Waters.  Liverpool:
     Liverpool Univ. Press.  202 pp.

                                  155

-------
Hynes, H.B.N.  1970.  The Ecology of Running Waters.   Toronto:
     Univ. of Toronto Press.   555 pp.

Jaccard, P.  1908.  Nouvelles Recherches sur la Distribution Florale.
     So. Vanoise Sci. Natur.  Bull.   44:233-70.

Kaesler, R. L.  1970.  The Cophenetic Correlation Coefficient in Paleo-
     Ecology.  Bull. Geol. Soc.  Am.   81:1261-66.

Kaesler, R. L. and J. Cairns, Jr.  1972.  Cluster Analysis of Data
     from Limnological Surveys of the Upper Potomac River.  Am.  Midi.
     Nat.  88:56-67.

Kaesler, R. L.,  J. Cairns, Jr.,  and J. M. Bates.  1971.   Cluster
     Analysis of Noninsect Macroinvertebrates of the Upper Potomac
     River.  Hydrobiol.  37(2):173-81.

Kaesler, R. L.,  J. Cairns, Jr.,  and J. S. Crossman.  1974.  Redundancy
     in Data from Stream Surveys.  Water Res.  8:637-42.

Kaesler, R. L. and E. E. Herricks.   1977.  Analysis of Data from
     Biological Surveys of Streams:   Diversity and Sample Size.   Water
     Res. Bull.   13(l):125-35.

Kaesler, R. L.,  E. E. Herricks,  and J. S. Crossman.  1978.  Uses of
     Indices of Diversity and Hierarchical Diversity in Stream Surveys.
     In ASTM Symposium on Quantitative and Statistical Analyses of
     Biological Data for the Assessment of Water and Wastewater Quality,
     Minneapolis, Minn. (June 20-21, 1977).

Kolkwitz, R. and M. Marrson.   1908.   Ecology of Plant Saprobia.   Rep.
     Ger. Hot. Soc.  26a:505-19.

Kolkwitz, R. and M. Marrson.   1909.   Ecology of Animal Saprobia.  Int.
     Rev. Hydrobiol. Hydrogeogr.  2:126-52.

Krebs, C. J.  1972.  Ecology:  The Experimental Analysis of Distribution
     and Abundance.  New York:  Harper and Row.  694 pp.

Kruskal, J. B.  1964a.  Multidimensional Scaling by Optimizing Goodness
     of Fit to a Nonmetric Hypothesis.  Psychometrika.  29:1-27.

Kruskal, J. B.  1964b.  Nonmetric Multidimensional Scaling:  A Numerical
     Method.  Psychometrika.   219:115-29.

Margalef, R.  1956.  Information y Diversidad Espicfica en las Cominudades
     de Organismas.  Invest.  Pesq.  3:99-106.

Parker, B. C. and B. L. Turner.   1961.  Operational Niches and Community-
     Interaction Values as Determined from In Vitro Studies of Some Soil Algae.
     Evolution.   15(2):228-238.

Patil, G. P. and C. Taillie.   1976.  Ecological Diversity:  Concepts,
     Indices, and Applications.   Proc. 9th Int. Biom. Conf.  2:383-411.

                                  156

-------
Patrick, R.  1961.  A Study of the Numbers and Kinds of Species Found
     in Rivers in Eastern United States.   Proc. Acad.  Nat.  Sci.
     Philadelphia  113(10):215-58.

Patrick, R.  1967.  Natural and Abnormal  Communities of Aquatic Life
     in Rivers.  Bull. S. C. Acad. Sci.  29:19-28.

Peet, R. K.  1974.  The Measurement of Species Diversity.   Ann. Rev. Eco.
     Syst.  5:285-307.

Pennak, R. W.  1971.  Toward a Classification of Lotic Habitats.
     Hydrobiol.  30:321-334.

Pielou, E. C.  1966a_.  The Measurement of Diversity in Different Types
     of Biological Collection.  J. Theor. Biol.  13:131-44.

Pielou, E. C.  1967.  The Use of Information Theory in the Study of the
     Diversity of Biological Population.   Proc. 5th Berkeley Symp. Math.
     Stat. Probab.  4:163-77.

Pielou, E. C.  1969.  An Introduction to  Mathematical Ecology.   New York:
     John Wiley & Sons.  286 pp.

Pielou, E. C.  1974.  Population and Community Ecology:  Principles
     and Methods.  New York:  Gordon and  Breach.  424 pp.

Pielou, E. C.  1975.  Ecological Diversity.  New York:  Wiley-Interscience.
     165 pp.

Pielou, E. C.  1977.  Mathematical Ecology.  Wiley-Interscience, New
     York, 311 pp.

Roback, S. S., J. Cairns, Jr., and R. L.  Kaesler.  1969.  Cluster
     Analysis of Occurrence and Distortion of Insect Species in a
     Portion of the Potomac River.  Hydrobiol.  34:484-502.

Rohlf, F. J.  1970.  Adaptive Hierarchical Cluster Schemes.  Syst. Zool.
     19:58-82.

Rohlf, F. J.  1972.  An Empirical Comparison of Three Ordination
     Techniques in Numerical Taxonomy.  Syst. Zool.  21:271-280.

Shannon, C. E. and W. Weaver.  1949.  The Mathematical Theory of
     Communication.  Urbana:  Univ. of Illinois Press.

Shannon, E. E.  1970.  Eutrophication-Trophic State Relationships in
     North and Central Florida Lakes.  Ph.D. Thesis, Univ.  of Florida.
     258 pp.

Shelford, V. E.  1915.  Principles and Problems of Ecology as Illustrated
     by Animals.  J. Ecol.  3:1-23.


                                  157

-------
Simpson, E. H.  1949.   Measurement of Diversity.   Nature  163:688.

Simpson, G. G.  I960.   Notes on the Measurement of Faunal Resemblance.
     Am. J. Sci.  258a:300-ll.

Sneath, P.H.A. and R.  Sokal.  1973.  Numerical Taxonomy.
     San Francisco:  W. H.  Freeman and Co.   573 pp.

Sokal, R. R.  1961.  Distance as a Measure  of Taxonomic Similarity.
     Syst. Zool.  10:70-79.

Sokal, R. R. and F. J. Rohlf.  1962.  The Comparison of Dendrograms  by
     Objective Methods.  Taxon  11(2):33-40.

Sokal, R. R. and P.H.A. Sneath.  1963.  Principles of Numerical Taxonomy.
     San Francisco:  W. H.  Freeman and Co.   359 pp.

Soukup, J. F.  1970.  Fish Kill # 70-025, Clinch River, Carbo,
     Russell County.  Unpublished.

Steinmann, P.  1907.  Die Tierwelt der Gebirgsbache:  Eine
     Faunistisch-Biologische Studie.  Ann.  Biol.  Lacustre  2:30-150.

Steinmann, P.  1908.  Die Tierwelt der Gebirgsbache.  Arch. Hydrobiol.
     3:266-73.

Stephenson, W.  1972.   The Use of Computers in Classifying Marine
     Bottom Communities.  Oceanogr. South Pac.  31:463-73.

Stephenson, W. and M.C.L. Dredge.  1976.  Numerical Analysis of Fish
     Catches from Serpentine Creek.  Proc.  R. Soc.  87:33-43.

Stephenson, W., Y. I.  Raphael, and S. D. Cook.  1976.  The Macrobenthos
     of Bramble Bay, Moreton Bay, Queensland.  Mem. Od. Mus.  17(3):425-
     47.

Stephenson, W., W. T.  Williams, and S. D. Cook.  1972.  Computer Analysis
     of Petersen's Original Data on Bottom Communities.  Ecol. Monogr.
     42(4):387-415-

Train, Russel E.   1973.  Address to the National Conference on Managing
     the Environment.  U.S. EPA.

Whiltaker, R. H.   1975.  Communities and Ecosystems.  2nd ed.  New York:
     Macmillan.  pp.  124-25.

Whitten, B. A., ed.   1975.  River Ecology.   Berkeley:  Univ. of Calif.
     Press.   725 pp.

Wilhm, J.  L.  1968.   Use of Biomass Units in  Shannon's Formula.  Ecology
     49:153-156.

                                   158

-------
Wilhm, J. L. and T. C. Dorris.  1968.  Biological Parameters for Water
     Quality Criteria.  Bioscience  18(6):477-81.

Williams, C. B.  1950.  The Application of the Logarithmic Series to the
     Frequency of Occurrence of Plant Species in Quadrats.  J.  Ecol.
     38:107-138.

Woodwell, G. M.  1970.  Effects of Pollution on the Structure and
     Physiology of Ecosystems.  Science  168:429-33.
                                  159

-------
                                  TECHNICAL REPORT DATA
                           (Please rcatl InMnirtiofit on the rtitne bifftrc c
  REPORT NO.
  EPA-GQO/7-84-042
3 RECIPIENT'S ACCESSION NO.
4. TITLE AND SUBTITLE
 Consolidation  of Baseline Information, Development of
 Methodology, and Investigation of'Thermal Impacts on
 Freshwater  Shellfish,  Insects, and Other Biota  	
                                                          5 REPORT DATE
                                                            March 19Q4
6. PERFORMING ORGANIZATION CODE
7. AUTHORIS)
  John S. Grossman,  James  R.  Wright, Jr., and
  Roger L. Kaesler
                                                          8 PERFORMING ORGANIZATION REPORT NO
 TVA/EP-78/09
9. PERFORMING ORGANIZATION NAME AND ADDRESS
 Office  of Natural  Resources
 Tennessee Valley Authority
 Knoxville, Tennessee   37902
                                                          10 PROGRAM ELEMENT NO
 TNE-6PSA	
II CONTRACT/GRANT NO.

  EPA-IAG-DS-E721
12. SPONSORING AGENCY NAME AND ADDRESS
 Office of Environmental Processes and Effects Research
 Office of Research and  Development
 U.S. Environmental Protection  Agency
 Washington, DC 204GO
13 TYPE OF REPORT AND PERIOD COVEREC
14 SPONSORING AGENCY CODE
           EPA/600/16
15. SUPPLEMENTARY NOTES
 This project is  part  of the EPA planned and coordinated Federal Interagency  Energy/
 Environment Research  and Development Program.
16. ABSTRACT
 A computerized  information  system was developed for storing, retrieving, and
 analyzing data  collected  during limnological surveys.  To facilitate storage  of
 information, a  series  of  hierarchial codes was developed.  These codes not  only
 reduced  storage requirements,  but also helped reduce computing costs.

 The information system utilized three analytical procedures, cluster analysis,
 ordination using nonmetric  multidimensional scaling (MDS), and measurement  of
 species  diversity from information theory.

 Results  indicated that identification to species contributed little information
 about the structure  of communities that discrimination of genera had not already
 provided.

 The heuristic properties  of species diversity were used to evaluate two
 classifications stressing functional morphology and trophic-functional relationships
 of benthic invertebrates, independent of the taxonomic hierarchy.  Both methods
 produced results similar  to ones obtained by cluster analysis, suggesting that
 they merit further investigation.
17. KEY WORDS AND DOCUMENT ANALYSIS
a. DESCRIPTORS
Ecology, Environment, Hydrology, Methodolog
Limnology, Information Systems
18. DISTRIBUTION STATEMENT
Release to public
b. IDENTIFIERS/OPEN ENDED TERMS
r Control Technology
Thermal, Nuclear,
Coal
Effects
Environmental, Nuclear,
Coal
19 SECURITY CLASS llHiiKtfiani
Unclassified
20. SECURITY CLASS (Tlmpaget
Unclassified
t. COSATI 1 icIJ Group
6F, 8A
21. NO. OF PAGES
159
22. PRICE
EPA Form 2220-1 (t-73)

-------