United States
 Environmental Protection
 Agency
 Office of Monitoring Support and/*- 
-------
     or otherwise associated in epidemi-
     ologies! studies.
  2) Differences or similarities of corre-
     lations between sexes or races may
     indicate genetic or  occupational
     factors of importance in tracing dis-
     ease etiology.
  3) Negative correlations may indicate
     competing causes of death.
  The objective of the study was to evalu-
ate use of the UPGRADE system to calcu-
late all possible correlations, determine
the strongest correlations, record  them
for future use by interested researchers,
and investigate the geographical varia-
tion of the strongly correlated diseases.

Experimental Procedure.
  The data base used in the study con-
sisted  of  county-level,  age-adjusted
mortality rates averaged over the five-
year period 1968-1972. The rates were
calculated by Herb Sauer of the Univer-
sity of Missouri, using the detailed mor-
tality records provided by the National
Center for Health Statistics (NCHS). All
deaths were recorded between  1968
and 1971, but only every other death in
1972. Thus, some sampling error  could
exist  for  the  less  common diseases.
Death  rate calculations were based on
each county's 1970 population. About
50 causes of death were studied  for
white males and females (Table  1).
  Because the 3082 mortality rates for
almost any  cause of death contained
some 10-30 extraordinarily  high rates,
due often to confounding factors such
as the existence of a major institution
(Indian reservation,  regional hospital,
prison) in the county, and because  these
rates could exert undue influence on the
Pearson  correlation  coefficient,  such
outliers were eliminated by use of a scat-
terplot screening technique. Visual  in-
spection of the scatterplots suggested
reasonable upper and lower bounds for
county mortality rates to be included in
the correlation  calculations.  Varying
these limits provided an indication of the
sensitivity of  the  calculations  to the
number of counties included: only about
a 10-15% variation in the most signifi-
cant correlations was observed.
  A stringent significance criterion of p
•C0001 was chosen to lessen the likeli-
hood of error in identifying  significant
correlations. Even so,  of the approxi-
mately 1200 possible correlations  for
each sex,  1 52 correlations were signifi-
cant for white females and 136 correla-
tions were significant for white males at
the p<.0001 level.
Table 1.    UPGRADE Variables Used in This Study with Corresponding ICDA Codes
UPGRADE
CODE
071
072
073
074
075
076
077
078
079
080
081
082
083
084
085
086
087
088
089
090
091
092
093
094
O95
096
097
098
099
100
101
102
103
104
105
106
107
1O8
109
VARIABLE
Tuberculosis, All Forms
Other Infective Disease
Ca Buccal Cavity, Pharynx
Cancer of Esophagus
Cancer of Stomach
Cancer of Intestine
Cancer of Rectum
Ca Liver, Gall B., Ducts
Cancer of Pancreas
Other Digestive Cancer
Cancer of Resp. System
Cancer of Breast
Cancer of Cervix
Cancer of Uterus
Ca Prost, Other Female Ca
Cancer of Bladder
Cancer of Kidney, Etc.
Cancer of Central Nervous System (CNS)
Residual Cancer
Cancer, Ill-Def. & Sec.
Lymphosarcoma, Etc.
Hodgkin's Disease
Multiple Myeloma
Leukemia
Other Lymphatic
Neoplasms, Benign & Unspecified
Diabetes
Alcoholism
Rheumatic Heart Dis.
Hypertension
Acute Ischemic Heart Dis.
Chronic Ischemic Heart
Other Heart Disease
Cerebrovascular Disease
Arteriosclerosis
Aortic Aneurysm
Other Arteries, Etc.
Veins, Etc.
Influenza and Pneumonia
ICDA
CODES
(8th Revision)
010-019
000-009,020-136
140-149
150
151
152, 153
154
155, 156
157
158, 159
160-163
174
180
181, 182
183, 184, 185
188
189
191, 192
170-3, 183, 186-7,
190, 194
195-199
200
201
203
204-207
2O2, 2O8, 2O9
210-239
250
303
390-398
400-404
410,411
412,413
420-429
430-438
440
441
442-448
450-458
470-486
(continued)

-------
 Table 1.
(Continued)
 UPGRADE
   CODE
                 VARIABLE
       ICDA
      CODES
    (8th Revision)
    110

    111

    112

    113

    114

    115

    116

    117

    126

    127
 Table 2.
Disease Title
   Chronic Resp. Dis.

   Cirrhosis of Liver

   Chronic Nephritis, Etc.

   Infections of Kidney

   Congenital Heart & Circ.

   Other Congenital

   Other Early Infancy

   Symptoms, Ill-Defined

   Major CV Diseases

   Cancer, All Sites and Forms
490-493, 517-519

571

582-584

590

746, 747

740-745, 748-759

760-778

780-796

390-448

140-209
Correlation Coefficients for the Top Twenty Correlations

                    White Females
                                      Scatterp/ot
                                       Method
        Exclusion/
       Filter Method
  1. Cancer of the Respiratory System - Cirrhosis          .224          .206

  2. Cancer of the Intestine - Cancer of the Breast         .211          .161

  3. Chronic Ischemic - Cancer, All Forms                 . 189          .238

  4. Chronic Ischemic - Cirrhosis                         . 182          .198

  5. Cancer of the Intestine - Cancer of the Rectum        .180          .145

  6. Cancer of the Cervix - Major CV                     .180          .189

  7. Other Heart Disease-Symptoms, Ill-Defined          .179          .203

  8. Rheumatic Heart Disease - Chronic Ischemic          .175          .193

  9. Chronic Ischemic - Other Heart                    -.172        -.2OJ

10. Acute Ischemic - Cerebrovascular                    .171          .173

11. Cancer of the Rectum - Rheumatic Heart              .170          .143

12. Aortic Aneurysm - Cirrhosis                         .170          .152

13. Cancer of the Esophagus - Cirrhosis                  . 169          .118

14. Cancer of the Rectum - Chronic Ischemic             .167          .151

15. Rheumatic Heart - Cirrhosis                         .166          .164

16. Diabetes - Major CV                               .166          .210

17. Rheumatic Heart - Aortic Aneurysm                  .161          .141

18. Cancer of the Rectum - Cancer of the Breast          .159          .155

19. Cirrhosis-Cancer, All Forms                        .150          .194

20. Major CV - Cancer, All Forms                        .146          .149
  The top 30 of these correlations for
each sex were further examined. If out-
liers were suspected, a new modified re-
gression was run using different bounda-
ries for excluding  counties.  (In  most
cases,  fewer than  1 %  of all counties
were excluded). This procedure resulted
in some changes of order among the top
correlations, but few sharp changes in
the magnitudes of the correlation coeffi-
cients.

Results and Discussion
  From the procedures discussed above,
a final list of the 20 strongest correla-
tions was obtained (Tables 2 and 3). No
fewer than eleven correlations appear in
both tables, and  only two pairs of dis-
eases for each sex were not strongly cor-
related in the other sex (Table 4). Thus,
sex is not a strong factor in the co-varia-
tion of mortality rates for most diseases.
  However, population density is very
clearly  an  important factor in the  most
strongly correlated disease pairs, as can
be seen by comparing those causes of
death most strongly associated  with
county population to those most strongly
correlated with each other. For white fe-
males, six of 48 causes of death investi-
gated showed a strong ( p <.0001) in-
crease in mortality rates in the more pop-
ulous counties (Table 5). Four of these
six appear most often in the  strongest
20 correlations for females.  Similarly,
nine of 46 causes of death investigated
for white  males showed a strong  (p
<.0001) increase with county  popula-
tion (Table 6). Six of these nine appear
most often in the strongest 20 correla-
tions for white males.
  The  strongest  negative correlations
are dominated by the "miscellaneous"
categories of "Other Heart  Disease"
and "Symptoms, Ill-defined" (Table 7).
These categories probably "compete"
with other causes of death in the sense
that inexperienced or untrained county
medical officers are more likely to classify
difficult cases in the miscellaneous  cate-
gory. However, the frequent appearance
of rheumatic heart disease in  this  table
does not appear to be explainable in the
same way. Rheumatic heart disease ap-
pears only for white females and only in
association with diseases that  have
higher mortality rates in rural regions.
This phenomenon seems worthy of fur-
ther study.
  The Pearson product-moment correla-
tion coefficient calculation assumes a
normal distribution.  However, the distri-
bution of county mortality rates was cat-

-------
Table 3.    Correlation Coefficients for the Top Twenty Correlations

                                White Males
                                                  Scatterplot
Disease Title                                        Method
 Exclusion/
Filter Method
  1. Other Heart Disease - Symptoms, Ill-Defined         . 286

  2. Cancer of the Respiratory System - Major
    Cardiovascular                                   .286

  3. Chronic Ischemic Heart Disease - Aortic
    Aneurysm                                       .268

 4. Chronic Ischemic - Cirrhosis of the Liver             .263

 5. Chronic Ischemic - Cancer, All Forms and Sites       . 250

 6. Cirrhosis - Aortic Aneurysm                        .246

  7. Cirrhosis - Cancer, All Forms                       .243

  8. Cancer of the Respiratory System - Chronic
    Ischemic                                         .242

 9. Cancer of the Rectum - Cancer of the Intestine       .242

10. Cancer of the Rectum - Chronic Ischemic            . 241

11. Major CV - Cancer, All Forms                       . 239

12. Acute Ischemic Heart Disease - Cerebrovascular      . 235

13. Cancer of the Buccal Cavity, Pharynx - Cancer
    of the Respiratory System                         . 233

14. Cancer of the Esophagus - Cirrhosis                 .231

15. Cancer of the Respiratory System - Chronic
    Respiratory                                      .231

16. Cancer of the Respiratory System - Aortic
    Aneurysm                                       .226

17. Aortic Aneurysm - Cancer, All Forms                .214

18. Cancer of the Rectum - Cirrhosis                    . 205
19. Cancer of the Respiratory System - Cancer
    Ill-Defined and Unspecified                        . 202

20. Cancer of the Respiratory System - Cirrhosis         .201
    .256


    .302


    .201

    .221

    .217

    .146

    .249


    .223

    .168

    .205

    .220

    .301


    .170

    .186


    .203


    .173

    .178

    .161


    .175

    .183
Table 4. Correlations That Are Strong For One Sex But Not The Other
Rank (WF)
11
16
Rank (WM)

Cancer of the Rectum - Rheumatic Heart
Disease
Diabetes - Major CV Diseases

Correlation Coefficient
WF WM
.170 .130
.166 .098

      1        Cancer of the Resp. System - Major CV
                 Diseases                                 .084      .286
     13        Cancer of the Resp. System - Ca. Buccal
                 Cavity                                   .074      .233
culated for six causes of death for each
of three race-sex groups and not one of
the 18 data sets passed chi-square tests
for normality. In every case, the distribu-
tions  were  more strongly  clustered
toward the  mean and simultaneously
more dispersed in the tails than the nor-
mal distribution.  Such distributions are
termed kurtic. The 18 distributions were
then plotted on logarithmic probability
paper but failed to display log-normal be-
havior. (Figure 1 provides an example of
the nonlinear shape of the distribution).
When a more homogenous set of coun-
ties is selected, the distribution of mor-
tality rates may  more nearly approach
log-normality. For example, lung cancer
death rates for white males in 234 mostly
urban counties were much closer to a
log-normal distribution than  the rates
from all 3082 counties (Figure 2).
  Thus we are uncertain of the interpre-
tation to be given to the absolute values
of the Pearson product-moment correla-
tions calculated in Tables 2 and 3, al-
though the relative values may be more
trustworthy.  For this reason, we have
considered only  correlations  with p
<.0001. Nonparametric statistics would
have been preferable, but because of the
large number of  counties involved, it
was not feasible to calculate Spearman
or Kendall correlation coefficients.
  It should also be noted that the lack of
normality  of the county mortality rate
distributions probably decreases the al-
lowed  range  of  negative correlations.
(For example, two log-normally distri-
buted variables have  a minimum  r. of
-0.369,  although the positive limit re-
mains at + 1.0.) Thus, a negative r is
probably indicative of  a stronger  rela-
tionship than a positive one of the same
magnitude.
  Geographic variations were  studied
using bivariate color maps created by the
Domestic  Information  Display  System
(DIDS). Rates for each disease were cat-
egorized in quartiles, and colors assigned
to each of the 16 cells of the resulting
4x4 matrix. Geographic characteriza-
tions of six disease pairs showing  high
correlations for both white males and
white females were prepared. An exam-
ple is given in Table 8.
  Two other  studies have used similar
programs  for investigating correlations
between diseases. Saueri has grouped
the same basic mortality data (1968-72)
by state and by state economic area;
thus, the present study of county  data
can be viewed  as  complementary to
Sauer's work. Wellington, MacDonald

-------
 Table 5.     Variation in Mortality Rate with County Population
            (Age-Adjusted Mortality Rate per Million at Risk (1968- 72) - White Males)
                                           1970 White Male County Population tin thousands)
 Cause of Death
O-5
5-10
10-25
25-100
>10O
Tuberculosis, AH Forms
Other Infective Disease
CA Buccal Cavity, Pharynx
Cancer of Esophagus
Cancer of Stomach
Cancer of Intestine
Cancer of Rectum
Cancer of Liver, Gall B., Ducts
Cancer of Pancreas
Other Digestive Cancer
Cancer of Resp. System
Cancer of Breast
Cancer of Prostate
Cancer of Bladder
Cancer of Kidney, Etc.
Cancer of CNS
Residual Cancer
Lymphosarcoma, Etc.
Cancer Ill-Del, and Sec.
Hodgkin 's Disease
Multiple Myeloma
Leukemia
Other Lymphatic
Neoplasms, Benign and Unspec.
Diabetes
Alcoholism
Rheumatic Heart Disease
Hypertension
Acute Ischemic Heart Dis.
Chronic Ischemic Heart
Other Heart Disease
Cerebrovascular Disease
Arteriosclerosis
Aortic Aneurysm
Influenza and Pneumonia
Chronic Resp. Disease
Cirrhosis of Liver
Chronic Nephritis
Infections of Kidney
Congenital Heart & Cir.
Other Congenital
Other Early Infancy
Major Cardiovascular Diseases
Cancer, AH Sites and Forms
25
64
45
26
94
152
39
31
110
7
512
3
202
61
44
47
77
38
109
22
24
108
25
23
170
24
55
117
3,006
1,223
376
1,207
198
90
415
403
115
44
SO
42
48
237
6,355
1,777
30
65
44
29
95
157
46
28
105
a
529
2
195
55
43
47
77
36
109
22
24
96
26
20
166
26
61
133
3,004
1,300
321
1,229
198
92
403
376
116
45
46
44
45
234
6,416
1,774
32
56
47
33
92
162
56
28
108
8
566
3
200
59
43
49
72
40
111
21
24
97
26
22
171
27
65
129
3,019
1,445
293
1,252
202
102
397
395
128
43
49
44
47
228
6,587
1,848
31
56
54
38
92
175
62
29
108
7
600
3
200
70
47
47
70
41
113
20
25
92
25
25
174
25
72
121
2,860
1,682
247
1,166
199
125
387
423
158
47
44
42
45
216
6,549
1,926
35
53
64
46
107
208
72
35
111
8
645
3
198
78
46
51
68
43
118
21
24
91
24
26
173
26
83
104
2,629
1,930
190
1,056
180
129
393
397
219
35
39
41
42
199
6,311
2,071
—
—
.02
.OOO/
—
.0007
.0001
—
—
—
.0001
—
—
—
—
—
—
—
—
—
—
—
—
—
—
—
.000;
—
.0001
.0001
.0001
.0003
—
.0001
—
—
.0001
—
—
—
—
.02
—
.0001
 'Probability that the increase (decrease) in rates is due to chance (Pearson product-moment correlations applied to all counties)
and Wolf2 considered cancer mortality
between 1950 and 1969 on a state-
wide basis.  Comparisons with results
from both works reveals considerable
agreement, although different choices
of disease groups makes detailed com-
parisons impossible.

References
1. Sauer, H.I., Geographic Patterns in
   the Risk of Dying and Associated Fac-
     tors: U.S. 1968-72, National Center
     for Health Statistics,  U.S.  Dept. of
     Health & Welfare, Wash. D.C. 1979.
  2. Wellington, MacDonald,  and Wolf,
     Cancer Mortality: Environmental and
     Ethnic Factors, Academic Press, New
     York, 1979.

-------
Table 6.    Variation in Mortality Rate with County Population
           (Age-Adjusted Mortality Rate per Million at Risk f 1968- 72) — White Females)

                                         1970 White Female County Population (in thousands)
Cause of Death
Tuberculosis, All Forms
Other Infective Disease
Cancer of Buccal Cavity, Pharynx
Cancer of Esophagus
Cancer of Stomach
Cancer of Intestine
Cancer of Rectum
Cancer of Liver, Gall B., Ducts
Cancer of Pancreas
Other Digestive Cancer
Cancer of Resp. System
Cancer of Breast
Cancer of Cervix
Cancer of Uterus
Other Female Cancer
Cancer of Bladder
Cancer of Kidney, Etc.
Cancer of CNS
Residual Cancer
Lymphosarcoma, Etc.
Cancer Ill-Def. and Sec.
Hodgkin's Disease
Multiple Myeloma
Leukemia
Other Lymphatic
Neoplasms, Benign and Unspec.
Diabetes
Alcoholism
Rheumatic Heart Disease
Hypertension
Acute Ischemic Heart Dis.
Chronic Ischemic Heart
Other Heart Disease
Cerebrovascular Disease
Arteriosclerosis
Aortic Aneurysm
Influenza and Pneumonia
Chronic Resp. Disease
Cirrhosis of Liver
Chronic Nephritis, Etc.
Infections of Kidney
Congenital Heart & Circ.
Other Congenital
Other Early Infancy
Major Cardiovascular Diseases
Cancer, All Sites and Forms
0-5
9
56
16
9
50
143
28
37
61
8
85
212
52
42
81
20
22
31
46
23
85
11
17
58
16
20
192
6
51
104
1,197
900
214
1,000
162
28
264
89
51
28
47
33
43
172
3,712
1,153
5-10
10
52
15
8
46
147
29
31
62
6
90
219
59
46
84
17
21
32
44
24
97
12
15
59
16
21
180
5
5O
111
1,212
938
195
978
162
26
252
88
50
29
41
35
41
161
3,723
1,178
10-25
10
46
16
9
48
156
34
31
64
6
97
228
61
46
91
19
21
33
43
26
91
11
16
59
16
18
187
5
55
112
1,202
1,026
175
996
164
28
243
89
57
29
42
35
45
164
3,813
1,227
25-100
10
42
17
9
45
160
38
32
64
6
107
249
61
44
95
22
21
30
40
28
89
12
17
56
15
20
186
6
67
97
1,161
1,179
148
962
168
33
233
95
74
25
37
34
43
154
3,869
1,254
>100
10
40
19
12
52
173
40
32
66
6
126
279
50
46
99
23
21
32
39
28
92
13
17
57
16
21
175
7
82
86
1,114
1,263
137
899
149
34
225
94
101
22
34
33
40
143
3,815
1,359
P*
	
.03
—
.03 -
—
.002
.002
—
—
—
.0007
.0007
—
—
.007
—
—
—
—
—
—
—
—
—
—
—
—
—
.0007
.05
.01
.0001
.003
.009
—
—
.04
—
.0001
—
—
—
—
.04
—
.0001
*See note to Table 5.

-------
Table 7.     Strongest Negative Correlations
                                                              WF
WM
Chronic Ischemic
Ca. Rectum
Aortic Aneurysm
Ca. Breast
Ca. Intestine
Acute Ischemic
Ca. Intestine
Ca., All Forms
Acute Ischemic
Chronic Ischemic
Major CV
Ca. Rectum
Ca. All Forms
Ca. Breast
Ca. Intestine
Other Heart Disease
Infections of Breast
Symptoms, Ill-Defined
Acute Ischemic
Cerebrovascular
Chronic Isch. Heart
Disease
Ca. Rectum
vs. Other Heart Disease
vs. Other Heart Disease
vs. Other Heart Disease
vs. Other Heart Disease
vs. Other Heart Disease
vs. Other Heart Disease
vs. Other Heart Disease
vs. Other Heart Disease
vs. Symptoms, Ill-Defined
vs. Symptoms, Ill-Defined
vs. Symptoms, Ill-Defined
vs. Symptoms, Ill-Defined
vs. Symptoms, Ill-Defined
vs. Symptoms, Ill-Defined
vs. Symptoms, Ill-Defined
vs. Rheumatic Heart Disease
vs. Rheumatic Heart Disease
vs. Rheumatic Heart Disease
vs. Rheumatic Heart Disease
vs. Rheumatic Heart Disease
vs. Cerebrovascular
vs. Cerebrovascular
-.172
-.105
-.086
-.137
*

-.125
-.123
-.118
-.137
-.114
-.083
-.139
-.122
-.123
-.141
-.137
-.125
-.118
-.092
-.141

-.197
-.146
-.101
NA
-.130
-.097


-.196
-.155
-.123
-.125








-.093
-.082
* Blanks indicate correlations that were not significant atp <. 000 / for the particular sex.

-------
   woo
   800
   600
   §500
 .  400

I  300
£
Q.  200
2  100
$   so
I"
    60
    50
    40

    30 -

    20-
    JO
             Major
          CV diseases
                                                   Major
                                                 CV diseases
                                                   (cont.)
                                                                    2000
                                                             "\9  . \ 1000
    0.01  0.1 0.5 1 2  5 10 20  40  60   80 90 95 98 99 99.8 99.99

Figure 1.  Cumulative frequency distribution of mortality rates: (1968-72).
   1000
    800

    600
    500
§  400

g  300

 «  200
 Q.
1   1°°
•8
     80
I-
     60
     50
     40

     30

     20
          234 mostly urban
            U.S. counties
                                       All 3,082 U.S. counties

                                  '1313 U.S.  counties with
                                  more than 10,000 white
                                      male population
     10
     0.01
                                                        98    99.8
           0.1
1 2  5  10 20   40   60   80  90 95    99  99.9 99.99
 Figure 2.   Cumulative frequency distribution of lung cancer mortality rates-
           white males (1968-72)
                                 8

-------
Tables.
Respiratory Cancer vs Cirrhosis of Liver
 Respiratory  Cirrhosis of
  Cancer      Liver    Sex Geographic Location
   HIGH      HIGH    WM New England, California, Florida
                       WF New England*, California, Florida, Nevada, Arizona,
                           New Mexico, Washington fSeattle-Tacoma-Everett),
                           Gulf Coast, Alaska
   LOW      LOW    WM Tennessee, Kentucky, Virginia
                       WF West**, Southeast
   HIGH     MIXED    WM Georgia, South Carolina, Lower Mississippi River
   LOW      HIGH    WM West, Southwest***	

   * Particularly CONNECTICUT, MASSACHUSETTS, SOUTHERN VERMONT AND
    NEW HAMPSHIRE, EASTERN NEW YORK STATE, COASTAL PARTS OF MAIN,
    MOST OF NEW JERSEY
  ' * Particularly WESTERN PORTION OF NORTH AND SOUTH DAKOTA,
    NEBRASKA, KANSAS, SOUTHERN PORTION OF MONTANA, IDAHO, UTAH
  >* Particularly NEW MEXICO, COLORADO, WYOMING
  The EPA authors Lance Wallace and Valarie J. Gill (also the EPA Project
  Officer, see below) are with the Office of  Monitoring Support and Quality
  Assurance, Washington, DC 20460.
  The complete report, entitled "Correlations Between Age-Adjusted Mortality
    Rates for White Males and Females in the United States, by County: 1968-
    1972," (Order No. PB 82-224 114; Cost: $10.50, subject to change) will be
    available only from:
         National Technical Information Service
         5285 Port Royal Road
         Springfield, VA 22161
         Telephone: 703-487-4650
  The EPA Project Officer can be contacted at:
         Office of Monitoring Support and Quality Assurance
         U.S. Environmental Protection Agency (RD-680)
         Washington, DC 20460
                                                                                           •USGPO:1M2-659-095-543

-------
                                                                      o
                                                                      O :

                                                                      o
                                                                      o>
                                                                      O
                                                                      Oi
                                                                      KJ
                                                                      O)
                                                                      00
      ,
o  i_
>cc
S^
cc
5
    rn
    ?
    o
                                                               m
                                                                   TJ m •
                                                                >
                                                               U< ST. 3 (
                                                               01  S|«

-------