Ecological Research Series
EFFECT OF MECHANICAL COOLING DEVICES ON
AMBIENT SALT CONCENTRATION
Environmental Research Laboratory
Office of Research and Development
U.S. Environmental Protection Agency
Corvallis, Oregon 97330
-------
RESEARCH REPORTING SERIES
Research reports of the Office of Research and Development, U.S. Environmental
Protection Agency, have been grouped into five series. These five broad
categories were established to facilitate further development and application of
environmental technology. Elimination of traditional grouping was consciously
planned to foster technology transfer and a maximum interface in related fields.
The five series are:
1. Environmental Health Effects Research
2. Environmental Protection Technology
3. Ecological Research
4. Environmental Monitoring
5. Socioeconomic Environmental Studies
This report has been assigned to the ECOLOGICAL RESEARCH series. This series
describes research on the effects of pollution on humans, plant and animal
species, and materials. Problems are assessed for their long- and short-term
influences. Investigations include formation, transport, and pathway studies to
determine the fate of pollutants and their effects. This work provides the technical
basis for setting standards to minimize undesirable changes in living organisms
in the aquatic, terrestrial, and atmospheric environments.
This document is available to the public through the National Technical Informa-
tion Service, Springfield, Virginia 22161.
-------
EPA-600/3-76-034
April 1976
EFFECT OF MECHANICAL COOLING DEVICES ON
AMBIENT SALT CONCENTRATION
by
Herbert E. Hunter
ADAPT Service Corporation
Reading, Massachusetts
Contract 68-03-2176
Project Officer
Bruce A. Tichenor
Assessment and Criteria Development Division
Corvallis Environmental Research Laboratory
Corvallis, Oregon 97330
U.S. ENVIRONMENTAL PROTECTION AGENCY
OFFICE OF RESEARCH AND DEVELOPMENT
CORVALLIS ENVIRONMENTAL RESEARCH LABORATORY
CORVALLIS, OREGON 97330
-------
DISCLAIMER
This report has been reviewed by the Corvallis Environ-
mental Research Laboratory. U. S. Environmental Protection Agency,
and approved for publication. Approval does not signify that
the contents necessarily reflect the views and policies of the
U. S. Environmental Protection Agency, nor does mention of trade
names or commercial products constitute endorsement or recommenda-
tion for use.
n
-------
CONTENTS
Page
List of Tables iv
List of Figures vi
I Introduction 1
II Conclusions and Recommendations 3
III Effect of Cooling Devices on Airborne Salt
Concentration 8
Estimate of the Precision Run Error 8
Estimate of Background Concentrations 17
Effect of Cooling Tower on Ambient Concentration 38
Effect of Spray Modules on Ambient Concentration 40
IV Extension to Other Sites 47
V References 49
VI Appendices
Appendix A - Mathematical Description of ADAPT 50
B - Analysis of Optimal Bases Used for
Salt Concentration Studies 99
C - Selection and Analysis of Algorithms 114
D - Algorithms for Calculating Ambient
Concentration 124
-------
LIST OF TABLES
Number Pa
-------
LIST OF TABLES (CONT'D)
Number Page
16 The Most Important Environmental Variables for Estim- 34
ate of Background Concentration at Station 11
17 Summary Statistics for Down Wind Minus Background Con- 39
centration for Cooling Tower
18 Summary Statistics for Down Wind Minus Background Con 39
centration for Spray Modules
19 The Most Important Environmental Variables for Estim- 45
ate of Increase in Salt Concentration Due to Spray
Modules at Station 7
20 The Most Important Environmental Variables for Estim- 45
ate of Increase in Salt Concentration Due to Spray
Modules at Stations 6 through 9
v
-------
LIST OF FIGURES
Number Page
1 Airborne Particles Sampler Station Locations 4
2 Relative Importance of Indexing Variable Defined by 11
Table 1 to Estimate of Precision Run Error Using an
Eleven Dimensional Analysis
3 Estimated Versus Actual Precision Run Error 14
4 Estimated Versus Actual Ambient Salt Concentration 19
Pooled Over All Measurement Stations
5 Definition of Wind Vector 20
6 Relative Importance of Indexing Variable Defined by 22
Table 1 to Estimate of Ambient Salt Concentration
Pooled Over All Stations
7 Relative Importance Vector for East Wind (-45°) 25
8 Relative Importance Vector for Absolute East Wind 26
(-45°)
9 Relative Importance Vector for Ambient Concentration 35
at Station 10
10 Relative Importance Vector for Ambient Concentration 36
at Station 9
11 Relative Importance Vector for Spray Modules at 43
Station 7
12 Relative Importance Vector for Spray Modules at Stations 44
6 through 9
VI
-------
SECTION I
INTRODUCTION
The purpose of this study is to analyze the airborne
salt concentration data collected during the demonstration of
salt water mechanical cooling devices at the Turkey Point
power plant and reported in detail in Reference 1. Airborne
particle samplers were used to collect data on ambient salt
concentration. The purpose of the study reported on Reference
1 was to measure the amount of cooling device draft which
was emitted from the cooling devices and subsequently collected
at downward samplers. The data consist of a series of measure-
ments to define the background airborne salt concentration at
each of the stations shown in Figure 1. Note, that Stations 1
and 2 are co-located and thus provide data on the repeatability
of the measurements or the "precision run error". A second
set of data were collected at Stations 3 through 11 during
the operation of either a single cell salt water cooling tower
or a pair of Powered Spray Modules. For all of the data runs,
only one of the two types of cooling devices was operating so
that the data may be divided into three classes: 1) background
salt concentration, 2) cooling tower plus background salt con-
centration and 3) powered spray modules plus background salt
concentration. By definition, background data define the con-
ditions with no cooling device in operation.
In general, the early portion of the data collection
consisted of collecting background data only- This situation
existed from approximately August of 1973 through early January
of 1974. During this time period, only measurement Stations 1
through 6 were used. Beginning at the end of January 1974,
measurements were made using each of the cooling devices in-
dependently as well as a few background measurements. During
this time period, Stations 3 through 11 were used. The last
series of background measurements were made in April of 1974.
Between April of 1974 and the end of July of 1974, all of the
measurements were made with one of the cooling devices operating.
-------
The approach to the analysis of these data consisted of
utilizing the data obtained at the two co-located stations to
make an estimate of the precision of the measurements. The
background data were then used to develop regression algorithms
relating the background salt concentration to the environmental
variables. These algorithms were then used to estimate
the background concentrations at the time that the
data were being collected during the operation of the cooling
devices. This estimated background concentration was then
subtracted from the measured concentration during the operation
of the cooling device-to determine the effect of the cooling
device on the ambient concentration. Regression analysis was
then performed to relate this effect of the cooling device
on the ambient concentration to the environmental variables.
The ADAPT family of computer programs, based on the
concept of preceding empirical analysis with the transforma~
tion of the data to the principle component space offers a
unique capability to analyze these data. This transformation
not only reduces the amount of computation required to analyze
the data, but also provides analysis of the data which is
useful for understanding the structure of the data. This
analysis will insure that no major errors have been made in
the preparation of the data. The advantages as well as a
detailed description of the ADAPT approach for performing
the regression analyses required for this study are summarized
in Appendix A.
-------
SECTION II
CONCLUSIONS AND RECOMMENDATIONS
This section will present a brief summary of the results
and recommendations resulting from the studies of the Turkey
Point data. Justification for these results and recommenda-
tions are presented in Section 3 and the supporting appendicies.
EFFECT OF COOLING DEVICES ON BACKGROUND SALT CONCENTRATION
The primary results of the present study are: 1) the
operation of the cooling tower at the Turkey'Point power
plant .did not increase the background salt concentration
by a measurable amount at any of the stations, and 2)' the
effect of operating the spray modules at the Turkey point
power station probably increased the background salt con-
centration at Station 7 by approximately 50% and did not
increase it by a measurable amount at any other station.
These results are developed in detail in Sections 3.3
and 3.4. They are based on the results of statistical summaries
of the difference between the concentration measured with
the device operating and the expected background concentration
made at each station and pooled over all stations. The
average difference between the measured concentration with
the cooling tower operating and the expected background con-
centration was the order of 2 x 10~3 micrograms per cubic
meter with a standard deviation of 4.8 micrograms per cubic
meter. Similar results obtained for each of the individual
stations indicate there were no stations for which the diff-
erence between the measured concentration with the cooling
tower operating and the expected background concentration
averaged over alL of the measurements at the station exceeded
the standard deviation.
The average of all of the concentrations obtained during
the operation of the spray modules, minus the expected back-
ground concentration is 1.32 micrograms> per cubic meter with
a standard deviation of 9.9 micrograms per cubic meter. When
considering the individual stations, the average difference
between the concentration measured with the spray modules
operating and the expected background concentration only exceeded
the standard deviation for two stations. One of these stations
was Station 10 for which only 6 cases were available and thus
no meaningful conclusions can be based on the results obtained
at this station. The other station was Station 7 which Figure 1
-------
N
figure 1 . Airborne Particle
Sampler station locations.
Distances in meters.
-------
shows to be one of the three closest stations to the spray
modules. The standard deviation of the measurements at this station
was smaller than any of the other nearby stations indicating
that the measurement errors at this station were unusually
small. For this special situation of a nearby station with
an unusually small measurement error, there is approximately
85% confidence that the spray module has increased the
expected background salt concentration. For this station,
the most likely effect is that on the average the spray module
increases the background salt concentration from approximately
5 to 8 micrograms per cubic meter.
It was not possible to prepare algorithms for estimating
the effect of the cooling devices on the salt concentration
as a function of environment and position because of the
small effects of the cooling devices on the salt concentration.
A more specialized optimal base may allow development of an
algorithm for estimating the effect of the environment on
the salt concentration at a fixed position; namely, that
defined by Station 7. Algorithms were developed to calculate
the background concentration at each station as a function
of the environment. These algorithms are given in Appendix D.
Thus, the major usefulness of the present study is: 1) the
general result that the effect of these cooling devices is
small compared to the measurement accuracy and 2) the informa-
tion which can be gained by examining the ADAPT analysis out-
puts to determine the relative importance of the environmental
variables to the concentrations measured and to define require-
ments for additional measurements.
-------
RECOMMENDATIONS
The results of this study which are presented in Sections 3
and 4 of this report have led to the following specific recommend-
ations :
1. .For future tests in which it is desired to determine
the precision run error, data should be obtained on co-located
stations for at least 300 days. This amount of data is required
in order to provide sufficient number of cases to develop an
algorithm to determine the precision run error as a function
of the environment.
2. Co-located stations should be provided at a number of
different locations in order to allow the development of algori-
thms to determine the effect of location on the precision ruri
error.
3. Additional emphasis should be placed on checking the
equipment and motivating test personnel on Mondays or after
any period of in-activity.
4. For those analyses where the major objective of the
study is to determine the effect of the environment on the
background salt concentration, the wind vector should be defined
as the projection of the wind on a compass direction rather than on
the position direction. Thus, separate ADAPT bases will be
required for developing algorithms to study the effect of the
environment and location on the background concentration and
to determine the effect of cooling devices on this concentration
as a function of location and environment.
5. At least 250 to 300 measurements are required at each
station to adequately define the effect of environment on the
background salt concentration.
6. In planning future tests programs the location
of-the stations relative to open water, including cooling
basins, should be considered with a significant number of the
stations located at distances greater than 300 meters from
such open water.
6
-------
7. A new ADAPT optimal base should be developed using
only the data from Stations 6 through 9 and a second optimal
base using only the data from Station 7. These bases should
then be used to rederive the algorithms for estimating the
increase in background concentration due to the spray modules
at these stations.
-------
SECTION III
EFFECT OF COOLING DEVICES ON SALT CONCENTRATION
The use of the Turkey Point data to estimate the effect
of two cooling devices on the ambient, airborne salt con-
centration may be divided into the following three steps:
1) the estimation of the precision of the measurements,
2) an estimate of the ambient concentration had the device
not been operating i.e. background concentration and 3) the
statistical analysis of the difference between the actual
measured concentration with the cooling device operating
and the expected background concentration. The first two
of these steps are identical for both the cooling tower and
the spray modules operating and the final step was carried out
independently for the data obtained when the cooling tower was
operating and the data obtained when the spray modules were
operating.
ESTIMATION OF THE PRECISION RUN ERROR
The precision run error is estimated from the results
obtained when no cooling device was operating and measurements
were made at Stations 1 and 2 which Figure 1 shows were co-
iocated. These measurements were made for 65 different cases.
The average precision run error for these 65 cases was 6.11%
with the standard deviation of 9.55%. Thus, if the distribution
is Gaussian we have a 70% confidence that the precision run error
was between 2%and 10%.
A regression analysis was performed to develop an algor-
ithm for estimating the precision run error as a function of
the environmental variables. The independent variables which
were used to prepare the estimate of the precision run error
are defined in column three, headed PRE, of Table 1. The
importance of each of these variables to the estimate of the
precision run error is shown in the relative importance plot
presented in Figure 2. The ordinate of Figure 2 is the relative
importance of each of the environmental variables to the estimate
of the precision run error. The number of learning cases was
insufficient and covered an insufficient variation in environ-
mental conditions to allow one to develop an algorithm having
sufficiently high confidence and accuracy to warrant the
application to the test conditions. However, the data were
adequate to provide an indication of which environmental factors
have the greatest impact on the precision run error. Figure 2
presents this information. The absolute magnitude of the
8
-------
TABLE 1
DEFINITION OF DATA VECTOR
VARIABLE NO
82Pt 75Pt
SYMBOL
PRE
1
2
3
4
5
6
7
8
9
10
11
12
13-22
-
-
-
1
2
3
4
5
6
7
8
9
10-19
1(80)
2(81)
3(82)
-
4
5(83)
6(84)
7(85)
8(86)
9(87)
10(88)
11-22
C
VOL
Na
de
<%
CC1
CC3
CC4
CCS
CC9
ts
te
dwi (1-1,10)
23-32 20-29 21-30
33-42 30-39 31-40
43-52 40-49 41-50
53-62 50-59 51-60
63
64
61
62
Nwi (i=l,10)
Ti (1=1,10)
Di (1=1,10)
Hi (1=1,10)
CDC1
CDC 2
65
66
67
68
69
70
71
72
60
61
62
63
64
65
66
67
63
64
65
66
67
68
69
70
DY
DFT
SI
S2
S3
M
T
u
W
DESCRIPTION
Mass of Salt/Unit Volume of Air
Sampled Air Volume, M3
Mass of Sodium on Mesh Pair
Projection of position vector on
East direction
Projection of position vector on
North direction
Binary Code for Light Rain
Binary Code for Bugs on Sample
Binary Code for Dust Contamination
Binary Code for Combination of
Comments
Binary Code for White Caps
Start Time
End Time
Projection of Wind Vector on Position
Direction -10 Samples Between ts and
te
Projection of Wind Vector on Normal
to Position Direction -10 Samples
Between ts and te
Dry Bulb Temperature -10 Samples
Between ts and te
Difference Between Dry and Wet Bulb
Temperature -10 Samples Between ts
and te
Relative Humidity -10 Samples Between
ts and te
Binary Variable Indicating Cooling
Tower Operation
Binary Variable Indicating Spray
Modules Operating
Day of Year
Days Since First Test
Binary Variable Indicating Spring
Binary Variable Indicating Summer
Binary Variable Indicating Fall
Binary Variable-Test on Monday
Binary Variable-Test on Tuesday
Binary Variable-Test on Wednesday
) - Variable Taken from STA-2 for Pre Estimate
-------
Table 1, continued
VARIABLE NO
82Pt 75Pt
73
74
75
76
77
78
79
80
81
82
68
69
70
71
72
73
74
75
PRE
71
72
73
74
75
76
77
78
79
SYMBOL
Th
F
S
dw(_i)
dn_
(_j\
dwsp/ jj
dnsp/ j\
du(_]_)
PRE
DESCRIPTION
Binary Variable-Test on Thursday
Binary Variable-Test on Friday
Binary Variable-Test on Saturday
Projection of Preceding Day ' s Average
Wind Vector on Position Direction
Projection of Preceding Day's Average
Wind Vector on Normal to Position
Direction
Preceding Day's Spread in Wind Speed
Preceding Day^'s Spread in Wind Direction
Preceding Day ' s Standard Deviation of
Wind Speed
Preceding Day ' s Standard Deviation of
Wind Direction
Precision Run Error
10
-------
FIGURE 2 - RELATIVE IMPORTANCE OF INDEXING VARIABLE DEJELINEJD BY TABLE £ TO
ESTIMATE OF PRECISION RUN ERROR USING AN 11 DIMENSIONAL ANALYSIS
H
J
ffl
<;
H
g
H
X
o
CX.
S
H
a;
o
a,
-2
20
40
O
0
WIND
INDEXING VARIABLE (SEE TABLE-1)
HUMIDITY ~ - '^TEMPORAL4
PREV
WIND,.
10
11
-------
ordinate indicates the effect of each of these variables.
One must also account for correlation between variables in
interpreting this figure. For these data, this may be
accomplished by multiplying the average value of each of
the sets of variables representing the wind and the humidity
(i.e., variables 11 through 60 taken in sets of 10) by 10.
The need for this procedure can be seen by noting that if the
same variable is included "'n11' times the relative importance
of each of the indices corresponding to these variables is
reduced by a factor of 1/n . Considering each of the wind
and humidity variables to be made up of an average plus a
small variation about the average, the effect of this average
is entered 10 times. When this is done, one obtains the
results which are presented in Table 2. This table summarizes
the most important variables for estimating the precision
run error as defined by Figure 3. Table 2 shows that the
most important single variable (which accounted for less than
10% of the explained variation) was wheather the test was
performed on Monday or not. If the test was performed on
Monday, the precision run error was larger. The next four
most important variables each of approximately the same
importance and nearly as important as whether the test was
performed on Monday were: the preceding day's wind speed,
the component of the wind normal to the line between Stations 1
and 2 and the cooling tower, the difference between the wet
and dry bulb temperatures, and the relative humidity. Note,
that the line between Stations 1 and 2 and the cooling devices
is approximately an east-west line so that the component of the
wind most important to the precision run error was the north-
south component of the wind.
Of the four most important variables contributing to
precision run error, only the first appears to represent a
factor which can easily be controlled. If one assumes that
the importance of Monday to the precision run error is due
to the effect of the weekend on the test equipment or personnel
performing the test, some improvement in the precision run
error could be achieved by particular emphasis on motivating
the personnel to exercise special efforts on the first day of
the week. It is also recommended that should any future tests
be performed, information on co-located stations should be
obtained for a minimum of 300 days. If this were donef.it is
extremely likely that an algorithm could be developed to
12
-------
TABLE 2 - MOST IMPORTANCE ENVIRONMENT VARIABLES FOR
ESTIMATE OF PRECISION RUN ERROR
ENVIRONMENTAL INDEX
VARIABLE NAME NO.
Test on Monday 68
Preceding Days Spread in
Wind Speed 76
Wind Normal to Position
Direction 21-30
Difference in Dry & Wet
Bulb Temperature 41-50
Relative Humidity 51-60
Salt Concentration Deposited 1
Bugs on Sample 5
Start Time 9
Test Performed in the Fall 67
Test on Wednesday 70
Test on Friday 72
RELATIVE
IMPORTANCE
EFFECT ON PRE
INCREASE DECREASE
X
X
4
4
i
2
2
2
2
2
X
X
X
X
X
X
X
X
X
13
-------
FIGURE 3 - ESTIMATED VERSUS ACTUAL PRECISION RUN ERROR
LEAST SQUARE ESTIMATED VERSUS ACTUAL
40
»!« *T
VI V
271
10
O
111
(ft
111
111
o
(ft
(ft
bl
,
20
10
10
20
ACTUAL
10
40
14
-------
estimate the precision run error as a function of the en-
vironmental variables shown in Table 1.
The results of applying the algorithm represented by
the relative importance vector in Figure 2 to the data
used to derive the algorithm is shown in Figure 3. Note,
that this is a plot of the data which was used to derive
the algorithm rather than independent test data. These data
are often referred to as learning or training data. Thus,
Figure 3 presents a plot of the estimated precision run
error versus the actual precision run error for the learning
data. The ordinate of this figure is the precision run
error estimated using the regression algorithm corresponding
to the relative importance vector of Figure 2. The abcissa
of Figure 3 is the actual precision run error for each of
these cases. The performance shown on this figure corres-
ponds to a correlation coefficient of0.56 which is equival-
ent to explaining approximately 17% of the variation in the
data.
Each of the points shown on Figure 3 may be related to
the raw data presented in Appendix C of Reference 1 by use
of Table 3 which indicates the sequential order in which
the symbols appear on the plot. The cases were plotted in
chronological order according to this sequence. Since only
17% of the precision run error could be explained by the
environmental factors, no attempt was made to estimate
precision run error for the other environmental conditions
occurring during the later tests. All of the learning data
were obtained during the early portions of the test. The
ADAPT analysis of the data which is reported in Appendix 2
showed that the character of the environment for the data
obtained during the early portions of the experiment in which
the precision run error was obtained was significantly
different from the character of the environment during the
later portions of the experiment. Since only 17% of the
variation could be explained by the environmental factors,
the best approach is to use 6% as the estimate of the
precision run error with a 70% confidence that the actual
precision run error lies between 2%and 10%. The analysis of
the precision run error regression algorithm discussed in
Appendix C of this report indicates that the most likely explana-
tion of the inability to develop a successful algorithm for esti-
mating the precision run error was the fact that only 65 learning
cases were available for this estimate.
15
-------
TABLE 3
ORDER OF PLOT SYMBOLS. READ FROM LEFT TO RIGHT
123456789=+-*/.C3ta"$0(?/3
-------
ESTIMATION OF BACKGROUND CONCENTRATIONS
The Turkey point data were processed through the ADAPT
programs to develop regression algorithms to estimate the
background concentrations, (i.e., concentrations with no
cooling device operating), as a function of the environmental
variables defined in Table 1 for each of the measurement
stations. These estimates were made for all of the stations
pooled together, for all of the stations during east winds,
'and for each of the individual stations. The results of these
estimates are summarized in Table 4. Physical reasoning
indicates that the cooling device will only effect downwind
stations, therefore, the background cases consisted of both
the data obtained when the cooling devices were not operating
and the upwind data.
Table 4 provides a summary of both the number of learning
cases and the number of dimensions used for each algorithm,
the performance of each algorithm in terms of the correlation
coefficient and the explained variation and the mean con-
centrations and standard deviations of the learning data used
to derive each of the algorithms. Each of the three sets of
algorithms developed will be discussed independently in the
following sections.
Estimate of Background Salt Concentration
Pooled Over All Stations
The algorithm derived to estimate the background salt
concentration using the data pooled over all of the stations
proved to have a relatively poor performance as indicated by
the correlation coefficient of 0.54 which provides an explained
variation of 16%. Figure 4 shows the performance of this
algorithm on each of the 478 learning cases used to derive the
algorithm. This figure shows that those cases having very large
salt concentrations were badly underestimated using this algorithm.
It was hypothesized that the reason for this poor performance
is due to the format of the wind vector.
The wind vector used was selected to be optimal for
estimating the concentration due to the cooling devices and
not the background concentration. To avoid the discontinuous
change in direction occurring as one moves from 360 to 0
degrees occurring in a polar coordinate system, the wind vector
used was the projection on two perpendicular directions as shown
in Figure 5. Since the primary objective of this study is to
17
-------
TABLE 4 - SUMMARY CONCENTRATION STATISTICS FOR
BACKGROUND CONDITIONS
STATION
NO OF
LEARN
CASES
ALL-POOLED 478
POOLED EAST WD 181
POOLED ABSOLUTE
EAST WIND
1,2
3
4
5
6
7
8
9
10
11
181
140
181
136
135
137
122
31
95
127
46
NO OF
DIM
USED
30
20
20
20
16
16
16
16
12
4
8
12
4
CORR
COEF
0.54
0.65
EXPL
VAR.
0.16
0.25
MEAN CONG. STD DEV.
LEARN OWN. WD LEARN DWN WIND
(ug /m3)
6.41
6.29
0.78
0.74
0.62
0.53
0.57
0.69
0.63
0.64
0.63
0.77
0.26
0.38
0.33
0.22
0.15
0.18
0.28
0.23
0.23
0.23
0.36
0.03
6.29
6.36
6.12
5.80
4.91
5.46
5.04
5.66
4.14
3.92
7.39
4.80
4.42
-
-
6.4
4.9
5.8
7.8
1.8
7.1
5.39
6.4
4.4
4.42
4.60
5.18
5.33
4.15
4.38
4.33
5.21
3.48
3.42
8.68
-
-
2.5
2.5
2.1
2.9
2.6
2.7
2.4
1.7
2.1
TABLE 5 - MOST IMPORTANT ENVIRONMENTAL VARIABLES
FOR ESTIMATE OF BACKGROUND .SALT- CONCENT RATION
(BASED ON EAST WIND DATA PLUS OR MINUS 45°)
ENVIRONMENTAL INDEX
VARIABLE NAME NO.
Preceding Day1 sSpread and
Standard Deviation of Wind
Speed 73,75
Projection of Wind Vector
on Position Direction 10-19
Dry Bulb Temperature 30-39
Presence of White Caps 7
Humidity 50-59
Projection of Position
on East Direction 1
RELATIVE
IMPORTANCE
46
15
10
9
8
EFFECT ON PRECISION
INCREASE DECREASE
X
X
X
X
X
18
-------
FIGURE 4 ESTIMATED VERSUS ACTUAL AMBIENT SALT CONCENTRATION
POOLED OVER ALL MEASUREMENT STATIONS
ACTUAL
19
-------
FIGURE 5 - DEFINITION OF WIND VECTOR
(ILLUSTRATION FOR NORTH WIND)
A/
WIND VECTOR
MAG = WS
COOLING DEVICE
WIND COMPONENTS @ STA -10
WIND COMPONENTS <§) STA -9
PROJECTION OF WIND VECTOR
ON POSITION
DIRECTION -
r WS * COS oC
10
WIND
PROJECTION OF WIND VECTOR ON
NORMAL TO POSITION DIRECTION::
WS * S/A/
PROJECTION OF WIND VECTOR
ON POSITION DIRECTION-
+ WS * Co<, oC0
WIND
PROJECTION OF WIND VECTOR ON NORK
TO POSITION DIRECTION =
&/*/
20
-------
determine the effect of the cooling devices, the wind is
projected on a coordinate system located relative to the
line connecting the cooling devices and the measurement
stations. Since measurement stations were located on all
sides of the cooling devices, this coordinate system only
has meaning with respect to the background salt concentration
on a station by station basis. For example, Figure 5 shows
that a North wind will be positive at Station 9 but negative
at Station 10. Thus, the effect of this wind is cancelled
when pooled over these two stations. Thus, the linear re-
gression algorithm can not make use of the wind information
when the data are pooled over all stations.
This hypothesis is supported by the relative importance
plot presented in Figure 6. Figure 6 presents the relative
importance of each of the variables shown in Table 1 to the
estimate of the background salt concentration pooled over
all stations. The environmental variables may be identified
by using Column 2, headed 75 pt, of Table 1; The average
value of variables 10 through 19 and 20 through 29, projection
oH the wind vector on the position direction and the normal
of the position direction respectively, are very close to 0.
This indicates that the background salt concentration algorithm
derived using the pooled data is not using the average magnitude
of the wind during the test. Physically, one knows that wind
should be important to this estimate and thus, we have verified
that the change of coordinate systems from station to station
is denying the algorithm this information. This figure shows
that the most important information available for the pooled
estimate is the temporal information, indicating time of year
during which the tests were performed and the spread of
variation in the previous day's wind speeds.
The difficulty resulting from the definition of the wind
vector could be corrected in two different ways: The first
would be to redefine the wind vector by projecting the wind on
two perpendicular compass directions and then rederive the
ADAPT optimal base using these new data vectors. This base
could then be used to rederive the regression algorithms. The
second approach is to use the data vectors as presently defined
and derive the concentrations for each of the individual
stations. Since the second approach was already planned as
part of the study, and eliminates the need to explain the
variation due to position, this approach was the solution chosen.
21
-------
FIGURE 6 - RELATIVE IMPORTANCE OF INDEXING VARIABLE DEFINED BY TABL!
1' TO ESTIMATE OF AMBIENT SALT CONCENTRATION POOLED OVER ALL STATIONS
WIND-
INDEXING VARIABLE.
_ HUMIDITY-
o
TEMPORAL
r i
o
PREV
WIND
22
-------
Estimate of Background Salt Concentration Pooled Over All
Stations During East Winds
A second study pooling all of the data was performed using
only the data collected when the wind was within plus or minus
45 degrees of an east wind. Since the wind direction is
limited, the difficulty created by the selection of the wind
vector is less important. Table 4 shows that even though
it was necessary, to reduce the dimensionality to 20 dimensions
because we now only have 181 learning cases, the performance
of the algorithm was improved giving a correlation coefficient
of 0.65 which corresponds to explaining 25% of the variation.
Figure 7 presents the relative importance vector for this
algorithm. Comparing this to Figure 6, we see that variables
10-19 are all positive and thus the absolute magnitude of
the wind is now playing a significant roll. It is interesting
to note that the effect in the spread in the previous day's
wind, variables 73 and 75, is now opposite to that seen in
Figure 6. One explanation for this is that when the magnitude
information for the current day's wind is available the
algorithm no longer trys to estimate the current day's wind
from the spread in the previous day's wind. Table 5 summarizes
the most important variables indicated by Figure 7. The spread
in the previous day's wind is still the most important variable
although the next most important variable has now become the
magnitude of the current day's wind followed by the presence
of white caps.
In order to determine if the selection of the wind vector
still effects the performance when the winds were restricted
to within plus or minus 45 degrees of an east wind, an east
wind algorithm was rederived modifying the data vectors such
that the absolute values of the projection of the wind on the
position directions and normal to the position direction were
used. Although this eliminates the effect of wind direction,
it now allows the linear regression algorithm to use the
wind magnitude even when the data are pooled over all of the
stations. Since the wind direction is limited to within plus
or minus 45 degrees of the east wind, this is not a severe
restriction. Table 4 shows that the algorithm based on the
pooled absolute wind shows significant improvement in per-
formance with a correlation coefficient of 0.78 and an ex-
planation of 38% of the variation. This algorithm is presented
in Appendix 0. The relative importance vector showing the
23
-------
effect of each of the environmental variables shown in
Table 1 on the estimated salt concentration is shown in
Figure 8 and the most important of these variables summarized
in Table 6. Comparison of Figures 7 and 8 verifies that the
form of the wind data vectors selected still has a significant
effect even when the wind is restricted to plus or minus
45 degrees, but a significant amount of the detrimental effect
may be eliminated by using the absolute magnitude of the wind
vector. Note, that variables 10 through 19 are now signi-
ficantly more important and also variables 20 through 29
have become far more important. Referring to Table 6, we
see that the most important variable s are now the two com-
ponents of the present day's wind followed again by the spread
of variation in the previous day's wind and the presence of
white caps. Note, that some of the variables associated
with relative humidity and the position of the station in
the east-west direction are approximately as important as these
latter variables. Thus, from Figure 8, we may conclude that
for the case of east winds, the most important environmental
factor for determining the background concentration of salt
is the magnitude of the current day's wind. This is followed
by the distance of the station from the ocean, the previous
day's variation in wind, the presence of white caps and the
humidity all of which have approximately an equal influence
of about l/10th that of the magnitude of the wind. To obtain
similar conclusions on the effect of environmental variables
for all of the wind directions it would be necessary to redefine
the wind vector by projecting it on two perpendicular compass
directions and then rederiving both the optimal base and the
regression algorithm.
BACKGROUND CONCENTRATIONS AT EACH STATION!
Regression algorithms were developed to estimate the
background concentrations at each of the individual measurement
stations. These algorithms have the advantage that the variation
due to position need not be explained by the regression algorithm
and that the optimal definition of the wind vector for estimat-
ing the concentration with the devices operating is also
optimal for estimating the background conditions. Thus, these
algorithms were used to estimate the background concentration
which would occur at the times when the downwind data were taken.
The second and third columns of Table 4 summarize the number
of learning cases which were available for developing each
24
-------
FIG- 7 RELATIVE IMPORTANCE VECTOR FOR EAST VINO (45-N-20)
to
X
u
o
b.
O
U
O
tt
O
Ul
>
Ul
It
-10
-20
-10
»o
4O
0
70
to
INDEXING VARIABLEC SEE TABLE-1)
WIND >fe HTTMIDITY TEMPORAL
I
PREV
WIND
25
-------
FIG-8 RELATIVE IMPORTANCE VECTOR FOR ABSOLUTE EAST WIND C43-N-20)
x
Ul
o
UJ
u
DC
O
0.
UJ
>
Ul
oe
-i
1
o
401 >
71
40
INDEXING VARIABLE!
WIMP >fe HUMIDITY-
o
SEE TABLE-:
o
^TEMPORAL
TO
J.
o
PREV
WIND
M
26
-------
TABLE 6 - MOST IMPORTANCE ENVIRONMENT VARIABLES FOR ESTIMATE
OF BACKGROUND SALT CONCENTRATION (BASED ON ABSOLUTE
EAST WIND DATA)
ENVIRONMENTAL INDEX
VARIABLE NAME NO.
WIND SPEED DURING THE TEST 10-29
WHITE CAPS 7
HUMIDITY 50-59
DRY BULB TEMPERATURE 30-39
DISTANCE OF STATION FROM
THE OCEAN 1
PRECEDING DAYS SPREAD AND
STD DEV IN WIND SPEED 73,75
RELATIVE
IMPORTANCE
400
6.8
6
6
5.7
EFFECT ON ESTIMATE
INCREASE DECREASE
X
X
X
X
X
X
TABLE 7 - MOST IMPORTANCE ENVIRONMENTAL VARIABLES FOR ESTIMATE
OF BACKGROUND CONCENTRATION AT STATIONS 1 AND 2
ENVIRONMENTAL
VARIABLE NAME
INDEX
NO.
RELATIVE
IMPORTANCE
EFFECT ON ESTIMATE
INCREASE DECREASE
SPREAD AND STANDARD
DEVIATION OF PREVIOUS DAYS
WIND 73,75
HUMIDITY 50,59
TESTS PERFORMED IN THE
SPRING 62
DIFFERENCE BETWEEN DRY
AND WET BULB TEMPERATURE 40-49
LIGHT RAIN 4
PROJECTION OF WIND ON
POSITION DIRECTION 10-19
END TIME 9
START TIME 8
150
40
27
25
11
10
9
8
X
X
X
X
X
X
X
X
27
-------
of these algorithms and the number of dimensions used.
Performance of the algorithms for each of these stations
is summarized in Columns 4 and 5 of Table 4. These algorithms
are included in Appendix D. The performance ranges from
very poor at Station 11 to marginally good at Stations 1,
2 and 10. The last four columns of this table show the
measured mean and standard deviation of the background con-
centration at each of the stations as well as the predicted
background mean and standard deviation of the concentrations
which would have occurred at each of the stations during the
time period when the cooling devices were operating. Com-
parison of the mean concentrations for the learning data
(i.e. the time period which the cooling devices were not
operating) and the mean obtained when the cooling devices
were operating shows-that they are usually within a standard
deviation of each other. The standard deviation of the
estimates is smaller than the measured cases. This is due
to the elimination of noise from the estimated values, which
reduces the spread about their mean value.
Examination of Table 4 shows that, in general, the explained
variation for the algorithms developed for the independent
stations is less than the explained variation for the algorithm
developed for the stations pooled over the absolute east wind.
There are two reasons for this lower value of the explained
variation. The first is that there is less variation to
explain since the algorithm pooled over all of the stations
roust also explain the variation due to station location. If
this variation is relatively easy to explain, the percentage
of variation explained would be greater when this variation
is included in the data. The second reason is that the number
of learning cases is significantly.reduced for the individual
stations and, as discussed in Appendices & and d, this reduces
the number of dimensions which can be used for the analysis.
The performance of the algorithms developed for those stations
for which relatively large number of cases were available is
in general not too different from that, obtained for the cases
which were pooled over stations with the absolute east wind.
This implies that the use of a new base with the redefined
wind vector with the data pooled over all of the stations
would not yield a significantly better estimate for the con-
centration. However, the advantage of this new pooled algori-
thm would be a capability to predict the background concentra-
tion at almost any point in the vicinity of the Turkey Point
facility.
28
-------
The calculated background concentration at each of the
stations could be improved either by deriving a separate
base for each of the stations, thus eliminating the position
variation from the ADAPT base and decreasing the number of
dimensions required to achieve a given degree of representa-
tion, or by increasing the number of learning cases avail-
able or a combination of the two. Since only a relatively
small improvement will occur as a result of developing a
new base at each station and approximately a factor of
two increase in the dimensionality is required, it is re-
commended that for any future test at least 250 to
300 measurements be made at each station. It should be noted
that the improvement in the estimate of the background condi-
tions will decrease the standard deviation of the estimate,
but the mean value of the background concentration averaged
over all of the stations will probably remain approximately
the same as was found in this study. It is unlikely that any
improvement in the ability to estimate the background con-
centration will allow a better estimate of the effect of the
cooling device unless there is a corresponding improvement
in the ability to measure the concentration.
Tables 7 through 16 summarize the relative importance
of the environmental variables for the concentration at each
of the stations. These tables summarize the corresponding
relative importance plots such as those presented in Figures 9
and 10. The relative importance vectors for the other stations
are included in Appendix c. It is interesting to compare the
relative importance plot presented in Figure 9 for Station 10
with the relative importance plot presented in Figure 8 for
the pooled stations with an east wind. The dominate variable,
the magnitude of the wind, is still the same. However, the
effect of the station location Variable No. 1 in Figure 8 is
now very nearly 0. This should be the case since the location
of Station 10 is fixed and this algorithm does not include
the variation in the position in learning data. We also see
that the effect of white caps and the spread in the previous
day's wind is not important to Station 10. Examination of
Table 7 shows that these variables are important in Stations 1
and 2. Thus, we conclude that the effect of the previous day's
wind is most important for those stations located near the ocean
and becomes relatively unimportant for those stations located
far back from the ocean. ' Figure 9 shows that Variable 64,
which Table 1 shows as a test performed in the fall is more
important then was observed in Figure 8. Examination of Tables 7
29
-------
TABLE 8 - MOST IMPORTANT ENVIRONMENTAL VARIABLES FOR
ESTIMATE OF BACKGROUND CONCENTRATION AT
STATION 3
ENVIRONMENTAL INDEX RELATIVE EFFECT ON PRE
VARIABLE NAME NO. IMPORTANCE INCREASE DECREASE
Projection of Wind on
Position Direction 10-19 20 X
Dry Bulb Temperature 30-39 10 X
Variation in Relative
Humidity During Test 40-59
Projection of Wind Vector
on Normal to Position
Direction 20-29 6 X
Tests Performed in the Fall 64 3 X
TABLE 9 - MOST IMPORTANT ENVIRONMENTAL VARIABLES FOR
ESTIMATE OF BACKGROUND CONCENTRATION AT
STATION 4
ENVIRONMENTAL INDEX RELATIVE EFFECT. ON PRE
VARIABLE NAME NO. IMPORTANCE INCREASE DECREASE
Projection of Wind Vector
on Position Direction 10-19 15 X
Projection of Wind Vector
on Normal to Position
Direction 20-29 9 X
Relative Humidity 50-59 9 X
Difference Between Dry &
Wet Bulb Temperature 40-49 . 6 X
Dry Bulb Temperature 30-39 5 X
Tests Performed in the
Summer 63 3 X
30
-------
TABLE 10 - MOST IMPORTANT ENVIRONMENTAL VARIABLES
FOR ESTIMATE OF BACKGROUND CONCENTRATION AT
STATION 5
ENVIRONMENTAL
VARIABLE NAME
INDEX
NO.
RELATIVE EFFECT ON PRE
IMPORTANCE INCREASE DECREASE
Projection of Wind on
Position Direction ' 10-19
Projection of Wind on
Normal to Position Direction 20-29
Dry Bulb Temperature 30-39
Variation in
Humidity 40-59
Tests Performed in the
Summer 63
Tests Performed on Friday 69
10
5
4
2%
2
X
X
X
X
X
TABLE 11 - MOST IMPORTANT ENVIRONMENTAL VARIABLES
FOR ESTIMATE OF BACKGROUND CONCENTRATION AT
STATION 6
ENVIRONMENTAL
VARIABLE NAME
INDEX
NO.
RELATIVE
IMPORTANCE
EFFECT ON PRE
INCREASE DECREASE
Projection of Wind Vector
on Normal to Position
Direction 20-29
Projection of Wind Vector
on Position Direction 10-19
Dry Bulb Temperature 30-39
Variation in Humidity 40-59
White Caps 7
24
18
10
X
X
X
X
31
-------
TABLE 12 - MOST IMPORTANT ENVIRONMENT VARIABLES
FOR ESTIMATE OF BACKGROUND CONCENTRATION
AT STATION 7
ENVIRONMENTAL
VARIABLE NAME
INDEX
NO.
RELATIVE
IMPORTANCE
EFFECT ON PRE
INCREASE DECREASE
Projection of Wind on
Position Direction 10-19
Projection of Wind on
Normal to Position Direction 20-29
Tests Performed in the Fall 64
Tests Performed in the
Summer 63
Variation in
Humidity 40-59
20
20
13
8
X
X
X
X
TABLE 13 -
MOST IMPORTANT ENVIRONMENTAL VARIABLES
FOR ESTIMATE OF BACKGROUND CONCENTRATION AT
STATION 8
ENVIRONMENTAL
VARIABLE NAME
INDEX
NO.
RELATIVE
HUMIDITY
EFFECT ON PRE
INCREASE DECREASE
Projection of Wind on
Position Vector 10-19
Difference Between Dry and
Wet Bulb Temperature 40-49
Dry Bulb Temperature 30-39
Relative Humidity 50-59
Days Since First Test 60,61
22
5
2
2
3
X
X
X
X
X
32
-------
TABLE 14 - MOST IMPORTANT ENVIRONMENTAL VARIABLES FOR
ESTIMATE OF BACKGROUND CONCENTRATION AT
STATION 9
ENVIRONMENTAL
VARIABLE NAME
INDEX
NO.
Magnitude of the Cross
Wind 20-29
Magnitude of Wind Towards
Cooling Device 10-19
Difference Between Dry
and Wet Bulb Temp 40-49
Humidity 50-59
Test Performed in the
Spring 62
Test Performed on Friday 69
Test Performed in the
Fall 64
Test Performed on Wed 67
RELATIVE
HUMIDITY
21
5.5
2
2
2
1.2
1.1
1.1
EFFECT ON ESTIMATE
INCREASE DECREASE
X
X
X
X
X
X
X
X
TABLE 15 - MOST IMPORTANT ENVIRONMENTAL VARIABLES FOR
ESTIMATE OF BACKGROUND CONCENTRATION AT
STATION 10
ENVIRONMENTAL
VARIABLE NAME
INDEX
NO.
RELATIVE
HUMIDITY
EFFECT ON ESTIMATE
INCREASE DECREASE
Magnitude of Wind On
Position Direction 10-19
Wind Magnitude Normal to
Position Direction 20-29
Difference Between Dry
and Wet Bulb Temp 30-39
Test Performed in the Fall 64
Length Time Since First
Test 60,61
28
17
7
4.5
X
X
X
X
X
33
-------
TABLE 16 MOST IMPORTANT ENVIRONMENTAL VARIABLES
FOR ESTIMATE OF BACKGROUND CONCENTRATION AT
STATION 11
ENVIRONMENTAL
VARIABLE NAME
INDEX
NO.
RELATIVE
IMPORTANCE
EFFECT ON PRE
INCREASE DECREASE
Projection of Wind Vector
on Position Direction 10-19
Difference Between Dry and
Wet Bulb Temperature 40-49
Dry Bulb Temperature 30-39
Relative Humidity 50-59
Tests Performed in the
Spring 62
Days Since First Test 60,61
Projection of Wind on
Normal to Position Direction 20-29
Tests Performed on Monday 65
11
5
4
3
3
4
1.5
X
X
X
X
X
X
X
34
-------
3FIG-9 RELATIVE IMPORTANCE VECTOR FOR BACKGROUND CONC AT STATION-10 N=12
t.o
2.0
1.0
X
Ul
o
U.
o
Ul
U
o
a.
1.0
Ul
M -2.0
Ul
a:
-1.0
-4.0
-f .0
tt
I OL
1O
2O
4O
0
INDEXING VARIABLEC SEE TABLE-1) [ PREV
WIND "»f^ -HUMIDITY ^ TEMPORAL f- WIND
I
0
35
-------
FIG-10 RELATIVE IMPORTANCE VECTOR FOR BACKGROUND CONC AT STATION-9 (N=8)
j.o
X
UJ
o
u
u
z
cc
o
0.
r
u
oc.
-1.0
INDEXING VARIABLEC SEE TABLE-D
WIND >j< HUMIDITY ^ TEMPORAL
PREV
WIND
36
-------
through 16 and the corresponding figures in Appendix 3 show
that the seasonal effects vary considerably from station to
station. Thus, a pooled estimate over all stations tends
to lose these seasonal effects. The reason why the seasonal
effects are different from station to station is unknown.
Comparison of Figure 10 with Figure 9 verifies the
previous explanation for why the estimate pooled over all
stations without limitation to wind direction loses the
importance of the magnitude of the wind. At Station 9 the
effect of the wind projected on the position direction is
exactly opposite to that for Station 10. Thus, when pooled
over all stations, the data for these two stations tends to
cancel the effect of wind direction.
Examination of the relative importance of the wind for each
of the stations also shows the effect of distance of the station
from the ocean and the cooling basin located South of the
cooling devices. For Stations 1 and 2 the dominate effect is
the variability of the previous day's wind and the current day's
wind is not as significant. This suggests that for stations
very near the ocean, the most important factors are those controll-
ing the salt concentration over the ocean. In going from
Station 3 to Station 4, the reader should recall that the
definition of the wind vector is such that the sign of the
effect of a given wind is reversed because of the change in
coordinate system. Thus, at Station 3 if the projection of
the wind on the position direction is positive at Station 4
the same wind projection is negative. At Stations 3, 4 and 5
all of which are located on an East-West line adjacent to the
cooling basin, the ratio of the effect of the projection of the
wind direction on the position direction (i.e. the East-West
wind component) is of the same order as the wind component in
the North-South direction. However, for Station 6 which is
located on the same East-West line but not directly adjacent to the
cooling basins, the effect of the North-South wind relative to
the East-West wind has been increased. Thus, as we move further
away from the cooling basins, we see a similar affect to that
which was observed as we moved away from the ocean. This
suggests that the cooling basins probably have a significant
effect on the concentration at those stations located close to
them. Thus, it is recommended that for any future experiments
any open water, including cooling basins, be considered as sources
of background salt concentration.
37
-------
EFFECT OF COOLING TOWER ON AMBIENT CONCENTRATION
The effect of the cooling tower on the ambient concentration
was obtained by using the algorithms discussed in the preceding
section and summarized in Table 4 to estimate the background
salt concentration which would have occurred during each of the
tests where the cooling tower was operating. This calculated
background concentration was subtracted from the measured con-
centration to determine the salt deposition which could be
attributed to the cooling tower. When this difference was
averaged over the entire set of 398 measurements made during
the operation of the cooling tower, the average increase in
ambient concentration over the expected background concentration
was 0.002 micrograms per cubic meter with a standard deviation
of 4.8 micrograms per cubic meter. This clearly indicates that
on the average the increase in concentration was less than
could be measured in this test program.
Similar analyses were made for each of the individual
stations. The results of these analyses are summarized in
Table 17. The first column of Table 17 indicates the station
number for which the statistical summary is provided. The
second column indicates the number of cases for which measure-
ments were made at that station while the cooling tower was
operating. The third column gives the mean value of the diff-
erence between the measured concentration and the expected back-
ground" concentration at that station. The standard deviation
of this average is provided in the fourth column. The fifth
and sixth columns provide the maximum and minimum values which
were observed for the difference between the measured and the
expected background concentration. The seventh column gives the
confidence that the increase in concentration at the station
due to~t:he cooling tower was less than the standard deviation
of the measurement. The av erage of this confidence for all
stations was 83%. This table shows that for Station 10 there
were only three cases and thus a statistical summary has no
meaning for this station. Thus, for the remainder of this
discussion we shall only consider Stations 3-9 and 11. For
each of.these stations we note that the standard deviation of
the measurement exceeds the mean value of the measurement. This
fact alone is a strong indication that no measurable enhancement
of the expected background concentration resulted from operating
the cooling tower. A further strengthening of this conclusion
beyond the 83% confidence level is the fact that approximately
half of the mean values are negative and the other half are
positive. This is exactly the situation that would be expected
in the event the cooling tower had no effect upon the background
38
-------
TABLE 17 - SUMMARY STATISTICS FOR DOWN WIND MINUS BACKGROUND
CONCENTRATION FOR COOLING TOWER
STA
3
4
6
9
10
11
NO
CASES
2 1
53
54
59
£ I
63
50
3
44
MEAN
mgr/m3
-0 2 I 0 1 E
0
_ (\
-6
c
_ A
-0
-0
A
4 6 4 £ E
Z2 1 1 E-
£41 7E
2 5 f ; fc
1976F
1 23 (. E
454 F E
7c5fiE
STD
DEV
ugr/m ^
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
c
c
c
c
c
0
0
0
Q
c
.
3590F
5593E
4238F
3095F
3 6 1 1 E
3975F
3 34 9 F
1 1 70E
46^p SE
MAX
VALUE
ugr/m3
01 0 4552E
Cl 0
c
c
g
c
c
c
c
0
0
")
0
n
-0
0
31 R1E
1 637E
5779E
1 1 70E
1022E
1 0 2 it E
2 9 0 0 f-
1550E
Cl
02
C2
C 1
C2
C2
C2
T 1
02
MIN
VALUE
ugr/m 3
Q.I 11 ^E 02
0. 1520E 01
C.C419E 01
C . 7 7 9 7E 01
p ^c; r- gc. i^j
C. 8790E 0 1
C . 7 4 6 3E 01
C.5504E 01
C.6409E 01
fMC]
TJd&]
A.A")
9l2
84
96
61
93
92
-i
85
CO.NFIDENCE THAT
INCREASE IS LESS
STD JXEV
U)
TABLE 18- SUMMARY STATISTICS FOR DOWN WIND MINUS BACKGROUND
CONCENTRATION FOR SPRAY MODULES
STA
3
4
6
7
8
9
10
11
NO
CASES
21
4 1
A3
43
37
54
40
6
40
MEAN
ugr/m
0 145 IE 00
0 1 16 CE C2
C 7576E CO
0 2935E 01
0 234<5E 01
0 2 7 Q 7 E 01
C 107 2E 0 1
0 503 2 E 01
0 246 2E Cl
STD
DEV
ugr/m
C.7726F Cl
0.19<37E 02
C.64COE 01
C.2 166E 0 1
C. 2 9 5 IE 01
0 .4 1 7SE C 1
0.2370E Cl
0. 1245E 0 1
C.9393E 01
MAX
VALUE
ugr/m
0 179QF 02
0 1036fc C3
0 3 1 79E C2
C 7399L C 1
0 134?F C?
0 I 3?^F 02
0 8539E Cl
-0 3003P Cl
0 4H57F C2
MIN
VALUE
ugr/m
C.8405F:
-C.7fi52F_
-C. 54S4F
-C. 1 136F
-o. 10 i IE
-C.I 17 3F
-0.4251E
-C.6554E
0. 6938E
0 1
C 1
: i
02
01
02
C 1
01
01
CONFIDENCE THAT
INCREASE IS LESS
THAN STD DEV
97
J44)
93
73
-------
concentration. This combined with the extremely small value
of the average difference over all 398 cases suggests that
the increase in the salt concentration over the expected back-
ground concentration as a result of the operation of the cool-
ing tower is probably significantly less than the measurement
accuracies of the instrumentation used in this program.
EFFECT OF SPRAY MODULES ON AMBIENT CONCENTRATION
The effect of the spray modules on the ambient concentration
was obtained in exactly the same way as the effect of the
cooling tower. That is the algorithms discussed in Section 3.2
were used to calculate the expected background salt concentration
for each of the tests performed with the spray modules operating.
The average increase in the expected background concentration
overall 325 tests was 1.32 micrograms per cubic meter with a
standard deviation of 9.9 micrograms per cubic meter. Again
we conclude that on the average the effect is smaller
than could be measured by this instrumentation. Table 18
presents the summary statistics for each one of the stations
with a spray module operating. This table is of the same form
as Tablel?. Examination of this table also shows that Station 10,
an inadequate number of cases, were available for one to draw
a meaningful statistical conclusion. For the remaining stations,we
again note that approximately half of the stations had negative
mean values and half had positive mean values. However, the
fact that the average over all 325 was 1.32 micrograms per
cubic meter shows that in general those with positive mean values
had slightly larger positive mean values then those with negative.
This would be an indication that there maybe some stations
for which the concentration was increased by a small amount by
the operation of the spray module.
Table 18 shows that for Station 7 the mean value of the
difference between the observed and expected background con-
centration is slightly greater than
the standard deviation. Since approximately 85% of the Gaussian
distribution falls between plus or minus one standard deviation
of the mean, there is approximately 85% confidence that at
Station 7 there was some increase in the background salt
concentration due to the operation of the spray modules. At the
other stations, the mean value for the difference between the
measured concentration and the expected background concentration
is less than the standard deviation of the measuremen with an
average confidence of 84%. Thus, for all of the other stations,
40
-------
we can conclude that the effect of the spray modules on the
background concentration is smaller than the ability o± the
present instrumentation to measure tftis effect.
Station 7 represents an unusual station for this study
in that it is the only station which is both relatively near
the spray modules and has a small standard deviation for the
difference between the measured salt concentration and the
expected background concentration. This implies that if the
measurements at some of the other nearby stations such as
Stations 3, 4, 11 ind 5 had standard deviations of the order
of three micrograms per cubic meter, increases may also have
been observed at these staions. Since the smallest standard '
deviations occurred at Stations 6, 7 and 9 ' .
which are not located adjacent to the
cooling basin, on© might speculate that the cooling basin
itself introduced a significant amount of variation into the
measurements. In this case, it is suggested that the location
of Station 7 away from the cooling basins may have been the
reason why the accuracy was sufficient that one was able to
measure an increase due to th© effect of the spray modules.
This leads to the recommendation that for future tests one
should locate the majority of the measurement stations at
least 300 meters from the cooling basin, We note that an
increase in the error due to the proximity to the cooling basin
is in general agreement with the results obtained by examination
of the relative importance vector for the background concentration
which indicated that the cooling basin does have a significant
effect on the salt concentration and is in
part strongly dependent on the magnitude and direction of the
wind. It is not unreasonable to assume that there may be
other factors which affect the increase in salt concentration
due to the cooling basin which have not.been included in the
present study.
Since the results in general for the effect of the cooling
devices on the background concentration were negative, algorithms
for estimating this effect could not be prepared except for
Station 7. Thus, it is impossible to use the present data to
obtain algorithms for estimating the effect of position and
environmental variables on the increase in the background con-
centration due to the operation of either the cooling tower or
the spray modules. However, the data from Station 7 offer a
potential for making an algorithm to determine the effect of
the environmental variables on the salt concentration at the
41
-------
location of Station 7. Since the standard deviations are small,
the data from Station 6, 7, 9 and 10 may be pooled to obtain
a limited effect of position on the salt concentration. Thus,
algorithms were developed using the data from Station 7 to
predict the different between the measured and expected back-
ground salt concentration for the spray module. A similar
algorithm was developed using the data for Station 6 through
9. The performance of these algorithms were quite poor with
correlation coefficients of 0.40 and 0.60 for the Station 7
and Station 6 through 9 algorithms, respectively. The fact
that the correlation coefficient is greater for the Station 6
through 9 algorithms is the result of the fact that a larger
number of cases were available to make this algorithm. Thus,
a dimensionality of 16 could be used which allowed one to
incorporate a significant portion of the information using
the base derived for the anlaysis of this salt spray data.
For Station 7, there were only 37 cases available and thus
the maximum dimensionality which could be used was 4. This
corresponded to only using approximately half of the informa-
tion in the data. Thus, the performance of this algorithm is
severely restricted by the limited number of cases available
for the analysis. This restriction could be alleviate to some
extent by the development of a new base using only the data
from Station 7 for the Station 7 algorithm and only the data
from Station 6 through 9 to develop this Station 6 through 9
algorithm. Since this new base would not have to incorporate
the information regarding position variation or the information
concerned with the other stations it would be a better base
for the analysis of the data from these limited subsets of
stations.
Although the performance of the algorithms for Stations
6-9 was very poor, there is reason to believe that the dominate
variables occurring in the relative importance vectors will
probably remain important even if more data were available for
the analysis. Thus, the relative importance vectors for these
two algorithms are presented in Figures 11 and 12. Figure II
presents the relative importance vector for the algorithm
for estimating the increase in ambient salt concentration due
to the spray module at Station 7. The dominate variables for
this algorithm are summarized in Table 19. Similar results
are presented for Stations 6 through 9 in Figure 12 and summarized
in Table 20. in both of these cases, only a few of the most
dominate variables are included in these tables because of the .
42
-------
FIG 11 RELATIVE IMPORTANCE VECTOR FOR SPRAY MODULE STA 7 CN«4)
1.2
-O.t
10
INDEXING VARIABLE C SEE TABLE-
-WIND >£ HUMIDITY Vf TEMPORAL -f WIND
43
-------
FIG-12 RELATIVE IMPORTANCE VECTOR FOR SPRAY MODULE STA6-9 CN-16)
x
UJ
o
uj
o
z
QC.
O
Q.
UJ
>
Ul
ft
-10
INDEXING VARIABLE C SEE TABLE-1)
-WIND- >fc- HUMIDITY >fr TEMPORAL-f WIND
44
-------
TABLE 19 - MOST IMPORTANT ENVIRONMENTAL VARIABLES FOR
ESTIMATE OF INCREASE IN SALT CONCENTRATION DUE
TO SPRAY MODULES AT STATION 7
ENVIRONMENTAL
VARIABLE NAME
Projection of Wind on
Position Direction
Projection of Wind on
Normal to Position
Direction
INDEX
NO.
10-19
20-29
RELATIVE
HUMIDITY
16
EFFECT ON ESTIMATE
INCREASE DECREASE
X
X
TABLE 20 - MOST IMPORTANT ENVIRONMENTAL VARIABLES FOR ESTIMATE
OF INCREASE IN SALT CONCENTRATION DUE TO SPRAY
MODULES AT STATIONS 6 THROUGH 9
ENVIRONMENTAL
VARIABLE NAME
INDEX
NO.
RELATIVE
HUMIDITY
EFFECT ON ESTIMATE
INCREASE DECREASE
Test Performed in the Fall 64
Test Performed in the
Slimmer 63
Projection of Wind on
Position Direction 10-19
96
27
20
X
X
X
45
-------
poor performance of these algorithms which suggests that only
the most significant of the variables can be considered mean-
ingful. It is recommended that if additional information's desired
regarding the amount of increase at Station 7 as a function
of variation in the environment is desired that these data be
reprocessed through the ADAPT programs to derive an optimal
base for this station by itself and that this base be used
to rederive an algorithm to estimate the effect of the en-
vironment on the concentration of Station 7. A similar pro-
cedure is recommended for the data obtained from Stations 6-10.
This will result in significantly improved relative importance
vectors and possibly an algorithm allowing application of the
results-of this study to other power plant sites.(See Section
4.0)
46
-------
SECTION IV
EXTENSION TO OTHER SITES
The approach to the present study consisted of determining
the effect of the cooling devices on the ambient concentration
by subtracting the background concentration expected at the
Turkey Point site. Thus, any measured increase in concentration
found in this analysis would be independent of the site at which
these specific cooling devices are located. However, since the
effect observed was essentially no increase, these results are
trivial except for the effect of the spray module at Station 7.
Thus, we may state in general that the effect of the particular
cooling tower used in the present study would be smaller than
the accuracy of the present measurements regardless of the site
at which it was located. We may also state, based on the results
of the observations of Station 7, a pair of spray modules as
used in this study can be expected to increase the background
salt concentration by approximately three micrograms per cubic
meter at distances of approximately 400 meters from the spray module
averaged over the environmental conditions similar to those
observed during the present test program.
Although the estimated increase in the background con-
centration due to the cooling devices obtained as a result of
this study are independent of the site with respect to the
measurement accuracies found in this study, the measurement
accuracies themselves are not independent of the site. This
can be seen from the fact that the most likely reason that an
effect was observed at Station 7 and not at Stations 3, 4, 5
and 11 was the fact that Station 7 was located at a greater
distance from the cooling basins then the other stations and
thus had more accurate estimates. Thus, we must conclude
that a similar study made at a different site where the measure-
ment accuracies can be expected to be more similar to those
observed on Station 7 would result in measurable increases in
the background concentrations at least at the stations located
near the spray module.
In the absence of the additional analysis recommended in
Section 3.4 and/or testing required to develop a site independent
algorithm for the spray modules, it can only be concluded that
for environment conditions such as those observed at the Turkey
Point site the average increase in background salt concentration
due to a pair of spray modules is approximately three micrograms
per cubic meter. The relative importance vectors show that
these concentrations will be strongly affected by the wind and
47
-------
the season during which the tests are performed. Clearly,
many other variables are still important and could only be
defined through a more complete analysis of additional test
data.
The results obtained for the estimate of the ambient
concentration are peculiar to the Turkey Point
location. However, the relative importance vectors allow
one to generalize these results in terms of the characteristics
of this site. For example, it was observed that very near
the ocean, the dominant effect is the variability of the
previous day's wind. This indicates that at locations near the ocean
background s alt concentration is determined primarily by those
environmental conditions which tend to increase the amount
of salt over the ocean. As one moves to larger distances
from the ocean, the current dayfswind becomes more important
since a transport mechanism is required to transport the salt
to the measurement site. In general, the
season is quite important to the background salt concentration.
The humidity also appears to be quite important. However,
this may be the result of the particular measurement instrumenta-
tion used. Some effects indicated by the relative importance
tables represented in the previous section may also be due to
the measurement procedures. This is particularly likely for
some of the variables associated with the relative importance
vectors for the precision run error. The importance of Monday
testing is almost certainly highly depe'ndent on either the
instrumentation or the test procedure.
48
-------
SECTION V
REFERENCES
Schrecker, Gunter O., et al "Drift Data Acquired on
Mechanical Salt Water Cooling Devices", Final Report
EPA Contract 68-02-1365 prepared by Environmental Systems
Corporation, U. S. Environmental Protection Agency Report
No. EPA-650/2-75-060, July.- 1975.
49
-------
SECTION VI
APPENDIX - A
MATHEMATICAL DESCRIPTION OF ADAPT
SUMMARY
The ADAPT analysis techniques consist of a family of computer
programs which are capable of performing empirical analysis of
any type of data. The generality of these programs follows from
the character of an empirical analysis which may be considered
to be made up of two separate steps. The first step of an
empirical analysis is the learning or training step. In this
step, data for which the answer is known is processed to deter-
mine the algorithm, i.e. rule, for obtaining the answer from
the data. The second step is to apply the algorithm derived in
Step-1 either to proof test data to demonstrate performance or
to operational data to get the desired answer. The ADAPT programs
incorporate this entire procedure with a very general input format
and the capability to apply a large number of the classical
empirical analysis techniques to the derivation of the algorithm.
This alone makes the ADAPT programs an extremly useful tool since
further programming is not required in order to develop algorithms
once a given data set has been properly formatted for input to
the ADAPT programs.
The unique empirical analysis capabilities of the ADAPT
programs arise from preceding the classical empirical analysis
techniques with an efficient representation of the data. This
representation enhances the subsequent empirical analysis. It
reduces the dimensionality of the problem so that the empirical
analysis may be applied to considerably larger data sets. This
approach has the additional benefits of requiring less learning
data, providing both empirical validity criteria and additional
insight to the nature of the data. The optimal representation
is obtained by transformation of the data to the ordered optimal
coordinate system as defined by the Karhunen-Loeve expansion
(see reference 1) this optimal representation is also known as
principal component analysis, optimal empirical orthogonal functions
and is closely related to factor analysis. The ADAPT programs
incorporate a unique approach to numerically deriving this
transformation. The ADAPT programs can derive this transformation
for an unlimited number of vectors having over 2,000 components
each. This capability represents an order of magnitude increase
over what can be accomplished with the classical approach to
finding the Karhunen-Loeve expansion.
50
-------
Representation in this optimal space usually requires
only l/10th to l/100th as many numbers as was required by the
original format of the data. With the data represented
efficiently, any of the classicial empirical techniques including
regression, classification, pattern recognition and clustering
may be performed with considerably less computational resources
and also with a smaller amount of learning data. The detailed
description of the methods used to obtain the optimal representa-
tion and the use of this optimal representation to improve
empirical data analysis will be presented in four parts:
1) definition of the data histories, 2) description of the
optimal representation of the data histories, 3) description
of the use of the optimal representations for empirical data
analysis and 4) evaluation of algorithm performance.
51
-------
DEFINITION OF DATA HISTORIES
Empirical analysis in general and the ADAPT techniques
in particular address themselves to the analysis of information
which appears as data histories, where data histories are defined
as an indexed series of numbers which convey information.
Although the indexing variable is often time or some other
continuous function, it can be anything. The histories may
consist of numbers with different physical meanings. For example,
ADAPT analyses have been performed on data histories consisting
of pressure versus time adjourned to dimensional measurements
associated with the hardware which produced the pressure versus
time history. ADAPT analyzes have also been performed on data
consisting of temperature as a function of spacial location
adjoined to quanities such as latitude, longitude and day of the
year.
The histories may be given in continuous (analog) form or
in descreet form; since the ADAPT programs operate in digital
computers, analog histories are each digitized into a set of N
numbers. Thus, a history is treated as an N dimensional vector
in Euclidean space. If there are N histories the result is an
M by N matrix of numbers which represent a given ensemble of
information.
It may be desirable to perform some pre-processing on any
given data set to bring out features or chara-cteristies of this
data before entering the ADAPT programs. Such pre-processing
can be performed using the ADAPT programs and include such non
linear pre-processing as normalization, raising to a power, taking
logarithms or anti-logarithms, taking Fourier transforms, equalizing
the data, etc. The particular pre-processing which is. required
for any given problem is normally suggested by previous data
processing experience and apriori knowledge of some significant
characteristics of the data. For example, data containing a
large number of irrelevant spikes should be pre-processed by
taking the log of the data. On the other hand, if the spikes
contain the significant information the anti-log of the data might
be more useful. If the data consists in part of quanities which
are totally unrelated and therefore measured by different units,
the relative magnitude of two such quanities such as temperature
and length is entirely dependent on which units are selected;
for example, if degrees and miles are chosen as the units, the
ratio of the magnitudes of the distance measurements to the
temperature measurements is considerably smaller than if degrees
and angstroms are selected as the units. To compensate for this
the ADAPT programs have the capability of introducing an equaliza-
52
-------
tion which emphasizes the variation in the data rather than the
absolute magnitude. This equalization is accomplished by
adjusting the magnitude of each variable, V, by the following
law:
Veq = 1-f- V. - MIN (I)
MAX-MIN
where MAX and MIN are the largest and smallest values of V which
occur . When each variable is processed by this law. it has a
maximum value of two, a minimum value of one and all other values
fall between one and two. Normalizations based on producing all
data vectors with unity absolute magnitude and normalization
based on the first data history are also available in the ADAPT
programs.
53
-------
OPTIMUM REPRESENTATION OF DATA HISTORIES
The choice of the N numbers which were used to represent
each data history, is to some extent arbitrary, the chief
criterion being that the desired physical phenomena are properly
contained in the N numbers. From a theoretical viewpoint, one
could use a continuous data history, however, this would require
processing of an infinite number of numbers. Clearly, the
realities of numerical analysis on digital computers require
that the input be in vector (digitized) form, rather than functional
(analog) form. Thus, the first problem is the classical sampling
problem. In addition to this problem, it is possible through
proper choice of coordinate transformations to still further
reduce the number of numbers required to represent a given amount
of information. The approach taken in the ADAPT programs is
to solve these two problems sequentially. Thus, the first step
in the optimization is to optimize the sampling of the data matrix.
The second step in the optimization is to find the best coordinate
system for representing the data. These two steps will now be
discussed.
ADAPT SAMPLING PROCEDURE
?
The first step in finding the optimal base for the ADAPT
programs is to examine the entire ensemble of data to determine
how to best sample the data matrix. Here best is defined as that
sampling which contains more than a specified amount of new
information as defined by the sampling criteria. The results of
this procedure may be considered equivalent to an incomplete
classical Gram-Schmidt procedure. The degree of incompleteness
is a function of the sampling criteria. The resulting dimension-
ality of the new orthogonal represetation is also a function of
the sampling criteria. The trade-off between the two conflicting
requirements of maximizing the completeness of the base and
minimizing the dimensionality of the new representation is accom-
plished by varying the sampling criteria. The impact of the degree
of incompleteness of this first step on the final representation
will be discussed in Section 3.3. For the special case of selection
criteria of the order of unity this first step in the ADAPT
optimization reduces identically to that of the Gram-Schmidt procedure.
Since the first step of the optimization procedure is not
a classical technique, the best way for the reader to comprehend
the result of this step is to consider it as a modification of the
Gram-Schmidt procedure. The process retains the capability of the
Gram-Schmidt procedure to discreetize the data regardless of the
54
-------
form of the input or of its dimensionality. That is, the
Gram-Schmidt base vectors obtained from a set of data histories
represent these histories with a discreet number, NC, of
components even when these data histories are continuous functions.
Since the Gram-Schmidt procedure eliminates all linearly
dependent cases, the value of NC must be less than the number
of cases which are processed through the Gram-Schmidt procedure.
In general, the number will be smaller yet because of linear
dependence within the data set. With the ADAPT modification
of this procedure, the resulting number of components is normally
considerably less than even those which would result from the
application of the Gram-Schmidt procedure. Therefore, -after
the first step of the representation the new representation
already significantly reduced the dimensionality and is largely
independent of the particular way the data histories were digitized".
However, just as in the case of the Gram-Schmidt base there
is no reason to believe that a given representation obtained by
this procedure is the best one for representing the data. Base
vectors are to a great extent determined by the order in which
histories were arranged when processed through either the ADAPT
selection procedure or the Gram-Schmidt procedure. The next
step of the ADAPT optimization is to find the base which is the
best representation for the given data. This is accomplished
by the second step of the optimization.
SECOND STEP OPTIMIZATION
To find the best representation a new set of NC N-dimensional
orthonormal vectors, rotated from the data represented in the
first step orthogonal base is postulated. This set is to be
chosen in an ordered fashion, so that the first vector is the
best, and so on. Only a limited number, NR^ NC, of these vectors
will be used as new base vectors for representing the histories.
They are chosen as follows: Each history vector is represented
by its coefficients in the first step base, and is projected onto
the NR new vectors, giving M x NR components in the new base.
If there were as many new vectors as first step vectors, NR = NC,
and the first step selection criteria had produced a complete
representation, this would be an exact representation of the
history vectors. Since, in general, NR
-------
in only NR new base vectors.
The new orthonormal set of vectors is chosen by minimizing
thisnean square error, thus defining the meaning of a "best" set
of vectors. If only one vector is used, NR = 1, it is that
vector which makes the one-vector representation error the
smallest. If a second vector is used also, it is chosen so
that together with the first vector, it minimizes the two-
vector representation error. This is continued for as many
vectors, i.e., as large a value of NR < NC, as is necessary or
desirable.
When formulated mathematically, this criterion requires
the maximization of a quadratic form whose unknowns are the
first-step components of one of the "best" base vectors,, and
whose coefficient matrix is the sum of the covariance matrices
of the first step components of the input histories. This
problem is a classical one in linear algebra, which often
appears under the names of the principal components analysis
of a matrix, Karhunen-Loeve or eigen function expansion and
optimum empirical orthogonal function. The solutions for the
unknown vector components are the normalized eigenvectors of
the covariance matrix sum, and the resulting values of the
quadratic form are the eigenvalues of this matrix. Once they
are obtained, they are simply arranged in order of decreasing
size of the eigenvalues. The largest eigenvalue gives the
most reduction in mean square error that can be achieved with
only one new base vector; and the corresponding eigenvector is
this new base vector. The next largest eigenvalue gives the
most reduction in the error that can be achieved by using a second
new base vector in addition to the first one found above, and
this second vector is the eigenvector of this second largest
eigenvalue. This process can be continued until the desired
accuracy is achieved. The sum of the NR largest eigenvalues
gives the maximum mean square error reduction which can be
achieved with NR new base vectors; when adding additional eigen-
values does not significantly increase this sum, the use of the
corresponding eigenvectors as additional base vectors does not
significantly improve the representation.
The optimal set of base vectors defined by this procedure
is known in the statistical literature as the Karhunen-Loeve
coordinate system.* The ADAPT processing of a collection of
* For a complete description of the Karhunen-Loeve representation
see Reference 1 - A
56
-------
histories yields the components, of the history vectors in this
optimal base vector system, as well as the components of these
base vectors themselves.
For each history the NR components in the optimal system
are the optimal representation of the data in the sense
described above. Alternatively, the approach taken is
conceptually analogous and numerical identical to finding a set
of orthogonal functions to be used for a generalized Fourier
series expansion of the original data histories. This problem
is often encountered in classical boundary value problems of
mathematical physics. In the case of the classical boundary
value problem, the appropriate differential equation defines
a set of orthonormal functions. To satisfy a given function on
the boundary, this boundary function is expanded in the set of
orthonormal functions which are defined by the differential
equation. This set of orthonormal functions are optimal for
representing this boundary condition in'that'they require less
terms in the series to achieve a given degree of representation.
This is equivalent to requiring less numbers to represent a given
amount of information which is the exact criteria on which the
Karhunen-Loeve expansion is based. In the case of empirical
analysis, there is no differential equation to define the optimal
set of orthogonal functions. However, the Karhunen-Loeve or eigen
function expansion provides the numerical tool required to make
any set of empirical learning data define its own set of optimal
functions.
The optimal components derived in this manner are used in
all further empirical analysis tasks performed by the ADAPT
programs. Thus, the original M x N numbers representing M
histories have been reduced to M x NR components, plus N x NR
numbers to define the optimal vector base. Since the base system
is optimal, the number of terms, NR, necessary to give a useful
representation of a history is small, of the order of 10 to 30
and the reduction in the number of numbers is large, often of the
order of 50 to 100.
In the process described so far, the optimal vectors are
represented by their NC components in the first step orthogonal
base, but this means they are a linear combination of the NC
first step vectors, the coefficients being these NC components.
Since these vectors are N-dimensional vectors, the optimal vectors
can alsobe represented in the original N-dimensional space of the
data history vectors, by performing the linear combination.
57
-------
The ADAPT representation process just outlined can be
clarified with the simple example of two input histories, which
is carried through analytically and described in appendices of
References 5, 20, 21 and 32, For this special case the first
optimal function is proportional to the average of the two
history functions, the second to their difference, a result in
accord with simple intuition. The relative sizes of the two
eigen values is found to depend on the degree of correlation
of the two histories, which has implications discussed later.
EVALUATION OF DATA USING ADAPT REPRESENTATION
Although the major objective of the representation is to
reduce the dimensionality of the data for future processing it
also provides an opportunity to better understand the nature of
the data and to establish validity criteria for the application
of the empirical analysis which may be performed using this
learning data. The ADAPT programs provide the tools which are
required to understand the quality of the base which has been
derived. These tools are also useful in understanding the nature
of the data. This understanding of the data provides a basis
on which to select the dimensionality for the analysis. This
section will present the tools which are available in the ADAPT
programs for understanding the representation, analyzing the
data and establishing empirical validity criteria.
A convenient measure of the degree of representation
achieved with a given number of base vectors is the sum of the
eigen values of the vectors used, divided by the average square
magnitude of all of the original data history vectors. This
represents the reduction in mean square error achieved divided
by the total error reduction possible. In statistical terms
this is the percent of the variation of the data explained by
the representation used. if a set of data has zero variation,
it does not contain any information. Extending this concept, we
see that information must be at least monotonically related to
the variation in the data. Furthermore, the variation has the
form of an energy. Thus, the ratio of the sum of the eigen values
divided by the average square magnitude of the original data
histories is defined as the information energy. The ADAPT
programs plot the percent information energy versus the number
of dimensions used. These information energy curves are useful
in at least three ways: 1) they provide a basis for evaluating
the quality of representation which has been achieved, 2) they
provide a basis to determine the different types of information
which are available in the data set and 3) they provide a basis
for selecting a dimensionality to use for the data analysis.
58
-------
Fig 1-A presents the information energy curve obtained
from a data set consisting of 50 histories of approximately 200
measurements each. These data histories were taken from
Reference 34 and contain both dimensional measurements on diesel
capsule valves and measurements of the performance of these
capsule valves. There are two separate curves shown on this
figure which have their initial point in common. The lower curve
on this figure is the ratio of the eigen value associated with
the optimal function indicated on the abcissa to the sum of all
the eigen vectors or the sum of the square magnitudes of the
learning data vectors. The upper curve is the cumulative sum
of this ratio. This particular information energy curve is an
example of a complete base in terms of the first step of the
ADAPT optimization. That is for this case the first step of the
ADAPT optimization is identical to that which would have been
obtained using the classical Gram-Schmidt procedure. This is
demonstrated by the fact that the cumulative information energy
(given by the upper curve) reaches 100% of the available informa-
tion.
The first point on Fig 1-A which is common to both the
upper and lower curves shows that the first term in the optimal
representation explains approximately 12% of the variation in
the original data set. The third value on the lower curve shows
that the third dimension explains approximately 6% of the varia-
tion in the data set and the corresponding point on the upper
curve shows that the first three terms taken together explain
approximately 27% of the information in the data set. Both the
consideration of the characteristics of noise and the results of
two case closed form solutions indicate
that the lower or term by term information energy curve should
be flat for random noise. This can be seen by noting that there
should be no preferred optimal directions or functions for
representing noise and each function should explain an equal amount
of the variation when the variation is due to a random noise.
Thus, in a complete base those terms lying above the point at
which there is no further change in the slope of the lower curve
of the information enexgy plot, are terms which are dominated by
noise and should not be included in an analysis. For Fig 1-A
this point occurs at a dimensionality of approximately 25.
Examination of the upper curve corresponding to this dimensionality
shows that only about 80% of the information in this data set is
usable for analysis and the maximum useful dimensionality is 25.
Note the ADAPT approach to determining the maximum usable
dimensionality for the eigen value expansion is based on the quality
of the information which is available, rather than the classical
59
-------
approach of simply assuming that when the eigen values have
fallen below some small but arbitrarily selected percentage of
the largest eigen value the analysis should be discontinued.
A second feature appearing in the lower curve on Fig 1-A
is the break in the slope or knee occurring at the third
dimension. This implies that the change of correlation between
the information contained in the second and third terms of the
representation is extremely great. This normally occurs when
the phenomena being represented by those terms changes signi-
ficantly. For this particular set of data, one can determine
the phenomena causing this knee by examination of the projection
of the entire set of learning data on the first two dimensions
of the optimal space. This projection illustrates another analysis
tool which the ADAPT programs provide. This tool is a plot of
the projection of the entire learning space on to a plane defined
by two of the optimal coordinates. Since the first two optimal
coordinates together represent those two coordinates which contain
the greatest amount of information possible for the learning
data set, the projection of the learning data on to these two
coordinate directions represent the best possible two dimensional
display of all of the information contained in the learning data.
Fig 2-A presents such a display for the data used to make the
learning base from which Fig 1-A was obtained. Examination of
the information energy plot shown in Figure 1 indicates that
Fig 2-A represents approximately 20% of the information contained
in the entire learning set. Fig 2-A way also be interpreted as
a scatter plot of the coefficients of the first and second terms
in the generalized Fourier series representation of each of the
data histories using the optimal empirical orthogonal functions
which are identical to the eigen functions derived in the Karhunen-
Loeve procedure. Based on this interpretation, it follows that
one can form the two term reconstruction of any of the learning
data histories simply sy multiplying the first optimal function by
the coefficient along the NP1 direction and adding this to the
product of the second optimal function times the coefficient along
the NP2 direction.
Examination of Fig 2-A shows that the variation in these
two dimensions is dominated by differences in the models of the
engines in which the capsule valves were used. Examination of
similar scatter plots for high order dimensions shows that this
natural grouping according to model no longer occurs at the higher
dimensions. Thus, the knee occurring at third optimal function
in the information energy curve for this data set is due to the
60
-------
fact that the variation introduced by the different models of
engines is contained in the first two terms of the optimal
representation. This illustrates that the information energy
curve can be used to select candidate dimensionalities for
different types of problems. For example, if one were planning
to use this data to study the differences between the model of
engine used then two dimensions would probably be adequate for
the analysis. However, if the data were to be used for the
analysis of some other features associated with this data, the
first two dimensions probably would not contain all of the
pertinent information and a dimensionality between 2 and
25 would be required.
Fig 3-A presents an information energy curve for a base
which although complete in terms of all of the dominant informa-
tion does not fully represent all of the uncorrelated or noise-
like information. This base is as useful for an analysis as the
complete base which is illustrated by the information energy
curve presented in Fig 1-A The information which has been
omitted by this base due to the first step in the ADAPT selection
criteria is only that information which is of a noise-like
character. This can be seen by noting that the slope in informa-
tion energy curve has become essentially zero after approximately
the 40 to 45th term in the series.
Fig 4-A presents the information energy curve for a poor
base. This curve shows an approximately constant rate of change
of curvature over the majority of the higher dimensional portion
of this curve. This suggest that one is still observing a change
in the degree of correlation of the information as one increases
the dimensionality. This implies that there is still useful
information being added by increasing the dimensionality. However,
the dimensionality can not be increased beyond that which was
admitted by the first step selection criteria. This suggests
that better results could be obtained if the base were re-derived
using a different selection criteria for the first step in the
ADAPT analysis. Even for a relatively poor base as illustrated by
this information energy curve, the leading terms will be essentially
identical to those that will be obtained with the complete first
step base. Thus, the dominant results which can be obtained from
the earlier terms in the series are valid. The results obtained
using relatively high dimensionality with a base such as this, will
generally be inferior to those obtained using a complete base.
The ADAPT empirical validity criteria is based on a measure
of representation which may be applied to individual data vectors.
This measure is the ratio of the square magnitude of the particular
61
-------
data in the optimal base to its square magnitude in the original
data base. This is a measure of the information content which
is lost as one transforms the data vector from the original
space to the new space. This quantity is provided for each of
the learning cases as a part of the standard ADAPT analysis.
This test may also be applied to any test case to which one
intends to apply the empirical algorithms which have been
derived using this ADAPT representation. If this ratio is
significantly smaller for the test cases then it was for the
learning case there is a significant difference between the test
data and the learning data and one is not justified in applying
the empirical analysis to that test data. Thus, this ratio when
applied to test histories serves as a basis for an apriori test
of the validity of performing the empirical data analysis on
that test case.
62
-------
Fig I-A - EXAMPLE OF AN INFORMATION ENERGY CURVE FOR A
COMPLETE BASE
100
to
to
u
z
z
o
a
O
u.
CUHULAT:
LE&M
40 «0
NUMBER OF DIMENSIONS USED
63
-------
Fig 2-A - EXAMPLE OF A SCATTER PLOT OF THE FIRST AND
SECOND OPTIMAL DIRECTIONS OF A BASE FOR
REPRESENTING DIESEL CAPSULE VALVE DATA
1 .0
2.0
5 1 0."
r i1""
bJ i
UI I
* \
0
t.o
1
\
/
\
\
\
^~i^
"*"
'
\
\
\
i
1
\
\
i
\
i
,
\
X
\
^
j
^
-^
\
f
\
N
X
s
\
1
;
i
^~,
^
x
-1.0 ^
^
^
i
i
\
^
N
I
t
\
/
N,
i i
|
* .
\
/
'
/
^
i
I
/
^
/
/
\
/
/
/
's
*
/
^N
^
V
Lo
d
;1 R3(
36
/:
=11
vi:
Model T236
*
\
/
\
/
/
\
1
/
NP1
JO
6
I
|
y
/
V
/
"^
Cap
/
2
(
/
2
.
f\
l.O
ELEMENT
M
/
>*
I
t_
c
ca
i
i
i
1
i
.
j i j ;
y^ ;
' i
i
i ;
i
sule Valve 1 A __,
s-
^
_t<
j
f
; j
, i
'
'l
i
'
'
Y
y
Mode
i
\ '
2.0
i
i
'''/
${:
\/
j l/t :
/ i
i
1 i i * -.
5! E209
1
i
64
-------
Fig 3-A - EXAMPLE OF INFORMATION ENERGY CURVE
FOR A GOOD INCOMPLETE BASE
NUMBER OF DIMENSIONS USED
65
-------
Fig 4-A - EXAMPLE OF AN INFORMATION ENERGY CURVE
FOR A POOR INCOMPLETE BASE
o
a
tjj
z
r
a
o
CUMULAT:
VE
I
T -
EX.
TERM
«0 «0
NUMBER OF 01HENSIONS USED
66
-------
APPLICATIONS USING OPTIMAL REPRESENTATION
Having arrived at the optimal (in the Karhunen-Loeve sense)
representation, attention may now be turned to the use of the
components of each of the data histories in this optimal space to
accomplish the desired empirical analysis. Two of the more common
forms of empirical analysis to which the ADAPT programs have been
applied are classification or pattern recognition and parameter
estimation or regression. The optimal representation not only
greatly simplifies the classical approach to this empirical
analysis but also provides additional capabilities. The use of
ADAPT to accomplish these types of analysis will be discussed in
the following two sections. The optimal representation is also
extremely useful for other empirical analysis including such tasks
as clustering, modeling and extrapolation. The ADAPT representation
provides unique opportunities for empirical clutter subtraction
and compacting of data.
CLASSIFICATION ANALYSIS
The derivation of classification algorithms in the optimal
space is benefited by the fact thatboth the coordinates of the
individual cases being studied and the statistics associated with
these cases are expressed by fewer numbers. The covariance matrix
which defines the statistics of the ensemble of the learning data
is now a matrix having a dimensionality of NR by NR rather than
the original dimensionality of N by N. Thus, the amount of analysis
involving the covariance matrix is reduced by the square of the
reduction in the dimensionality. Furthermore, the orthogonal
properties of the space and the knowledge of the eigen functions
can greatly simplify the derivation of some of the classical
discriminates.
67
-------
The simplest derivation of classification schemes incor-
porated in the ADAPT programs is simply visual examination of
the scatter plots such as that shown in Fig 2-A. Had the
objective of that analysis been to identify the engine from which
the capsule valve was taken, the classification law could easily
have been specified by examination of this scatter plot. This
approach to the derivation of the classification laws is perhaps
the most effective so long as one can deal with two or possibly
three dimensions. However, for higher dimensionalities the
inability to visualize separation surfaces requires the use of
more formal approaches. The approaches which have been used
for ADAPT analyses and are part of ADAPT include Fisher classifi-
cation schemes, special classification schemes for populations
having equal means, maximum likelyhood, Eckart and energy detectors.
Other linear andnon linear schemes may also be incorporated as
required and will benefit by the simplifications noted above. Since
space does not allow detailed descrption of all of the classification
schemes which have been applied with the ADAPT programs the
remainder of this section will be confined to the descriptions of
the Fisher discriminant as it is incorporated in the ADAPT programs.
Since many of the features discussed are common to any classifica-
tion scheme, it will serve to illustrate the tools which can be
made available with the ADAPT approach. The Fisher discriminant
has been selected because experiences show that for many problems
the Fisher discriminant is the most effective. An extensive
comparison of the Fisher discriminant with other linear and non-
linear discriminants has been performed and is presented in
Reference 30.
The Fisher classification scheme is a linear classifier which
seeks a line on which to project all of the data. This line is
selected to satisfy the criteria that when all of the learning
data histories are projected on this line the ratio of the distance
between the means of the two classes divided by the sum of the
dispersions of each of the classes is maximized. In the original
derivation of this classification scheme by Fisher, the dispersions
of each of the classes was equally weighted based on the number of
members of that class. However, there are conditions under which
it is advisable to use different weightings for each of these
classes. It is this more general scheme which is included in the
ADAPT programs.
SETTING OF THRESHOLD
The approach to setting the threshold to be used to classify
the projection value obtained from applying the Fisher discriminant
is based on the analysis presented by Anderson and Bahadur in
68
-------
Reference 3. Strictly speaking, this analysis requires that all
possible projection vectors produce Gaussian projections. In
general, this is only true if the input data is itself Gaussian.
For the great majority of projection directions, in particular
those directions which are normally determined by the application
of the Fisher discriminant, the Central Limit Theorem will result
in a Gaussian projection. Thus, although the theory is not
rigorously applicable, it is usually applicable to a large
percentage of the possible projection directions when the data
space is sufficiently large to invoke the Central Limit Theorem.
Thus, one suspects it may still be a valid guide as to the selection
of the Fisher weighting parameter and the threshold to be used
with the Fisher discriminant. Experience with a great variety of
data has shown that this is indeed the case.
Reference 3 shows that if one desires to minimize total
number of errors made by the Fisher classification algorithm,
one should select the Fisher weighting parameter, P, according
to the following relation:
P£T = (1 - P) 0"2 (2)
where Q~-\ and 0~2 are tne standard deviation of the projection
values of the first and second classes, respectively. Assuming
that the origin has been selected mid-way between the means of
the projection values of each class the threshold, TH, is given by:
TH = (% - P) V (3)
Another criteria which one may wish to use, rather than
minimizing the number of errors, establishes an algorithm which
will achieve a desired false alarm rate. This special case is
also discussed in Reference 3. Suppose one desires a probability
?, that there will be no false alarms in Class 1 when N Class 1
M
cases are examined (i.e. no Class 1 cases will be classified as
belonging to Class 2) . The following relation will define the
false alarm probability for Class 1,
Solving this equation for the probability of false alarm for Class 1
under the assumption that PN = is equal to 0.5 gives:
PFA = 1 - exp (In Pj/N) -~- 0.693 (5)
N
6S
-------
Once the desired false alarm rate has been defined, Reference 3
shows that the proper Fisher weighting parameter to achieve
this false alarm rate is given by:
where /c is the variable in the cumulative standard normal
distribution function of the probability 1 - PFA. The correspond-
ing threshold is given by:
TH = //, - A1 /r (7)
/ J- [** u i
where x& is the mean of Class 1 and @~ is the standard deviation
of Class 1.
The above equations, although strictly valid only for the
case of Gaussian data, may be expected to give a good approxima-
tion even in the case where the data is not Gaussian) when the
data space is relatively large. Experience with the utilization
of these equations in a large number of real problems has verified
that they do provide a good guidance for the selection of both
the Fisher weighting parameter and the best threshold to a.chieve
either the goal of minimum errors or a predefined false alarm rate.
Analysis of Classification Law
The procedure for deriving the Fisher discriminant in the ADAPT
programs consists of the following steps: 1) the use of the
learning data to derive the optimal representation, 2) the projection
of all of the learning data into the optimal space, 3) the use
of the learning cases represented in the optimal space to derive
the Fisher classification direction and 4) the transformation of
the Fisher classification direction back to the original data
space. The first step has already been discussed in detail in
Section 3. The projection of any learning case into the optimal
space is accomplished by taking a dot product of the learning data
vector with each of the base vectors of the optimal space.
The derivation of the Fisher discriminant using this
representation yields a direction or a line in the optimal space.
The components of this line may be considered as a spectrum indic-
ating the importance of each of the optimal directions to the
classification which will be performed. The square of each of these
components have been plotted in Fig 5-A for an algorithm derived
for the separation of diesel capsule valves which could be
expected to have high fuel flow rates from those which could be
expected to have low fuel flow rates. This figure shows that
70
-------
the most important dimensions for performing this separation
was the 10th, 12th, 13th, 14th and 20th. This provides informa-
tion regarding the effect of reducing dimensionality on the
availability of the pertinent information for the decision.
It may be used in conjunction with the information energy curve
discussed in Section 3 to reach decisions as to the dimension-
ality which should be used for the analysis. However, since
this line is invariant under coordinate transformation its
coefficients in the original data space may also be obtained
by transforming the line back to the original data space. The
transformation between the original and the optimal data space
is defined by the ADAPT optimal representation. Since this
transformation is defined by an orthogonal matrix, the inverse
of this transformation which is required to go from the optimal
space to the original data space is-the transpose of the matrix
of the optimal functions. Thus, one may simply transform the
Fisher classification line from the optimal space to the original
space and examine the importance of each of the original measure-
ments to the decision.
A plot of the line in Fig 5-A when transformed back in
the original data space is presented in Fig 6-A For the
problem each indexing variable may be associated with a specific
measurement or tolerance in the capsule valve. For this study
these measurements were grouped as indicated by the labels: fuel
pressure, spring (SP), Seat (ST), etc. This plot has been given
the name of the relative importance vector since it defines the
importance of each of the independent variables to the decision
which is made using the algorithm. It also provides the capability
to apply the algorithm to test cases in the original data space
without first transforming the test cases to the optimal space.
Since the algorithm is just the dot product of the relative
importance vector with the data history the absolute magnitude
of the value plotted on the relative importance vector is a measure
of significance of a given variable to the decision. For example,
if a given variable shown in Fig 6-A has value of zero relative
importance, the value of this variable in any data hrstory will
be multiplied by zero when it is added into the detection statistic.
Thus, this variable can have no influence on the decision. On the
other hand, if the relative importance vector corresponding to a
given variable has a very large negative or positive value, even
a relatively small change in the corresponding variable in the
data vector may have a significant effect on the detection statistic
and therefore the decision which is reached. Thus, even a casual
examination of Fig 6-A shows that most important factors controlling
the fuel flow from the capsule valve are variables controlling
71
-------
the fuel pressure and dimensions of the valve seat.
Although the relative importance spectra and vectors
illustrated in this section have been derived for the Fisher
discriminant it is clear that they may be derived for any
linear classification scheme. These outputs are standard
outputs for all the linear classification schemes which are
incorporated in the ADAPT programs and are used both to
understand the role of the optimal coordinates and each of
the original measurements. This often provides a basis for
physically understanding how the algorithm works.
REGRESSION ANALYSIS
If one wishes to associate a data history with a number
rather than a class, one classical approach is a multiple
regression analysis. The ADAPT programs include both least
square and canonical regression schemes which may be used to
derive parameter estimation algorithms in the optimal space.
Both of these schemes require the inversion of matrices whose
dimensionality is the square of the dimensions of the space
in which the data is represented. Thus, once again the trans-
formation of the data from the original space to the optimal
space has resulted in a reduction in computation of the order
of the square of the ratio of the dimensionalities of the optimal
space to the original space. Since this ratio is often of the
order of one to two orders of magnitude, we have again reduced
the complexity of the derivation by several orders of magnitude.
In many cases, this represents the difference between a feasible
and an infeasible task.
The availability of the canonical regression scheme allows
one to simultaneously fit any given data history to a number of
dependent variables. However, in either case the algorithm
derived is the dot product of the regression line with the data
vector. Thus, as in the case of the linear classification laws,
this line may be transformed from the optimal space to the
original space and the algorithm may be applied in the original
space without the necessity of transforming the data histories to
the optimal space.
One is also able to form the relative importance spectrum,
i.e. the components of the regression line in the optimal space,
and a relative importance vector or the components of the regression
line in the original data space. Fig 7-A presents an example of
such a relative importance vector for a regression analysis for
72
-------
predicting the central pressure of a cyclone using the longitude
of the storm, the latitude of the storm, the day of the year and
79 satellite observed temperature measurements. Although this
display is very useful when 'dealing with a linear data history
such as the diesel capsule valve problem, it is difficult to
interpret a pictorial data history such as the effect of the
temperature distribution on the estimate of the central pressure
from this format. One can take the same transformation which
was used to transform the picture to a linear data history and
transform the relative importance vector back to the pictorial
display. When this is accomplished one obtains a relative
importance vector such as that shown in Fig_ 7-A which shows
the importance of each of the grid points on the radiation map
to the calculation of the central pressure. Reference 8 shows
that an experienced meteorologists can use this picture to
understand the mechanisms which ADAPT has selected to predict
the central pressure of the cyclone.
OTHER EMPIRICAL ANALYSIS-CLUSTERING ANALYSIS
In addition to the classification and regression analysis
schemes which are incorporated in the ADAPT program, the optimal
representation offers opportunities for other empirical analyses.
One such opportunity is for clustering analyses which is often
included under the general scope of pattern recognition. In
the present discussion, we separate classification from clustering
analysis in that classification analysis as used here refers to
the derivation of a law to separate two apriori classes. On
the other hand, a clustering analysis examines a set of data to
determine one of the natural classes or groupings of the data
which occur. After such clustering has been identified, one may
derive a classification law to separate these clusters and examine
the relative importance vector to determine the reasons for the
cluster. Thus, clustering analysis is often a useful tool for
evaluating the general nature of a set of data. It can also
be useful in sub-dividing the data into sets for analysis.
One of the most useful tools for clustering analysis is the
ADAPT scatter plot. In fact when very strong clustering occurs
in the scatter plot of the first few optimal dimensions and when
these first few optimal dimensions contain a large portion of
the information energy it is usually desirable to make separate
bases to analyze each of the clusters which are formed. One
then has a two step analysis The first step of the analysis uses
the first few dimensions of the universal base to establish
what cluster a given test data history belongs to. If the data
73
-------
histories vary over time this clustering is equivalent to
finding the time epoch which is most appropriate for the
particular data history. This epoch analysis may be considered
as one step beyond classical trendline analysis. Rather than
simply using the trend to update failure criteria, time clusters
found by examination of the data allows one to account for
discontinuities and non-linearities in the time variation when
updating failure criteria.
In addition to the use of scatter plots to find clusters,
the ADAPT programs also incorporate a nearest neighbor cluster-
ing scheme. This scheme is based on an algorithm which identifies
those cases which are closest to one another in a high dimen-
sional space. The performance of this algorithm can best be
visualized by considering a nearest neighbor plot such as that
presented in pig 9-A This is a plot of the 50 capsule values
used in the study presented in Reference 30, The capsule value
number appears on both the abcissa and the ordinant of the plot.
The ordinant is the capsule valve which is closest in the optimal
space to the capsule valve listed on the abcissa. Thus, capsule
valve No. 11 is closest to capsule valve No. 1, capsule valve
No. 17 is closest to capsule valve No. 2, capsule valve No. 8
is. the closest capsule valve to No. 3, capsule valve No. 43 is
the closest capsule valve to No. 4, etc. When this plot has-
been constructed, it is read beginning with the ordinant instead
of the abcissa. As an illustration consider capsule valve 6
as the starting capsule valve. If we read starting at an ordinate
value of 6 we find that capsule valve No. 6 is the closest
capsule valve to capsule valve No. 8, 13, 18 and 49. This is
shown in the second tree of the Fig 10-A where Capsule Valve
6 has attached to it Valves 8, 13, 18 and 49. If we then examine
the ordinate corresponding to Capsule Valve 8 we see that
Capsule Valve 8 is the closest capsule valve to Capsule Valves 3,
6, 7 and 27, forming the second branch of the tree. Examination
of the ordinate for values of 13, 18 and 49 show that these
capsule valves are not closest to any other capsule valves.
Similarly, examination of the second branch in the tree shows
only Capsule Valve 3 is closest to any other capsule valve and
it is closest to Capsule Valve 44 which in turn is not the
closest to any other capsule valve. Thus, we conclude that the
members of this tree are Capsule Valves 6, 8, 13, 49, 3, 7, 27
and 44, form some natural cluster. It is clear that this process
is not as effective in identifying clusters as the human eye,
however, this process is applicable to any number of dimensions
where as the human eye has serious limitations if one attempts
to find clusters in more than three dimensions.
74
-------
MODELING AND COMPACTING
The ADAPT representation provides the most efficient way
to represent any given amount of the information in the data.
Thus, if one is trying to decide on what features to use in
constructing a model based on an observed data, the optimal
representation contained in the ADAPT programs provide this
answer. Similarly, the ADAPT representation provides a natural
mechanism for compacting the amount of data which must be
stored or transformed when there exist an adequate empirical base
to derive this optimal representation.
EXTRAPOLATION
The ADAPT representation provides a basis for extrapolating
data histories. If one has a learning set consisting of a
relatively large number of complete data histories, these
histories may be used to construct a data base. If one now
receives additional data histories which are incomplete, ADAPT
programs exist to make a least square estimate of the coefficients
which best fit that portion of the incomplete data history which
is available. When these coefficients have been estimated, they
may then be combined with the optimal functions derived from the
complete data histories from the learning data to reconstruct
the entire data history. Thus, the entire backlog of learning
data is incorporated in the optimal functions. The available
portion of the data history to be extrapolated is used in the
finding of the least square fit to the best coefficient. This
procedure has been applied to the continuation of both velocity
altitude histories and to the extrapolation of the sunspot cycles
which is reported in Reference 31.
CLUTTER SUBTRACTION
Clutter subtraction makes use of a modification of the first
step in the formation of the optimal representation to eliminate
certain directions from consideration in the optimization. This
can be accomplished if the directions to be eliminated can be
characterized. Classical situations for which this occurs include
the effect of ground clutter on radar signatures and the effect
of self-noise on sonar signatures. In these cases, one can obtain
considerable amounts of data from the no target environment and
utilize this data to determine the characteristics or the regions
of the space in which the features which produce the clutter
occur. If these regions of the space are then eliminated in the
75
-------
first step of the ADAPT optimization procedure they will not be
available for consideration when the Karhunen-Loeve expansion
is derived and the optimal representation will not include
these regions of the space. This procedure is useful if one
wishes to visually display reconstructed data histories without
the clutter or if one wishes to use data from a number of
sensors each of which have different clutter. In the latter
case, clutter subtraction may be used to subtract clutter from
each of the sensors prior to the comparison of the results.
76
-------
FIGURE 5A- EXAMPLE OF A RELATIVE IMPORTANCE SPECTRUM FOR
A FISHER CLASSIFICATION LAW
RELATIVE IMPORTANCE OF COEFFICIENT
* «
0 o O 0
»
i
1
RE
1
LAT
i
IVE
t
i
IMPOR
T J
i
T/
<
tN
CE
i
l
SPEI
i
:
:T
Rl
JM
!
i
!
i !
i
i
,-i-
I * * 10 12 l« 1« l» I0
COEFFICIENT NUMBER '
77
-------
FIGURE 6A- EXAMPLE OF A RELATIVE IMPORTANCE VECTOR
FOR A FISHER CLASSIFICATION LAW
a
CQ
o
LU
O
o
Q_
Use Index B of Tables 4. 2
and 4. 3 to Identify Abscissa V_ii
_4- !
| . l
J_i_j
i
t
LU
-4
1*0
itO
VARIABLE
78
-------
FIGURE 7A- EXAMPLE OF A RELATIVE IMPORTANCE VECTOR.
FOR A REGRESSION LAW
10O
1OQ
-200
-30Q
RELATIVE IMPORTANCE VECTOR FOR FIFTEEN-
TERM PARAMETER PREDICTION USING 30
CVCLONE DATA BASE
FOR PREDICTING CENTRAL PRESSURE PC
2X> 4O 60
INDEXING MAR SABLE
O«TE ..... C«SE
MCHO -0.0
79
-------
FIGURE 8A- EXAMPLE OF RELATIVE IMPORTANCE VECTOR
TRANSFORMED TO TWO DIMENSIONAL FORMAT
156 5°E 161 5«E 166 5*E 171 5°E 176 5°E 178 5'W 173 5°W 168 5'W
41 6° N
4I.6»N
36.6" N
36.6'N
3I.6*N
26.6*N
21.6'N
I6.6*N
J^^^W-^'^X?^^'
vi::::'':::::::::::::::::::::::::::::'X::§:^
1.5'E
161.5'E
I665*E
>ieo
I20-I60
I7I.5-E
176 5°E
I76.5°W
80-120
0-80
. The relative importance of each grid point for the prediction of P,. The values are the same as those
shown in Fig. Y A grid point number is shown beneath each dot which identifies the grid point position.
80
-------
oo
FIGURE 9A- EXAMPLE OF A NEAREST NEIGHBOR PLOT USED TO CONSTRUCT
A NEAREST NEIGHBOR TREE
^f.
5°
v
*io
£
D
? 30
3
LJ
« 1 O
O lu
LJ
^
a:
^
in
/ ^
O
X
X
X
X
X
V X
X
X.
X
*
X
X
X.
M
X
^
^
X )
X
X
k
x :
K
X x
X
X
X
X,
K
X
X
K
Y
M
A
3°
J-«
CASE
-------
00
IN3
FIGURE 10A- EXAMPLE OF A NEAREST NEIGHBOR TREE
/3) (/a) (/?;
(ffl)
-------
PERFORMANCE EVALUATION
The classical method of evaluating the performance of an
empirical algorithm is to apply this algorithm to independent
test cases and derive the performance statistics from the
results of these, independent tests. Although this is the only
acceptable approach for demonstrating this performance to the
scientific community it has several significant disadvantages
as an analysis tool. The most important disadvantage is the
cost and time required to perform the independent tests at
each stage of the analysis. Thus, experience with empirical
analysis has shown a need for developing a capability to
estimate the performance that a given algorithm will achieve
without the need for performing independent tests.
The ADAPT programs include procedures which allow the
estimate of this performance based on the performance of the
algorithm on the learning data. These procedures have been
developed for both classification analysis and regression
analysis. The procedures are based on the concept that when
the ratio of number of learning cases to number of dimensions
is sufficient the performance of the algorithm on the independent
test cases will approach the performance on the learning data.
Experience has shown that acceptable ratios of the number of
cases to the number of dimensions used are dependent upon the
performance of the algorithm. Thus, an experimental performance
map has been developed which provides the analyst with a basis
for estimating the performance of an empirical algorithm on
independent test data 'from the performance of that algorithm on
the learning data. This tool is utilized in the process of
developing algorithms to reduce both the time and cost required
to select the best approach to deriving the empirical algorithm.
Final demonstration of the best algorithm which has been achieved
is still accomplished by independent test, if test data can be
obtained.
EVALUATION OF CLASSIFICATION PERFORMANCE
The simplest approach to visualize the performance of a
classification algorithm is to examine the values obtained by
application of the algorithm. This may be accomplished by
presenting these values as a bar chart of the detection statistic
versus each of the cases examined. This presentation is included
in the ADAPT programs "and is very useful to visualize the detailed
characteristics of the performance. However, it is often
desirable to be able to compare the performance of a large number
83
-------
of algorithms. One such situation is the study of the trade-off
between detection probability and false alarm rate. The
classical approach for accomplishing this study is to present
the data in the form of receiver operating curves. These curves
are simply plots of a detection probability versus the false
alarm rate and can be obtained by evaluating the algorithm
performance as one varies the threshold.
It is also difficult to evaluate the effect of dimension-
ality on the performance of algorithms by studying large
numbers of bar charts. Thus, it is desirable to introduce a
single measure of algorithm performance which can be used to
study the effect of dimensionality on the performance of the
algorithm. One convenient measure is the probability of error.
This measure has the advantages of a simple intuitive meaning
and has a unique relationship to the receiver operating curve.
Thus, it is desirable to express the performance for the
classification law in terms of the probability of error. This
can be accomplished for the Fisher discriminant by examination
of a quantity V- Since the Fisher discriminant is the result
of a maximization of V, which can be defined by:
it is clear that the maximum value of V is itself a good measure
of the performance of the algorithm. The maximum value of V,
over all possible projections, turn out to occur when the
denominator of Equation 8 equal to the square root of the
numerator, which means V becomes, geometrically, the distance
between the means of the projection of the two classes on the
Fisher direction. Thus, for the Fisher discriminant Equation 8
provides a relationship between the projection of the means of
the two classes, the standard deviation of each class, and the
Fisher weighting parameter.
It is interesting to consider the special case in which the
standard deviation of each of the classes is equal. For this case
- V = \jU- //( (9)
and
0"2) /V = 2//v'= 2(T/V (10)
84
-------
This parameter is used as a measure of the goodness of
performance of the discriminant. Regardless of the relationship
between the standard deviations of the two classes, the smaller
£ 0~/V (the larger V) the better the performance of the algo-
rithm. In this case where the standard deviations of both
classes are equal V may be related directly to the probability
of error, PE.
For the case where the threshold is selected to minimize
the number of errors, the situation is shown in Figure HA.
The threshold is set half way between the mean projections of
the two classes, because the criterion requires that the errors
for the two classes are the same. Then the probability of error
is the shaded Area AQ_ which is value of the cumulative normal
distribution centered on^/^, up to/U-^ -V/2. If G is the standard
cumulative normal distribution, this is
r G ( (T ) = G
(
PE is the probability of making an error in either class, and
PD = i - PE (12)
is the probability of correctly identifying a member of either
class.
For the case where one wishes the maximum detection of
Class 2 for a specified error probability PF^ of Class I, the
threshold is set by Equation 7. Again, for equal standard
deviations, the situation is quite simple. If we take the origin
half way between the mean projection of the two classes then
/.'I = v/2/ and Equation 7 becomes j-
/ TH - V/2 - 0~ (13)
or ft1 = (V/2 - TH)/0- (14?
This is the standard normal deviate at which
p = G (S1) (15)
The detection probability P D of Class 2 is the area under the
normal curve centered on//2 - -V/2 up to TH. The normal deviate
for this curve at that point is
11 = (TH +V/2) (16)
and
85
-------
P ^ = G ( /3) (17)
But TH can be eliminated from (14) and (16) to give
/»
» (18)
Thus, for the case of equal standard deviations, the
detection probability of Class II depends only on the false
alarm probability of Class I, and the Fisher maximum through
£ /T/V. Fig 12 -A presents these ROC curves for various values
of the parameter 2.
In addition to obtaining an understanding of the trade-off
between detection probability and false alarm rate, it is
important to have a measure of algorithm performance to evaluate
the effect of dimensionality of the space in which the algorithm
is derived. This is extremely important since the use of too
large a dimensionality in the derivation of an algorithm will
result in the algorithm being derived by fitting the learning
data according to special characteristics of the particular
learning sample, and not according to characteristics of the
population sample. That is, the major basis for the separation
will be the difference between the population and sample means,
rather than the difference between the means of the two popula-
tions being classified. When this occurs the classification
may be called "overdetermined"1 . This phenomena is quite
analogous to the fitting of a third order polynomial through a
set of data. If a third order polynomial is fit to 3 data points,
there is no reason to believe that a general law has been derived.
However, if this same third order polynomial makes a reasonably
good fit to 100 points, there is little doubt that these 100
points are related by some phenomena which is well expressed by
a third order polynomial.
Thus, it is important to understand the capabilities of a
Fisher discriminant to derive classification algorithms simply
on the difference between sample and population means. Originally
the ADAPT analysis team evaluated this by performing separations
of odd cases versus even cases from both classes for each problem
being considered. The performance of these separations were then
compared with the performance of the classification algorithm
derived between the desired classes. If the algorithm derived
for separating the odd versus even gave a similar performance to
the desired algorithm then one concluded that the algorithm was
86
-------
not based on physical characteristics but rather on the differences
between the sample and the population means and was considered
to be overdetermined. This experience can be summarized in a
plot such as presented in Fig 13-A This figure plots the
number of cases divided by the number of dimensions versus the
performance measure. The cross hatched curve is an experimental
curve separating valid from overdetermined separations. It is
based on separations of odd from even (i.e. random separations)
for a large variety of problems and data. The extrapolation of
this curve for low values of the performance measure was
accomplished by making a similar plot on a linear scale and
noting that for a number of cases over number of dimensions of
unity the performance measure should go to 0. It is interesting
to compare the cross hatched curve of Fig 13-A with the results
of a similar analysis presented in Reference 4 which indicated
that when the number of cases to number of dimensions exceeded
six, one could have confidence in the performance of the algorithm.
Fig 13-A stows why this is the case. Remembering that for
ff~, = <^2 WG may relate 2. &*/V to the probability of error we
note that for a performance measure of 2 the probability of error
is approximately one in three. Since a random process for selecting
a class has a probability of error of one in two, it is clear
that an algorithm whose performance measure is two or greater is
not interesting. Thus, this curve shows that any algorithm of
interest derived in a space such that the number of cases .divided
by the number of dimensions is greater than six lies to the left
of the experimental curve.
When an algorithm is derived using the Fisher discriminant
it may be placed at some point on Fig 13-A-by noting the number
of cases used in the learning data, the number of dimensions of
the space in which the algorithm is derived, and the performance
measure for that algorithm. All of these parameters are available
in the ADAPT output for the deviation of the Fisher discriminant.
If the algorithm falls to the right of the cross hatched region
in this figure, one knows that it is overdetermined and is not a
valid algorithm. If it falls near but to the left of the cross
hatched area, one realizes that the performance of this algorithm
on the learning data is significantly better than one can expect
on the test data. Only'if the algorithm falls to the left of
and reasonably far away from this cross hatched area does one
have an algorithm whose learning data performance is indicative
of the performance which can be expected on a test case.
It is useful to visualize the path of a typical algorithm on
this performance map in conjunction with the other ADAPT analysis
tools. If one were to examine the projection of all of the
87
-------
learning data on to the first optimal coordinate direction, one
could determine for any particular classification law a
probability of error or £ 3~"/V for the algorithm consisting of
the projection on the first eigen vector. This would be
the performance of any linear classifier derived using only the first
optimal direction. If the desired separation is based on
information which dominates the variation of the data set one
might expect that this classification procedure would yield a
useful result. If not the classification procedure may be of
no value at all. In either case, it can be located on a
performance map. If we assume that the first eigen direction
has no bearing on the classification which is desired, then this
point would fall to the right of the cross hatched curve on
Figure 13A. As one now increases the number of dimensions used
and repeats the application of a Fisher discriminant at each
dimensionality, the curve will continue some path to the right
of the cross hatched region until sufficient number of dimensions
is used that some of the data pertinent to the desired classifi-
cation is incorporated into the analysis. At this point the
classification will no longer have the character of a purely
random classification and the curve can be expected to move to
the left of the cross hatched curve. Analysis of the relative
importance spectrum would show that the first significant dimension
was also reached at this point. It is also likely that the
information energy plot will show a knee at this point in the
curve indicating that a different type of data having a more
noise-like characteristic is now being considered.
As one continues to increase the dimensionality, the
performance of the algorithm may be expected to increase for
two reasons. The first is that additional hopefully pertinent
information is being added. At any dimensionality where pertinent
information is added one can expect that the track of the algorithm
on the performance map will become more horizontal and that this
dimension will correspond to a large value in the relative
importance spectrum. A second reason for improvement in the
performance is that as the number of dimensions gets sufficiently
large the algorithm will approach an overdetermined situation.
The dimensionality at which this occurs is a function of the number
of learning cases used and the performance of the algorithm.
As the ratio of number of cases to number of dimensions approaches
unity the path of the algorithm will again intersect the cross
hatched line on Figure ISA.
The preceding discussion suggests that the ADAPT scatter
plot and relative importance spectrum will provide a good estimate
88
-------
of the shape of the track of an algorithm for higher values
of the ratio of number of cases to the number of dimensions
then were used for the derivation of the algorithm. By examin-
ation of this track, one can estimate whether the performance
achieved on the learning data will also be achieved on the
test data and whether the dimensionality used contains any
information which is pertinent to the desired classification.
If one must use a relatively small ratio of number of cases
to number of dimensions to achieve a valid classification, on
the performance map, more learning data is needed to derive a
useful algorithm. When an algorithm is to the left of the
cross hatched curve the data does contain information which is
pertinent to the desired classification. However, if this can
only be accomplished at a small ratio of the number of cases
to number of dimensions, additional learning cases would allow
one to increase this ratio and still use a sufficiently high
dimensionality to obtain the information which is pertinent to
the desired classification. For specific examples illustrating
tracks of algorithms on the performance map, the reader is
referred to the discussions of performance maps presented in
References 30, 32 through 34 and 36 through 38.
THE PERFORMANCE EVALUATION OF REGRESSION ANALYSIS
The performance evaluation of regression analysis is, similar
to that of the Fisher classification analysis discussed in the
preceding section. The pictorial presentation of performance of
a single algorithm corresponding to the bar chart is now a plot
of the actual parameter versus the estimated value of the parameter.
These plots can be prepared showing either the learning or the
test data or both the learning and test data on the same plot.
Perfect agreement on this plot results in the data falling on the
line having a 45 degree slope and passing through the origin.
This plot has the same disadvantages for evaluating large numbers
of algorithms and for studying the effect of dimensionality on
performance that the bar charts have for the Fisher discriminant.
It is useful to examine the performance of a regression law
on the learning cases and estimate what the performance would be
on the test cases. The ADAPT programs include performance maps
for accomplishing this for regression analysis which are analogous
to the performance maps presented for the Fisher classification
algorithms. For the case of the regression analysis, the
performance maps are.a plot of the ratio of the number of cases
to the number of dimensions versus the performance of the algorithm.
The major difference is that the performance of the algorithm is
now measured by the quantity ^rat. This quantity is defined as
89
-------
the ratio of the standard deviation of the estimate about the
actual value divided by the standard deviation about the mean
value and is given by:
where:
V = actual value
Z = estimated value
V = mean of actual values
As in the case of the classification performance map, experimental
analyses have been performed on a large variety of data which
provide experimental curves as to the confidence that the
algorithm derived is not overdetermined as a function of its
location on a performance map. A performance map including these
curves is presented in Fig 14-A. This regression performance
map can be used in a manner exactly analgous to that explained
for the classification performance map. The track of an algorithm
on this performance map follows the same logic as was developed
for the track of an algorithm on the classification performance
map.
90
-------
FIGURE - 1JA
PERFORMANCE MEASURE FOR CLASSIFICATION
HAZARDOUS
V
V
NON-HAZARDOUS
Al
= MEASURE OF QUALITY OF SEPARATION
f DETECTION ~]
PROBABILITY, PQ
A* «j
("FALSE ALARM "1
L RATE J
AREA UNDER HAZARDOUS"
CURVE TO LEFT OF
DETECTION LEVEL
AREA UNDER NON-HAZARDOUS
CURVE TO LEFT OF
DETECTION LEVEL
E>UM OF AREAS UNDER HAZARDOUS"
BJMD NON HAZARDOUS CURVES TO
LEFT OF DETECTION LEVEL
-------
FIGURE - 12A
CLASSIFICATION PERFORMANCE TRADE-OFF CURVES FOR
EQUAL-GAUSSIAN STANDARD DEVIATIONS
PO
t
PQ
<
(Q
O
o
ii
H
U
\A
H
W
Q
O-i,
0.6
Tirn
6.01
FALSE ALARM RATE
-------
FIGURE- 13
PERFORMANCE MAP
FOR FISHER CLASSIFICATION ALGORITHMS
FIRST PERTINENT INFORMATION
20
OSES
0.5 \0<7
2
0001 . 001 . 01.;. 05 .1 16
PROBABILITY OF ERROR
93
-------
FIGURE - 14A
REGRESSION PERFORMANCE MAP
10
CONFIDEN
PHYSICAL
'°o
CE IN "I
BASIS j
o-E
;,?AT
-------
ADAPT REFERENCES
LA Watanabe, S., "Karhunen-Loeve Expansion and Factor Analysis
Theoretical Remarks and Predictions", Transaction of the 4th
Prague Conference on Information Theory, Statical Decision
Functions, and Random Processes, 1965, pages 635 thru 660.
2A Andrews, Harry C., "Introduction to Mathematical Techniques in
Pattern Recognition," John Wiley & Sons, Inc. 1972.
SA Anderson, T.W. and Bahadur, R0R0, "Classification into Two
Multi-Variate Normal Distributions with Different Co-Variance
Matrices" Annals of Mathematical Statistics, Vol. 33, p. 420,
1962o
4A Foley, Don Ho, "Probability of Error in the Design Set As A
Function of A Sample Size Dimensionality", Thesis, Syracuse
University, 1971.
5A Hunter, H«,E., N0 Kemp, "Application of Avco Data Analysis and
Prediction Techniques (ADAPI) to Prediction of Cyclone Central
Pressure and Its Derivatives Using NIMBUS HRIR Data", AVSD--
0362-70-RR, August 1970.
6& Hunter, H.E., N. Kemp, "Application of Avco Data Analysis and
Prediction (ADAPT) to Prediction of Cyclone Central Pressure and
Its Derivatives Using NIMBUS HRIR Data", AVSD-0142-71-RR,
March 1971.
7A Hunter, H. E., N» Kemp, "Application of Avco Data Analysis and
Prediction Techniques (ADAPT) to Prediction of Cyclone Present
Motion and 12 Hour Motion, and Re-centering Effects, Using
NIMBUS HRIR Data", AVSD-0334-71-RR, July 1971.
8A Shcnk, William E., Herbert E. Hunter, Frederick V. Menkello,
Robert Holub, Vincent V. Salomonson, "The Estimation of Extra-
tropical Cyclone Parameters from Satellite Radiation Measurements,"
Journal of Applied Meteorology April 1973.
9A Kemp, N.H., Ha E. Hunter, R. A. Amato, "Application of Avco
Data Analysis and Prediction Techniques (ADAPT) to Multi-Spectral
Extra-Tropical Cyclone Accuracy Investigation", AVSD-0128-72-CR,
March 1972.
95
-------
10 A Hunter, H. E., N. H0 Kemp, "ADAPT Hurricane Data Selection
and Performance Study", AVSD-0400-72-RR, Nov. 1972.
11A Hunter, H0 Eo, N. H. Kemp, "ADAPT Hurricane Forecast Improve-
ment Demonstration", AVSD-0020-73-CR, January 1973,
12A Hunter, H. E., "Use of Satellite Data and the ADAPT programs
to Improve Hurricane Forecasts" AVSD-0138-73-RR, April 1973.
13A Avco Corp.,, ADTECH (Advanced Decoy Technology) Program Final
Report, Vol, 3, Appendices, Avco TR No. RAD TR-65-4, Contract
AF04(694)-593, DDC #AD363081, April 30, 1965, (Secret) Pages
157-
14A Avco Corporation , ADTECH II Final Report, May 1966, BSD-TR-66-192
Sponsored by Advanced Research Projects Agency (ARPA), DOD
ARPA Order #441 Amendment #4, DDC #AD374278, May 1966, (Secret)
Pages 64-89=
ISA Avco Corp., ADETCH III Final Report, Feb0 1968, AVMSD-0835-67-RR,
Contract AF04(694)-9560 (Secret)
16A Avco Corp., ADTECH IV Final Report, June 1969, AVMSD-0465-68 RR,
Sponsored by ARPA, DOD ARPA Order #441 Amendment #12 (Secret).
17A Choiniere, Dill, Hines; Chaff Masking Effectiveness Paper 61 in
AMRAC Proceedings, Volume XVIII, Part I, (AD-390700), Published
by University of Michigan, April 1968» (Secret)
18A Avco Corp., Test and Evaluation Study Report - Vol. II, Data
Bank Study, Prepared for Institute for Defense Analysis,
Contract FO 4701-68-C-0012, AVMSD-0300-68-RR, 22 April 1968.
(Secret)
19A Hunter, H0 E., "Discussion of Patterns Recognition Techniques
Applied to Diagnosis", presented at the Society of Automotive
Engineers, Mid-year meeting, May 18-20, 19700
20A Hunter, H0 E., R0 Amato, J. Conway, N. Kemp, "Demonstration of
Applicability of Avco Data Analysis Technique to Sonar Signature,"
AVSD-0605-70-RR, December 1970-0
21 A Hunter, H. E., J. Conway, "Demonstration of Feasibility of Using
the Avco Data Analysis and Prediction Techniques (ADAPT) to
Develop Algorithms for Automating the Identification of Solar
Rurst", AVSD-0255-71-RR, 21 May 1971.
96
-------
22 A Avco Corp., Tethered Radar Reflectors (TRR) Report, Sept. 1971.
SAMSO TR-71-181, Vol. II, Contract F04701-68-C-0289. (Secret)
23 A Hunter, H. E., N. Kemp, "Demonstration of Feasibility of Avco
Data Analysis and Prediction Techniques (ADAPT) for Sonar
Detection" AVSD-0411-71-RR, September 1971.
24 A Hunter, H. E,, L. Meixsell and J. Conway, "Feasibility
Demonstration for Optically Implementing an Avco Data Analysis
and Prediction Techniques (ADAPT) Algorithm for Recognizing
Spirals", AVSD-0026-72-RR, January 1972.
25A Hunter, H. E., J. Conway, "Use of the Avco Data Analysis and
Prediction Techniques (ADAPT) to Develop Analytical Techniques
for a Comprehensive Attach on Auto Theft and Burglary in
Lawrence, Mass,,", AVSD-0042-72-RR, 31 January 1972.
26A Hunter, H. E L» M. Meixsell, "Preliminary ADAPT Analysis
Feasibility of Discriminating Crash Sensor Signature," AVSD-
0398-71-RR, 30 Aug. 1971.
27A Hunter,H. E., "Application of ADAPT to Analysis of Bi-Static
RCS Crash Signature," AVSD-0180-72-RR, May 1972.
28A Jones, T. O., D. M. Grimes, R. A. Dork, "A Critical Review of
Radar as a Predictive Crash Sensors," presented at the Second
International Conference on Passive Restraints, Detroit,
Michigan, May 22-25, 1972, SAE Report 720424, Pages 22-24, 38.
29A Hunter, H. £, L. M. Meixsell, R. A0 Amato, "Final Letter Report
ADAPT Solar Burst Compacting Study," AVSD-0209-72-CR, May 1972.
30A Kemp, No H., H» E. Hunter, R, A. Amato, "Application of Avoo
Data Analysis and Prediction Techniques (ADAPT) to a Gauss-in-
Gauss Detection Study, " AVSD-0260-72-RR, July 1972.
3lA Hunter, H. E0, R. A. Amato, "Application of Avco Data Analysis
and Prediction Techniques (ADAPT) to Prediction of Sunspot
Activity," AVSD-0287-72-CR, August 1972.
32A Hunter, H. E., "Demonstration of the Use of ADAPT to Derive
Predictive Maintenance Algorithms for the KSC Central Heat Plant,"
Nov. 1972, Contract No. MAS 10-7926, AVSD-0084-73-RR.
33A Kemp, N. H«, H. E. Hunter, "Application of Avco Data Analysis
and Prediction Techniques (ADAPT) to Analysis of TUMS Sonar Data,"
AVSD-0433-72-CR, December 1972
34A Hunter, H. E., "Application of ADAPT to Determination of Effect
of Diesel Capsule Valve Design Criteria on Fuel Flow Performance,"
AVSD-0102-73-RR, March 1973..
97
-------
35 A Kemp, N.H., K. E. Hunter, R.A. Amato, "Avco Data Analysis
;iid Prediction Techniques (ADAPT) Tri-Class Passive Sonar
Classification Study, January 1973, (Confidential)
36A Hunter, H.E. "Application of ADAPT to Selecting Optimal
Features for Study and Modeling of Rain Cell Radar Signatures"
AVSD-0122-73-RR, April 1973.
37A Hunter, H.E., "Final Report-ADAPT Cyclone Forecast Correction
Study", ADAPT 73-1, September 1973.
38A Hunter, H.E., "Task-1 Final Report - Application Of ADAPT
to Quick Look Classification of Composite Radar Signatures",
ADAPT 73-3, November 1973.
39A Hunter, H.E., "Final Report-Applcation of ADAPT to Integrated
Trend Analysis for Checkout of Space Vehicles", ADAPT 74-2,
April 1974
40A Hunter, H.E., "Final Report-Application of ADAPT to Quick
Look Classification of Composite' Radar Signatures", ADAPT 74-4,
BSD TR-74-345, November 1974.
41A Hunter, H.E., "Summary Letter Report - ADAPT Studies to Define
Diagnostic Potential of Preliminary Brake Analysis Data",
ADAPT 75-1, May 1975.
98
-------
APPENDIX B
ANALYSIS OF OPTIMAL BASES
Two ADAPT optimal bases were prepared for the analysis
of the Turkey Point salt concentration data. These optimal
bases were prepared using the procedures described in
Appendix 1. Originally only one optimal base was to be
prepared, however, analysis of this base showed that there
were significant keypunching errors in its preparation.
It was necessary to prepare a second base using the corrected
data. This appendix presents the most significant character-
istics of each of these bases. These characteristics are
presented as scatter plots, plots of the ADAPT optimal functions
and ADAPT information energy plots. For the interpretation
and meaning of each of these plots, the reader is referred to
the descriptions provided in Appendix A.
Fig 1-B presents the scatter plot of every third case
from the first two optimal coordinates for the original base.
This scatter plot led to the discovery of the keypunching
errors in the data. This scatter plot is the projection
of each of the data histories on the first two ADAPT optimal
directions. Both the data histories and optimal directions
are made up of Variables 3 through 79 listed under the 82pt
column of Table 1 in the main body of the report. This projection is
obtained by taking the dot product of each of these data
histories with the ADAPT optimal function corresponding to
the coordinate onto which the data history is to be projected.
Thus, the abscissa of Fig 1-B is obtained by taking the dot
product of the data history with the first ADAPT optimal
function presented in Figure 2B. The ordinate of Figure 1-B
is obtained by taking the dot product of the data history
with the second ADAPT optimal function which is presented
in Figure 3B. Thus, the ADAPT optimal functions may be
considered as relative importance vectors for defining the
location of a point on the scatter plot coordinate corres-
ponding to that optimal function. Examination of Figure 2fi
shows that the first optimal function is primarily a time
measure such that projections on the first optimal function
should have negative values for tests performed early in the
program and positive values for tests performed later in the
program. Examination of the second optimal function-presented
in Figure 3Bshows that the second optimal function is primarily
a measure of whether the cooling device was operating or not.
99
-------
FIGURE IB- PROJECTION OF EVERY THIRD CASE ON FIRST, TWO OPTIMAL
COORDINATES OF FIRST ANALYSIS.BASE ..
TOWER
0.4
0.2
NO COOLING
DEVICE
UJ
u
_l
W-o.J
-0.4
-O.f
-O.t
-1.0
-1.2
SPRAY
MODULE
-'* -'*
-«.« -O.« -O.« -O.2
NP1 ELEMENT
0.4 0.« O.t
iOO
-------
FIGURE 2B
FIRST ADAPT OTPIMAL FUNCTION FOR FIRST ANALYSIS BASE
o.to
o.«o
0.40
li.
O
Z
O
8
u
0.20
t.t
i
-O.JO
-0.40
ffifflH
"KEMP I PBEV
ORAL I WIND
101
-------
0.10
o.«o
0.40
O.ZO
X
b.
O
§...
O
u
-O.JO
0.40
-O.M
-0.80
FIGURE 3B
SECOND ADAPT OPTIMAL FUNCTION FOR FIRST ANALYSIS BASE
40
>0
to
70
0
INDEXING VARIABLE it
WIND H^ HUMIDITY ^-"ffiEMPORAL - WIND
102
-------
High values of the projections! on the second optimal function
correspond to cases where the cooling tower was operating,
values near zero correspond to the time period where no
cooling device was operating and large negative values
correspond to the time values where the spray module was
operating.
The numbers 1 through 5 used to designate the data
histories projected on the scatter plot shown in Figure 1-B
are a chronological ordering of the data histories. The
Number 1 designates those data histories which were obtained
during the time period when neither cooling device was
operating. The Numeral 2 designates those cases obtained
during the remainder of 1973. The 3's and 4's are from early
1974 and the 5's are from the later period of 1974 when data
were obtained with a cooling device operating. ^ Examina-
tion of Fig 1-A shows the anomolous result that a group of
5's are located on the left hand side with an NP1 projection
of minus 1.4 to minus 1.5. This is inconsistent with the
definition of the first ADAPT optimal function presented in
Fig 2-B since the fives should occur on the right hand side
of this figure. There are also several anomoulous groupings
of numberal 4's on this scatter plot. Investigation of each
of these groupings show that they were a result of keypunching
errors in the data preparation. Since these keypunching errors
had a significant effect on the variation of the 'data set used
to derive the optimal base, it was necessary to recreate this
base using the corrected data.
The average of all of the data histories used to develop
the second analysis base using the corrected data is shown in
Fig 4-B Prior to processing these data histories to develop
this ADAPT optimal base, this average is subtracted from all
the data histories so that all of the succeeding analysis is
performed on zero mean data. Fig 5-B presents a plot of the
effect of dimensionality on the information available for
analysis using the second analysis base. Fig 5-B is actually
the plot of two curves on a single grid. The lower curve
presents the amount of information available in each of the
terms of the optimal base. The upper curve is the cumulative
sum of the lower curve. Thus, this curve indicates that the
scatter plot containing the first two dimensions in this
optimal base contains approximately a third of the information
in the entire set of data being analyzed. Similarly, if one
performs an analysis using 16 dimensions the sixteenth dimension
103
-------
1.70
o
Ul
1.40
U
O
l.ZO
1.10
i. oo
FIGURE 4B~ AVERAGE OF ALL DATA HISTORIES USED TO
DEVELOP SECOND ANALYSIS BASE
10
1
NOf
40
INDEXING VARIABLE
WIND ->fc HUMIDITY
to
70
3RAI4
o
PRBV »«
TEMPORAL! WIND
104
-------
100
FIGURE SB- EFFECT OF DIMENSIONALITY ON THE AVAILABLE
INFORMATION FOR SECOND ANALYSIS BASE"
CUMULAT:
XLRIUEX
79-
«O
to
too
NUMBER OF DIMENSIONS USED
*«
105
-------
only contains approximately 1% of the information
contained in the original data set. The first sixteen
dimensions taken together contain approximately 95% of the
information available in the entire data set. Analysis of
the optimal functions associated with this base suggested
that up to approximately the first 26 optimal dimensions
the information contained was sufficiently general to be
useful to the type of analysis being performed in this
study. However, the number of cases available for analysis
restricted the dimensionality to between 4 and 20
dimensions depending upon a particular algorithm being
developed. This implies that almost 99% of the information
available in the data set could be useful to the present
analysis. Fig 5-B shows that the limitations on the
number of cases available for analysis have restricted the
amount of information which could be used in this study to
approximately 40% for the four dimensional algorithms to as
much as 98% for the 20 dimensional algorithms. The majority
of the algorithms were developed at 16 dimensions which
corresponded to approximately 95% of the available information.
Fig 6-B presents the scatter plot projection of every
third case on the first two optimal dimensions of the second
analysis base. The effect of correcting the keypunching errors
can be seen by comparing Figures IB through 3ewith Figures 6-B
through 8B, Fig 7-B and 8B present the first and second optimal
functions using the corrected data. The first optimal function
remains strongly dependent on the temporal variables, however,
humidity and wind are more important to the first optimal function
then in the base with the keypunch errors. The second optimal
function is considerably different and is dominated by informa-
tion concerning the wind and humidity.
The numerals used to designate the bases on Figure 6-B have
the same meaning as for Figure IB. Examination of Figure 6-B
shows that in general the numeral indicating the test increases
from left to right on this figure. This is in general an
agreement with the temporal nature of the first optimal function.
Note, that there are two major groupings of data on this scatter
plot. This indicates there are probably significant differences
between the-characteristics of the tests performed in the
first half of: the program from those performed in the second
half of the program. These groupings indicate potential areas
of incomplete data. For this data set, the majority of the
scatter plots are similar to the scatter plot projection data
on the third and fourth optimal directions shown in Figure 9B.
106
-------
FIGURE
1.0
H
Z
u
2 i
u
u
(M
Z«
-1.0
2
2
4
1
1
1
6B- PROJECTION OF EVERY THIRD CASE ON FIRST TWO OPTIMAL
COORDINATES OF SECOND ANALYSIS BASE
i
t
l
*
i
I
f-
i
4
]
t
i
tl
1
|1*
t
>
1
1
*l
\
1
1
1
-l
L
1
%
t
\
I,'
1
1
I
1
1
.O
t
i
i
i
1
J1
1
*
»
,1
l l
,
*
»'
t
,f
t
f
i
l
i
T
^
I1
s
4
"
a
1 |
*
*
X
1 ~4»
'
,
A
1
41,
i
,
^
1
1
r
4
4
^^
. t
4
V
4
1
t
1
I
I
NP1
4
4
4
4
4
*
4
A
4*
4
4
4
4
4
4
* 4
44
«*
»*
4
4
4
4
4
f
f
f
t
4
4
1 f
* *
V
1
y
*
4
**
A
JW
4
1
1
41
i
4ft
.
f
*
«fl
V1
\
1
K
,«
1
i«;
4
4
4
1*4
4
'**
1 *1
f «
V
y
r\%
1^
s
f
f
f
4*
*
I,
ELEMENT
K
t*^
f
( S
1
1
.0
M
107
-------
FIGURE IB. - FIRST OPTIMAL FUNCTION SECOND ANALYSIS BASE
o.«o
> INDEX ING VARIABLE
HUMIDITY
108
-------
FIGURE 8B - SECOND OPTIMAL FUNCTION SECOND ANALYSIS BASE
o.soo
0.400
o.»oo
0.200
la.
O
O
O
o.ioo
0.0
-0.100
-0.200
-0.100
INDEXING VARIABLE
-WIND >|< HUMIDITY-
«o
TEMPORAL
WIND
109
-------
FIC
1.0
H
Z
UJ ~
5° H
_i
bl
o
]
1
i
I
4
ff
V,
1
4
1
4
F
F
'«4
*
I »
^
4
t
>
1
4
4
EV1
SE
t
4
1 1
^
1
$
i
|
^
«
4
i
ER-5
30]
i
i
4
4
/
1
L 1
1
rfi
*
^
**
1
i»
2
i
5
f a
TO
4i
4
f
4
M
« '
«,
'-
A
i
>
nH;
A]
i
t
i
i
4
^
1
f
* t
t
4'
1
f
4
>
CRT
\TAI
t
i
1
'?,
^
V*
iff
*
f
I
1
1
) C
jYS
i
4
4*
«f
V
t '
.'
*'
\
»
i
2
I
i
i
AS
IS
i
i
4
4
4
»41
»
y>
i
>
» (
>
E (
BJ
i
4
4
4
4!
4
.
f '
»
4
4
/
1
DN
^S
1
i
i
*
t
41
«
Tf
E
i
« *
*
s
r~
IIP
i
4
<
i
4
4
4
f
%
D
i
»
4
1
«
NP3 ELEMENT
AN
i
i
4*
9
1
D
3
4
1
1
.0
FO
j
URr
i
TH
r*
OI
3T:
CMAL
i
M
110
-------
This scatter plot shows an even distribution of the data
indicating that the data set should be complete for the
analysis over the variation in variables considered for
this study. The next significant dichotomy occurred in
the scatter plot of the seventh and eighth optimal functions,
This scatter plot is shown in Figure 10B. The grouping in
the lower right hand corner of this scatter plot may be
attributed to cases for which the cross wind values are
low. This follows from the seventh optimal function which
is shown in Figure 11B. This function is dominated by the
cross wind variables, 20 through 29.
m
-------
FIGURE 10B- PROJECTION OF EVERY THIRD CASE ON SEVENTH AND
OPTIMAL DIRECTION OF SECOND ANALYSIS BASE
1.2 i
1.0
O.f |
o.«
0.4
Z «»
Ul
X ~*
_l
wo
0
0.
-0.2
-0.4
(
-O.t 9
-o.«
-1.0
1
4
V '
i
4*
4
»
1
1
1
,1
I
4*
>
1
V
|
U
t
i
1
1
f
I
;
'
.
i
.'
I
*
4
t
4
j|
4
1
1
I4
*
i
I1
i
^
.*,*;
w
i N
{ I.
« 1* W
*l«4«
*lfT .
ri1
! »*
i^
A
t_
1,
1 1
ui_j
T
|4
vki
' L 4
4 1
f*4
d*
J
J
f
>
2
4
V
1
1
1
4
f
4
/*
m
f
i
i
i
V
4
B
4
t
,
1
^,
-
4
1
*|
1
{
f
*
4
1
JB
\
»*
4
i
<
i
i
i
j
<
h
,
4
^
4
4
1
4
I
|
W
4
t
%
!
I
'
*
^
1
l
f
4
4
1
|
i
-0.« -0.« -O.4 -O.2 'o.2
NP7 ELEMENT
A
1
1
4
4
,
i
T
4
1
.
t
1
1
t
O.4
V
4
4
t
1
f
4
i
\
|
t
.
|
1,
W
t
I
\
EIGHTH
w
^
ft
ti
i
^
i
n
4
i
^
t
1
I
4
1
T
LOW CROSSWIND
1 1 1 1 1 1 1 L
O.« 0.* 1.0
W
112
-------
FIGURE 11B- SEVENTH OPTIMAL FUNCTION SECOND ANALYSIS BASE
o.»o
o.«o
0.40
Z
u.
o
o
u
0.20
0.0
-0.20
-0.4O
1O 20 1O 4O
WIND"
INDEXING VARIABLE
r?K ... .HUMIDITY
PREV
WIND
113
-------
APPENDIX C
SELECTION AND ANALYSIS OF ALGORITHMS
Two types of algorithms were developed for the present
study. The first was an algorithm to estimate the precision
run error as a function of the environment and the second
were algorithms to estimate the background concentration as
a function ofthe environment and station location.
The algorithm for estimating the precision run error
was developed using available variables which defined the
environment at Stations 1 and 2. Unfortunately, only 65
cases were available to develop this algorithm. Analysis
of the information energy indicated that one should use
more than 20 dimensions for this algorithm. However, the
65 cases restricts one to less than 11 dimensions. Algorithms
were developed using both 20 and 11 dimensions. Confidence
in applying these algorithms to independent test cases is
low because of the dimensionality required by the poor
ratio of the number of independent cases to the dimension-
ality of the analysis. The relative importance vectors
for the 11 dimensional analysis is presented in the main
report. The ADAPT performance map suggests that the relative
importance vector for the 20 dimensional analysis is not
meaningful because of the small number of cases available.
Several options were available for algorithms to estimate
the background concentration at each of the stations. The
first option was to use the data pooled over all stations
to make an algorithm which would estimate the background
concentration as a function of both location and environment.
As discussed in the main body of the report, this option has
the severe restriction imposed by the manner in which the
wind vector was defined. Thus, it was decided that for this
study one would develop algorithms to estimate the background
concentration at each of the individual stations. These
algorithms determine the effect of the environment on the
background concentration of each of the measurement stations.
At all stations having approximately 100 or greater measure-
ments available, algorithms were developed using 20, 16 and 12
dimensions. The performance of these algorithms was then evalua-
ted using the ADAPT performance map. In all of these cases, the
ADAPT performance map indicated that the 20 dimensional analysis
should yield physically meaningful algorithms. This implies that
the relative importance vector ..-for the 20 dimensional analysis
should have physically significant meaning. However, the validity
114
-------
of the algorithm when applied independent test cases may be
limited to a small set of cases for the higher dimensional
algorithms. For many applications,the ADAPT validity criteria
(See appendix A) can be used to eliminate those eases for
which the algorithm is not valid. However, for the present
study the amount of data available was not sufficient to allow
one to discard a significant portion of the learning cases
to satisfy the validity criteria. Thus, a technique for
selecting a dimensionality which would insure the applicability
of the algorithms to almost all of the cases was required.
The technique used to select a dimensionality allowing
sufficient generality of the resulting algorithms consisted
of comparing the average value estimated for the station under
the environmental conditions during which the cooling device
was operating with the average value observed at the station
during the time which the cooling device was not operating. A
significant difference between these two values suggests that
the dimensionality used was too high and restricted the applic-
ability of the algorithm to a set of cases which did not include
a significant portion the environmental conditions occurring
during the actual operation of the cooling device. For the ca.ses
where more than 100 learning cases were available, 16 dimensions
were ususally adequate. The exceptions to this were Stations 7
and 10 where only 12 dimensions could be used. The algorithms
developed for Stations 1 and 2 were developed using 20 dimensions
since these algorithms were not used to estimate background
concentrations. They were only used for interpretation of the
relative importance vectors. Since significantly less cases were
available at Stations 8, 9 and 11, the dimensionalities of 4,
8 and 4 were used, respectively. The relative importance vectors
for the algorithms using these reference dimensionalities are
presented in the main body of the report for Stations 9 and 10
and in Figure 1-C through 8-C of this appendix for the remainder
of the stations. These relative importance vectors were used to
develop Tables 7 through 16 in the main body of the report.
115
-------
FIGURE 1C
RELATIVE IMPORTANCE OF INDEX
M »
o o o o
RELATIVE IMPORTANCE OF INDEX TO AMBIENT SALT AT STA-1+2 CN«20)
,
l\
\
\
|
1
1
j
j*
--
-*
-\
8 10 20
INDEXING
£ WIND ^|,
>
«*.
>j
^.
v»
^
s*
*»
++
^
j
SO 4O 50
VARIABLEC SEE TABLE-2>
£ HUMIDITY : 5»
\
\
1
I
k/*-
A
J
«0 7«| «
rEMPORAL : ^ -
116
-------
FIGURE 2C
RELATIVE IMPORTANCE OP INDEX TO AHBTENT SALT AT STA-3 CN«16)
4.0
-l.O
1O
2O tO «O SO «O
INDEXING VARIABLEC SEE TABLE-2)
70
117
-------
FIG- 3CRELATIVE IMPORTANCE VECTOR FOR AMBIENT CONC AT STATION-4 CN-16
2.0
1.0
X
UJ
9 o
u.
o
Ul
o
(L
O
Q.
Ul
>
Ul
tf.
-i.o
-2.0
-S.O
1
2O »O 4O
INDEXING VARIABLEC
0
71
«0 «0
SEE TABLE-2)
TO
0
M
118
-------
FIG- 4CRELATIVE IMPORTANCE VECTOR FOR AMBIENT CONC AT STATION-S CN-I6
2.0
1.0
X
111
o
L.
O
111
O
oe
o
a.
UJ
oc
-2.0
-t.O
s
10
2O *O *0 SO «O
INDEXING VARIABLEC SEE TABLE-2)
\
0
119
-------
FIG- 5CRELATIVE IMPORTANCE VECTOR FOR AMBIENT CONC AT STATION-6 CN-16
J
c.o
4.0
1.0
X
111
O
HI
O
oc
o
-------
FIG-6C RELATIVE IMPORTANCE VECTOR FOR AMBIENT CONC AT STATION-7 CN-12
to
X
UJ
O
U.
O
bl
U
tc.
O
0.
bl
>
-10
i-
i
10
20 10 40 so «o
INDEXING VARIABLEC SEE TABLE-2)
70 »0
12]
-------
FIGURE 7 C
RELATIVE IMPORTANCE OF INDEX TO AMBIENT SALT AT STA-8 CN-4)
2.0
1.0
X
u
o
u.
o
u
o
OC
O
0.
ui
>
itt
oc.
-1.0
-2.0
-1.0
_L
_L
\
20 1O 40 SO «0
INDEXING VARIABLEC SEE TABLE-2)
70
0
r«
122
-------
FIGURE 8 C
FI6- RELATIVE IMPORTANCE VECTOR FOR AMBIENT CONC AT STATION-11 CN«4
-4.0
*0 SO 4O 10 «0
INDEXING VARIABLEC SEE TABLE-23
TO
to
M
123
-------
APPENDIX D
ALGORTHIMS FOR CALCULATING AMBIENT CONCENTRATION
Tables 3D through 1Z) provide the information required
to apply the algorithms derived to calculate the background
concentrations at Stations 3 through 11 which were used to
estimate the background concentrations under the conditions
for which the cooling devices were operating. Table 1-D
provides a definition of the indexing Variable i, Table 2-D
presents the Variable Vj_ and Table 3D the Variables VMAX j_
and VMIN . . Tables 4Dthrough IP present the average, SCAMBk
and the algorithm vectors A-^. To find the concentration
at Station 3t, CAMB-^, the numbers presented in these tables
should be combined according to the equation:
CAMB = SCAMB - - A (V, - V
where
V± - 1 -f ( (VD± - VMIN.)/ (VMAXi - VMIN±)
124
-------
TABLE 1-D
DEFINITION OF DATA VECTOR - VD±
VARIABLE NO SYMBOL DESCRIPTION
1 d Projection of position vector on
East direction
2 dN Projection of position vector on
North direction
3 CC1 Binary Code for Light Rain
4 CCS Binary Code for Bugs on Sample
5 CC4 Binary Code for Dust Contamination
6 CCS Binary Code for Combination
of Comments
7 CC9 Binary Code for White Caps
8 ts Start Time
9 te End Time
10-19 dwi (i=l,10) Projection of Wind Vector on
Position Direction -10 Samples
Between ts and te
20-29 Nwi (i-1,10) Projection of Wind Vector on
Normal to Position Direction
-10 Samples Between ts and te
30-39 Ti (i-1,10) Dry Bulb Temperature -10 Samples
Between ts and te
40-49 Di (i=l,10) Difference Between Dry and Wet
Bulb Temperature -10 Samples
Between ts and te
50-59 Hi (i=l,10) Relative Humidity -10 Samples
Between ts and te
CDC1 Binary Variable Indicating
Cooling Tower Operation
CDC2 Binary Variable Indicating
Spray Modules Operating
50 DY Day of Year
51 DFT Days Since First Test
52 SI Binary Variable Indicating Spring
53 S2 Binary Variable Indicating Summer
54 S3 Binary Variable Indicating Fall
55 M Binary Variable-Test on Monday
125
-------
TABLE 1-D
DEFINITION OF DATA VECTOR = VDi (CONT'D)
VARIABLE NO SYMBOL DESCRIPTION
66 T Binary Variable-Test on Tuesday
67 W Binary Variable-Test on Wednesday
68 Th Binary Variable-Test on Thursday
69 F Binary Variable-Test on Friday
70 S Binary Variable-Test on Saturday
71 dw (-1) Projection of Preceding Day's
Average Wind Vector on Position
Direction
72 dn "(-1) Projection of Preceding Day's
Average Wind Vector on Normal
to Position Direction
73 dwsp (-1) Preceding Day's Spread in Wind
Speed
74 dnsp (-1) Preceding Day's Spread in Wind
Direction
75 dw (-1) Preceding Day "s Standard Deviation
of Wind Speed
du (-1) Preceding Day sStandard Deviation
of Wind Direction
- PRE Precision Run Error
126
-------
TABLE- £-D THE AVERAGE VECTOR
AIV( I )
AIV( I )
AIV(I)
AIVF
0.358.3F
0 .3 310F
ol3376F
0 .97 OOF
C.8300E
0.80 3 3F
0 .9 300F
0.9033F
0.9278F
OT.9744F
0.3350F
0. IOOOE
0. IOOOE
""IT. rfiiyoT
0 . 1 302E
04
0 1
0 1
02
02
02
02
0 P
02
02
0"?""
02
0 1
0 1
0 1
0 1
02
0?
02 """"
03
0 1
0 1
04
2
5
3
1 1
14
1 7
20
26
29
3?
3b
38
41
44
4 ^
50
53
5b
by
6?
65
68
74
-0
r
-0
-0
-0
-0
0
0
o
0
f\
C
n
0
0
0
6
0
-0
-o
2200E
O
3093E
2926E
31 22E
2 3 OOF
10 12F
1443E
9444F
1222E
n
.
0
O
G """" "
0
0
0
12 OOF
04
03
02
02
02
02
02
02
02
02
00
01
00
02
02
0. 760CF
0 . 1 OOOF
0. 2091F
7JT3 F72E
0 .3253F
0. 3334F
0 . 3440F
0". 340~5C
0 . 3390F
0 .368CE
0 . 330CF
0.3314E
0.3436F
0.9233E
0. 8189E
0 . 8344E
0. 9500E
0. 8967E
0. 9433E
ol IOOOE
0. IOOOE
0. IOOOE
T3.5889ET
0. 3COOE
03
01
04
32
02
02
02
02
02
02
02
01
01
01
02
02
02
01
01
01
o p
01
3
6
T5
15
18
21
27
30
33
39
42
45
48
51
54
57
6TT
63
66
69
75
0.0
ol 10 14E 04
-OI3037E 02
-0.3093E 02
-0.3645E 02
-0^30 18E 02
0.8500E 01
0. 1093F. C2
o!l450E 02
0.1289E 01
0.9778E 00
0.2444E 00
C.O
0.0
0.0
0.2360e 03
0.0
0.0
0.0
-C.7749E 02
0. 1400E 00
0. IOOOE
0. IOOOE
0.3386E
0.31 9 9E
0 .3280E
0 .3361E
0.3428E
ot3487E
0 .3320 F
0 .3290E
0~, 3333E
0 .3500E
C.8767E
0.81 1 IE
0.8822E
0 .9267E
0 .9 122E
0 .9589E
dlOOOE
0. IOOOE
0. IOOOE
ol2968E
0 1
0 1
04
02
02
02
02
02
02
02" ""
02
0 1
0 1
01
02
02
02
03
0 1
0 1
01
0 3
-------
TABLE 4D- ALGORITHM FOR STATION 3 (16 DIMENSIONS)
SCAMB3= 6.116851
2.42546077D CO
^4.966312650 00
..~e^l.3 493^X00^- C 1
-1.2051A681D-C1
-1.887193820 00
1.13870194D-01
-2^-14620XX2-2D OO
-1.64860387D 00
-1.029C5422D OC
1.94046728D 00
4-973045690 00
2.002f36740 00
2 .03222009D 00
2.06227453D CC
.Q926.1A.&-7D OC
2.123215460 CO
2.15371250O CO
2.10936124D 00
2..067S4866O OO
-5.953816280-01
-5.92855891D-C1
-5.90369861D-01
-5*876637530-01
-5.84928356D-C1
-5.826949670-01
"31
5.772*387970-01
5.64-7SS259O-CI
5.517274210-01
S.33437956D-C1
1.916559790 OO
1 .8035-49070. 00
1.67927958O CO
1.54477161D 00
1.38837149O 00
1 ȣ 1 24 2 -7S9D C C
I.025581500 00
8 . 168C691 1D-01
5.76968049O-01
3»425<;i873Q-4H
2.01CE6788D 00
1.67049351O CO
1.25636689D 00
7-. 197 7-7 2 57O=-4) 1
-7.03490223D-02
5.£54359410-01
1.15264176D 00
1.A3-783347D OC
2.019025760 OO
2.340922120 00
1.14532684O CO
6 .514846040-01
3.77777736O-01
9.74823498D-02
-4.353736130-01
-6.88753^220-01
-9.3417205OD-O1
4D -
-9. 77321486 D-C1
1-9 .77321 486 O-C1
7.274584170-01
-3. 44 7 44 68 7O- 01
3.062317440 00
"1 .21209601 D 00
^4 -^5S922-84-OO - 04J
-r4 .59490257O-C1
2.34905064O-01
1 .724033300-01
6. 08663 796 O- 01
-3. 52706584D-OI
9 . 260 49 1 33D-0 I
6 *^O4B3O 45O-04;
9. 33 0 8795 7 D- 01
TABLE 5D- (16 DIMENSIONS) STATION 4 SCAMB. = 5.795147
1
2
3
4
5
f>
7
8
9
10
U
12
13
14
15
16
17
18
!9_
20
21
22
?3
24
25
-1.604768170 01
-3.23332704D 01
1«959715_P7D CO
4.93630355D-C1
1.2451J573D CC
-4.299C2948D-C1
9-1210361 7D-C1
.287122920 CC
.14152726D CC
.77463994D 00
.72659467D 00
.67441184O CC
,622677080 CO
.570C6485D CC
.51638385O CO
.46239796O CO
.40824997D CC
.306E1331D 00
.21076462O OC
, 19E48951D CC
, 14437545D CC
*09090530D CO
«,03f24439D CC
1
-1
1
1
1
1
1
9,795£347>D-C1
.9*209151320-01
26 8 .57624871D-C1 """Si
27 , 7.81133134Q-01 52
28 7.04714597D-C1 53
29 6.P2064905D-C1 54
30 -3.62037717D-.C1 55
31 -4. 1110I552D-C1 56
32 -4 .41635792D-01 57
33 -4.75527059D-C1 58
34 -5.C7939864D-01 59
3-5 -5,39387105D-C1 ^°
36 -5.729C9335D-C1 61
37 -6.01429153D-C1 £2
38 -6 .07372354D-C1 63 ,
39 -6.073133760-Cl 64 '
*0 7.47489023D-C1 65 -
41 7.70121110D-01 6& -
42 7.969500970-01 67
43 8,06923872D-C1 68
44 7.34971071D-C1 69
45 6.50722667D-01 70
46 5.859^36280-01 71
47 4.90863728D-C1 72
48 4.00725716D-01 73 -
49 3.243494840-C1 74 -
50 - -9.932673440-01 75 _
41
-9. 875228420-01
-9.81964400D-01
-9.57846042D-CI
-9. 108128250-01
-8. 648572360-01
8. 2 12577380-01
7.783796940-01
7.37427351D-C1
3.37054285O-C1
3.370S4285D-C1
6. 99 8 84300^01
2. 13200872O
4.75755515O-C1
4. 08747097D-01
4.9893 3727J?- C 1
1.410624310 00
1 . 805771970-01
1 .34045971D 00
5.63372007D-C1
6. 12978796D-C1
2.26522530D-01
1^767028430 00
6, 43405277 D-Ol
1.72C88997D 00
128
-------
TABLE 6D- (16 DIMENSIONS) STATION 5 SCAMB5 =4.91
A
51
A
51
A
51
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-3.33253519D CC
-5.57589773D CC
7.27794518D-0
-2. 13340726D-C
- 1 .29410470D C
- 1 . 162323600-C
7 .26561543D-C
9 .29432755D-C
6.78530171D-0
-1.01 744160D 00
-1.020315120 CC
- 1 .02137330D
- 1 .022587040
-1 -02374602D
- 1 .02477156D
025?C749D
- 1
-1 .026987570
CO
CC
oc
CC
oc
CO
9 .925431 1OD-C1
-9 .60 1 191 28D-C1
5 * 13926464O-C1
5.07199276D-01
5.00C899170-01
4.92909484D-01
4 .854257380-01
4 .77624051D-Cl
26 4.68274980D-C1
27 4.521 76599D-C1
28 4.35976205D-01
29 4. 156^5418D-C1
30 - 1 .49560762D-01
31 -1.994340040-01
32 -2.53127250D-C1
33 -3. 1 1441268D-C1
34 -3.71 <;3C1 930-C1
35 -4.34111133O-01
36 -5.00101582D-C1
37 -5.636C0899D-01
38 -6.05852249D-01
39 -.405665120-01
40 5.58663261D-C1
41 5.048367200-01
42 4.389363860-01
43 3 .46ei0676D-Cl
44 2.12436133O-C1
45 8.112?4059D-C2
46 -4.418139490-02
47 - 1 .S5841553D-C1
48 -2.477647230-01
49 -3.254687960-01
50 -4.36248263O-01
51 -3.88432601O-C1
52 -3.38237678D-C1
53 -2 .79995408D-C1
54 -2.15626567O-C1
55 - 1 .53270588D-C1
56 -9.31573656D-C2
57 -3.485C9930O-C2
58 2.149E0327D-C2
59 7.61324350O-C2
60 4.24794499D-C1
61 4 .24794499D-0I
62 - I .80910435D CC
63 -2.529030400 CC
64 4 .27678632O-C1
65 2.1 1950534D-C1
66 -6.04377566D-01
67 7. 61 7<5444 1 O- C 1
68 7.25334854D-01
69 1.8725754O CC
70 1.22291540D CC
71 -3 .30142092O-C1
72 7.84579452D-C2
73 -5.466102600-01
74 -6.393701000-C2
75 - 5 . 25412729O-C1
TABLE 7D- (16 DIMENSIONS) STATION 6 SCAMB^ = 5.459854
I
2
3
4
S.
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
5.848675870 00
1 . 102580810 01
-1.113810790 CO
-3.925Q3345D-C 1
- 2. 150<58054D CC
5.384345170-02
4.78532422D 00
-1.2574S313D OC
-7, 866C7345D-
- 1 . 54099437D
- I .591292340
-1.639616460
.6SJ36047O
.73744624D
.78724136D
-1.83742111D
-1.887535850
-1.87156680D
.856527790
.16«2<32C30
2.223387990
2.2aOf2027D
2.33C41501D
2.40037857O
2.46343749O
C1
00
CC
OC
00
00
CC
CO
CC
00
00
CC
CO
OC
00
OC
00
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
A, .
61
2 521C91C8D CO
2 54S47157D CC
2 56S41554O CC
2 56565914D CC
1 266E0800O 00
1 229170700 CC
18914638D CC
145520920 OC
09034119O CC
02430728O CC
1
1
1
1
9.54073617D-0I
8.68££3515D-C1
7.505651820-C1
6.307E3769O-C1
1 .231617890 CO
1 .090250710 CC
9.I7967742O-01
6.835423830-01
3 .62C28321D-C1
4 .90877328O-C2
2 .495209900-01
5 .03339538D-C1
7. 1862 16 190.-01
8.96292248D-C1
8.94165961D-C1
51
52
«S3
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
"61
7.863C8133O-C1
6 .73042046D-CI
5 .4343696 1D-0 1
4 .01915084O-CI
1.327672410-C1
4.714291 55O-C3
I . 191C6774D-C1
2.391C7165O-C 1
1 094621 75D- C 1
1 .09462175D-01
5.292471 91O-C1
7. 75753581D-C1
1 .00473025D 00
1 .238*03600-0 1
1.544177C6D CO
3.75493291O-C1
5 .34301955O-C1
A .21641 14 1O-01
1.092033250-01
4.554831 89O-01
1 .39672773D-C 1
1.149750280 00
6.064887460-01
1.182802270 CC
129
-------
TABLE 8D - (12 DIMENSIONS) STATION 7
SCAMB7 = 5.046803
1
2
3
A
5
6
7
8
9
10
11
12
13
1*_
15
16
17
18
19
20
21
22
23
24
25
5.89134390O CO 26
-1.73470027D 01 27
8.00221059D-C1 28
8,15547703D-C1 29
8.66471276O-C1 30
-4.6302906D-C1 31
-3.39706609D-01 32
1.557617930 00 33
5.969555370-Cl 34
1.439104Q5D OC 35
1.523335D CC 36
1.61314941O 00 37
1.69960359D CC 38
1.78741608D 00 39
1.876202010 CC 40
1.965656010 00 41
2.05523014D CC 42
2.072194120 CC 43
2.0S6C8964D 00 44
2.049535930 CC 45
2.006156750 00 46
1.961C4459D CC 47
1.914676030 00 48
1.866640900 OC 49
1.816S4613D 00 50
1.76010596O 00
1.67821369O 00
1.595494840 CC
1.497904010 CC
1.85662322D-01
8.77855321O-02
- 1*747652440-02
-1.318^45890-Cl
-2.539576410-01
-3.82676114O-01
-5.19483617O-01
-6.57171666D-01
-7.702285010-01
-8.70961319D-01
6.900E3937D-C1
4.77275153D-C1
2.17957174D-C1
-1 . 00*^84450-01
-4.33222526D-C1
-7.49016054O-01
-1.050357310 CC
-1.272C4984O CC
-1.435C2948D 00
-1.574265480 OC
-7.363S7816D-C1
51 -6.20216796D-01
52 -4.982713600-01
53 -3.629204350-01
54 -2.20160465D-01
55 -8.192858820-62
56 5.138C9277D-C2
57 1.80642583D-C1
58 3.05524303D-01
59 4.26722377D-01
60 -1.05232853D CO
61 -1.05232853D CC
62 -5,415662430-02
63 -8.159174830 CC
64 1.32690723D 01
65 -1.220391500 00
66 -7.524C1933D-01
67 -1.460C4724O CO
68 2.46961T670-01
69 -1.174633400 00
70 -3.616237140-C1
71 2.09731682D-01
72 3.328S7380D-01
73 -2.475487930-01
74 -6.79397119O-C1
75 -2.593945530-C1
TABLE 9D- (4 DIMENSIONS) STATION 8 SCAMBg - 5.663548
A
8i
1 -4.338694650-01 26
2 2.227951620-01 27
3 -i.73989683O-02 2fl
4 -0.351429270-02 29
£> 1 .O8164161D-03 30
6 -3.1 18769570-02 31
7 -4.40092867O-02 32
8 4.48024737D-01 33
9 3.270552010-01 34
10 -2.061889240 00 35
11 -2.09586349O 00 36
12 -2.12655912D 00 37
13 -2.157534940 00 38
14 -2.168791620 00 39
15 -2.220346560 00 40
16 -2.252177950 00 41
17 -2.283935970 00 42
18 -2.23626953D 00 43
19 -2.19133795t> 00 44
20 -i.i98C9473D-01 45
21 -1.184045140-01 46
22 -1.169433900-01 47
23 -1.15449349O-01 48
24 -1.139009340-01 49
25 -i.122731250-01 50
-1. 1 0297053D-01 51
-1.067605730-01 52
-1.021607080-01 53
-^.859018100-02 54
3.266526560-01 55
J.091C9185D-01 56
2.90171144D-01 57
2.696695500-01 58
2.455706120-01 $9
2.18274675D-01 60
1.892609710-01 61
1.56509919D-01 62
1.17784106D-01 63
7.970799430-02 e>4
6.33366645D-01 65
0.483365270-01 66
6.665602160-01 67
6.69126835O-01 68
0.035949400-01 69
'5 « 348609910-01 70
4.692567110-01 71
J.864682900-01 72
3.08510079O-01 73
2.42649065D-01 74
-3.360892490-01 75
-3.18229512O-01
-2 .9945751 1O-0 1
-2,744987610-01
-2.4306893 6O-01
-2.12711587O-01
-1.83350606O-01
-I.S4942133D-01
-1.2744453 OO-01
-1 .008021020-01
-1.433561660 00
-1. 43356166O 00
-1.7779136ID 00
1.09210798O 00
4.43258288O-01
2.2198865OD-02
-1.52869957D 00
J.63871226O-01
1.233380820 00
J.92427912D-01
-2.997106910-01
-4. 3476178 00-01
-D.29709879D-02
-6.749318440-02
-1.99453779O-01
-0.48748376O-02
130
-------
TABLE 10D- (8 DIMENSIONS) STATION 9 SCAMB0 - 4.139158
9 ~
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Ao.
-J.496636590-02 26
-7.469583140-01 27
i.07058601D-01 28
-J.463944740-02 29
i .60746174D-01 30
-1.553599260-01 31
-7.330861900-02 32
2.77614409D-01 33
1.69036120O-01 34
-0.54367004D-01 35
-5.52167867D-01 36
-J.48999131D-01 37
-5.458087070-01 38
-0.42584407D-01 39
-0.392959580-01 40
-0.36017888D-01 41
-5.32627376D-01 42
-o. 1 G362246D-01 43
-*.9055752 7D-01 44
2.146640760 00 45
2.174066790* 00 46
2.202385380 00 47
2.231691780 00 48
2.262C0045D 00 49
2. 2 933484 6O 00 50
A
2.313792520 00 SI
2.31304965O OO 52
2.306085870 00 53
2.276557050 00 54
1 . 74752327O-01 55
1 .64790652D-01 56
1 .540437610-0 1 57
i .42391095D-01 58
1 ,2£790869D-0 1 59
1.134292580-01 60
J. 7153433'90-02 61
7.884737300-02 62
0.75322634D-02 63
J.66664705D-02 64
-1.35399224D-01 65
-i .65883265D-01 66
-2.02402164O-01 67
-2.40545059D-0 1 68
-2.569523350-01 69
-2.70326098D-01 70
-2.33031226D-0 1 71
-<;.806a4231D-0 1 72
-2.731 0360O-0 I 73
-2.66d43374O-0 1 74
1.372976890-01 75
9i
1 .542S5250D-01
1 . 72156067D-01
1.874353270-01
1.97961415D-01
2.08173345O-O
2.18015766O-0
2.275533420-0
2.36782529D-0
2.457219570-0
-1.203599850 00
-1.20359985D 00
-1.866167030 00,
3,562924290-01
1.135S77O2O 00
J.38347162O-01
4.18734861D-01
i . 0905271 8D 0 0
3.413342810-01
1.263^85250 00
2.67422078D-01
2.814018300-01
J.018108010-01
4.46681181D-02
1.976591780-01
4.777842670-02
TABLE llo- (12 DIMENSIONS) STATION 10 SCA^ n = 3.918661
A10i
'101
Sl0i
1
2
3
4
5
6
7
8
9
10
1 1
12
13
14
15
16
17
18
19
20
21
22
23
24
25
1.54383943O 00 gfi
1*521150870 00 27
8 . 14643000D-C2 28
2.51CC8622D-01 29
3.50393505D-02 30
1.485398670-01 31
1.58156644D-01 32
-4.59069261D-01 33
- 1.932997460-01 34
2.574431170 OG 35
2.619548190 00 35
2.66C57311D 00 37
2.70200B59D CC 38
2.74381828D CC 39
2.78597582D CC 40
2.82652307D OC 41
2.87102705D 00 42
2.813608240 CC 43
2.75
-------
TABLE 12D- (4 DIMENSIONS) STATION 11 SCAMB
11
7.388696
A
Hi
A
Hi
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-1
5.86323127O-02
7.509774100-02
1.57452110D-02
1.3516S326D-C2
1.88676914D-01
2.263794530-C3
3.73047853D-02
5.510532790-01
4.65S1364D-01
1 .15121 1 750 00
170220820
1.18740365D
1.20474748D
1.222242060
1.239890960
1.25771539D
1.275503680
24890902O
-1
-1.22367818D
CC
00
oc
00
CO
00
oc
oc
CO
1.353532360-01
1.362399060-01
1 .371 £17920-01
1 .381242 530*-01
1.39123788B-01
1.40147733D-01
26 -1.407643090-01 51
27 - 1.3952264 ID-Cl 52
28 -1.381571610-01 53
29 - 1.35449666D-01 54
30 -2.219330400-01 55
31 -2.54711443D-01 56
32 -2.90091914D-01 57
33 -3.284C75800-C1 58
34 -3.673356950-01 59
35 -4.06478829D-01 60
36 -4.480576560-C1 61
37 -4.86438852D-01 62
38 -5.0645991 OD-01 63
39 -5.209099730-01 64
40 5.596593530-01 65
41 5.70344776D-C1 66
42 5.833374820-01 67
43 5 .82C45362D-01 68
44 5.212499490-01 69
45 4.578525590-01 70
46 3.97347772D-C1 71
47 3.22721877D-C1 72
48 2.52«69204D-01 73
49 1.94023043D-01 74
50 -3.85413706D-01 75
-3.702284 06O-C1
-3.54267707D-C1
-3.311682610-01
-3.002444620-01
-2.70340534D-01
-2.41451323D-C1
-2.13477296D-01
1.86413367O-C1
-1.601899920-01
-1.952C6617D 00
1.95206617D 00
-3.295883640 OC
7.51950727D-C1
1.179425770 CO
1.54421644O 00
9.20644355O-C1
1.634428630-01
6.45855191D-01
2.037641840-01
4.910453650-01
2.36S82827D-01
6.13333172D-C2
-8.27530747D-02
2.028680860-01
8.385S6629D-C2
132
-------
I
(I
1 Ml I'lHt 1 NO,
EPA-600/3-76-034
TECHNICAL REPORT DATA
li-asc rccil taurtictlunx on lite reverse bsjorc comi'lctitiK)
2.
1, II ILL AMU ;;uu 1 1 ILL
Effect of Mechanical Cooling Devices on
Ambient Salt Concentration
/. AUTHOmS)
Herbert E. Hunter
W. I'l FU OF1MINU ORG '\NIZATION NAME AND ADDRESS
ADAPT Service Corporation
23 Pine Ridge Circle
Reading, Massachusetts 01867
1A SPONSORING ACHNCY NAME AND ADC
EPA/Pacific Northwe
Corvallis, Oregon 9
JHESS
st Research Lab
7330
__ ;
3. RECIPIENT'S ACCE3SION"NO. j
1
5. REPORT DATE
April 1976
6. PERFORMING ORGANIZATION CODE
8. PERFORMING ORGANIZATION REPORT NO.
ADAPT 75-8
10. PROGRAM ELEMENT NO.
1 BA032
11. CONTRACT/GRANT NO.
68-03-2176
13. TYPE OF REPORT AND PERIOD COVERED
Final-Feb 1975-Sept 1975
14. SPONSORING AGENCY CODE
EPA-ORD
ID. SUPPLEMENTARY NOTES
T8. ABSTRACT
This report presents an analysis of the airborne salt concentration
data collected during the demonstration of the salt water mechanical
cooling devices at the Turkey Point power plant. The data vore analyzed
using the ADAPT family of empirical analysis programs which are based
on the concept that empirical analysis should be preceded b,y the
development of an optimal (in the Karhunen-Loeve sense) representation
of the data. The analysis presented in the report shows that the in-
crease in the background salt concentration due to the cooling tower
was less than the measurement accuracy of approximately three to five
micrograms per cubic meter. The analysis also shows that the spray
modules used in this test probably increased the background concentratioi
at one station located approximately 430 meters from the spray module
by approximately three micrograms per cubic meter. These results were
obtained by analysis of statistical summaries of the difference between
the measured concentration with the cooling device operating and the
calculated background concentration for the same conditions.
|7 KEY WORDS AND DOCUMENT ANALYSIS
a. DESCRIPTORS
b. IDENTIFIERS/OPEN ENDED TERMS
Airborne Salt Concentration Cooling Towers
Regression Analysis Spray Modules
Statistical Data Analysis
Thermal Pollution
I'l \l\:> f HI HUT ION b r ATLMENT
RELEASE TO PUBLIC
19. SECURITY CLASS /'This Report/
Unclassified
20. SECURITY CLASS {This page)
Unclassified
EPA form 2220-1 (9-73)
c. COSATI F'icld/Group
13/02
06/06
12/01
04/02
18/05
20/04
21 . NO. OF PAGf.3 \
22. PRICE
133
{, U.S. GOVERNMENT PRINTING OFFICE: 1976697-053183 REGION 10
------- |