United States
Environmental Protection
Agency
EPA/600/R-10/128
September 2010
Office of
Research & Development
National Health and
Environmental Effects
Research Laboratory
-------
EPA/600/R-10/128
September 2010
Approaches to Identify Exceedances
of Water Quality Thresholds
Associated with Ocean Conditions
Cheryl A. Brown and Walter G. Nelson
National Health and Environmental Effects Research Laboratory
Western Ecology Division
Pacific Coastal Ecology Branch
2111 SE Marine Science Dr.
Newport, OR 97366
-------
United States
Environmental Protection
Agency
EPA/600/R-10/128
September 2010
Office of
Research & Development
National Health and
Environmental Effects
Research Laboratory
-------
Abstract
Estuaries along the west coast of the United States periodically have high nutrient,
high chlorophyll a, and low dissolved oxygen levels due to the intrusion of oceanic water
into the estuaries. This oceanic water often has water quality conditions which exceed
water quality standards and indicators of eutrophication status. Tools are needed to
distinguish such exceedances of water quality thresholds related to import of oceanic
water from other causes. In this report, we present an application of logistic regression
models to predict the probability of exceedance of water quality thresholds using flood-
tide nutrient and dissolved oxygen data from the Yaquina Estuary. Models including
water temperature and salinity correctly classified exceedances of dissolved inorganic
nitrogen and phosphorous thresholds about 90% of the time, and for dissolved oxygen
about 80% of the time. Inclusion of in situ fluorescence in the logistic regression model
for dissolved oxygen improved the model performance and reduced the rate of false
positives.
-------
Table of Contents
Abstract 2
List of Figures 4
List of Tables 5
Disclaimer 6
Acknowledgments 6
1. Introduction/Background 7
2. Methods 9
2.1 Data Used in the Analyses 9
2.1.1 Yaquina Estuary Flood-tide Nutrient Data 10
2.1.2 Yaquina Estuary Continuous Data 10
2.1.3 Additional Yaquina Estuary Data 11
2.1.4 Other Sources of Data 11
2.2 Data Analyses 12
2.2.1 Logistic Regression 13
3. Results and Discussion 14
3.1 Role of Oceanic Conditions in Causing Exceedances of Water Quality
Thresholds 14
3.1.1 Nutrients 15
3.1.2 Dissolved Oxygen 17
3.2 Development of Indicators of Ocean Influence 18
3.2.1 Nutrients 18
3.2.2 Dissolved Oxygen 35
4. Summary 43
5. Literature Cited 49
Appendices 52
-------
List of Figures
Figure 1. Location map showing the location of flood-tide nutrient sampling and
continuous monitoring station (Yl) inside Yaquina Estuary, and inner shelf stations
from Wetz et al 9
Figure 2. Interannual variability in the percent of dry season observations with a) DIN >
14 uM and b) DIP > 1.3 uM in Zone 1 16
Figure 3. Percent of dry season observations with DIN > 14 uM in the lower estuary and
median flood-tide water temperatures for each year 17
Figure 4. Percent of dry season observations of flood-tide dissolved oxygen < 6.5 mg I"1
at station Yl for each year 18
Figure 5. DIN as a function of temperature and salinity generated using dry season data
from a) the inner shelf off of Newport, Oregon and b) flood-tide samples from station
Yl in the Yaquina Estuary 20
Figure 6. DIP as a function of temperature and salinity generated using dry season data
from a) the inner shelf off of Newport, Oregon and b) flood-tide samples from station
Yl in the Yaquina Estuary 21
Figure 7. False positive and false negative rates as a function of prediction point for the
logistic regression model for DIN > 14 uM using water temperature and salinity. ... 24
Figure 8. ROC curve for logistic regression model for DIN > 14 uM using water
temperature and salinity with an AUC value of 0.94 25
Figure 9. Temperature and salinity of cruise data measured in the Yaquina Estuary
during the dry seasons of 1998-2008 with DIN < 14 uM and DIN > 14 uM, and
contours of probability of DIN > 14 uM generated from logistic regression model
with water temperature and salinity as explanatory variables 28
Figure 10. Example of a mixing diagram showing a riverine DIN source 29
Figure 11. Temperature and salinity of cruise data measured in the Yaquina Estuary
during the dry seasons of 1998-2008 with DIP < 1.3 uM and DIP > 1.3 uM, and
contours of probability of DIP > 1.3 uM generated from logistic regression model
with water temperature and salinity as explanatory variables 30
Figure 12. Flood-tide dissolved oxygen at station Yl in the Yaquina Estuary plotted
versus a) temperature and salinity, and b) sigma-t and in situ fluorescence 36
Figure 13. False positive and false negative rates as a function of prediction point for
logistic regression model for dissolved oxygen < 6.5 mg I"1 using water temperature
and salinity as explanatory variables 37
Figure 14. Temperature and salinity of cruise data measured in the Yaquina Estuary
during May to October of 2006 and 2007 with DO > 6.5 mg I"1 and DO < 6.5 mg I"1,
and contours of probability of DO < 6.5 mg I"1 generated from logistic regression
model with water temperature and salinity as explanatory variables 41
-------
List of Tables
Table 1. Water quality thresholds used in development of logistic regression models.
Error! Bookmark not defined.
Table 2. Intercepts and coefficients for logistic regression models for exceedances of
DIN and DIP thresholds 23
Table 3. Sample size and area under the receiver operating characteristic curve (AUC)
for DIN and DIP models 24
Table 4. Classification table showing accuracy of the water temperature and salinity
logistic regression equation at predicting DIN > 14 uM using the reserved data 26
Table 5. Classification table showing accuracy of the water temperature and salinity
logistic regression equation at predicting DIP > 1.3 uM using the reserved data 27
Table 6. Observed DIN and DIP, water temperature and salinity, and probability of
exceeding nutrient thresholds calculated using water temperature and salinity at time
of sampling and previous flood tide 33
Table 7. Observed median NOs +NO2 and PO4 for May - September 2008 and
modeled using water temperature at time of sampling and water temperature during
flood tide previous to sampling 34
Table 8. Intercepts and coefficients for logistic regression models for occurrences of
dissolved oxygen <6.5 mg I"1 38
Table 9. Sample size and area under the receiver operating characteristic curve (AUC)
for the dissolved oxygen models 38
Table 10. Classification table showing accuracy of the water temperature and salinity
logistic regression equation for predicting the occurrence of flood-tide DO < 6.5
mg I"1 using data collected during May - September 2009 at station Yl 39
Table 11. Classification table showing accuracy of the water temperature, salinity, and in
situ fluorescence logistic regression equation for predicting the occurrence of flood-
tide DO < 6.5 mg I"1 using data collected during May - September 2009 at
station Yl 40
Table Al. Classification table for DIN logistic regression with water temperature and
salinity as explanatory variables for probability prediction points ranging from
Oto 1 53
Table A2. Classification table for DIP logistic regression with water temperature and
salinity as explanatory variables for probability prediction points ranging from
Oto 1 54
Table A3. Classification table for dissolved oxygen logistic regression with water
temperature and salinity as explanatory variables for probability prediction points
ranging from 0 to 1 55
-------
Disclaimer
The information in this document has been funded wholly by the U.S.
Environmental Protection Agency. It has been subjected to review by the National
Health and Environmental Effects Research Laboratory and approved for publication.
Approval does not signify that the contents reflect the views of the Agency, nor does
mention of trade names or commercial products constitute endorsement or
recommendation for use.
Acknowledgments
The authors would like to thank the following for their contributions to this
report: Mr. T Chris Mochon Collura of Western Ecology Division (WED) for his work in
calibration of the instruments; Dr. Robert Ozretich of WED for cruise data; Mr. Pat
Clinton of WED for map production, Dr. Peter Eldridge (deceased) of WED for nutrient
data; employees of Dynamac, Inc. for field work; and Dr. Melanie Frazier of WED for
statistical advice and reviewing a previous version of this report.
-------
1. Introduction/Background
In response to the Clean Water Act requirements to protect and restore the quality
of surface waters of the nation, EPA has developed a strategy of assisting the States to
develop numeric nutrient criteria as part of water quality standards designed to protect the
designated uses of State waters. EPA has provided guidance to the States and Tribes for
developing nutrient criteria for estuarine and coastal waters (US EPA, 2002). The Office
of Research and Development, National Health and Environmental Effects Laboratory
(NHEERL) has been conducting research to support improvements to the scientific basis
for estuarine, numeric nutrient criteria. In the Pacific Northwest (PNW) region,
NHEERL scientists have previously synthesized the research results of field sampling,
trend analyses, and modeling approaches to produce a case study for development of
numeric nutrient targets for Yaquina Estuary, Oregon (Brown et al., 2007).
Due to the seasonal variability in water quality conditions within the Yaquina
Estuary, Brown et al. (2007) recommended that separate criteria be developed for wet
(November - April) and dry seasons (May - October). Since there is little biological
utilization of nutrients during the wet season, development of dry season criteria was
suggested as a higher priority. In addition, it was recommended that the estuary be
divided into two zones and separate criteria be developed for the ocean-dominated (Zone
1) and watershed and point source dominated (Zone 2) regions (see Figure 1). Using in
situ observation within the Yaquina Estuary as a basis for determining an Estuarine
Reference Condition, median values were suggested as potential dry season criteria for
dissolved inorganic nitrogen (DIN) and phosphate. In the present report, the potential
numeric criteria are termed "water quality thresholds."
During April through September along the Pacific Northwest coast of the U.S.,
seasonal, wind-driven coastal upwelling advects relatively cool, nutrient rich water to the
surface, which is then advected into the estuaries during flood tides. Previous studies
have demonstrated that water quality conditions within PNW estuaries during the
summer are influenced by intrusions of upwelled oceanic water into the estuaries,
affecting nutrients (Haertel et al., 1969; de Angelis and Gordon, 1985; Brown and
Ozretich, 2009), phytoplankton (Roegner and Shanks, 2001; Roegner et al., 2002; Brown
and Ozretich, 2009), and dissolved oxygen (Pearson and Holt, 1960; Haertel et al., 1969;
7
-------
Brown and Power, in review) levels. The coupling of water quality conditions between
the coastal ocean and the adjacent estuaries can be problematic in assessing compliance
of water quality standards and for evaluating eutrophication status for estuarine systems
in the region.
The objective of this report is to provide a set of simple, statistical approaches that
may be used to distinguish exceedances of water quality thresholds resulting from natural
conditions in the near coastal ocean as distinct from other causes of exceedances. We
used the physical characteristics of upwelled water, namely temperature and salinity, as
indicators of upwelled water within the estuary. Statistical methods were then applied in
order to develop a probability estimate that observations of water quality parameters such
as nutrient or dissolved oxygen may have been strongly influenced by conditions in the
near shore water and exceedances were associated with ocean conditions at time of
sampling. An additional approach is presented where observed nutrient levels are
compared to modeled values based on temperature-nutrient relationships developed using
data from outside the estuary on the adjacent continental shelf.
-------
2. Methods
2.1 Data Used in the Analyses
We assembled data primarily from a variety of research projects conducted by the
Pacific Coastal Ecology Branch (U.S. EPA) to serve as the basis for development of
statistical methods to detect the influence of near coast ocean waters on estuarine water
quality conditions. EPA data were supplemented with additional data sources described
below. Due to the multiple data sources, there is considerable interannual variability in
sampling locations and sampling frequency. Although the studies used were not
specifically designed to address issues of exceedances of water quality thresholds, they
do allow us to examine the importance of variability of ocean conditions on water quality
measurements within the estuary. All of the data used in the analyses was for the dry
season, which is defined as the months of May through October.
NH_5
NWP03
Zonel
Zone 2
012 4 6 8 10
Figure 1. Map showing the location of flood-tide nutrient sampling and continuous
monitoring station (Yl) inside Yaquina Estuary, and inner continental shelf stations (NH-
1 and NH-5) from Wetz et al. (2005). Meteorological data are available from NOAA
station NWP03 and flood-tide water temperature data from station SBEO3. The
boundary delineating Zone 1 and Zone 2 is also presented.
-------
2.1.1 Yaquina Estuary Flood-tide Nutrient Data
During the period of May through October of 2002, 2003, and 2004, once daily
water samples were collected during flood tides at an approximate depth of 0.5 m at the
Oregon State University Dock (Yl,Figure 1), which is located inside Yaquina Estuary 4
km from the seaward end of the inlet jetties. Water samples were immediately filtered
using GF/F filters and frozen for storage until analysis. Dissolved inorganic nutrients
(NOs + NO2 , NH4 , and PO4 ) were analyzed by MSI Analytical Laboratory (University
of California-Santa Barbara, CA) using Lachat flow injection instrumentation (Zellweger
Analytics, Milwaukee WI). Dissolved inorganic nitrogen (DIN) is composed of NOs,
NO2 , and NFL , and dissolved inorganic phosphorous (DIP) represents PO4 .
Commencing on August 28 2002 and continuing through September 2004, an automated
sampler (ISCO®, Model 3700FR, Lincoln, NE, USA), programmed using the predicted
time of each high tide, was used to collect water samples for each flood tide. Samples
were held in a dark, refrigerated compartment and were collected daily, filtered and
frozen for nutrient analysis.
2.1.2 Yaquina Estuary Continuous Data
During 2002-2009, time-series data (temperature, salinity and dissolved oxygen)
were collected every 15 minutes at station Yl in the Yaquina Estuary using water quality
monitoring sondes (YSI 6600, YSI, Inc., Yellow Springs, OH, USA). Beginning in 2004,
in situ fluorescence was also measured. The sondes were calibrated prior to use
following the manufacturer's recommendations. Temperature sensors were factory
calibrated by the manufacturer and their performance was checked prior to and
subsequent to deployment. Conductivity was calibrated with a one-point calibration.
The dissolved oxygen sensor was calibrated using the saturated air-in-water method. In
situ fluorescence was calibrated with a two-point calibration, using reverse osmosis water
and a rhodamine WT solution with data reported as ug I"1. Sonde performance was
checked in a flow-through seawater bath in the laboratory immediately before and after
deployment. Several techniques were used to identify time periods of significant
biological fouling of the sensors. These techniques included post-deployment calibration
checks, comparison of results from adjacent stations, comparison to independent discrete
measurements (if available), and comparison of the last few readings of a deployment to
10
-------
the first few readings of the newly deployed sonde. If there was evidence of biofouling
or sensor drift, these data were excluded from the analyses. Flood-tide values were
identified and extracted from the 15-min data record using the maximum salinity data
that occurred closest to the time of predicted high tides. During 2002-2004, two sondes
were deployed at station Yl, one deployed about 1 m below the water surface and the
second deployed about 2.5 m below the water surface. During 2007-2009, only one sonde
was deployed at 2.5 m. Data from the 2.5- m depth sonde were utilized, with data from
the 1-m depth sonde substituted to fill data gaps. During 2005 and 2006, there appeared
to be a positive bias in salinity data; therefore, these data were excluded from analyses.
2.1.3 Additional Yaquina Estuary Data
Additional data were collected by the Pacific Coastal Ecology Branch in Zone 1
of the Yaquina Estuary during the months of May through October of 1998-2008 either
from water quality cruises or the OSU dock. Sampling stations extended from near the
mouth to a distance of about 12 km up estuary. Data compiled included dissolved
inorganic nutrients, dissolved oxygen, water temperature, and salinity. The data collected
from 1998-2006 were previously described in Brown et al. (2007). During 2007, seven
stations were sampled in the lower estuary approximately once per month from June 8 -
September 25l . During May through October of 2007 and 2008, nutrient samples were
collected approximately weekly at station Yl. This sampling was random with respect to
tidal stage.
2.1.4 Other Sources of Data
Classification Dataset
As part of a study to classify estuaries with regard to their susceptibility to
nutrient enrichment (Lee and Brown, 2009), we have conducted short-term deployments
of instruments to measure temperature, salinity and dissolved oxygen levels in several
Oregon estuaries. Data sondes were deployed near the mouths of the Siletz (June 23-25,
2008), Tillamook (July 19-25, 2005), and Umpqua (June 21-26, 2005) estuaries during
the summers of 2005 and 2008. Length of deployments varied from 2 to 7 days and the
same calibration procedures described above were used for these deployments.
11
-------
Coos Bay Dataset
Continuous sonde data were also available for a station near the mouth of Coos
Bay (Charleston Bridge Station at Latitude: 43° 20' 15.72" N, Longitude 124° 19'
13.92" W) from the South Slough National Estuarine Research Reserve
(http://nerrs.noaa.gov/SouthSlough/).
Near shore Data
Additionally, we compared the flood-tide data collected at station Yl to
temperature, salinity and nutrient data from 1997 through 2004 that were available from
Wetz et al. (2005) for two stations on the inner continental shelf off Newport, Oregon
(NH-5,NH-15 , Figure 1). Data from the months of May through October, which
coincides with the period during which upwelling predominantly occurs, were extracted.
Hourly wind speed and direction data were available from a near shore NOAA weather
station adjacent to the Yaquina Estuary (NWP03, Figure 1; http://www.ndbc.noaa.gov/).
Flood-tide nutrient, water temperature, and dissolved oxygen conditions at station Yl
have previously been correlated with integrated alongshore wind stress (Brown and
Ozretich, 2009; Brown and Power, in review). Integrated alongshore wind stress (with a
decay coefficient of 2 days) was calculated using wind data from station NWP03 during
the years of 1998-2008. Details on calculation of integrated alongshore wind stress are
provided in Brown and Ozretich (2009). In this report, a positive wind stress indicates
upwelling favorable wind stress from the north. Water temperature data were available
from a NOAA station located inside the Yaquina Estuary (SBEO3, Figure 1;
http://www.ndbc.noaa.gov/stati on_page.php?station=sbeo3).
2.2 Data Analyses
The first step in identifying exceedances of water quality thresholds associated
with ocean input is to specify the target thresholds (Table 1). Potential numeric targets
for DIN and DIP have been previously identified by Brown et al. (2007). These
thresholds are based on dry season median values of DIN and DIP for Zone 1 in the
Yaquina Estuary. The threshold for dissolved oxygen is the state of Oregon criterion for
estuarine waters (http://arcweb.sos.state.or.us/rules/OARs_300/OAR_340/340_041 .html).
12
-------
Table 1. Water quality thresholds used in development of logistic regression models.
Parameter
DIN
DIP
Dissolved Oxygen
Threshold
14 uM
1.3 uM
6.5 mgl'1
Source
Brown et al. (2007)
Brown et al. (2007)
State of Oregon Criterion for Estuaries
2.2.1 Logistic Regression
To develop indicators that can be used to determine whether ocean conditions are
responsible for exceedances of water quality thresholds, we used logistic regression
models. Logistic models can be used to predict the probability of an event when the
dependent variable is dichotomous. Logistic regression models have the form of
(Eq. 1)
where in our application p is the probability of exceedance of a water quality threshold
and Po is a constant, Pi; fa, • • •, Pk are the regression coefficients of variables xj, x2, ... Xk,
respectively. The probability of an event occurring can be calculated as
1
P=-
(Eq. 2)
1 + e-loglt(p)
The logistic regression models were used to predict the probabilities that water
quality thresholds were exceeded due to ocean conditions at the time of sampling.
Logistic regression equations were generated for DIN, DIP, and dissolved oxygen water
quality thresholds (Table 1). To create a dichotomous outcome for each dependent
variable, a threshold value which is indicative of a potential water quality objective or
threshold was specified. For DIN and DIP, if nutrient levels exceeded the threshold of
either 14 uM or 1.3 uM, respectively, then a value of 1 was assigned (i.e., water quality
objective not met), otherwise it was assigned a value of 0. For dissolved oxygen, if
concentrations were below the threshold of 6.5 mg I"1, a value of 1 was assigned;
otherwise it was assigned a 0. All logistic regression models were generated using R
(version 2.8.1; R Development Core Team, 2008).
13
-------
Logistic regression models were generated using the following explanatory
variables: 1) water temperature, 2) water temperature and salinity, and 3) sigma-t
(calculated from water temperature and salinity). For dissolved oxygen, an additional
logistic regression model was developed using water temperature, salinity, and in situ
fluorescence as explanatory variables. All models were developed using flood-tide data,
since this is representative of oceanic water advected into the estuary.
To validate the logistic regression models, we randomly selected 20% (108 data
points) of the 2002-2004 flood-tide nutrient data and reserved it for model validation.
These reserved observations were randomly selected for DIN and DIP independently.
For the dissolved oxygen logistic regression model, we used flood-tide dissolved oxygen
data collected at station Yl in May to September 2009 for model validation. Since water
temperature and salinity were not measured at the time of nutrient sample collection, we
used the continuous data described in the previous section for these parameters.
3. Results and Discussion
3.1 Role of Oceanic Conditions in Causing Exceedances of Water Quality
Thresholds
Numerous studies have found that there is considerable interannual variability in
oceanic conditions on the shelf in the California Current Region resulting in variability in
nutrient, chlorophyll a, and dissolved oxygen levels (Corwith and Wheeler, 2002;
Thomas et al., 2003; Wheeler et al., 2003; Grantham et al., 2004; Earth et al., 2007). In
addition, it has been demonstrated that conditions on the shelf influence water quality
conditions within PNW estuaries (Roegner and Shanks, 2001; Roegner et al., 2002;
Brown and Ozretich, 2009). Upwelling conditions typically result in high concentrations
of DIN and DIP and lower concentrations of dissolved oxygen in surface waters that get
advected into PNW estuaries. The magnitudes of increased nutrient or decreased oxygen
concentrations vary with the year- to-year strength of upwelling. Therefore, it follows
that non-attainments of estuarine water quality criteria may be related to interannual
variability in ocean conditions.
14
-------
3.1.1 Nutrients
To examine the importance of interannual variability in ocean conditions on
nutrient levels within the estuary, we used the lower estuary (Zone 1) dry season nutrient
threshold values presented in Table 1. We then calculated the number of exceedances of
these thresholds for each year over the period 1998-2008 based on nutrient-cruise data. It
is important to note that the data used to examine interannual variability in exceedances
includes the data used to generate the criteria for the years of 1998-2006. The 90%
confidence intervals for the percent of observations constituting exceedances of the
nutrient thresholds were calculated following the method of Donohue and van Looij
(2001). An annual exceedance of the nutrient threshold is determined if the lower 90%
confidence interval falls above the level of 50% of observations as exceedances.
Based on this technique, the criterion for DIN (based on median values) would be
exceeded during 2001 and 2002 (Figure 2a) and the DIP criterion would be exceeded in
2002 (Figure 2b). For an additional comparison, we used the flood-tide nutrient samples
collected as the OSU dock in 2002, 2003, and 2004. The percent of observations
exceeding the DIN threshold was similar to that for the cruise data in each year
(Figure 2a). There was a significant correlation between the annual percent of
observations exceeding the DIN and DIP thresholds (r = 0.87, p = 0.001, Pearson Product
Moment Correlation), because the ocean is the primary source of both of these nutrients
in the lower estuary.
In the dry season, there is also a significant correlation between interannual
variability in exceedances and median flood-tide water temperature (Figure 3), which is
an indicator of the relative strength of coastal upwelling (Brown and Ozretich, 2009).
The rate of exceedances of the DIN threshold was lowest in 1998, concurrent with El
Nino conditions on the Oregon shelf, and an associated reduction in coastal upwelling.
El Nino conditions occurred on the Oregon shelf from August 1997 through July 1998,
and as a result, nutrient conditions were low on the inner shelf (Peterson et al., 2002).
The highest rate of exceedances of the DIN threshold occurred in 2002, which coincided
with anomalous conditions on the Oregon shelf. There was an intrusion of a subarctic
water mass (Kosro, 2003) onto the Oregon shelf, and as a result the shelf water was
cooler than usual and had higher than normal nutrient levels (Wheeler et al., 2003).
15
-------
Interestingly, the rate of exceedances is not correlated with the Bakun upwelling index
(daily values for latitude 45 °N, longitude 125°W averaged over the interval of May to
October for each year). Menge et al. (2009) suggested that the Bakun index does not
adequately reflect the magnitude of upwelling conditions occurring in the nearshore
region.
CO
o
'-4—'
CO
o
M—
o
-I—'
CD
CD
Q.
100
80-
>
I
? 60-
A
z
Q 40-
20-
Cruise
Flood-tide
I I I I I I I I I I I
1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008
100
5 80-
CO
'-co 1 60 -I
£ co
CD ,J
I A 40-
M- S=
O Q
1 -^
CD
Q.
0
b)
I \ \ \ \ \ \ \ \ \ \
1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008
(66) (95) (164) (141) (336) (268) (43) (38) (57) (21)
Year
Figure 2. Interannual variability in the percent of dry season observations with a) DIN >
14 uM (filled symbols are cruise data and open symbols are flood-tide OSU dock
samples) and b) DIP > 1.3 uM in Zone 1. The sample size of the cruise data is presented
in parentheses below each year in panel b. The error bars represent 90% confidence
intervals.
16
-------
o
100.
80-
O z
^040-1
*-• .n
c +i
0) >
§52<
D.
DIN
- Water temperature
-9
01
t-t-
CD
CD
3
T3
-- 10 »
- 11
01
i-i-
c
CD
.
CD
CQ
O
12
1998 2000 2002 2004
Year
2006
2008
80.
I 60-
> 1
k_ ^~
0) ^
O z 40-|
"5 Q
0)
D.
20-
b)
8
10
11
12
Water Temperature (deg C)
Figure 3. a) Percent of dry season observations with DIN > 14 uM in the lower estuary
and median flood-tide water temperatures for each year, b) Linear regression of the two
variables (Percent observations exceeding threshold = 224.3 - 17.27 * water temperature,
r = 0.61, p < 0.01). Median water temperatures were calculated using flood-tide values
from station Yl for 2001 and 2003-2008, other years were calculated using water
temperature data from SBEO3.
3.1.2 Dissolved Oxygen
There is considerable interannual variability in the percent of flood-tide
observations with dissolved oxygen levels less than 6.5 mg I"1 at station Yl (Figure 4a).
During 2006, 56% of the flood tides had dissolved oxygen < 6.5 mg I"1, which coincided
with strong upwelling conditions near Newport, Oregon. The lowest percentage
occurrence of dissolved oxygen < 6.5 mg I"1 occurred in 2005. There was a delay in the
onset of upwelling on the Oregon coast during 2005 (Barm et al., 2007). Previously, we
have demonstrated that there is a significant correlation between integrated alongshore
17
-------
wind stress and flood-tide dissolved oxygen values in the Yaquina Estuary (Brown and
Power, in review). Figure 4b shows that there is also a significant correlation between
percent of flood-tide observations with dissolved oxygen below the criterion and median,
integrated alongshore wind stress for May through October (p < 0.05).
60
> .
CD CD
CO v
o g
CD O)
5 o
o -o
O CD
g b
O .C
I— -I-J
CD >
CO _
£ ^
CD CD
W v
O g
CD O)
8 "S
u. >.
M- O
o w
*- .!2
12
50-
40-
-i 30-
I I I I I I
2002 2003 2004 2005 2006 2007
Year
I I
2008 2009
50-
40-
30-
20
b)
\
2
\
3
2 -1x
Integrated Wind Stress (m s")
Figure 4. a) Percent of dry season observations of flood-tide dissolved oxygen < 6.5 mg
I"1 at station Yl for each year, b) Linear regression of exceedances versus median
integrated wind stress at station NWP03, with positive values indicating upwelling
conditions. Years with the highest percentage occurrence of dissolved oxygen < 6.5 mg 1"
1 coincide with strong upwelling conditions (Percent observations exceeding threshold =
15.59 - 8.308 * wind stress, r2 = 0.54, p < 0.05).
3.2 Development of Indicators of Ocean Influence
3.2.1 Nutrients
In the previous section, it was shown that ocean conditions can influence water
quality conditions within the Yaquina Estuary. Consequently, it is desirable to develop
indicators which will allow us to distinguish exceedances of water quality thresholds that
18
-------
are related to ocean conditions from other causes. Previously, we have demonstrated that
flood-tide water temperatures are strongly correlated with inner shelf water temperature
(Brown and Power, in review) and alongshore wind stress (Brown and Ozretich, 2009).
In addition, Nelson and Brown (2008) demonstrated that nitrate and phosphate levels in
flood-tide water samples collected in the Yaquina Estuary can be modeled using flood-
tide water temperatures. Figures 5 and 6 show DIN and DIP as a function of water
temperature and salinity generated using either data from the inner shelf or flood-tide
samples from the Yaquina Estuary. Water density (sigma-t values) contours are also
presented. High DIN and DIP concentrations occur at high salinities (>33 psu), cold
water temperatures (< 10 °C), and high water densities (sigma-t > 26 kg m"3) both on the
shelf and within the estuary. Peak DIN and DIP concentrations associated with coastal
upwelling in the PNW are equivalent to concentrations defined as representing medium
and high categories of eutrophication status when DIN and DIP are used as water quality
indicators (Bricker et al., 2003). The DIN and DIP thresholds presented in Table 1 are
exceeded in 45% and 48% of the flood-tide observations at station Yl (Figures 5b and
6b), respectively.
Logistic Regression
We used the flood-tide DIN and DIP data (Figure 5) to generate logistic
regression models, which can be used to predict the probability of DIN and DIP
exceeding the thresholds of 14 uM and 1.3 uM, respectively. Logistic regression models
were generated for three sets of explanatory variables: 1) water temperature, 2) water
temperature and salinity, and 3) water density (sigma-t).
The intercepts and coefficients (including standard errors and p-values) of each of
the logistic regression models generated for DIN and DIP are presented in Table 2. To
use a logistic regression model to predict the probability of an occurrence of an event
being modeled, the user needs to specify the prediction point. If the modeled probability
exceeds the prediction point, then the model predicts that the event being modeled has
occurred (in this case the nutrient threshold has been exceeded). The selection of the
prediction point represents a trade off between type I (false positive) and type II (false
negative) errors. Examples of classification tables for the DIN and DIP logistic
19
-------
DIN
34 -
w
>-32 -
"ro
OD
30 -
b) flood tide at Y1
T
8
\
10
\
12
\
14
16
18
Water Temperature (deg C)
14
20
25
30
35
40
Figure 5. DIN as a function of temperature and salinity generated using dry season data
from a) the inner shelf off of Newport, Oregon (Stations NH-5 and NH-15 from Wetz et
al., 2005) and b) flood-tide samples from station Yl in the Yaquina Estuary. The color
of the symbol indicates DIN concentration (uM) and the contours indicate sigma-t values.
20
-------
34 -
832 -
"ro
OD
30 -
24
a) Inner Shelf
T
8
10
12
14
16
18
DIP
34 -
w
Q.
32 -
"ro
OD
30 -
K*
b) flood tide at Y1
8
\
10
\
14
12 14 16
Water Temperature (deg C)
18
Figure 6. DIP as a function of temperature and salinity generated using dry season data
from a) the inner shelf off of Newport, Oregon (Stations NH-5 and NH-15 from Wetz et
al., 2005) and b) flood-tide samples from station Yl in the Yaquina Estuary. The color
of the symbol indicates DIP concentration (uM) and the contours indicate sigma-t values.
21
-------
regression models using water temperature and salinity as explanatory variables are
presented in Tables Al and A2. These classification tables show the number of
observations correctly and incorrectly classified as exceeding the DIN and DIP thresholds
as a function of prediction point generated with the data used to create the models.
A false positive is when the logistic model predicts that the nutrient threshold will
be exceeded but the observed value does not exceed the threshold. A false negative is
when the logistic model predicts that the threshold will not be exceeded, but the observed
value exceeds the threshold. The false positive rate is the total number of false positives
divided by the number of observations below the threshold. The false negative rate is the
total number of false negatives divided by the number of observations that exceed the
threshold. Selection of a lower prediction point (i.e., lowering the probability threshold
that needs to be exceeded) results in an increase in sensitivity (number of times the model
correctly predicts an exceedance of the threshold compared to the total number of
observations that exceed the threshold) at the cost of increasing the occurrence of false
positives. To evaluate the optimal prediction point, the false positive and false negative
rates are plotted versus prediction point (Figure 7). The optimal prediction point is where
the false positive and false negative rates are equal (or the intersection of the two curves).
The optimal prediction points for each of the nutrient models are presented in Table 3.
The logistic regression models can be used to develop cutpoints for identification
of exceedances associated with ocean input. Development of such cutpoints would allow
users to simply compare their observed temperature or density to these values to discern
whether exceedances are related to ocean input. Combining the optimal prediction points
and the equations for the logistic regression using only water temperature as the
explanatory variable results in a water temperature cutpoint for DIN and DIP
exceedances of 10.6 °C. Thus, these logistic regression models predict that exceedances
of DIN and DIP thresholds are related to ocean input when the water temperature is less
than 10.6 °C. The water density cutpoints for DIN and DIP exceedances are sigma-t
values of 25.67 and 25.59 kg m"3, respectively (see Figures 5 and 6).
Receiver operating characteristic (ROC) curves are often used to evaluate the
predictive capabilities of models (Figure 8). ROC curves are generated by plotting true
positive rate or sensitivity (ratio of the number of times the model correctly predicts an
22
-------
exceedance of the threshold compared to the total number of observations that exceed the
threshold) versus false positive rate (ratio of the number of times the model incorrectly
predicts an exceedance of the threshold compared to the total number of observations less
than the threshold) for prediction points ranging from 0 to 1. The values plotted in ROC
curves are expressed as ratios rather than the percentages presented in the Appendix
tables. An example of an ROC curve for the DIN > 14 uM model using water
temperature and salinity as explanatory variables is presented in Figure 8.
Table 2. Intercepts and coefficients for logistic regression models for exceedances of
DIN and DIP thresholds
Parameter
Standard error
p value
DIN using water temperature
Intercept
Water temperature
14.729
-1.417
1.390
0.133
p< 0.001
p< 0.001
DIN using water temperature and salinity
Intercept
Water temperature
Salinity
-56.5829
-0.9373
1.9937
11.4365
0.1345
0.3333
p< 0.001
p< 0.001
p< 0.001
DIN using sigma-t
Intercept
Sigma-t
-88.9207
3.4799
9.3790
0.3648
p< 0.001
p< 0.001
DIP using water temperature
Intercept
Water temperature
14.4256
-1.3656
1.3796
0.1306
p< 0.001
p< 0.001
DIP using water temperature and salinity
Intercept
Water temperature
Salinity
-91.6108
-0.8167
3.0139
14.8188
0.1333
0.4374
p< 0.001
p< 0.001
p< 0.001
DIP using sigma-t
Intercept
Sigma-t
-106.5114
4.1778
11.1929
0.4362
p< 0.001
p< 0.001
23
-------
Table 3. Sample size and area under the receiver operating characteristic curve (AUC)
for DIN and DIP models. The optimal prediction points (false negative = false positive
rates) are presented for each model.
Model
DIN using water
temperature
DIN using water
temperature and salinity
DIN using sigma-t
DIP using water
temperature
DIP using water
temperature and salinity
DIP using sigma-t
N
431
431
431
431
431
431
AUC
0.91
0.94
0.94
0.90
0.96
0.96
Optimal Prediction Point
0.45
0.55
0.55
0.49
0.60
0.60
100
— False Positive Rate
• — False Negative Rate
o>
-I—'
CO
a:
Optimal Prediction Point = 0.55
20-
0.0
Prediction Point
Figure 7. False positive and false negative rates as a function of prediction point for the
logistic regression model for DIN > 14 uM using water temperature and salinity.
Optimal prediction point (0.55) is where overall failure rate is minimized and is located at
the intersection of the two curves.
24
-------
Models with high predictive capacity have curves that rise rapidly and have larger
areas under the curve (AUC). An ideal model would have an AUC value of 1. Hosmer
and Lemeshow (2000) suggest that if the area under the ROC curve is > 0.9 the model
has 'outstanding' discrimination; for AUC values between 0.8 and 0.9, the model has
'excellent' discrimination capability; for AUC values between 0.7 and 0.8, the model has
'acceptable' discrimination; and if the AUC = 0.5, the model has no discrimination
capability. All of the models computed for the nutrient thresholds had 'outstanding'
discrimination capability (Table 3); however, the models which included salinity
consistently had higher AUC values than those models which omitted salinity.
1.0-
0.8 -\
W)
I 0.6
ca
CD
:•! °-4
o
Q_
CD
H 0.2 -I
0.0
0.0
0.2
0.4
0.6
0.8
1.0
False Positive Rate (1 - Specificity)
Figure 8. ROC curve for logistic regression model for DIN > 14 uM using water
temperature and salinity with an AUC value of 0.94. The dashed line indicates the line of
no discrimination capability (i.e., random guess).
25
-------
Model Validation
The predictive capability of the models was tested using the data which were
reserved for model validation. For model validation, we used the models with water
temperature and salinity as explanatory variables which had the highest AUC values and
used the optimal prediction points of 0.55 and 0.60 for DIN and DIP, respectively. For
each observation, the probability of exceeding the threshold was predicted using the
logistic regression model, and if the probability was greater than the optimal prediction
point, then the observation was predicted to exceed the threshold. Because the data used
to generate the logistic regression models were exclusively flood-tide values, then a
modeled exceedance of a water quality threshold represents the effect of ocean conditions
at the time of sampling.
Prediction accuracies of the models for the reserved data are presented in 2x2
classification tables (Tables 4 and 5). The logistic regression models had an overall
accuracy of 88.9% and 90% for DIN and DIP, respectively. Sensitivity is ratio of the
number of times the model correctly predicts an exceedance to total number of observed
exceedances. Specificity is defined as the ratio of correctly classified occurrences of
nutrients less than the threshold to total number of observed occurrences less than the
threshold. The sensitivity of the models was 89.8% and 89.3% for DIN and DIP,
respectively. The specificity of the models was 88.1% and 92.3% for DIN and DIP,
respectively.
Table 4. Classification table showing accuracy of the water temperature and salinity
logistic regression equation at predicting the occurrence of DIN > 14 uM using the
reserved data. Prediction point = 0.55.
Observed
DIN < 14 uM
Observed
DIN > 14 uM
Total
Predicted Occurrence of DIN
<14uM
52
5
57
>14uM
7
44
51
Total
59
49
108
26
-------
Table 5. Classification table showing accuracy of the water temperature and salinity
logistic regression equation for predicting the occurrence of DIP > 1.3 uM using the
reserved data. Prediction point = 0.60.
Observed
DIP < 1.3 uM
Observed
DIP > 1.3 uM
Total
Predicted Occurrence of DIP
<1.3uM
48
6
54
> 1.3 uM
4
50
54
Total
52
56
108
Demonstration of Application of Logistic Regression Model
We applied the logistic regression models for DIN and DIP to water temperature
and salinity data collected concurrently with nutrient data during the dry seasons of 1998-
2008 in the marine-dominated portion (Zone 1) of the Yaquina Estuary. The temperature
and salinity of DIN observations that either exceeded the 14 uM threshold (filled circles)
or fell below the threshold (open circles), together with the probability contours of the
logistic regression model are presented in Figure 9. Fifty percent of the DIN
observations exceeded the 14 uM threshold, which is to be expected since it was based on
the median value of dry season data from 1998-2006. The logistic regression model
predicts that 46% of the DIN exceedances of the 14 uM threshold are associated with
ocean input (these data points are indicated with a green "x" in Figure 9). There is also
evidence of a riverine DIN source, which the logistic regression model does not identify
as ocean input. We, therefore, examined mixing diagrams (i.e., salinity versus DIN
graphs) to identify observations where the exceedance of the DIN threshold can be
attributed to a riverine source (red "x", Figure 9). Figure 10 shows an example of a
mixing diagram which indicates a riverine source for DIN.
The temperature and salinity of DIP observations that either exceeded the 1.3 uM
threshold (filled circles) or fell below the threshold (open circles), together with the
probability contours of the logistic regression model are presented in Figure 11. Forty
seven percent of the DIP observations exceeded the 1.3 uM threshold and the model
predicts that 44% of these exceedances are associated with ocean input (indicated by a
green "x" in Figure 11). There are fewer observations exceeding the DIP threshold at
relatively low salinities (< 27 psu), than there are for the DIN observations, this is
27
-------
30--
25-
CO
20-
to
oo
15-
10
Observed DIN< 14|iM
Observed DIN > 14|iM
DIN > 14 |iM Identified as Ocean Input
DIN > 14 |iM With Riverine Source
\
8
Probability of DIN > 14|iM
10 12 14 16 18
Water Temperature (deg C)
20
22
Figure 9. Temperature and salinity of cruise data measured in the Yaquina Estuary during the dry seasons of 1998-2008 with DIN <
14 uM (open circles) and DIN > 14 uM (filled circles), and contours of probability of DIN > 14 uM generated from logistic
regression model with water temperature and salinity as explanatory variables. The green "x" symbols are observations of DIN > 14
uM identified as ocean input from the logistic regression model with a prediction point of 0.55 the red "x" symbols are those that
appear to have a riverine DIN source, as determined from mixing diagrams. The white arrow indicates a heating and mixing line.
-------
60-
50-
40-
30-
20-
10-
i
0
i
5
10
15
I
20
I
25
I
30
Salinity (psu)
Figure 10. Example of a mixing diagram showing a riverine DIN source (generated
using cruise data from May 6, 2003).
29
-------
25-1
w
Q.
05
Probability of DIP > 1.3 uM
20-
15-
10
Observed DIP< 1.3|iM
Observed DIP> 1.3|iM
Observed DIP > 1.3 |iM Identifed as Ocean Input
10
12
14
16
18
20
22
Water Temperature (deg C)
Figure 11. Temperature and salinity of cruise data measured in the Yaquina Estuary during the dry seasons of 1998-2008 with DIP <
1.3 uM (open circles) and DIP > 1.3 uM (filled circles), and contours of probability of DIP > 1.3 uM generated from logistic
regression model with water temperature and salinity as explanatory variables. The green "x" symbols are observations of DIP > 1.3
uM identified as ocean input from the logistic regression model with a prediction point of 0.60. The white arrow indicates a heating
and mixing line.
-------
because the ocean is the primary source of DIP.
One of the limitations of using logistic regression models to calculate the probability of
an exceedance being due to ocean conditions is that the cool, high nutrient oceanic water warms
up and mixes with low salinity and warm water both inside the estuary and on the shelf, reducing
the distinctive thermohaline signature. Studies from the Oregon shelf off of Newport have
demonstrated that most of the water that upwells is Subarctic water, which has salinity ranging
from 32.5 to 33.8 psu and similar peak nutrients as those entering the estuary (Wheeler, et al.
2003; Huyer et al., 2005). In addition, the Columbia river plume influences shelf water off of
Newport, with plume water having salinity < 32.5 psu. Offshore of Newport (80 km), the mean
summer time water temperature is about 17°C and water temperatures off of Newport are
strongly influenced by mixing, upwelling and advection (Huyer et al., 2005). Those samples
identified as being associated with ocean input are a conservative estimate, and other observed
exceedances that fall along the white arrows in Figures 9 and 11 are probably associated with
upwelled ocean water that has heated and mixed with lower salinity water. Along this line, there
is relatively large change in temperature (~ 8 °C) , and a relatively small change in salinity (2
psu), suggesting that heating dominates. A portion of this heating and mixing is occurring on the
shelf and some is occurring inside the estuary.
Alternate Approach - Using lagged flood tide data
An alternate approach to identify exceedances associated with ocean input is to calculate
the probability using water temperature and salinity from the previous flood tide. Values for
temperature and salinity at the time of nutrient sampling are compared to those from the previous
flood tide for May- October 2008 (Table 6), together with the probabilities calculated from the
logistic regression using conditions at time of sampling and for the previous flood tide. As an
example the sampling on May 15, 2008 shows an event where the nitrogen and phosphorous
water quality thresholds were exceeded. However, the logistic regression model calculated using
the water temperature and salinity at time of sampling would not classify this sampling event as
an exceedance associated with ocean conditions. In contrast, the use of data from the previous
flood-tide would identify the event as an exceedance associated with ocean input.
Of the 8 observed exceedances of the DIN threshold during May - September 2008, the logistic
regression model using temperature and salinity at time of sampling identified 4 as being
-------
associated with ocean input, while by using the temperature and salinity from the previous flood
tide, an additional 3 of the exceedances would be identified as being associated with ocean input
(Table 6). Of the 9 exceedances of the DIP threshold, using temperature and salinity at time of
sampling identified 5 as being associated with ocean input, while by using data from the previous
flood tide, an additional 2 would be identified as being related to ocean input. However, using
the previous flood-tide temperature and salinity combined with the logistic regression model
would incorrectly indicate that 6 of the DIN and DIP observations less than threshold would be
expected to exceed the threshold.
32
-------
Table 6. Observed DIN and DIP, water temperature and salinity, and probability of exceeding nutrient thresholds calculated using
water temperature and salinity at time of sampling and previous flood tide. The shaded observed nutrient cells identify those that
exceeded the nutrient threshold, and the shaded probability cells are those that exceeded the optimal prediction point.
Sampling
Date
5/1/2008
5/9/2008
5/15/2008
5/21/2008
6/9/2008
6/26/2008
6/27/2008
6/30/2008
7/2/2008
7/16/2008
7/24/2008
7/31/2008
8/7/2008
8/12/2008
8/22/2008
8/26/2008
9/5/2008
9/10/2008
9/18/2008
9/23/2008
Observed
DIN
(uM)
1.6
32.0
22.2
21.5
14.5
10.1
10.3
9.4
10.5
29.8
25.6
6.3
13.9
9.3
10.8
6.7
16.5
33.5
5.6
9.1
DIP
(uM)
0.22
1.80
1.56
1.62
0.81
1.20
1.29
0.94
1.12
2.13
1.88
0.69
1.47
0.99
1.42
1.13
1.96
2.65
0.74
1.06
Time of Sampling
Temperature
(degC)
9.8
8.6
12.4
9.8
13.4
14.5
14.7
11.6
13.7
8.9
9.2
10.3
10.7
12.0
13.9
15.1
12.0
8.9
10.6
10.8
Salinity
(psu)
30.7
33.6
31.2
33.1
26.3
30.0
30.5
32.6
31.9
33.7
33.4
34.2
33.9
34.1
32.4
32.3
31.0
33.5
34.1
34.1
Previous Flood Tide
Temperature
(degC)
9.8
8.6
9.0
9.8
10.1
8.1
8.5
10.8
12.1
8.7
8.7
10.3
9.8
13.0
14.0
14.3
9.4
8.9
10.4
10.6
Salinity
(psu)
30.9
33.6
33.5
33.0
33.3
34.1
34.2
33.8
33.6
33.8
34.5
34.3
34.2
34.0
32.5
32.5
33.5
33.5
34.2
34.1
Probability of Exceeding Nutrient Threshold
Calculated Using Conditions from:
Time of Sampling
DIN
0.01
0.92
0.00
0.53
0.00
0.00
0.00
0.08
0.00
0.90
0.79
0.89
0.72
0.53
0.01
0.00
0.00
0.86
0.82
0.01
DIP
0.00
0.94
0.00
0.50
0.00
0.00
0.00
0.06
0.00
0.93
0.81
0.96
0.85
0.78
0.01
0.00
0.00
0.88
0.92
0.00
Previous Flood Tide
DIN
0.02
0.92
0.85
0.52
0.58
0.98
0.97
0.66
0.27
0.93
0.98
0.89
0.91
0.26
0.01
0.01
0.78
0.87
0.85
0.02
DIP
0.00
0.94
0.87
0.49
0.61
0.99
0.99
0.81
0.41
0.95
0.99
0.96
0.97
0.54
0.01
0.00
0.82
0.90
0.94
0.00
-------
-3
Alternate Approach - Using Modeled Flood-Tide Nutrients
Nelson and Brown (2008) present equations to model NCb +NO2 and PO4" levels using
water temperature (generated using inner shelf data from Wetz et al. (2005)). Presented in
Table 7 are observed and modeled median values for NC>3 +NO2 and PO4 calculated using
water temperature at time of nutrient sampling and water temperature during the flood tide
previous to the nutrient sampling for data collected at station Yl during May to October 2008.
The observed nutrients are significantly higher than those modeled using water temperature at
time of sampling (Mann Whitney Rank Sum, p = 0.05), but there is not a significant difference
between observed values and those modeled using flood-tide water temperatures, suggesting that
observed nutrients are a result of ocean input. This analysis suggests that comparing observed
nutrients to those modeled using flood-tide water temperatures may be an alternate approach to
determine if observed nutrient levels are consistent with ocean input.
Table 7. Observed median NOs +NO2 and PO4 for May - September 2008 and
modeled using water temperature at time of sampling and water temperature during
flood tide previous to sampling. Modeled values are calculated using the equations in
Nelson and Brown (2008). N=20.
Nutrient
NO3" +NO2"
PO4"'
Observed Median
(uM)
7.9
1.3
Modeled Median (uM) Calculated Using
Temperature at
Time of Sampling
3.7
0.9
Flood Tide
Temperature
9.1
1.3
34
-------
3.2.2 Dissolved Oxygen
Dissolved oxygen levels in the lower portion of Yaquina Estuary are also
influenced by upwelling conditions on the inner shelf. Figure 12 shows flood-tide
dissolved oxygen levels at station Yl as a function of a) temperature and salinity and b)
density and in situ fluorescence. Low dissolved oxygen levels (< 5 mg I"1) tend to occur
at cool water temperatures (8-10 deg C), high salinities (32.5-34.5 psu), high water
densities (sigma-t values > 25 kg m"3), and are associated with low in situ fluorescence (<
5 ug I"1), all of which are characteristics of recently upwelled water (Pearson and Holt,
1960; Park et al., 1962; Bourke and Pattulo, 1975; Brown and Power, in review).
Occurrences of relatively high dissolved oxygen levels (> 6.5 mg I"1) that occur at high
water densities (sigma-t > 25 kg m"3) also tend to have relatively high in situ fluorescence
(an indicator of phytoplankton chlorophyll a levels). In this dataset, flood-tide dissolved
oxygen was less than 6.5 mg I"1 (State of Oregon criterion for estuarine waters) 38% of
the time at station Yl.
Logistic Regression
We used flood-tide dissolved oxygen data collected at station Yl in the Yaquina
Estuary (Figure 12) to generate logistic regression models, which predict the probability
of dissolved oxygen levels < 6.5 mg I"1. Logistic regression models were generated for
four sets of explanatory variables: 1) water temperature, 2) water temperature and
salinity, 3) sigma-t, and 4) water temperature, salinity and in situ fluorescence. Due to
diel fluctuations in dissolved oxygen levels, time of day is usually included in regression
models of dissolved oxygen; however, our analysis did not indicate that time of day was a
significant explanatory variable in these logistic models. The logistic regression models,
standard errors and p-values are presented in Table 8.
The AUC values and optimal prediction points for the logistic regression models
generated for the dissolved oxygen threshold are presented in Table 9. All of the models
developed for dissolved oxygen had 'excellent' discrimination capability; however, there
was improvement in model performance with the addition of in situ fluorescence. In situ
fluorescence data were not available in 2002 and 2003; therefore, there is a large
difference in sample size between the models that include fluorescence and those that
exclude it (Table 9). The water temperature and water temperature and salinity models
were re-calculated using the subset of data used for the model which includes in situ
35
-------
35
34 -
-33 -
•E32 -
CD
CO
31 -
30 -
a)
\
8
n ' i ' i ' r
10 12 14 16
Water Temperature (deg C)
Dissolved Oxygen
(mg r1)
•
18
22
26
-3
2.5
5.0
Sigma-t (kg m -1000)
Figure 12. Flood-tide dissolved oxygen at station Yl in the Yaquina Estuary plotted
versus a) temperature and salinity, and b) sigma-t and in situ fluorescence. The upper
panel included flood-tide data from May-October of 2002, 2003, 2004, 2007, and 2008.
The lower panel includes data from May-October of 2004, 2007, and 2008.
36
-------
fluorescence. These re-calculated values resulted in AUC values of 0.85 similar to those
obtained for the full dataset indicating that sample size was not producing the difference
in AUC values presented in Table 9. The false negative and false positive rates as a
function of prediction point for the water temperature and salinity logistic regression
model are presented in Figure 13 and the classification table is presented in Table A3.
Combining the optimal prediction points and the equation for the dissolved
oxygen logistic regression, using only water temperature as the explanatory variable,
results in a cutpoint of 10.3 °C for occurrences of dissolved oxygen < 6.5 mg l^as
predictive of ocean input. A sigma-t value of 25.4 kg m"3 represents the density cutpoint
for the occurrence of dissolved oxygen level< 6.5 mg I"1 consistent with ocean input.
100
False Positive Rate
— False Negative Rate
o>
-I—'
CO
a:
Optimal Prediction Point = 0.43
40-
20-
0
0.0 0.2 0.4 0.6 0.8 1.0
Prediction Point
Figure 13. False positive and false negative rates as a function of prediction point for the
logistic regression model for dissolved oxygen < 6.5 mg I"1 using water temperature and
salinity as explanatory variables. Optimal prediction point (0.43) is where overall failure
rate is minimized and is located at the intersection of the two curves.
37
-------
Table 8. Intercepts and coefficients for logistic regression models for occurrences of
dissolved oxygen <6.5 mg I"1.
Parameter
Standard error
p value
Dissolved oxygen < 6.5 mg 1" using water temperature
Intercept
Water temperature
10.47223
-1.03563
0.72229
0.06999
p< 0.001
p< 0.001
Dissolved oxygen < 6.5 mg 1" using water temperature and salinity
Intercept
Water temperature
Salinity
-13.08966
-0.84449
0.65113
3.95360
0.07281
0.10986
p< 0.001
p< 0.001
p< 0.001
Dissolved oxygen < 6.5 mg 1" using sigma-t
Intercept
Sigma-t
-42.5248
1.6589
2.8411
0.1114
p< 0.001
p< 0.001
Dissolved oxygen < 6.5 mg 1" using water temperature, salinity, and in situ
fluorescence
Intercept
Water temperature
Salinity
In situ fluorescence
-23.59785
-0.84656
1.00692
-0.47812
6.25981
0.09254
0.17905
0.05780
p< 0.001
p< 0.001
P< 0.001
p< 0.001
Table 9. Sample size and area under the receiver operating characteristic curve (AUC)
for the dissolved oxygen models. The optimal prediction points where false negative and
false positive rates are equal are presented for each model.
Model
Dissolved oxygen < 6.5 mg I"1
using water temperature
Dissolved oxygen < 6.5 mg I"1
using water temperature and
salinity
Dissolved oxygen < 6.5 mg I"1
using sigma-t
Dissolved oxygen < 6.5 mg 1"
using water temperature,
salinity, and in situ
fluorescence
N
1126
1126
1126
760
AUC
0.84
0.85
0.82
0.90
Optimal Prediction Point
0.44
0.43
0.41
0.47
38
-------
Model Validation
The predictive capability of each of the dissolved oxygen models was tested using
flood-tide dissolved oxygen, temperature, salinity, and in situ fluorescence data collected
at station Yl from May-September 2009. For model validation, we used the water
temperature and salinity model as well as the one that included in situ fluorescence. For
this analysis, we used the optimal prediction point where the false positive and the false
negative rates were equal. Tables 10 and 11 show the number of times the logistic
regression model correctly and incorrectly predicted flood-tide dissolved oxygen falling
below 6.5 mg I"1 for the 2009 dataset. The logistic regression model including
temperature and salinity had an overall accuracy of 79.8%, while the one including these
variables and in situ fluorescence had an accuracy of 88.9%. Including in situ
fluorescence in the logistic regression model resulted in a reduction of false positives by
almost a factor of 2. The sensitivity and specificity of the logistic regression model
including temperature and salinity were 90.3% and 75%, respectively. Inclusion of in
situ fluorescence increased the sensitivity and specificity to 93.5% and 86.8%,
respectively.
Table 10. Classification table showing accuracy of the water temperature and salinity
logistic regression equation for predicting the occurrence of flood-tide dissolved oxygen
< 6.5 mg I"1 using data collected during May - September 2009 at station Yl. Prediction
point =0.43.
Observed
Occurrence of
Dissolved Oxygen
^e.smgr1
<6.5mgr1
Total
Predicted Occurrence of Dissolved Oxygen
^e.smgr1
51
3
54
<6.5mgr1
17
28
45
Total
68
31
99
39
-------
Table 11. Classification table showing accuracy of the water temperature, salinity, and in
situ fluorescence logistic regression equation for predicting the occurrence of flood-tide
dissolved oxygen < 6.5 mg I"1 using data collected during May - September 2009 at
station Yl. Prediction point = 0.47.
Observed
Occurrence of
Dissolved Oxygen
>6.5mgVl
<6.5mgr1
Total
Predicted Occurrence of Dissolved Oxygen
>6.5mgrl
59
2
61
<6.5mgr1
9
29
38
Total
68
31
99
Demonstration of Application of Logistic Regression Model
We used additional datasets obtained from the Yaquina, Coos, Umpqua,
Tillamook and Siletz estuaries to see how effective the logistic regression models were at
identifying occurrences of dissolved oxygen less than 6.5 mg I"1 associated with ocean
input.
Yaquina Estuary
The water temperature and salinity logistic regression presented in Table 8 was
applied to cruise data collected in the lower portion of Yaquina Estuary during the dry
seasons of 2006 and 2007. If the probability calculated from the logistic regression
exceeded the prediction point of 0.43, then the model predicted that dissolved oxygen
would be less 6.5 mg I"1 due to ocean conditions.
The logistic regression model identified 35 (or 73%) of the 48 occurrences of
dissolved oxygen less than 6.5 mg I"1 as being associated with ocean input (Figure 14).
The median dissolved oxygen of the entire 2006 and 2007 cruise dataset was 6.9 mg I"1 (n
= 159). Removing the observations which the logistic model predicts would have a
dissolved oxygen less than 6.5 mg I"1 results in a median value of 7.4 mg I"1 (n = 110).
The logistic model should not identify all of the occurrences of dissolved oxygen less
than the threshold because there may be other causes of low dissolved oxygen conditions.
There are some events of dissolved oxygen < 6.5 mg I"1 which the model does not
40
-------
34-
32-
30-
3 28-|
a.
>,
-i—i
i 26
C/)
24-
22-
20
o
o
o
o
p
o
o
o
o Observations with dissolved oxygen > 6.5 mg I"
• Observations with dissolved oxygen < 6.5 mg I"1
Identified as ocean input
Probability of dissolved
oxygen < 6.5 mg I"1
12
16
20
Water Temperature (deg C)
I
0
0.15
0.25
0.35
0.43
0.55
0.65
0.75
0.85
1.0
Figure 14. Temperature and salinity of cruise data measured in the Yaquina Estuary during May to October of 2006 and 2007 with
dissolved oxygen > 6.5 mg I"1 (open circles) and dissolved oxygen < 6.5 mg I"1 (filled symbols), and contours of probability of
dissolved oxygen < 6.5 mg I"1 generated from the logistic regression model with water temperature and salinity as explanatory
variables. The "x" symbols are those identified as ocean input from the logistic regression model with a prediction point of 0.43.
-------
attribute to ocean input (filled symbols outside of the colored region); however, mixing
diagrams (salinity versus dissolved oxygen plots) demonstrate that there was also an up
estuary source of low dissolved oxygen during this period.
Coos Bay
As an additional test of the logistic regression model at predicting events of
dissolved oxygen <6.5 mg I"1, we applied the water temperature and salinity logistic
regression to continuous data from the South Slough National Estuarine Research
Reserve for a location near the entrance of Coos Bay during the period of June-
September, 2006. Flood-tide temperature, salinity and dissolved oxygen values were
identified and extracted from this dataset. The flood-tide data from Coos Bay exhibited
similar patterns to the flood-tide data from Yaquina Estuary, with low dissolved oxygen
levels occurring at cool water temperatures and high salinities (similar to Figure 12).
Flood-tide dissolved oxygen levels were <6.5 mg I"1 about 39% of the time. The logistic
regression model (developed using water temperature and salinity generated using flood-
tide data from the Yaquina Estuary) identified 45% of these events in Coos Bay as being
associated with ocean input.
Classification Dataset
The classification dataset was used to determine if the logistic regressions
generated using data from the Yaquina Estuary would be applicable to other estuaries in
the region. During a deployment near the entrance of the Siletz Estuary, 40% of the
observations had dissolved oxygen values of < 6.5 mg I"1, and 93% of these observations
were predicted to be associated with ocean input. During a deployment near the entrance
of the Umpqua Estuary, 29% of the observations had dissolved oxygen values of
< 6.5 mg I"1, and 60% of these observations were predicted by the logistic regression
model to be related to ocean input. During a deployment near the mouth of Tillamook
Estuary, 43% of the observations had dissolved oxygen values < 6.5 mg I"1, and 49% of
these observations were predicted to be associated with ocean input. Based on these
results, we believe that the logistic models developed in this report may be applicable to
other estuaries in the region.
42
-------
4. Summary
Observations from the Yaquina and other estuaries in the Pacific Northwest show
that intrusions of coastal ocean water into the estuaries can result in high nitrogen
(-32 jiM), phosphorous (~3 |iM), and chlorophyll a levels (up to 50 jig I"1), and low
dissolved oxygen (at times < 2 mg I"1) conditions. These natural intrusions of oceanic
water into PNW estuaries thus often have values of water quality parameters that exceed
water quality criteria, or are greater than values for eutrophication indicators associated
with highly eutrophic status (e.g., Bricker et al., 2003). Many states, including
Washington and Oregon, have a narrative criterion that specifies that if natural conditions
are the cause of non-attainment of a water quality standard, then the natural conditions
become the standard; thus, tools that identify natural events that will cause non-
attainment of water quality standards are needed.
This report demonstrates an approach for distinguishing exceedances of water
quality thresholds associated with ocean conditions by using logistic regression models.
These types of models have been variously used to forecast the occurrence of poor water
quality conditions within streams, estuaries, and the coastal ocean. For example, logistic
regression models have been used to predict occurrences of toxic phytoplankton blooms
in the coastal ocean (Lane et al., 2009), forecast non-attainments of water quality
standards in an estuarine impoundment (Worrall et al., 1998), to predict the
eutrophi cation status of estuaries (Lowery, 1998), to forecast the probability of
exceedance of a turbidity criterion in streams (Towler et al., 2010), and to assess
attainment of dissolved oxygen criteria in Chesapeake Bay (US EPA, 2003).
The logistic regression models presented in this report are simple tools, which
provide the probability of an observation exceeding a water quality threshold due to
ocean conditions based on water temperature, salinity, and in situ fluorescence (for
dissolved oxygen) at time of sampling. It is possible to distinguish oceanic inputs due to
their distinctive thermal and saline signatures. Typically, dissolved oxygen levels
decrease with increasing temperature due to both reduced solubility of oxygen in water
and due to increased respiration and decomposition (e.g., Lee and Lwiza, 2008; Verity et
al., 2006). However, in the marine-dominated portion of Pacific Northwest estuaries,
minimum dissolved oxygen values often occur at cool water temperatures, which are
43
-------
distinctive from the water temperatures associated with within estuary causes of low
dissolved oxygen. In addition, water masses with high nutrients associated with ocean
input have temperatures and salinities which differ from those associated with riverine
and point source inputs. Based on the analysis presented in this report, we suggest that
water temperature and salinity data always be collected at the same time as nutrient and
dissolved oxygen concentrations are measured.
To apply these regression models, the user would need to substitute the
parameters in Tables 2 and 8 into Equation 1. The probability of exceedance being
asscociated with ocean input is then calculated using Equation 2, and the value of
Equation 1 calculated using the measured temperature, salinity and in situ fluorescence
(for dissolved oxygen if available). The user then will need to provide the prediction
point either using the optimal prediction point presented in Tables 3 and 9 or specifying a
different prediction point based on the needs of their application. If the calculated
probability is greater than the prediction point, then the exceedance is predicted to be
associated with ocean input. The regression models presented in this report were
calculated using the water quality thresholds presented in Table 1. If the user desired
other water quality thresholds, then the logistic regression equations would need to be re-
calculated.
Occasionally, water masses with high chlorophyll a are advected into PNW
estuaries from the coastal ocean (Brown and Ozretich, 2009). However, these
phytoplankton blooms do not have as distinctive of a temperature and salinity signature
as high nutrients or low dissolved oxygen events. This decrease in the distinctive
signature occurs because peak chlorophyll a levels entering estuaries lag upwelling
conditions by about 4 to 7 days (Brown and Ozretich, 2009). Phytoplankton blooms
develop while upwelled water with high nutrient concentrations is exposed to sunlight
and warms up. Therefore, we do not feel that this approach will be capable of
distinguishing exceedances of chlorophyll a thresholds related to ocean input. However,
Newton and Horner (2003) demonstrated that phytoplankton species can be used to
identify the origin of phytoplankton blooms inside PNW estuaries, including those
advected from the ocean.
44
-------
Nutrient and dissolved oxygen observations identified as being due to ocean input
should not be used in assessing compliance of water quality standards or for assessing
eutrophication status (e.g., using the approach of Bricker et al. 2003). By excluding
observations associated with ocean input from water body assessments, type I errors in
listings (i.e., falsely listing a segment as impaired when it isn't) may be reduced. Falsely
declaring an estuarine reach as impaired results in unnecessary planning and costs (Smith
et al., 2001). Logistic regression models such as those presented in this report could also
be used to remove the effect of ocean input in a water quality dataset, and then the
remaining data could be used in the development of nutrient criteria (e.g., using the
percentile approach presented in Brown et al. (2007).
For estuarine assessments, such as the EPA's National Coastal Assessment (EPA,
2004), water quality is assessed as "good", "fair" or "poor" by comparing observed water
quality indicators to thresholds established for each of these categories. For example, for
the west coast of the United States, the thresholds for DIP in the last National Coastal
Assessment were as follows: DIP < 0.32 uM were rated as "good", 0.32 uM < DIP <
3.2 uM were rated as "fair", and DIP > 3.2 uM were rated as "poor." In the most recent
assessment of west coast estuaries, 86% of the sites had DIP levels in the "fair" category,
and 10% in the "poor" category (EPA, 2004). This report also states that coastal
upwelling may have been an important contributing factor to the high DIP levels. In the
flood-tide dataset from the Yaquina Estuary, 91% of the DIP observations would be
classified as "fair", and 9% as "good" using the National Coastal Assessment threshold.
In this report, we present logistic regression models for the thresholds presented in
Table 1; however, additional logistic regressions could be developed for other water
quality thresholds such as those used in EPA (2004) and Bricker et al. (2003). An
alternate approach would be to use flood-tide or inner shelf data to develop thresholds for
the categories that are a function of water temperature and salinity. Based on the data
presented in this report, the nutrient thresholds for the water quality indicators would
need to be increased and the dissolved oxygen threshold decreased at low water
temperatures and high salinities due to the influence of coastal upwelling.
The logistic regression models developed in this report may be applicable at a
regional scale for estuaries extending from northern California to outer coast estuaries
45
-------
along the Washington coast. If the models presented in this report were applied to other
Oregon estuaries, this would require the assumption that the nutrient and dissolved
oxygen levels and their distinctive thermal and saline signatures are uniform along the
coast. More extensive verification of the approach would be needed prior to applying
these models to other estuaries. In addition, this method also assumes that the nutrient
and oxygen levels entering the Yaquina Estuary are related to coastal upwelling, rather
than being influenced by plume effects or runoff from coastal watersheds. The Columbia
River plume has been shown to influence the coastal ocean along the Oregon shelf;
however, Huyer et al (2005) found that there is no evidence that the plume supplies any
nitrates to the region off of Newport. In addition, peak flood tide nutrient concentrations
entering the Yaquina during dry season flood tides are consistent with recently upwelled
water on the shelf and strongly correlated with upwelling favorable wind stress (Brown
and Ozretich, 2009). Based on these lines of evidence, we feel that nutrient and dissolved
oxygen level in flood tide water entering the Yaquina Estuary results from upwelling
processes rather than plume effects. For other estuaries that are in closer proximity
Columbia River (such as Willapa Bay, WA), this may not be the case. These models
would be most useful for tide-dominated estuaries in the Pacific Northwest, such as Coos,
Yaquina and Tillamook Bays (Lee and Brown, 2009). This method may also be useful
for distinguishing upwelling caused hypoxic events in other regions. For example, Glenn
et al. (2004) suggested that recurrent hypoxia on the inner shelf off the coast of New
Jersey is related to coastal upwelling.
The models developed in this report will not capture all of the oceanic import
events, because as the water heats up and mixes with low salinity and warm water in the
estuary, the ocean signature becomes obscured. Hence, exceedances identified as ocean
input by the logistic regression models will be under-estimated. More work is required to
incorporate the heating of cool ocean water both inside and outside the estuary. An
alternate approach to deal with this limitation may be to use water temperature and
salinity observations for the flood tide prior to observed sampling, or by comparing
observed nutrients to modeled nutrient values using flood-tide water temperatures.
Additionally, since both high nutrients and low dissolved oxygen conditions are
characteristic of recently upwelled water, if there are exceedances of nitrogen,
46
-------
phosphorous, and dissolved oxygen thresholds simultaneously during a sampling event,
this may provide additional confidence in attributing these exceedances to ocean input.
The best predictor of dissolved oxygen events <6.5 mg I"1 included in situ
fluorescence. The models which excluded in situ fluorescence had more observations
classified as false positives (i.e., modeled predicted dissolved oxygen < 6.5 mg I"1, while
observed values exceeded this threshold). Worrall et al. (1998) found similar
misclassification due to the presence of algal blooms when using a logistic regression
model to predict exceedances of a dissolved oxygen criterion in an estuarine
impoundment. We included this parameter in the model to demonstrate that in situ
fluorescence (a measure of phytoplankton chlorophyll a) improves model performance.
However, we caution against applying this specific logistic model to other datasets due to
instrumentation and calibration differences in the measurement of in situ fluorescence.
The specific equation would only be applicable to datasets that use the same YSI
chlorophyll a sensor and calibration methods described in this report.
The traditional method of identifying nutrient sources or causes of low dissolved
oxygen within estuaries is the use of mixing diagrams, which requires that end members
(ocean and river) remain relatively constant (Loder and Reichard, 1981). Previous
research has demonstrated high temporal variability in ocean conditions (Brown and
Ozretich, 2009), with water quality conditions in the nearshore coastal ocean changing on
the scales of hours to days, which prohibits creating mixing diagrams using data collected
over multiple days. The use of mixing diagrams requires sampling multiple stations
along the axis of the estuary within a short period of time. However, due to logistical
constraints, water quality sampling is often random with respect to tidal stage, and often
only a few locations are sampled within an estuary on a particular sampling date. One
advantage of the approach presented in this report is that it can be used even when only
one location in the estuary is sampled on a given day.
Many of the watersheds of PNW estuaries presently have relative low levels of
development and human population density in their watersheds (Lee and Brown, 2009);
however, populations are expected to increase. Similar to other regions of the United
States (e.g., Crossett et al., 2004), highest human populations are located along the coast
and adjacent to many PNW estuaries (Lee and Brown, 2009); hence it is desirable to
47
-------
develop an approach that identifies water quality conditions associated with the coastal
ocean. One can then remove those observations related to ocean conditions, and use the
remaining data to assess the status of estuarine water quality conditions and to assess
attainment of water quality standards. In addition, filtering out oceanic conditions from
estuarine water quality datasets may aid in identifying other sources of water quality
degradation. For example, often land uses in watersheds are correlated with water quality
conditions within estuaries (e.g., Dauer et al., 2000; Kauppila et al., 2003). Development
and highest population densities are often co-located adjacent to the most seaward portion
of estuaries, adjacent to the portion of the estuarine system where ocean conditions are
most likely to cause high nutrient, low dissolved oxygen, and high chlorophyll a
conditions. If the influence of the coastal ocean is not removed from samples taken for
compliance monitoring of water quality criteria, it may obscure anthropogenic effects and
cause misinterpretation of results.
In order to make the application of the approach developed in this report as
accessible as possible, we are presently developing a data exploration tool, which
graphically displays user-provided data, and compares it to flood-tide data from Yaquina
and the inner shelf, and identifies data points associated with ocean input using the
logistic regression models presented in this report.
Some studies have suggested that future climate change may lead to changes in
seasonality or intensity of coastal wind-driven upwelling (Snyder et al., 2003). It has
recently been suggested that there has been an increase in the occurrence of severe
hypoxic condition on the Oregon shelf (Chan et al., 2008). If future studies demonstrate
anthropogenic-related changes in the amount of coastal upwelling or in the occurrence of
hypoxia on the inner shelf, then those exceedances identified as associated with ocean
input may have some component related to anthropogenic activities, which will greatly
complicate decisions with regard to compliance monitoring for water quality criteria.
New approaches will doubtlessly be required.
48
-------
5. Literature Cited
Barth, J.A., B.A. Menge, J. Lubchenco, F. Chan, J.M. Bane, A.R. Kirincich, M.A.
McManus, KJ. Nielsen, S.D. Pierce, and L. Washburn. 2007. Delayed upwelling
alters nearshore coastal ocean ecosystems in the northern California current.
Proceedings of the National Academy of Sciences, 104(10):3719-3724,
www.pnas.org/cgi/doi/10.1073/pnas.0700462104.
Bourke, R.H. and J.G. Pattullo. 1974. Seasonal variation of the water mass along the
Oregon-Northern California coast. Limnology and Oceanography, 19(2): 190-198.
Bricker, S.B., J.G. Ferreira, and T. Simas. 2003. An integrated methodology for
assessment of estuarine trophic status. Ecological Modelling., 169:39-60.
Brown, C.A., W.G. Nelson, B.L. Boese, T.H. DeWitt, P.M. Eldridge, I.E. Kaldy, H.Lee
II, J.H. Power, and D.R. Young. 2007. An approach to developing nutrient criteria
for Pacific Northwest estuaries: A case study of Yaquina Estuary, Oregon. US EPA,
Washington, D.C., EPA/600/R-07/046. 183 pp.
Brown, C.A. and RJ. Ozretich. 2009. Coupling between the coastal ocean and Yaquina
Bay, Oregon: Importance of oceanic inputs relative to other nitrogen sources.
Estuaries and Coasts, 32: 219-237, doi 10.1007/sl2237-008-9128-6.
Brown, C.A. and J.H. Power. In review. Historic and recent patterns in dissolved
oxygen within the Yaquina Estuary (Oregon, USA): Importance of anthropogenic
activities and oceanic conditions. Submitted to Estuarine, Coastal and Shelf Science.
Chan, F., J.A. Barth, J. Lubchenco, A. Kirincich, A. Weeks, W.T. Peterson, and B.A.
Menge. 2008. Emergence of anoxia in the California Current large marine
ecosystem. Science, 319:920.
Corwith, H.L. and P. A. Wheeler. 2002. El Nino related variations in nutrient and
chlorophyll distributions off Oregon. Progress in Oceanography, 54:361-380.
Crossett, K.M, T.J. Culliton, P.C. Wiley, and T.R. Goodspeed. 2004. Population trends
along the coastal United States: 1980-2008. Coastal Trends Report Series. National
Oceanic and Atmospheric Administration.
Dauer, D.M., J.A. Ranasinghe, and S.B. Weisberg. 2000. Relationships between benthic
community condition, water quality, sediment quality, nutrient loads, and land use
patterns in Chesapeake Bay. Estuaries, 23(1): 80-96.
De Angelis, M.A. and L.I. Gordon. 1985. Upwelling and river runoff as sources of
dissolved nitrous oxide to the Alsea estuary, Oregon. Estuarine, Coastal and Shelf
Science, 20:375-386.
Donohue, R. and E. van Looij. 2001. The derivation of percentile quality criteria for the
Swan-Canning Estuary; A binomial approach. Swan River Trust, Swan-Canning
Cleanup Program Report SCCP 24.
EPA. 2004. National Coastal Condition Report II. U.S. Environmental Protection
Agency, Washington, DC., EPA-620/R-03/002. www.epa.gov/owow/oceans/nccr2/
Glenn, S., R. Arnone, T. Bergmann, W.P. Bissett, M. Crowley, J. Cullen, J. Gryzmski, D.
Haidvogel, J. Kohut, M. Moline, M. Oliver, C. Orrico, R. Sherrell, T. Song, A.
Weidemann, R. Chant, and O. Schofield. 2004. Biogeochemical impact of
summertime coastal upwelling on the New Jersey Shelf. Journal of Geophysical
Research, 109, C12S02, doi: 10.1029/2003JC002265.
49
-------
Grantham, B.A., F. Chan, KJ. Nielsen, D.S. Fox, J.A. Earth, A. Huyer, J. Lubchenco,
and B. A. Menge. 2004. Upwelling-driven nearshore hypoxia signals ecosystem and
oceanographic changes in the northeast Pacific. Nature, 429:749-754.
Haertel, L, C. Osterberg, H. Curl, Jr., and P.K. Park. 1969. Nutrient and plankton
ecology of the Columbia River estuary. Ecology, 50(6):962-978.
Hosmer, D.W. and S. Lemeshow. 2000. Applied logistic regression. 2nd edition, John
Wiley, New York, NY.
Huyer, A., J.H. Fleischbein, J. Keister, P. M. Kosro, N. Perlin, R.L. Smith, and P.A.
Wheeler. 2005. Two coastal upwelling domains in the northern California Current
system. Journal of Marine Research., 63:901-929.
Kauppila, P., J.J. Meeuwig, and H. Pitkanen. 2003. Predicting oxygen in small estuaries
of the Baltic Sea: a comparative approach. Estuarine, Coastal and Shelf Science,
57:1115-1126.
Kosro, P.M. 2003. Enhanced southward flow over the Oregon shelf in 2002: A conduit
for subarctic water. Geophysical Research Letters, 30(15): 8023, doi:
10.1029/2003GL017436.
Lane, J.Q., P.T. Raimondi, and R.M. Kudela. 2009. Development of a logistic
regression model for the prediction of toxigenic Pseudo-nitzschia blooms in
Monterey Bay, California. Marine Ecology Progress Series, 383:37-51.
Lee and C.A. Brown. 2009. Classification of Regional Patterns of Environmental Drivers
and Benthic Habitats in Pacific Northwest Estuaries. USEPA, Washington, D.C.,
EPA/600/R-09/140. 298 pp.
Lee, Y.J. and K.M.M. Lwiza. 2008. Characteristics of bottom dissolved oxygen in Long
Island Sound, New York. Estuarine, Coastal and Shelf Science, 76(2): 187-200.
Loder, T.C. and R.P. Reichard. 1981. The dynamics of conservative mixing in estuaries.
Estuaries, 4(1):64-69.
Lowery, T.A. 1998. Modelling estuarine eutrophication in the context of hypoxia,
nitrogen loadings, stratification and nutrient ratios. Journal of Enviornmental
Management, 52, 289-305.
Menge, B.A., F. Chan, K.J. Nielsen, E. Di Lorenzo, and J. Lubchenco. 2009. Climatic
variation alters supply-side ecology: impact of climate patterns on phytoplankton and
mussel recruitment. Ecological Monographs, 79(3):379-395.
Nelson, W.G. and C.A. Brown. 2008. Use of probability-based sampling of water-
quality indicators in supporting development of quality criteria. ICES Journal of
Marine Science, 65:1421-1427.
Newton, J.A. and R.A. Horner. 2003. Use of phytoplankton species indicators to track
the origin of phytoplankton blooms in Willapa Bay, Washington. Estuaries, 26(4B):
1071-1078.
Park, K., J.G. Pattullo, and B. Wyatt. 1962. Chemical properties as indicators of
upwelling along the Oregon coast. Limnology and Oceanography, 7: 435-437.
Pearson, E.A. and G.A. Holt. 1960. Water quality and upwelling at Grays Harbor
entrance. Limnology and Oceanography, 5(l):48-56.
Peterson, W.T., J.E. Keister, and L.R. Feinberg. 2002. The effects of the 1997-1999 El
Nino/La Nina on hydrography and zooplankton off the central Oregon coast.
Progress in Oceanography, 54:381-398.
50
-------
R Development Core Team (2008). R: A language and environment for statistical
computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-
900051-07-0, http://www.R-project.org.
Roegner, G.C., B.M. Hickey, J.A. Newton, A.L. Shanks, and D.A. Armstrong. 2002.
Wind-induced plume and bloom intrusions into Willapa Bay, Washington.
Limnology and Oceanography, 47(4):1033-1042.
Roegner, G. and A. Shanks. 2001. Import of coastally-derived chlorophyll a to South
Slough, Oregon. Estuaries, 24:224-256.
Smith, E.P., K. Ye, C. Hughes, and L. Shabman. 2001. Statistical assessment of
violations of water quality standards under Section 303(d) of the Clean Water Act.
Environmental Science and Technology., 3 5(3):606-612.
Snyder, M.A., L.C. Sloan, N.S. Diffenbaugh, and J.L. Bell. 2003. Future climate change
and upwelling in the California Current. Geophysical Research Letters, 30(15):! 823.
doi:10.1029/2003GL017647.
Thomas, A.C., P.T. Strub, and P. Brickley. 2003. Anomalous satellite-measured
chlorophyll concentrations in the northern California Current in 2001-2002.
Geophysical Research Letters, 30(15):8022, doi: 10.1029/2003 GLO 17409,2003.
Towler, E, B. Rajagopalan, R. S. Summers, and D. Yates. 2010. An approach for
probabilistic forecasting of seasonal turbidity threshold exceedance. Water Resources
Research, 46, W06511, doi:10.1029/2009WR007834, 2010.
Verity, P.G., M. Alber, and S.B. Bricker. 2006. Development of hypoxia in well-mixed
subtropical estuaries in the Southeastern USA. Estuaries and Coasts 29(4): 665-673.
Wetz, J.J., J. Hill, H. Corwith, and P. A. Wheeler. 2005. Nutrient and extracted
chlorophyll data from the GLOBEC long-term observation program, 1997-2004.
Data Report 193, COAS Reference 2004-1, Oregon State University, Corvallis,
Oregon.
Wheeler, P. A., A. Huyer, and J. Fleischbein. 2003. Cold halocline, increased nutrients
and higher chlorophyll off Oregon in 2002. Geophysical Research Letters, 30(15):
8021,doi:10.1029/2003GL017395,2003.
Worrall, F., D.A. Wooff, and P. Mclntyre. 1998. A simple modelling approach for water
quality: The example of an estuarine impoundment. The Science of the Total
Environment, 219:41-51.
U.S. EPA. 2003. Ambient Water Quality Criteria for Dissolved Oxygen, Water Clarity
and Chlorophyll a for the Chesapeake Bay and Its Tidal Tributaries. Appendix I:
Analytical approaches for assessing short-duration dissolved oxygen criteria.
USEPA, Washington, D.C., EPA 903-R-03-002.
51
-------
Appendices
Provided in the appendices are classification tables of the logistic regression models
generated for DIN, DIP, and dissolved oxygen using water temperature and salinity.
52
-------
Table Al . Classification table for DIN logistic regression with water temperature and salinity as explanatory variables for probability
prediction points ranging from 0 to 1. The row that is shaded shows the optimal prediction point where false negative and false positive
rates are approximately equal.
Prediction
Point
0
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
0.55
0.60
0.65
0.70
0.75
0.80
0.85
0.90
0.95
1.00
Total Correct
DIN > 14 |iM
194
193
193
190
189
188
185
181
176
173
169
163
161
156
148
143
133
112
78
40
0
DIN<14|iM
0
147
159
167
173
180
186
188
191
193
198
200
204
209
214
221
224
228
237
237
237
Total Incorrect
DIN>14|iM
0
1
1
4
5
6
9
13
18
21
25
31
33
38
46
51
61
82
116
154
194
DIN<14|iM
237
90
78
70
64
57
51
49
46
44
39
37
33
28
23
16
13
9
0
0
0
Percent
Correct
45.0
78.9
81.7
82.8
84.0
85.4
86.1
85.6
85.2
84.9
85.2
84.2
84.7
84.7
84.0
84.5
82.8
78.9
73.1
64.3
55.0
Sensitivity
100.0
99.5
99.5
97.9
97.4
96.9
95.4
93.3
90.7
89.2
87.1
84.0
83.0
80.4
76.3
73.7
68.6
57.7
40.2
20.6
0.0
Specificity
0.0
62.0
67.1
70.5
73.0
75.9
78.5
79.3
80.6
81.4
83.5
84.4
86.1
88.2
90.3
93.2
94.5
96.2
100.0
100.0
100.0
False
Positive
100.0
38.0
32.9
29.5
27.0
24.1
21.5
20.7
19.4
18.6
16.5
15.6
13.9
11.8
9.7
6.8
5.5
3.8
0.0
0.0
0.0
False
Negative
0.0
0.5
0.5
2.1
2.6
3.1
4.6
6.7
9.3
10.8
12.9
16.0
17.0
19.6
23.7
26.3
31.4
42.3
59.8
79.4
100.0
-------
Table A2. Classification table for DIP logistic regression with water temperature and salinity as explanatory variables for probability
prediction points ranging from 0 to 1. The row that is shaded shows the optimal prediction point where false negative and false positive
rates are approximately equal.
Predition
Point
0
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
0.55
0.60
0.65
0.70
0.75
0.80
0.85
0.90
0.95
1.00
Total Correct
DIP> 1.3 |iM
204
203
202
200
199
198
197
195
193
191
188
185
181
176
169
158
149
142
118
62
0
DIP<1.3 |iM
0
154
166
170
175
180
187
188
191
194
195
197
200
203
205
207
212
215
225
227
227
Total Incorrect
DIP > 1.3 |iM
0
1
2
4
5
6
7
9
11
13
16
19
23
28
35
46
55
62
86
142
204
DIP<1.3 |iM
227
73
61
57
52
47
40
39
36
33
32
30
27
24
22
20
15
12
2
0
0
Percent
Correct
47.3
82.8
85.4
85.8
86.8
87.7
89.1
88.9
89.1
89.3
88.9
88.6
88.4
87.9
86.8
84.7
83.8
82.8
79.6
67.1
52.7
Sensitivity
100.0
99.5
99.0
98.0
97.5
97.1
96.6
95.6
94.6
93.6
92.2
90.7
88.7
86.3
82.8
77.5
73.0
69.6
57.8
30.4
0.0
Specificity
0.0
67.8
73.1
74.9
77.1
79.3
82.4
82.8
84.1
85.5
85.9
86.8
88.1
89.4
90.3
91.2
93.4
94.7
99.1
100.0
100.0
False
Positive
100.0
32.2
26.9
25.1
22.9
20.7
17.6
17.2
15.9
14.5
14.1
13.2
11.9
10.6
9.7
8.8
6.6
5.3
0.9
0.0
0.0
False
Negative
0.0
0.5
1.0
2.0
2.5
2.9
3.4
4.4
5.4
6.4
7.8
9.3
11.3
13.7
17.2
22.5
27.0
30.4
42.2
69.6
100.0
-------
Table A3. Classification table for dissolved oxygen logistic regression with water temperature and salinity as explanatory variables for
probability prediction points ranging from 0 to 1. The row that is shaded shows the optimal prediction point where false negative and false
positive rates are approximately equal.
Prediction
Points
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.43
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
Total Correct
<6.5mgr1
432
427
419
415
403
389
379
365
347
332
322
304
278
255
229
183
124
62
34
5
0
0
^e.smgr1
0
183
265
329
366
408
446
484
514
534
544
568
593
614
632
647
669
680
688
694
694
694
Total Incorrect
<6.5mgr1
0
5
13
17
29
43
53
67
85
100
110
128
154
177
203
249
308
370
398
427
432
432
^e.SmgT1
694
511
429
365
328
286
248
210
180
160
150
126
101
80
62
47
25
14
6
0
0
0
Percent
Correct
38.4
54.2
60.7
66.1
68.3
70.8
73.3
75.4
76.5
76.9
76.9
77.4
77.4
77.2
76.5
73.7
70.4
65.9
64.1
62.1
61.6
61.6
Sensitivity
100.0
98.8
97.0
96.1
93.3
90.0
87.7
84.5
80.3
76.9
74.5
70.4
64.4
59.0
53.0
42.4
28.7
14.4
7.9
1.2
0.0
0.0
Specificity
0.0
26.4
38.2
47.4
52.7
58.8
64.3
69.7
74.1
76.9
78.4
81.8
85.4
88.5
91.1
93.2
96.4
98.0
99.1
100.0
100.0
100.0
False
Positive
100.0
73.6
61.8
52.6
47.3
41.2
35.7
30.3
25.9
23.1
21.6
18.2
14.6
11.5
8.9
6.8
3.6
2.0
0.9
0.0
0.0
0.0
False
Negative
0.0
1.2
3.0
3.9
6.7
10.0
12.3
15.5
19.7
23.1
25.5
29.6
35.6
41.0
47.0
57.6
71.3
85.6
92.1
98.8
100.0
100.0
------- |