oEPA
  United States
  Environmental Protection
  Agency
  Six Key Steps
  for Developing and Using
  Predictive Tools at Your Beach
Wat<
BASf
;r Quality Today
~^£i+

GOOD
D ON RECENT MONIIOHING
FOR E.COU BACVEHIA

                      U.S. Environmental Protection Agency
                      Office of Water
                      March 8,2016
                      820-R-16-001

-------
Foreword

This non-technical guide was developed by the U.S. Environmental Protection
Agency (EPA) to provide local government officials, beach managers, health
department personnel, and others basic information on how to develop predictive
tools in the context of an overall beach monitoring and notification program. Five
case studies are presented toward the end of this document as examples of how
predictive tools have been developed and used at actual beaches. Readers seeking
more in-depth design  and implementation information are encourage to review the
sources used to develop this document as well as various on-line resources provided
by EPA and other agencies.
    Front cover photos, starting upper left and moving clockwise:
       * Little boy enjoying the waves, ©istockphoto.com.
       « Water quality notification sign, USEPA.
       • qPCR analysis, City of Racine Health Department.
       « Lake Superior, Michigan, upper peninsula, ©istockphoto.com.
       • Miami Beach, ©islockpholo.com.

    Case study images are courtesy of the Chicago Parks District, Charles River Watershed
    Association, Milwaukee Department of City Development, Milwaukee County Parks,
    University of Wisconsin Zilber School of Public Health, and City of Racine Health Department.

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Contents

Foreword
Acronym List	v
Introduction	1
      The Time-Lag Problem	2
      Predictive Tools	3
      Developing a Predictive Model	4
Step 1: Evaluate the Appropriateness of a FIB Predictive Tool	6
      Introduction to Step 1	6
      Is There a Need for a Predictive Tool?	6
      Are Beach Characteristics Compatible with Predictive Tools?	7
      Are There Sufficient Historical Data to Develop and Test a Predictive Tool?	7
      Are There Funding and Other Resources Available to Develop, Operate,
      Maintain, and Update a Predictive Tool?	8
         Personnel and Technical Experts	9
         Data Collection	9
         Monitoring Equipment and Supplies	10
         Modeling and Statistical Software  	  10
         Model Evaluation Over Time	11
Step 2: Identify Variables and Collect Data	12
      Introduction to Step 2	12
         Key Attributes of Variable Data Sets	12
      FIB Density	14
      Independent Variables	16
         Variables Relating to Bacteria Movement through the Drainage Area	16
         Variables Relating to Bacteria Movement through the Receiving Water	17
         Variables Relating to the Fate of Bacteria in the Swimming Area	19
         Variables Relating to Activities and Conditions at the Beach	20
Step 3: Perform Exploratory Data Analysis	23
      Introduction to Step 3	23
         Virtual Beach Software	24
         Data Management	24
      Characterize the FIB and Independent Variable Data Sets	27
         BoxPlots	27
         Outliers	29
         Comparing Data Distributions among Variable Subsets	30

-------
i i                                                  Six Key Steps for Developing and Using Predictive Tools at Your Beach
                                                                                            Contents
                    Examine the Relationship between FIB and Independent Variables	31
                       Scatterplots	31
                       Variable Transformation	32
                       Creation of New Variables	32
                       Correlation among Independent Variables	32
                       Analysis of Variance for Categorical Variables	33
              Step 4: Develop and Test a Predictive Model	34
                    Introduction to Step 4	34
                       Data Sets	34
                       Reducing Errors	35
                       Virtual Beach	35
                       Models that Do Not Meet Performance Goals	40
                    Exceedance Probability Threshold	41
              Step 5: Integrate the Predictive Tool into a Beach Monitoring and
                    Notification Program	44
                    Introduction to Step 5	44
                    Frequency of Running the Model	45
                    Notification Protocols	45
                    Types of Beach Notifications	47
                       Beach Advisories	47
                       Beach Closings	47
                       Preemptive Advisories	47
                       Permanent Advisories	48
                    Public Communication	49
                       Public Education	49
                       Public Outreach	50
                    Other Uses for Predictive Models	50
              Step 6: Evaluate the Predictive Tool over Time  	51
                    Introduction to Step 6	51
                    Changes to the Fate and Transport of FIB	51
                    Changes to Data Sources	52
                    Changes to Your Beach Program	53
              Bibliography	54
              Case Studies	61

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach                                                  i i i
Contents
Figures

Figure 1.  Using sampling and culture analysis to make a beach notification decision	2

Figure 2.  Using predictive modeling to make a beach notification decision	3

Figure 3.  Box plot attributes	28

Figure 4.  Box plots of E. coli density sorted by wind direction	29

Figure 5.  Comparison of E. coli density over a four-year period	30

Figure 6.  Scatterplots of E. coli vs. rainfall without transformation (A) and with
          a log-transformation (B)	32

Figure 7.  Plot of persistence model results of 2005 data (adapted from Francy
          and Darner 2006.)	38

Figure 8.  Plot of predictive model results of 2005 data (adapted from Francy
          and Darner 2006.)	40

Figure 9.  Plot of predictive model results of 2005 data expressed as exceedance
          probability threshold (adapted from Francy & Darner 2006.)	42

Figure 10. Notification protocol for a beach program that uses sampling results and
          a predictive model to make notification decisions	46

Figure 11. Notification protocol for a beach that uses only model results to make
          notification decisions	46


Tables

Table 1. Beaufort Wind Scale	18

Table 2. Independent variables used in final statistical models from case studies	21

-------

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Acronym List
ANN	Artificial Neural Network

ANOVA	Analysis of Variance

CART	Classification and Regression Tree

CESN	Coastal Environmental Sensing Network

CPU	Colony Forming Unit

CPD	Chicago Parks District

CRWA	Charles River Watershed Association

CSO	Combined Sewer Overflow

DOY	Day of the Year

EDA	Exploratory Data Analysis

EMPACT	Environmental Monitoring for Public
                                              Access and Community Tracking

EnDDaT	Environmental Data Discovery and
                                              Transformation

EPA	U.S. Environmental Protection Agency

FIB	Fecal Indicator Bacteria

GBM	Gradient Boosting Machine

GIS	Geographic Information System

GLRI	Great Lakes Restoration Initiative

MHD	Milwaukee Health Department

MLR	Multivariable Linear Regression

MPN	Most Probable Number

NDBC	National Data Buoy Center

NOAA	National Oceanic and Atmospheric
                                              Administration

NWIS	National Water Information System

NWS	National Weather Service

OLS	Ordinary Least Squares

PLS	Partial Least Squares

-------
vi                                                Six Key Steps for Developing and Using Predictive Tools at Your Beach
                                                                                     Acronym List
       QA	Quality Assurance

       QAPP	Quality Assurance Project Plan

       QC	Quality Control

       qPCR	Quantitative Polymerase Chain Reaction

       SCDHEC	South Carolina Department of Health and
                                                      Environmental Control (SCDHEC)

       SSO	Sanitary Sewer Overflow

       TMDL	Total Maximum Daily Load

       USAGE	U.S. Army Corps of Engineers

       USGS	U.S. Geological Survey

       UTC	Coordinated Universal Time

       UV	Ultraviolet

       UWM	University of Wisconsin-Milwaukee

       VB 	Virtual Beach

       VB3	Virtual Beach Version 3

       WDNR	Wisconsin Department of Natural Resources

       WWTP	Wastewater Treatment Plant

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Introduction

     Even the most pristine waters contain a variety of microscopic
     organisms. Most of them are harmless, but a small portion can cause
illness in humans, including gastroenteritis; eye, ear, and throat infections;
hepatitis; and giardiasis. Generally disease-causing (pathogenic) organisms
encountered at swimming beaches originate from the feces of humans
and warm-blooded animals and are carried into recreational waters by
stormwater runoff.

Monitoring directly for pathogens in recreational waters is currently
impractical for a number of reasons, which include the difficulty in
identifying which pathogens are present, filtering large volumes of water
to isolate enough organisms to measure, and the high cost of analytical
methods. Fortunately, some types of nonpathogenic fecal bacteria are
transported along with disease-causing microbes. Known generically as
"fecal indicator bacteria" (FIB), they exist in far greater numbers than
pathogens and are easier to isolate and enumerate in the laboratory.
Consequently, FIB can serve as markers for the potential presence of
pathogens.

Currently EPA recommends two types of FIB for use in beach monitoring
programs: enterococci and Escherichia coli (E. coli) Either type can be used
at freshwater beaches, and enterococci are recommended for marine water.
State beach programs use exceedance of a beach notification threshold based
on the U.S. Environmental Protection Agency's (EPA's) national criteria
recommendation or a site-specific water quality standard for these bacteria
to determine when to issue a swimming advisory or close a beach (beach
notification).
Information on EPA's
recommended water quality
criteria is provided in the National
Beach Guidance and Required
Performance Criteria for Grants (the
National Beach Guidance) at http://
www.epa.qov/sites/production/
files/2014-07/documents/beach-
auidance-final-2014.pdf.
    Key Resources on Predictive Tools
    • Predictive Tools for Beach Notification. Volume I, Review and Technical Protocol (USEPA 201 Oa)
    • Predictive Modeling at Beaches. Volume II, Predictive Tools for Beach Notification (USEPA 201 Ob)
    • Developing and Implementing Predictive Models for Estimating Recreational Water Quality at Great Lakes
      Beaches (Francy et al. 2013a)
    • Virtual Beach 3.0.4: User's Guide (Cyterski et al. 2013)
    • Accessing Online Data for Building and Evaluating Real-Time Models to Predict Beach Water Quality
      (Mednick2009)
    • Report of the Experts Scientific Workshop on Critical Research Needs for the Development of New or Revised
      Recreational Water Quality Criteria (USEPA 2007)
    • Beach Water Quality Decision Support System (Rockwell etal. 2013)

-------
Sampling beach water for FIB,
                                                      Six Key Steps for Developing and Using Predictive Tools at Your Beach
                                                                                            Introduction
                                  The Time-Lag Problem
                                  At first glance, the process for determining when beach water is safe for
                                  swimming seems fairly straightforward. If laboratory results indicate FIB
                                  densities above the state water quality standard or other threshold value, a
                                  beach notification is issued. If FIB densities are below the threshold value,
                                  no action is taken (Figure 1).
                                        8:00 a.m.
                                        Collect FIB
                                         sample
                                                         10:00 a.m.
                      Deliver sample
                     to lab for culture
                        analysis
                                         7:00 a.m.
                                        the next day
  Receive
sample results
                   7:00 a.m.
                  the next day
Make beach
notification
 decision
Figure 1. Using sampling and culture analysis to make a beach notification decision.

Underlying this beach notification system is the assumption that FIB
densities do not change (i.e., they persist) between the time a water sample
is taken and the laboratory results are known (usually a span of 18-24 hours
for culture methods—the methods most often used). At some beaches,
this "persistence model" is valid, especially when natural or  artificial
barriers restrict water movement at the beach. At many open water beaches,
however, studies have shown that FIB density can fluctuate significantly
over relatively short periods of time. This phenomenon sets up possible
undesirable scenarios, for example:

     Beach water is sampled on Monday. Results obtained on Tuesday
     indicate that FIB density was above the state standard, so the
     beach manager issues an advisory. On Wednesday, results of
     follow-up samples taken on Tuesday reveal that FIB density was
     back to normal and the water was actually safe for swimming (i.e.,
     Monday's FIB levels did not persist into Tuesday). Consequently,
     Tuesday—a perfectly good beach day—was lost. Monday's
     swimmers, on the other hand, were exposed to high levels of FIB
     and potentially unhealthy levels of pathogens.

None of the consequences are good: (1) Monday's swimmers might have
swum in contaminated water, (2) beachgoers might have lost recreational
time on Tuesday, and (3) area businesses might have suffered economic losses
due to the lack of customers. The 18-24-hour time-lag can be a problem.

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Introduction
Predictive Tools
The time-lag problem of culture analysis and the shortcomings of the
persistence model have led to the development of tools that predict whether
the applicable water quality standard has been or is likely to be exceeded so
that beach notifications can be issued in a more timely way. When integrated
properly into a beach notification program, these tools can provide an early
warning of potentially unsafe swimming conditions. This guide presents
an overview of how to develop a predictive tool for your beach program. It
focuses mainly on implementation activities and issues and not on technical
details.

In most instances, the "tool" is actually a mathematical equation or
"model" designed to  produce one of two types of output: (1) a FIB density
prediction, or (2) a probability prediction that expresses the chances that an
applicable water quality standard or notification threshold will be exceeded
(e.g., "There is 60 percent chance that the standard or threshold will be
exceeded."). Either output type can be used by beach managers to "trigger"
a beach notification.  Throughout this document, when "bacteria density"
is mentioned as the model output, assume that it includes "exceedance
probability" as an alternative form of output, unless indicated otherwise.

Figure 2 shows the timeline for a beach program using predictive modeling.
The time required to make the beach notification decision is significantly
shorter than the time required in the scenario shown in Figure 1.
       8:00 a.m.
     Collect model
     input variables
     (e.g., rainfall,
      turbidity)
                        8:30 a.m.
Run predictive
   model
                   8:45 a.m.
Interpret results,
compare to WQS
   or other
  notification
  threshold
                    9:00 a.m.
Make beach
notification
 decision
Figure 2. Using predictive modeling to make a beach notification decision.

In addition to improving the timeliness of beach notifications, predictive
models can also help reduce sampling and increase the accuracy of
identifying notification days by adding to the existing monitoring
program (e.g., if FIB sampling occurs only once or twice a week because of
resource constraints, predictive models can provide information for timely
notification on  other days).

-------
                    Six Key Steps for Developing and Using Predictive Tools at Your Beach
                                                         Introduction
Developing a Predictive Model
This document presents six basic steps that an interdisciplinary project team
(the Beach Team) might take to analyze, develop, implement, and evaluate
the success of a predictive model. Each step is discussed in a separate section.
   • Step 1: Evaluate the Appropriateness of a Predictive Tool. This
    section outlines factors that your Beach Team should consider before
    proceeding with a modeling project. The team should assess the
    degree of risk to the public from swimming at the beach, confirm that
    essential historical FIB data exist that can be used to develop the model,
    identify any beach conditions or attributes that are not compatible with
    FIB modeling, and evaluate whether sufficient resources are available
    locally to support model development, operation, and maintenance.

   • Step 2: Identify Variables and Collect Data. This section introduces
    independent variables influencing the movement of bacteria from their
    sources, through the drainage system and receiving water, and into the
    swimming area of a beach. It offers insights into which independent
    variables might serve as the best candidates for modeling FIB at a
    beach.

   • Step 3: Perform Exploratory Data Analysis. Once a set of candidate
    independent variables is selected, they must be statistically evaluated to
    see how well they correlate with FIB densities. Results from exploratory
    data analysis further refine the list of candidate variables.

   • Step 4: Develop and Test the Predictive Tool. Models can range from
    simple to complex. This section begins with a discussion of rainfall-
    based models that need only one independent variable to develop and
    run. The discussion continues with modeling using multiple variables
    and concludes with techniques for testing the model.

   • Step 5: Integrate the Predictive Tool into a Beach Monitoring and
    Notification Program. Predictive tools are one component of an
    overall beach program. To successfully integrate a model into a beach
    monitoring program, your Beach Team should develop protocols for
    collecting input data, running the model, and using model results.

   • Step 6: Evaluate the Predictive Tool over Time. To ensure that model
    output remains accurate and relevant over time as beach conditions
    change, your Beach Team should evaluate the model's accuracy at least
    annually.

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Introduction
This document concludes with a series of case studies that illustrate various
ways that predictive models have been developed and implemented. The
following case studies helped inform this guidance:
  • The Grand Strand, South Carolina. The South Carolina Department
    of Health and Environmental Control (SCDHEC) developed a
    stormwater model to predict FIB densities at South Carolina state
    beaches. This case study highlights the limitations of monitoring
    equipment and the value of collaboration and technology.

  • Charles River, Massachusetts. The Charles River Watershed Association
    (CRWA) worked with Tufts University to develop a statistical model
    to predict water quality in the Lower Charles River Basin. CRWA's
    experience highlights the importance  of model simplicity and the
    availability of real-time data when resources are limited.

  • Chicago, Illinois. The Chicago Parks District (CPD) developed a
    predictive model in 2011 with the assistance of the U.S. Geological
    Survey (USGS). CPD s experience emphasizes the need for
    comprehensive knowledge of the beach environment as well as
    adequate funding and technical resources to collect data and conduct
    statistical analyses.

  • Racine, Wisconsin. The City of Racine and the Wisconsin Department
    of Natural Resources developed NOWCAST statistical models for
    Racine's two beaches using EPA's Virtual Beach (VB) software.
    Racine's experience illustrates the importance of a robust data set and
    the advantages of reinforcing a model  with other beach monitoring
    components.

  • South Shore Beach, Wisconsin. With assistance from the University
    of Indiana and USGS,  the Milwaukee Health Department (MHD)
    developed a statistical  model for three of its public beaches based on
    24-hour rainfall data and previous 24-hour bacterial sampling data.
    MHD's experience shows that a model can be a good fit for the local
    public health department.

-------
                   Six Key Steps for Developing and Using Predictive Tools at Your Beach



Step 1:  Evaluate the Appropriateness  of a

            FIB Predictive Tool

Introduction to Step 1
While a predictive tool might provide a huge benefit to some beach
programs, your Beach Team should first carefully consider and answer the
following questions to make sure that a predictive tool  is right for your beach:
  a. Is there a need for a predictive tool?
  b. Are beach characteristics compatible with predictive tools?
  c. Are there sufficient historical data to develop and test a predictive tool?
  d. Are there funding and personnel experienced with model development
     and maintenance available to develop, operate, maintain, and update a
     predictive tool?


Is There a Need for a Predictive Tool?
One of the first things your Beach Team should evaluate is whether a
predictive tool is needed. Remember that the main purpose of a predictive
tool is to predict whether the applicable water quality standard has been or
               is likely to be exceeded during the time period prior to
               culture results being available (time lag) on a sampling
               day or on nonsampling days. Using  time series analyses,
               EPA reports that bacteria levels at the beach can change
               over relatively short periods of time (USEPA 2010c). If FIB
               density at your beach, however, is known to  persist for
               24 hours or more, the need for making predictions is not
               as important. Traditional water sampling and laboratory
               analysis alone might adequately protect swimmer health.
                Other situations when a predictive tool might not be
                needed include (1) beaches being sampled daily using
                rapid methods, and (2) beaches never, or hardly ever,
                exceeding applicable recreational water quality standards.
                Given several beaches to manage and limited budgets,
                your Beach Team will likely rank their beaches according
                to factors such as potential risk to human health
                presented by pathogens and beach use. These rankings
                (described further in Chapter 3 of EPA's 2014 National
                Beach Guidance) can also help identify the beaches that
                could benefit most from predictive models.

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 1: Evaluate the Appropriateness of a FIB Predictive Tool
Are Beach Characteristics Compatible with
Predictive Tools?
Beaches that make the best candidates for predictive tools are located in
environmental settings that are themselves predictable. A good candidate
beach operates under a fairly constant range of "normal" conditions that,
when processed through a predictive tool, should yield a good estimate of
FIB levels. The tool operates like an "if...then" statement. If a set of these
conditions occurs, then you get a specific FIB density. Importantly, most
predictive tools are developed using historical data which, in effect, describe
and define the norm in terms of both the conditions and the predicted value.
Once in operation, if the tool is presented with conditions outside the norm,
it might not yield accurate results. Therefore, the team might have to revisit
the conditions predictive for FIB  density.

Beaches that might not be good candidates for predictive tools are usually
those subject to a wide or frequently changing set of conditions and
disturbances that impact FIB density, making "normal" difficult to define
and characterize. These conditions might include frequent impacts by spills
or illicit discharges or periodic visits by large flocks of birds. Some open
ocean beaches are not good candidates for modeling simply because of the
sheer complexity of the various meteorological conditions, tidal patterns,
offshore currents, and other factors that occur.

Are There Sufficient Historical Data to Develop and
Test a Predictive Tool?
Access to a sufficient amount of historical FIB density data and
corresponding data describing a variety of environmental conditions (i.e.,
independent variable data) is crucial for developing and testing predictive
models. EPA recommends having at least 50 observations; but 100 or more is
preferable (USEPA 2010b). Ideally the observations should represent a range
of conditions experienced at the beach and include data collected in normal
seasons, dryer-than-normal seasons, and wetter-than-normal seasons. This
is rarely the case; but the closer you can get to this ideal, the more robust
your model will be.

An important part of the model development process is testing the model.
Francy et al.  (2013a) recommends that you collect data for at least three
seasons, then use two seasons' data as the training dataset and one season's
data as the testing dataset.
Checklist of Beach and Program
Characteristics Compatible with
Modeling

•  The beach operates
   under a constant range of
   "normal" conditions.

•  Exceedances of beach
   notification threshold
   values occur occasionally
   but are not a chronic
   problem.

•  FIB densities change over
   relatively short periods of
   time (time-lag problem).

•  A sufficient amount
   of historical FIB and
   independent data exists.

•  Funding for personnel
   and technical experts is
   available.

•  Monitoring equipment is
   available.

•  Computer equipment and
   software are available.

-------
                    Six Key Steps for Developing and Using Predictive Tools at Your Beach
            Step 1: Evaluate the Appropriateness of a FIB Predictive Tool
A more complete discussion of independent variable data along with tips
on how to collect them is provided in step 2. For preliminary purposes,
however, your Beach Team should investigate the FIB density, rainfall data,
and data on factors that affect water movement in and around the beach
(i.e., wind and wave direction and magnitude). Water quality data such as
turbidity and water temperature are also important factors at some beaches;
as are data on near-shore sources of fecal pollution (e.g., birds).

Sources of data include federal agencies (e.g., the National Weather Service
(NWS) and USGS) as well as various state and local agencies. A particularly
valuable resource is beach sanitary surveys, especially if they are conducted
on a daily basis (see http://www.epa.gov/beach-tech/beach-sanitary-surveys
                    for sanitary surveys developed by EPA). Sanitary
                    surveys provide site-specific data that match exactly
                    to the time a FIB sample  is collected.

                    If a minimum of three seasons' worth of historical
                    data is not readily available, then your Beach Team
                    might need to collect more data before developing
                    the model. Step 2 provides more information on data
                    collection.

                    Are There Funding and Other
                    Resources Available to Develop,
                    Operate, Maintain, and Update a
                    Predictive Tool?
                    The development of a predictive model is just the first
                    phase of an overall predictive modeling program.
                    Once the model is developed, there are a variety of
                    costs associated with operating and maintaining it.
                    The majority of local agencies responsible for beach
                    programs—usually city or county public health
                    departments—have limited staff time, technical exper-
                    tise, and funding available for projects. Consequently,
                    resources and costs must be carefully planned and
                    budgeted. Major costs to  consider include:
                       • Personnel and technical experts.
                       • Data collection.
                       • Monitoring equipment and supplies.

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 1: Evaluate the Appropriateness of a FIB Predictive Tool
   • Modeling and statistical software.
   • Model evaluation over time.

Personnel and Technical Experts
Your Beach Team needs to have the right combination of staff to develop
and implement a predictive tool. The most important staff will be the
following:
   • Field staff—to conduct sampling and maintain equipment.

   • Modeler/statistician—to analyze data and develop, validate, and refine
     the model.

   • Beach manager—to integrate the model into your beach program and
     conduct public outreach.

It can be helpful to collaborate with others, such as universities, federal
agencies, and state or local governments (see Collaboration with Others
text box). They can be excellent resources, especially when the technical
knowledge of a statistician is required.

Data Collection
In addition to gathering historical data, the team will need to continue to
collect data from the same sources to run the model once it is implemented.
Data collection is discussed in more detail in step 2. If the data source changes


    Collaboration with Others
    Many partnerships have successfully developed a number of modeling programs. For example, USGS
    has played a major role in modeling efforts in the Great Lakes region. They offer extensive resources,
    expertise, and comprehensive knowledge of watersheds (including beaches) and can provide in-depth
    statistical tools and statisticians to run them. Local universities can be another highly valuable resource.
    Graduate students from ecology, biology, and  environmental science and engineering departments
    might be available to assist with water quality  monitoring, sampling, and even model development. A
    mutually beneficial partnership might develop as students have the opportunity to apply their research
    to a real-world scenario and  it allows for low-cost sampling and monitoring. In addition, universities often
    have their own monitoring equipment, laboratories, and  even statistical software that can be shared.

    Some beach programs have models that began as part of graduate theses and dissertations. For example,
    SCDHEC developed its model with  the help of a graduate student at the University of South Carolina who
    used it as part of a master's thesis. The CRWA's predictive model was also developed as part of a master's
    thesis by a student at Tufts University. These collaborative efforts proved to be highly advantageous,
    providing a wealth of knowledge and expertise, as well as significant cost savings.

-------
1O
        Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 1: Evaluate the Appropriateness of a FIB Predictive Tool
                               for any of the model's variables or significant alterations occur to the beach
                               and surrounding area, the model will need to be recalibrated (see step 6).

                               Monitoring Equipment and Supplies
                               Even when there is an abundance of data for your beach from external
                               sources, use of monitoring equipment such as data sondes, flow meters, and
                               rain gauges might provide you with more accurate data. The main drawback
                               of using this equipment is that it can be expensive to purchase and maintain,
                               especially when it must be placed in harsh environments and exposed to
                               weather, waves, sand, and vandalism. Sufficient funding resources as well as
                               staff to maintain and repair equipment are necessary. As described in the
                               case studies, both MHD and SCDHEC stopped using data sondes and rain
                               gauges because of their high maintenance costs, but were able to develop
                               successful models using other data sources.

                               Modeling and Statistical Software
                               Some models are simple enough to run in a basic Excel spreadsheet, with
                               no additional software required. Statistical software can be purchased,
                               but it might have licensing costs. EPA developed VB, a free model builder
                               software program (described in more detail in steps 3 and 4), that enables

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 1: Evaluate the Appropriateness of a FIB Predictive Tool
11
beach managers and others to develop or update models using statistical
techniques. The software is user-friendly; however, preparing data for input
into the modeling software requires considerable time and expertise. Visit
http://www.epa.gov/exposure-assessment-models/virtual-beach-vb for more
information about VB.

For information on how to manage your data set for use in VB, see step 2. If
the answers to the four questions asked in the introduction to this step are
"Yes", you are in a good position to move forward with the development of
a predictive model. If you determine that a predictive model is not needed
or your beach is not a good candidate for a modeling project, consider
working with your public health officials to alter the current monitoring
program to focus your efforts during times when conditions favor high FIB
densities. If your answer is "No" to question c., you might need to collect
additional data to  build a historical database for use for model development
in the future. If your answer is "No" to question d., consider exploring the
potential of collaborating with others interested in model development (see
"Collaborating with Others" text box). If that option is not available, there
are other ways to increase the level of public health protection at beaches,
including the use of sanitary surveys and preemptive advisories.

Model Evaluation Over Time
Once your model has been developed, it must be maintained to keep it
running properly and performing as expected. This process is covered in
step 6.

-------
12
Six Key Steps for Developing and Using Predictive Tools at Your Beach
                               Step 2:  Identify Variables  and Collect

                                           Data

                               Introduction to Step 2
                               After your Beach Team has determined that a predictive model is
                               appropriate for a beach, it can proceed to Step 2: identifying candidate
                               independent variables for use in model building and collecting a set of high-
                               quality historical data for those variables and FIB density. Refer to page 16
                               for list of independent variables.

                               To predict an exceedance of a water quality standard, your Beach Team
                               must first identify environmental conditions that likely affect the levels of
                               bacteria at the beach. In the context of predictive modeling, those conditions
                               are the "independent variables." In this step you are trying to identify and
                               collect data for the independent variables that exhibit the strongest statistical
                               relationship with the dependent variable, FIB density. It is important to
                               keep in mind, however, that a strong statistical association should not be
                               interpreted as reflecting actual causative mechanisms for an observed
                               elevation of FIB densities.  The association is based only on the correlation
                               of past observations of independent variables with FIB  density. When such
                               associations are evident, further scientific investigation can inform beach
                               managers of the nature of the association and improve  their understanding
                               of future occurrences and how much weight to give them.

                               Key Attributes of Variable Data Sets
                               For model-building, the variable data sets should possess the following basic
                               characteristics.
                                 An adequate amount of data to develop (training dataset) and validate
                                 (testing dataset) the model. EPA recommends at least 50 observations
                                 for model development, but 100 or more are preferred. There are several
                                 ways of portioning the available data into training and testing datasets.
                                 One common way is to collect three seasons of data,  then designate two
                                 seasons as the training data set and the third as the testing data set. You
                                 can construct your model using fewer data, but model performance might
                                 suffer because accuracy might be low.

                                 High-quality data, including quality assurance documentation. Ideally,
                                 a quality assurance plan exists that describes data collection methods,
                                 protocols, and procedures. Given the particular variable, the plan might
                                 include laboratory methods; field sampling protocols, including metadata
                                 (e.g., sampling time and depth of sample); and data processing procedures.

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 2: Identify Variables and Collect Data
  Easily collected or obtained data. Because predictive models are often
  run daily, all input data must be obtained quickly. Automatic samplers
  with data transmission capabilities and data that are easily downloaded
  from government agencies' websites (e.g., NWS and USGS) represent
  good data sources. The "ease of collection" will likely eliminate many
  potentially good candidate variables from consideration. In some
  cases, a more easily collected surrogate variable might convey similar
  information; in other cases, your Beach Team will have to abandon
  the variable and look elsewhere. In general, data collected locally are
  preferred over data obtained from external sources. If data from external
  sources are used, it is preferable if the collection methods are subject to
  good QA/QC (e.g., USGS or NWS data).

  Consistent procedures for collecting data for pre- and post-model
  development. Independent variable data have two functions: (1)  they
  are used to develop the predictive model, and (2) they are used as input
  variables to run the model. When you use historical data to build a
  model, you assume that the methods used to collect and report that data
  will remain in place for future model  input data collection. Consistency
  is key. You cannot mix and match data sources for the same variables
  without re-validating the model.

  Temporally relevant independent variable and FIB data. FIB sampling
  at swimming beaches usually occurs early in the morning. Some
  independent variable data  are likely collected at the time of sampling.
  Other data might relate to  conditions that occurred prior to the sample
  collection time, such as cumulative rainfall over the previous 12  hours.
  As you determine your independent variables, you must keep them
  temporally relevant to the sample time. If the sample time was at 8:00 a.m.,
  you need to ensure that your independent variables are also based on 8:00
  a.m. or an earlier time based on knowledge of stream effects and  runoff.
                        13
Quality Assurance and Quality
Control

EPA's National Beach
Guidance (2014) provides
important information and
recommendations concerning
primary data collection to
ensure that all observations,
samples, and measurements
are properly and consistently
collected and processed.
Specifically, the Agency
recommends that a quality
assurance project plan (QAPP)
be developed to ensure that
collected data are complete,
accurate, and suitable for the
intended purpose. Essentially,
the QAPP serves as a blueprint
for collection activities and
quality assurance (QA) and
quality control (QC) procedures.
Also included in the plan should
be detailed  descriptions of
standard operating procedures
and staff training requirements.

-------
  14
Some Basic Bacteria Facts

FIB are very small, immobile
single-celled organisms.
They have to be physically
transported from point to point
by some mechanism. Usually
this mechanism is moving
water.

Life expectancy of individual
cells outside their natural
environment is usually short,
around 2-5 days. Many
stressors can shorten it further.

FIB can survive and even
multiply for some time in
sediments and algal mats.
They can be easily stirred up
and resuspended in overlying
waters.
                    Six Key Steps for Developing and Using Predictive Tools at Your Beach
                             Step 2: Identify Variables and Collect Data
FIB Density
It is important that FIB density measurements are taken at a consistent
location, depth, and time. Sample collection, handling procedures, and
analytical methods must be consistent as well. This section lists EPA
recommendations concerning FIB sampling and analysis. They are presented
not to encourage immediate changes to data collection procedures (which
would disrupt consistency), but as background to allow better interpretation
of FIB density data that have already been collected.
  • Sample Location. Sample sites should be located where the greatest
    recreational use occurs. Features that might directly affect the
    movement of FIB to and from the beach, such as outfalls and jetties,
    should also be taken into account.

  • Sample Depth. Samples should generally be taken in approximately
    knee- to waist-deep water unless that depth poses a safety risk to the
    sampler (e.g., powerful waves). The sample should be drawn 0.5-1 foot
    below the surface. Samples taken from shallower waters might not
    accurately represent ambient FIB density due to the resuspension of
    bacteria from sediments.

  • Sample Time. Samples taken early in the morning are generally
    considered the best for beach monitoring programs because that is
    the time when FIB densities are usually the highest. The sampling
    time should be consistent day to day because FIB density can change
    fairly quickly in response to increasing sunlight intensity, temperature,
    and other environmental conditions. EPA's National Beach Guidance
    includes a detailed discussion on event-scale, diurnal, and tidal
    variability (USEPA 2014).

  • Sample Frequency. For the purpose of developing a predictive model,
    the more samples the better. Most beach programs sample high priority
    beaches at least once a week during the swimming season. In general,
    a model will be increasingly robust as more FIB data are collected and
    matched with independent variable data. The Report of the Experts
    Scientific Workshop on Critical Research Needs for the Development of
    New or Revised Recreational Water Quality Criteria recommends you
    collect data four or five times a week covering a variety of sampling
    events to capture temporal variability (e.g., if FIB sampling occurs  only
    once or twice a week due to resource constraints, predictive models can
    provide information for timely notifications on other days) (USEPA
    2007).

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 2: Identify Variables and Collect Data
  • Sample Processing and Analysis. Your Beach Team should consult the
    state for the proper procedures and QA/QC requirements, including
    holding times, for collecting, handling, and analyzing water samples.
    EPA has approved a number of analytical methods for culture analyses
    of recreational waters (40 CFR part 136). In addition, EPA has validated
    quantitative polymerase chain reaction (qPCR) methods for measuring
    water quality at beaches (see http://www.epa.gov/cwa-methods/other-
    clean-water-act-test-methods-microbiological). Standard Methods
    for the Examination of Water and Waste-water (APHA 1998) is also a
    source of valuable information.

FIB density data is usually reported as colony forming units (CPUs) and
most probable number (MPN). CPU is a measurement based on a direct
count of bacteria colonies grown on Petri plates and substrate media from
water samples passed through membrane filters. MPN tests involve multiple
tubes  that are allowed to ferment over time. Probability formulas are applied
to the number of tubes that produce a positive reaction, and a FIB density
estimate is calculated. EPA has approved methods for both types of analyses
and either is acceptable to use in modeling. The key is consistency, however.
If the  type of analysis has  changed for your beach, construct (or reconstruct)
your model using only data generated using the current analytical method.


   Sources of Bacteria
   Human Sources
   Some older cities have combined sewer systems that convey both sanitary sewer wastewater and
   stormwater in one piping system. During periods of significant rainfall, the capacity of the combined sewer
   might be exceeded. When this happens, the excess mixture of sanitary wastewater and stormwater is
   discharged at combined sewer overflow (CSO) points, typically to rivers and streams. During dry weather
   periods, human-derived bacteria usually cause a problem at beaches only if septic systems in the area fail
   or wastewater pipes are compromised or illegally connected to storm drains.

   Animal Sources
   In urban and suburban landscapes, animal-derived bacteria and other pollutants tend to collect on
   impervious surfaces. Sources typically include dogs and cats; waterfowl such as geese, gulls, and ducks; and
   scavenger species such as raccoons, rats, and pigeons. During the beginning of a storm, the initial runoff
   flow will sweep up most of the deposited fecal matter and quickly carry it into the drainage network. Known
   as the "first flush" phenomenon, this flow typically has significantly higher concentrations of bacteria than
   subsequent flows that occur as the storm lingers. In general, the amount of first flush pollutants available for
   transport is a function of the number of dry days since the previous storm. Animal-derived bacteria can also
   be transported from feedlots, barnyards, and other confined-animal facilities located in the drainage area.
                        is
qPCR

Newer analytical technologies
have accelerated the timeliness
of laboratory results. One
such method is quantitative
polymerase chain reaction
(qPCR) which quantifies a
targeted genetic sequence
for both viable and nonviable
forms of the indicator bacteria.
Because the method does not
require culturing live bacteria,
analysis can be completed in
less time—within 2-4 hours of
receipt of the sample by the
laboratory. Although both qPCR
and bacteria culture methods
report FIB density, they are
derived using significantly
different methods. These
results should not be combined
when building and operating a
predictive model.

-------
  16
Common Parameters Used in
Models
• Parameters relating to sources
  of FIB atthe beach
  -  Beach attendance
  -  Bather counts
  -  Dog counts
  -  Bird counts

• Parameters relating to
  movement of FIB through the
  drainage area
  -  Cumulative rainfall
  -  Antecedent dry days
  -  Stream discharge
  -  Stream stage

• Parameters relating to
  movement of FIB in receiving
  waters
  -  Currentspeed
  -  Current direction
  -  Current A-  and 0-components
     (created byVB)
  -  Wind  speed
  -  Wind  direction
  -  Wind  A-and 0-components
     (created byVB)
  -  Water level
  -  Barometric pressure

• Parameters relating to the fate
  FIB atthe beach
  -  Solar  irradiance
  -  Air temperature
  -  Water temperature
  -  Cloud cover
  -  Dew point
  -  Day of year (ordered number)
  -  Turbidity
  -  Conductivity
  -  Wave height
  -  Wave direction
  -  Wave A-and  0-components
  -  Chlorophyll
  -  Dissolved Oxygen
                     Six Key Steps for Developing and Using Predictive Tools at Your Beach
                              Step 2: Identify Variables and Collect Data
Independent Variables
Independent variables associate directly and indirectly to environmental
conditions. To aid in choosing the best candidate variables, your Beach Team
should become familiar with the likely sources of bacteria that affect the
beach, how they are transported to the beach area, and conditions that tend to
increase or decrease FIB density in the swimming area. A useful way to collect
this information is by using a sanitary survey (see Data Sources text box on
page 21 for more details on sanitary surveys). That information can serve as a
valuable starting point for selecting candidate independent variables.

Independent variables can be roughly categorized into one of four groups:
   •  Variables relating to bacteria movement through the drainage area.
   •  Variables relating to bacteria movement through the receiving water.
   •  Variables relating to the fate of bacteria in the swimming area.
   •  Variables relating to activities and conditions at the beach.

Other good sources of guidance on selecting variables include:
   •  Predictive Tools for Beach Notification. Volume I, Review and Technical
      Protocol (USEPA 2010a).
   •  Predictive Modeling at Beaches. Volume II, Predictive Tools for Beach
      Notification. (USEPA 2010b)
   •  Procedures for Developing Models to Predict Exceedances of Recre-
      ational Water Quality Standards at Coastal Beaches:  U.S. Geological
      Survey Techniques and Methods 6-B5 (Trancy and Darner 2006).

Variables Relating to Bacteria Movement through the
Drainage Area
The amount, intensity, and duration of a rain event determine the timing
and amount of runoff and the extent of water movement in the drainage
area. Since runoff functions as the primary transport mechanism for both
human- and animal-derived bacteria, rainfall is usually identified as a very
important independent variable for FIB predictive modeling.

Your Beach Team's analysis of the drainage network and the potential
sources of bacteria within the network should help identify the specific types
of rainfall statistics that might be considered for use in the predictive model.
The most common choice is cumulative rainfall over a specific time period
prior to the FIB sample time (e.g., 6-hour, 24-hour, 48-hour lag).

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 2: Identify Variables and Collect Data
17
Further analysis might lead you to create an even better variable by assigning
more importance or "weight" to segments within a chosen time range. For
example, Francy and Darner (2006) created a weighted 3-day rainfall statistic
by assigning the most weight to the rainfall total occurring 24-hours
immediately prior to the sample time and progressively lesser weights to the
amounts occurring one and two days before sampling (see equation below).
                           +R
                             Day3)
where:
      Rw = weighted cumulative variable

      RDa 1 = 24-hour total rainfall/0-hour lag

      RDa 2 = 24-hour total rainfall/24-hour lag

      RDa 3 = 24-hour total rainfall/48-hour lag

Rainfall data can be collected locally using a rain gauge, or it can be
obtained from an external source such as the NWS. Some water and
wastewater utilities operate rain gauges near beaches and might be good
sources of data. Locally collected data might correlate better with actual
conditions at the beach site; however, operating and maintaining rain gauges
can be challenging. Data available on the Internet can be easy to download
and use, but might not adequately characterize local conditions.

Since drainage flow is a direct result of rainfall, if one or more streams in
the drainage network have monitoring gauges in place that provide daily
or hourly measurements of discharge
or the height of the water surface (i.e.,
stage or gauge height), those data might
also prove to be valuable as independent
variables.

Variables Relating to Bacteria
Movement through the
Receiving Water
The endpoints of the drainage networks
are typically mouths of streams or
drainage outfall structures that discharge
into a lake, river, estuary, or ocean
(the "receiving waters"). When outfalls
are not located directly on the beach,

-------
18
Six Key Steps for Developing and Using Predictive Tools at Your Beach
         Step 2: Identify Variables and Collect Data
                                bacteria contained in the discharge must be transported from the outfall,
                                through the receiving water, and to the beach to cause unhealthy conditions
                                for swimmers. In addition to the lateral movement of bacteria from outfall
                                to beach, bacteria residing in sand or sediments can move vertically into the
                                water column when the sand or sediment is stirred up.

                                Wind, waves, and water currents are usually the three most important
                                independent variables associated with the movement of bacteria through the
                                receiving water. They all can be characterized by direction and magnitude
                                measurements. In general, "continuous variables" (those with numeric
                                values) are preferred over "categorical variables" (those with labels as values),
                                but both forms have been successfully used in predictive models.

                                Routine sanitary surveys tend to collect either (1) very simple discrete
                                measurements (continuous variable) of wind speed and direction, current
                                speed and direction, and wave height, or (2) categorical descriptions of wind
                                and wave attributes. The Beaufort Wind Scale, developed in 1805 by Sir
                                Francis Beaufort, U.K. Royal Navy, is an example of a categorical approach
                                to measuring wind and waves (Table 1).
    Table 1. Beaufort Wind Scale
Wind
(Knots)
Less
than 1
1-3
4-6
7-10
11-16
17-21
22-27
28-33
34-40
41-47
48-55
56-63
64+
Classification
Calm
Light Air
Light Breeze
Gentle Breeze
Moderate Breeze
Fresh Breeze
Strong Breeze
Near Gale
Gale
Strong Gale
Storm
Violent Storm
Hurricane
On the Water
Sea surface smooth and mirror-like
Scaly ripples, no foam crests
Small wavelets, crests glassy, no breaking
Large wavelets, crests begin to break, scattered whitecaps
Small waves 1-4 ft. becoming longer, numerous whitecaps
Moderate waves 4-8 ft taking longer form, many whitecaps, some spray
Larger waves 8-13 ft, whitecaps common, more spray
Sea heaps up, waves 13-19 ft, white foam streaks off breakers
Moderately high (18-25 ft) waves of greater length, edges of crests begin to
break into spindrift, foam blown in streaks
High waves (23-32 ft), sea begins to roll, dense streaks of foam, spray may
reduce visibility
Very high waves (29-41 ft) with overhanging crests, sea white with densely
blown foam, heavy rolling, lowered visibility
Exceptionally high (37-52 ft) waves, foam patches cover sea, visibility more
reduced
Air filled with foam, waves over 45 ft, sea completely white with driving
spray, visibility greatly reduced

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 2: Identify Variables and Collect Data
19
Automated collection of wind, wave, and current data offers several
advantages over manual collection because (1) it is more easily obtained,
(2) it eliminates the subjectivity associated the measurements, and
(3) assuming data are recorded continuously, it allows for the construction
of antecedent variables (e.g., average wind speed over the previous 24 hours).
The most convenient source of wind, wave, and current data is the National
Data Buoy Center. This agency is part of the NWS and maintains a network
of 90 buoys and 60 coastal stations that collect hourly data on wind speed
and direction and wave height. Some also collect data on currents.
Tides also create currents that can affect
FIB density in beach areas. Incoming tides
usually tend to keep FIB in residence at some
beaches, while outgoing tides can serve to
flush them away. Although very site-specific,
the tidal cycle might be an important
independent variable at some ocean beaches.

Man-made structures such as jetties, groins,
piers, breakwaters, and seawalls can affect
FIB movement through the receiving water at
some beaches. Those structures can enclose
most or part of a beach, preventing water
circulation between the beach and open
water. Several studies have reported higher
densities of FIB in those situations because
of the retention of bacteria from lack of
flushing.

Variables Relating to the Fate of
Bacteria in the Swimming Area
Bacteria residing in the receiving water,
including in the swimming area, are subject
to many conditions that can increase or
decrease their presence in the water column.
One of the more important stressors of
bacteria is sunlight—specifically, ultraviolet
(UV) light. Exposure causes bacteria to die
off, which is why FIB densities are usually
found to be greater in the early morning
before the sun rises higher in the sky.

-------
  2O
Waterfowl as a Pollution
Source

Gulls and other waterfowl
are often a source of fecal
contamination at beaches,
particularly in the Great Lakes.
Hansen et al. (2011) concluded
that waterfowl, including
Canada geese, ring-billed
gulls, and mallard ducks were
the primary source of E, coli
contamination at beaches
near Duluth, Minnesota, and
Superior, Wisconsin. Chicago
and Racine have also correlated
gull populations at its beaches
to FIB densities in beach water
samples (Converse et al. 2012;
Whitman and Nevers 2003;
Hartmann et al. 2013). Chicago
has reduced the numbers
of gulls at its beaches by
managing their nests.
                    Six Key Steps for Developing and Using Predictive Tools at Your Beach
                             Step 2: Identify Variables and Collect Data
Turbidity is a common measurement and often found to be an important
independent variable for predictive modeling. It is essentially the cloudiness
of the water as defined by a measurement of scattered light. Turbidity is
generally caused by a combination of suspended solids, colloidal matter,
and algae. Cloud cover also affects the amount of light penetrating the water
column and is sometimes used as an inverse surrogate for UV light. Staining
of the water by tannins also affects light penetration.

Light alone is not the only factor attributable to turbidity's value as
an independent variable. Perhaps even more importantly, stormwater
runoff carries with it a load of suspended solids, silt, and other material.
Consequently, outfall discharge during and following storms is usually more
turbid than the receiving water. Thus, turbid water moves in tandem with
outfall bacteria. Other parameters associated with stormwater runoff—such
as total suspended solids, salinity, and conductivity—can also serve as
independent variables.

Suspended solids can play a role in removing FIB from the water column
via sedimentation. Individual bacteria cells are very small (some are only a
micron in length) and easily remain suspended in water. But they can also
be adsorbed on sediment particles and, in doing so, increase their weight
and their chances of settling to the bottom. Once in the sediments, however,
they can remain viable and be resuspended in the water column by any
number of turbulent forces, including waves and even swimmer activity.

Variables Relating to Activities and Conditions at the Beach
The variables described in the previous subsection are related to sources of
FIB that (1) originate in the drainage area and are subsequently  transported

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach

Step 2: Identify Variables and Collect Data
through the drainage network and receiving waters to the beach, and
(2) have settled from the water column to the sediments. At some beaches,
however, significant sources of bacteria found in or immediately adjacent
to beaches can cause high FIB densities within the swimming area.
For example, resident populations of gulls and Canada geese have been
identified as important contributors to bacteria loads at some beaches.

Table 2 includes a list of the independent variables included in the final
models for the five case studies. The variables were useful in making timely
beach notification decisions. Models must be developed on a beach-specific
basis using site-specific data, as shown by the variety of independent
variables used in the case studies as well as the number of variables used in
similar models (Francy et al. 2013b).

Table 2. Independent variables used in final statistical models from case studies.
 Location
 Chicago
 Parks
 District
 Charles
 River
 Watershed
 Association
 Milwaukee,
 Wisconsin
 Horry
 County,
 South
 Carolina
 Racine,
 Wisconsin
Beaches
Montrose Beach,
Oak Street Beach,
Foster Beach,
63rd Street Beach,
and Calumet Beach
Lower Charles
River Basin from
Watertown to
Boston Harbor
Bradford Beach,
McKinley Beach,
and South Shore
Beach
Grand Strand
beaches
North Beach,
Zoo Beach
Independent Variables Used in Final Model
6-hour rainfall, 4-hour wave period,
6-hour solar radiation, 48-hour rainfall,
6-hour longshore wind, onshore wind,
turbidity
Rainfall volume, river flow, and wind
24-hour rainfall, previous 24-hour £ coll
sampling, pH, conductivity, wave height,
water temperature
Cumulative rainfall, rainfall intensity,
preceding dry days, weather (e.g., wind
speed), tides and lunar phase data,
current and salinity
Water temperature, air temperature,
seagull counts, dog counts, wildlife
counts, wave height and intensity, water
clarity, sky conditions, color changes,
odor, algae amount, algae type, bather
load (in, out, and total), long shore
current and components, wind direction
and speed, stream discharge, pollution
discharge , rainfall (24-, 48-, and 72-
hour), day of year, season, lake levels,
and previous day's £ co/; values
                                                                                      21
Sand and Algal Mats

Sand in the wave-washed zone
of a beach can be a potential
source of fecal contamination
(Aim et al. 2003). Beach sand
can support large densities
of FIB for prolonged periods,
independent of lake, human,
or animal input (Whitman et al.
2014).

Other research has examined
the presence of FIB in algal
mats along beaches. Whitman
et al. (2003) found that
Cladophora can  provide a
secondary habitat for FIB that
could potentially impact water
quality in affected Great Lakes
swimming areas.

-------
22
Six Key Steps for Developing and Using Predictive Tools at Your Beach
         Step 2: Identify Variables and Collect Data
                  Data Sources

                  National Oceanic and Atmospheric Administration (NOAA)/NWS Weather
                  Station Data
                      • NWS airport weather data (e.g., rainfall, temperature, cloud cover, wind speed and
                        direction) are frequently available and easily downloaded.

                      • NOAA maintains a network of buoys, tidal stations, and satellite measurements that
                        provide data on tides, currents, wind, cloud cover, and other marine characteristics
                        (http://tidesandcurrents.noaa.gov).

                      • Additional water quality data are available from NWS (e.g., forecast maps, radar,
                        river/lake levels, rainfall, air quality, and past weather) (http://www.weather.gov).

                  uses
                      • USGS provides continuous real-time water quality data, including streamflow,
                        water temperature, conductivity, pH, dissolved oxygen, turbidity, and runoff
                        (http://water.usgs.gov/data).

                      • USGS supports the National Water Information System (NWIS), which includes data
                        from more than 1.5 million sites, some in operation for more than 100 years
                        (http://waterdata.usgs.gov/nwis).

                  Sanitary Surveys
                  Sanitary surveys are an excellent source of information on site characteristics that
                  can support the development of predictive models. The surveys provide detailed
                  environmental data, including the following observational variables that could be
                  translated into predictive variables for a model:
                      • Number of swimmers/bathers.
                      • Boat traffic.
                      • Wildlife and domestic animals.

                      • Debris and litter.
                      • Presence of algae.
                      • Infrastructure (e.g., parking lots, storm drains, WWTPs).

                  EPA has developed beach sanitary survey tools—one each for marine and Great Lakes
                  beaches—to help beach managers evaluate all contributing beach and watershed
                  information, including water quality data, pollution source data, and land use data
                  (http://www.epa.gov/beach-tech/beach-sanitary-surveys).

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach



Step 3:  Perform Exploratory Data

            Analysis

After selecting and collecting high-quality FIB data and independent
variable data sets, your Beach Team is ready to proceed to exploratory data
analysis (EDA).

Introduction to Step 3
The primary purpose of EDA is to explore the relationships between the
independent and FIB density variables and identify the best candidate
variables for model development. Another purpose is to assess two
fundamental assumptions of the statistical models described in this
guidance: (1) the data sets represent the normal range of conditions that are
expected in the future, and (2) the FIB density and independent variables
are linearly related. Your Beach Team should consider working with a
statistician who can provide statistical expertise during EDA.

The EDA work is valuable because it adds to your Beach Team's depth of
knowledge about relationships between FIB density and the various drainage
area, receiving water, and fate independent variables. This knowledge is
crucial for integrating predictive modeling into an overall beach program.
                       23
The purpose of this section
is to provide an overview of
the approach to exploratory
analysis. It does not attempt to
provide a thorough discussion
of techniques or evaluations.
Further information can be
found at http://www3.epa.gov/
caddis/da exploratory O.html.

-------
24
Six Key Steps for Developing and Using Predictive Tools at Your Beach
          Step 3: Perform Exploratory Data Analysis
                                  Virtual Beach Software
                                  Your Beach Team will need to use specialized computer software for many of
                                  the data processing and EDA tasks described in this step, as well as for model
                                  development and testing activities described in the next step. EPA's Virtual
                                  Beach (VB) software package is specifically designed for constructing site-
                                  specific FIB prediction models at freshwater and marine beaches. Created
                                  for use by beach managers and researchers, VB includes a variety of EDA
                                  techniques, including the basic ones described in this section.

                                  Although many free and proprietary statistical packages that include EDA
                                  programs are available, VB allows predictive beach modelers to seamlessly
                                  integrate  all the necessary components for preparing and analyzing data
                                  and building and testing models. VB also includes an integrated mapping
                                  component to determine geographic orientation of the  beach and assists
                                  the user in compiling wind/current speed and direction in along-shore and
                                  onshore/offshore components. Your Beach Team should ensure they have
                                  staff with appropriate skills as VB does not replace the  need to work with
                                  someone  knowledgeable in data management and analysis.

                                  For more information about VB and its capabilities, including how to
                                  download a free copy, visit http://www.epa.gov/exposure-assessment-
                                  models/virtual-beach-vb. You can also visit http://www.seagrant.wisc.edu/
                                  home/Default.aspx?tabid=646#Training for predictive  modeling workshop
                                  presentations, webinars on accessing online data, and step-by-step modules
                                  onVB.
 Exposure Assessment Models
               in arf tifif 'PJ5 Homo » F*fKWJirp flw^menr Mnrlp!* » Vimnt frw+i (VR)
              Virtual Beach (VB)
                                      3&A   December 2014
                                      2.4.3   September 2013
                                          liilyjfll?
               Applications and Possible Uses
               Technical Support and Training
               Duality Assurance and Oualitv Control
              VMiJdl Bedi.ii b d wjllwdie pdtkdye designed fui developing bile-spetifii. sLdUbliidl mudelb lui me
              prediction of pathogen indicator levels at recreational beaches.
              VD is primarily designed for beach managers responsible for making decisions regarding beach
              closures due to pathogen contamination. However, researchers, scientists, engineers, and students
              ntcrcstcd in studying relationships between water quality indicators and ambient environmental
              conditions will find VB useful.
         Data Management
         The management of data is an important
         part of the model development process.
         Before data can be uploaded to VB or other
         modeling software, it must be manipulated
         and formatted properly. This can be a fairly
         complex and time-consuming process and
         enlisting the help of data processing experts
         is often necessary.

         It is important to keep in mind that each
         measurement in an independent variable
         data set must pair with one, and only one,
         FIB density measurement. Some beaches
         collect multiple samples at about the same

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 3: Perform Exploratory Data Analysis
time and record all of them in a database. In that case, you can take the
geometric mean and use that as your data point.

As with any data-driven analyses, variable data must be checked carefully
and identified errors or anomalies corrected before they are entered into
any analytical software, including VB. Some basic things to watch out for
include missing data, improperly recorded information, invalid data cells,
and other potential formatting problems. Data formatting and structure
must meet all of the input standards and requirements of the software. For
example, empty data cells are not permitted in VB. In such cases, you need
to either identify and replace these values or delete the observation from the
data set.

VB includes a component that assists users in the input data-check process.
It can go through a spreadsheet cell by cell looking for blanks as well as
non-numeric or user-specified values. If a bad cell or value is identified, the
user is presented with an opportunity to fix it.

Other data checks can include:

  Linking FIB observations with independent variable data. A  key
  challenge in developing the input data sheet for VB is selecting only  those
  data temporally linked to the FIB observations. The challenge is further
  complicated if you are also creating antecedent variables from those data.
  There are several methods for accomplishing this data manipulation task,
  both within and outside of the VB-input file. Mednick (2009) describes
  a system  for joining various data tables into one master table using
  Microsoft's Access database software.

  Numerical conversion of categorical variables. VB requires that all
  categorical variable labels be given a numerical designation. Ordinal
  variables can be simply converted to a continuous-like numerical variable.
  For example, turbidity values can be translated as Clear = 1, Slightly
  Turbid = 2, Turbid = 3, and Opaque = 4. Of course, even though they
  appear as numbers, they are still categorical values and, therefore, most
  summary statistics (e.g., mean values) are not applicable. VB provides
  an opportunity for the user to flag categorical variables to prevent the
  creation of inappropriate summary statistics and  variable transformations
  (e.g., natural log or square root variable).

  Data entry errors. Your Beach Team should put in place data
  management QA oversight and QC procedures that include the transfer
  and manipulation of data such as in the VB input data sheet. After data
                        25
NOAAand USGS have
developed tools to help
automate the process of
downloading data from online
sources and  compiling them
into a single  data sheet.

• NOAA-PROCESSNOAA.
  Accesses, compiles, and
  processes wind speed and
  direction (instantaneous
  and previous 24 hours) and
  rainfall totals for 24-hour
  windows of lag times of 1,
  2, and 3 days. It also has
  the ability to display data
  graphically and weight
  rainfall variables.
  To access the tool, visit
  http://pubs.usgs.gov/
  sir/2013/5166/Ddf/sir2013-
  5166 appendix2.pdf.

• USGS-Environmental
  Data Discovery and
  Transformation (EnDDaT).
  Accesses, compiles, and
  processes data from a
  variety of data sources,
  including  NWS, National
  Data Buoy Center (NDBC),
  and NWIS. EnDDaT can be
  used to compile historical
  data in a single worksheet
  for model development  and
  to create real-time datasets
  for direct  import to VB for
  model operation.
  To access, visit http://cida.
  usqs.gov/enddat/.

-------
26
                      Water Quality Notice
                   All natural bodies of water contain microscopic
                    organisms This area 19 monitored for E. coll
                   bacteria, an Indicator of the possible presence
                    of human health risks If bacteria levels are
                    above state health standards, an advisory or
                   closure sign will be posted  at this location. Do
                    not Ingest lake water and, as always, swim at
                             your own i
                         me latest water con
                         www.ldern.IN.gov/tMaahM
                   Six Key Steps for Developing and Using Predictive Tools at Your Beach
                             Step 3: Perform Exploratory Data Analysis
have been transferred to the data sheet, review for errors and anomalies in
the data sets. Excel 2013 and later versions include a Quick Analysis tool
that allows you to select data and instantly create statistics and charts that
will help identify problems.

Placeholders for unmeasured values. Some data sets, especially those
downloaded from online data sources, use numeric placeholders to
indicate unmeasured data (e.g., 999). You need to identify and replace
these values or delete the cells. Empty data cells are not permitted in
many model building programs, including VB.

                       Unit errors. Most numeric data are reported
                       in unit measurements. The units can vary and
                       must be converted to common units for model
                       development. The most common conversions
                       involve converting data from English to metric
                       units, or vice versa. For modeling purposes, the
                       unit chosen is not as important as consistently
                       using the same units. Unit information should be
                       included in the column title.
                              il
                   Water Quality Today
                          GOOD
                      BASED ON RECENT MONITORING
                          FOR E.COLI BACTERIA
                     For Moro Information Visit: w«w.rdom.IN gov
                       NO INGIERA AGUA DEL LAGO.
                       NAPE A SU PBOPIO RIESCO.
                       Porn mas Inlormwlon- www.IOem.lN gowibeaehts
                               CAUMD D6 AfllW
                              AUWIM10 DE HIISOO DE
                              fNFERMEDAOWUXMBLfi
                              BASADO thl RECIENTES
                              iHALJSiS M LA BACTERIA
                              DE LAflACTlHIA E COU
                       Date/time errors. Some data sets downloaded
                       from online data sources list date and time by the
                       numerical day of the year (DOY, 1-366) and/or
                       Coordinated Universal Time (UTC). Some data
                       sources use "Zulu Time" or "Greenwich mean
                       time." These data need to be converted into the
                       same time zone and the date/time format selected
                       for use in the input data sheet. A UTC conversion
                       tool can be found at http://www.noaanews.noaa.
                       gov/hurricanes/zulu-utc.html. A DOY conversion
                       tool can be found at http://www.ngs.noaa.gov/
                       GRD/GPS/DOC/dov/dov.html.

                       FIB data inconsistencies. A special case is often
                       noted with FIB density data. Because laboratories
                       have minimum density detection limits for FIB,
                       data sets will sometimes have category-type
                       entries mixed in with numerical entries (e.g.,
                       < 10CFU/100 milliliters (mL)). In this case, your
                       Beach Team must decide how to handle the
                       "below detection limit" entries so the variable is

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 3: Perform Exploratory Data Analysis
27
  continuous. Typically, one of three options is chosen: (1) use the detection
  limit value, (2) use one-half the detection limit value, or (3) use zero
  as the value. Too many detection limit substitutions, however, might
  compromise the integrity of the FIB density data set.

  Multiple stations or sites. Some beaches have one sample site, others
  use multiple sites. For modeling purposes, however, you need just one
  density measurement to represent the beach as a whole for a sampling
  event. In the case of beaches with multiple sites, some sampling schemes
  are designed to produce a composite sample composed of subsamples
  taken from each of the stations at approximately the same time. In
  that case, you would use the composite sample measurement as your
  FIB observation. Other programs process multiple station samples
  individually, resulting in multiple FIB data points for an event. A
  common approach in that case is to calculate the geometric mean of the
  samples and use that as your FIB observation. Occasionally, you might
  come across duplicate samples taken from the same station for QA or
  other purposes. In that case, using the average of the two samples would
  be appropriate, or a more conservative approach would be to use the
  highest value as your observation.
Characterize the FIB and Independent Variable
Data Sets
EDA usually begins with an examination of the distribution of each of the
data sets. If the "most ideal normal condition" is assumed to be the center
of the data distribution (signal), the spread of data from the center (noise)
should be examined and at least informal inferences made about the range
of environmental circumstances and conditions that produced the variation.

Box Plots
Box plots are an effective way to summarize data distributions. An example
of a box plot is presented in Figure 3. You can generate box plots in VB as
well as other statistical software. Note that the box itself is plotted on the
Y-axis, and the top and bottom of the box represent the lower and upper
quartiles of the ordered data set (25th and 75th percentiles, respectively). The
median is calculated and displayed as a horizontal line inside the box. The
difference between the quartiles is called the "interquartile range". Vertical
lines (whiskers) extend from the quartile lines to represent data above and
below interquartile range. Traditionally, the box plot's whiskers terminate

-------
28
Six Key Steps for Developing and Using Predictive Tools at Your Beach
          Step 3: Perform Exploratory Data Analysis
                                 with a short horizontal line that represents the highest and lowest data
                                 points of the distribution.
                                                  75th percentile
                                                      median
                                                  25"' percentile
                                                                  Outlier
                                                                       Largest non-outlier
                I Interquartile
                f range
                                                                     \
                                                                       Smallest non-outlier
                                 Figure 3. Box plot attributes.

                                 By visually inspecting the box plots, your Beach Team can observe:
                                   •  Outliers—Extreme values in the data set that should be investigated.

                                   •  Median—The central tendency of the data.

                                   •  Spread—The variability in the data set in relationship to the median.
                                     Smaller spreads are generally better for modeling than larger spreads.
                                     The interquartile range is an indicator of spread of the middle half of
                                     the data set.

                                   •  Symmetry and Skewness—The variability of the data set on either side
                                     of the median. A symmetric data set shows the median in the middle
                                     of the box. A skewed data set displays the median closer to one edge
                                     of the box, indicating that the spread is greater for those data on the
                                     other side of the median line. If the data are skewed with outliers, the
                                     interquartile range is often a better measure of variability than the
                                     standard deviation because it is not inflated by the entire data set.

                                 Your Beach Team might find that some data sets are difficult to plot and
                                 characterize because the data range over several orders of magnitude. FIB
                                 densities, in particular, often range from very low densities (< 10 CPU per
                                 100 mL) to very high densities (> 10,000 CPU per 100 mL). Data ranges
                                 such as these require that data be transformed to induce symmetry in the
                                 distribution and to make it easier to graph, observe, and interpret results.
                                 The logarithm is the favored method used for this purpose.

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 3: Perform Exploratory Data Analysis
                                                            29
As mentioned earlier in step 3, categorical variables have values that
function as labels rather than numbers. Therefore, they are not an ordered
data set and cannot be box-plotted in the same manner as continuous data.
FIB density data, however, can be box-plotted by variable categories. The
resulting plots will indicate how the different categories  of the independent
variable individually influence FIB levels (Figure 4).
           10,000
         _ 1,000
         I
         u_
         o
         .=  100
         "5
         u
         u]
T
                  North
                              East        South
                                Wind Direction
                                                     West
              — Notification threshold value (235 CFU/IOOml]
               * Outlier
Figure 4. Box plots of £ co//density sorted by wind direction.
Outliers
An "outlier" is a data point located outside of the overall pattern of a
distribution of other data points. Sometimes outliers are a result of a
faulty measurement or a data entry error. In other cases, the data might
be correctly measured, but the measurement or sampling occurred under
unusual circumstances or conditions. This could be especially significant
at beaches with infrequent but predictable exceedances, such as after a
heavy rain event. In still other cases, the outlier is a legitimate data point
and, while uncommon, might be considered within the normal range of
conditions. Because of this uncertainty, your Beach Team should always try
to identify the reason for or cause of an outlier.

Legitimate outliers can be displayed in the box plot as data points that
extend beyond a reformulated minimum or maximum line. Basically the
four quartiles are constructed as usual, but (invisible) "fences" are added at

-------
3O
Six Key Steps for Developing and Using Predictive Tools at Your Beach
         Step 3: Perform Exploratory Data Analysis
                                the tails of the distribution. These fences mark the boundaries of what is and
                                is not an outlier. The fence is usually defined as 1.5 times the interquartile
                                range. Some analysts like to further categorize outliers as either mild or
                                extreme. To do this, the analyst calculates an outer fence beyond the initial
                                (now inner) fence that is defined at 3.0 times the interquartile range. Any
                                data point that lies between the inner and outer fences is designated as a mild
                                outlier and any point beyond the outer fence is considered an extreme outlier.

                                Comparing Data Distributions among Variable Subsets
                                As mentioned in the introduction to this step, a fundamental assumption
                                of predictive models is that the data used to build the models represent
                                normal conditions that are expected to extend into the future. One way to
                                confirm this assumption, at least for the collected data, is by constructing
                                a time-series plot (Figure 5). If data levels seem to change in certain time
                                periods, your Beach Team might also want to prepare box plot presentations
                                for temporal subsets of the data set to better analyze year-to-year
                                variations and/or season-to-season data variations. By making side-by-side
                                comparisons of box plots, the team might note a significant shift of one
                                subset compared to the others.
                                          10,000
                                        _ 1,000
                                        I
                                        u
                                        .=  100
                                                                        ¥
                                                 2012
        2013
2014
2015
                                                                  Year
                                             - - - Notification threshold value (235 CFU/IOOml)
                                              ¥ Outlier
                                Figure 5. Comparison of £ co//density over a four-year period.
                                If your Beach Team notes a significant shift of one data subset compared
                                to the others, it should investigate why this is occurring. In some cases,
                                this exercise could lead to the development of different predictive models
                                for spring and summer seasons or even the incorporation of a "time of

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 3: Perform Exploratory Data Analysis
season" variable into the predictive model. In other cases, examining subset
distributions might indicate that one entire season's data might be suspect
because of the use of different sample collection protocols or equipment or
because an important change in environmental conditions occurred that
created a new normal for that time period.

Examine the  Relationship  between FIB and
Independent  Variables
Once your Beach Team is familiar with the data sets, outliers are explained,
and bad data have been removed, the team can begin examining the
relationship between FIB concentrations and independent variables. The
main purpose of this exercise is to document linear correlations between
FIB density and independent variables—another key assumption of
statistical predictive models.

Scatterplots
The "scatterplot" is a graphical technique that portrays the one-to-
one relationship between a dependent variable (FIB densities) and an
independent variable. A clustering of data points in a nonrandom pattern
along an imaginary line indicates that a linear relationship exists. The
strength of the linear association is measured by the Pearson's Correlation
Coefficient (r). Its value can range from -1.0 to  1.0—where -1.0 is a perfect
inverse correlation, 0.0 is no correlation, and 1.0 is a perfect correlation. The
closer the absolute value is to 1.0, the stronger the association is between the
two variables.
                                              31
Your Beach Team should keep in mind
that, even though a scatterplot might
reveal a strong association between
dependent and independent variables, it
does not automatically mean that there is
a cause-and-effect mechanism at work. A
definitive connection of this nature must
be made through other means. The only
finding from the scatterplot analysis is the
correlation between the two data sets.
*~^-
        •
/
                       frf-
                                                                     '-- V , •        Credit: ftyan-Mgeriy/USFVVS

-------
32
A)
1,000
750
0 500
"S
u
111
250

0
B)
1,000
•g 100
I
u_
O
B
"S
u
1


.
.
.
.
•*&-.;." "• •'••
• - «-* * *** *^ '»• •
,•*,*, ,







1.0 0.5 1.0 1.5 2.0 2.5
Rainfall in inches
. *
• »*B • •
. •*"? • * 5* "
•• ^ • * •
..••••A.-'





,0 0.5 1.0 1.5 2.0 2.5
Rainfall in inches
Figure 6. Scatterplots of £ colivs. rainfall
withouttransformation (A) and with a
log-transformation (B)
Six Key Steps for Developing and Using Predictive Tools at Your Beach
         Step 3: Perform Exploratory Data Analysis
                                            Variable Transformation
                                            If the relationship is nonlinear, your Beach Team should
                                            consider transforming the data to try to improve linearity.
                                            FIB data, for example, are almost always transformed
                                            to a logarithmic scale. Figure 6 illustrates how a LoglO
                                            transformation of E. coli data improves linearity. VB
                                            provides several transformation options, including base 10
                                            logs, natural logs, square, and square root.

                                            Creation of New Variables
                                            Your Beach Team might want to explore manipulating or
                                            combining variables to improve linearity or to enhance the
                                            meaning of the variable. This might include:
                                               •  Creating a  new composite variable by summing,
                                                 multiplying, or averaging data when multiple sites
                                                 are measuring the same variable (e.g., multiple FIB
                                                 sampling sites in the swimming area or multiple rain
                                                 gauges in the drainage area).
                                               •  Creating a  new composite-weighted variable by
                                                 including additional weight to select components of the
                                                 same variable (e.g., creating a cumulative 3-day rainfall
                                                 total but manipulating the equation so that the more
                                                 recent 24-hour period receives a higher weight than the
                                                 preceding 24-hour period).

                                            VB allows you to create new variables using sum, maximum,
                                            minimum, mean, or products; it also allows you to define
                                            beach orientation and break down wind, current, wave
                                            direction and magnitude  (speed or height) data into
                                            alongshore and offshore and onshore components. These
                                            types of data are often valuable independent variables in
                                            situations in which a major outfall is located near the beach.
                               Correlation among Independent Variables
                               Sometimes combinations of independent variables do not work well together
                               in the context of a predictive model. This frequently occurs when two
                               independent variables correlate highly with each other. Therefore, your
                               Beach Team should examine relationships among independent variables
                               during EDA and identify any strong correlations. The correlations might be

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 3: Perform Exploratory Data Analysis
33
important in step 4 of the model development phase. Where there is a strong
correlation, your Beach Team might consider picking one variable and
discarding the other—a decision made easier if data for one of the variables
is more convenient and/or less expensive to collect.

Analysis of Variance for Categorical Variables
The relationship between an independent categorical variable and FIB
density cannot be represented in a scatterplot with r values calculated in
the same manner as continuous data. As noted above, you can visually
detect categorical influences on density by using categorical box plots of
FIB density. Your Beach Team can use the analysis of variance (ANOVA)
statistical technique to determine if the means of the categorized data as
they relate to FIB density are significantly different.

-------
34
Six Key Steps for Developing and Using Predictive Tools at Your Beach
                               Step 4:  Develop and Test a Predictive

                                           Model

                               After your Beach Team has completed the EDA and selected a set of
                               independent variables that correlate with FIB density, they can proceed to
                               developing and testing the predictive model.

                               Introduction to Step 4
                               Most predictive models in use today are based on linear regression, a
                               statistical method that assumes a linear—or straight-line—relationship
                               between variables. Linear regression can be used to predict a dependent
                               variable measurement (in this case, FIB density) using one or more
                               independent variable measurements.

                               A model that uses only one independent variable is generally described as
                               a "simple linear regression" model. A model using two or more variables is
                               called a "multivariable linear regression" (MLR) model. In either case, the
                               model itself is nothing more than an equation with the dependent variable
                               on one side of the equal sign and independent variable coefficients on the
                               other side. Conceptually, you plug in the appropriate measured independent
                               variable values and calculate a predicted FIB density. You can then compare
                               the FIB density to a state water quality standard or other threshold value
                               and make a decision concerning beach notification actions (e.g., to issue a
                               swimming advisory or close the beach).

                               Three key elements are necessary for producing an effective predictive model:
                                  • Using high-quality data sets to develop and test candidate models.
                                  • Reducing error and increasing predictive power of the model as much
                                    as possible.
                                  • Choosing an appropriate software package.
                               Data Sets
                               The importance of using high-quality dependent and independent variable
                               data sets for model development and testing cannot be overemphasized. A
                               sufficient amount of good empirical data is necessary for an effective and
                               reliable model. As mentioned in step 2, a rule of thumb is to collect at least
                               three years' worth of historical data that represent conditions that are likely
                               to occur in the future. Then, use two of those years' data to develop the
                               model (training data set) and one year's data to assess the model's predictive
                               accuracy (testing data set).

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 4: Developing and Testing a Predictive Model
Reducing Errors
Recall that the correlation coefficient r was used in context of EDA
scatterplots to measure the linear association of one independent variable
and FIB density. In the context of modeling, r also represents a measure of:
(1) the scatter (variability) of the data points from the regression line, and
(2) the power of the independent variable to correctly predict the value of the
dependent variable. Variability can often be reduced if more independent
variables are added to the mix. This makes sense thinking back to how
bacteria moves from land-based sources, through the drainage network and
the receiving water, and to the beach. Rainfall, wind, currents, sunlight, and
other factors work in combination to influence both the journey and the fate
of bacteria cells. While the complexity of model development increases with
the addition of more independent variables, the result is usually increased
accuracy in predicting FIB density.

Virtual Beach
The material presented in this section focuses on VB's traditional MLR
method of model-building. The current version of VB software is version 3
(VB3), which was released in December 2014.

For more complete information about MLR as well as other modeling
methods available in VB3, consult Virtual Beach 3.0.4: User's Guide
(http://www.epa.gov/sites/production/files/2015-02/documents/vb3
manual 3.0.4.pdf)  (Cyterski et al. 2013).

In general, the model-building process in VB3 involves searching for the
combination of independent variables that produces the most accurate
FIB density predictions. Although the VB3 software processes for building
predictive models are automated, you must make important decisions
concerning model construction and testing, including choosing the method
used to build the model, number of variables to include in the model, and
evaluation criteria used to judge model fitness. Unless you or another
member of the team is familiar with VB3, you will probably need to consult a
person who has used it before to help you with these decisions. You can also
visit http://www.seagrant.wisc.edu/home/Default.aspx?tabid=646#Training
for predictive modeling workshop presentations, webinars on accessing
online data, and step-by-step modules on VB.

Model Building
VB3 offers two general methods for selecting variables for the model. One
is called the "genetic algorithm." It is a stepwise procedure that adds or
          35
yp
                                                                                        fdit:

-------
36
Six Key Steps for Developing and Using Predictive Tools at Your Beach
  Step 4: Developing and Testing a Predictive Model
                                subtracts independent variables from the model based on their level of
                                statistical significance. The software retains the most significant variables
                                and discards the least significant.

                                A more comprehensive approach to model building is called "exhaustive
                                search." It involves measuring the goodness of fit for all possible
                                combinations of the chosen independent variables, beginning with models
                                with a single variable and working up to models with all the variables
                                incorporated. The best model for each number of variables, up to a defined
                                maximum, is then identified based on goodness of fit statistics (e.g., the best
                                2-variable model, the best 3-variable model).

                                VB3 provides a variety of criteria for evaluating model fitness. In addition,
                                the software can recommend how many variables are optimal for the model
                                and determine if collinearity among independent variables is a problem.
                                Once model-building is completed in VB3, the software presents you with
                                the 10 best models based on your chosen evaluation criteria. You then
                                evaluate these candidates using one or more metrics described in detail in
                                the Virtual Beach 3.0.4: User's Guide (Cyterski et al. 2013). Based on the
                                results, you select a final model and begin the process of model validation.

                                Model Validation
                                The objective of model validation is to determine whether your final model is
                                good enough to use in your beach program. Keep in mind that your model's
                                output is used to help officials make timely beach management decisions,

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 4: Developing and Testing a Predictive Model
including issuing a swimming advisory or closing the beach. These decisions
are not taken lightly because they affect public health and safety and a variety
of related community concerns pertaining to economic prosperity and public
perceptions about the safety of local recreational waters.

How you determine if your model is good enough to use in your program
is up to you. If you have been relying on previous day sampling results
for making your beach  notification decisions, you probably want your
predictive model to at least perform better than this "persistence model"
approach. You can define "how much better" be setting performance goals
and testing to see if your predictive model meets or exceeds those goals. If it
passes this test, you can consider the model validated and acceptable to use.

Discussed below is a four-step method for validating a model using a
performance goals approach:
   1.  Generate evaluation statistics for the persistence model using a testing
      dataset. Common evaluation statistics are overall accuracy, specificity,
      and sensitivity (described in more detail below). They are defined
      and generated in this first step for the persistence model and then
      generated again in the third step for the predictive model.
   2.  Set performance goals for your predictive model based on the
      persistence model's evaluation statistics.
   3.  Generate evaluation statistics for the predictive model using the
      testing dataset.
   4.  Compare  the evaluation statistics of the two models and  determine
      the percentage point increase (or decrease) of the predictive model
      compared to the persistence model.

This approach to model validation is illustrated using the work of
Francy and Darner (2006). They developed an MLR predictive  model
for Huntington Beach, Ohio, a beach located on Lake Erie, using a
training dataset collected during the 2000-2004 beach seasons. The beach
notification threshold value is an E. coli density of 235 CFU/lOOml. The
explanatory variables incorporated into their model were wave height,
weighted rainfall in the previous 48 hours, and loglO turbidity. Data
collected in the 2005 beach season were used as the testing dataset.

Generate evaluation statistics for the persistence model
Using Francy and Darner's testing dataset, Figure 7 is a plot of the
persistence model results; that is, observed E. coli densities (X-axis) vs.
E.  coli densities measured the previous day (Y-axis). The quadrants displayed
                        37
Forecasting

Future directions that EPA
considers likely for predictive
tools for beach notification
include forecasting beach
water quality conditions a day
or more ahead. Researchers
are also attempting to develop
models applicable to more than
one beach or to a region of
shoreline.

-------
38
Six Key Steps for Developing and Using Predictive Tools at Your Beach
  Step 4: Developing and Testing a Predictive Model
                                  in the graph are defined by the vertical and horizontal lines set at the
                                  beach notification threshold value of 235 cfu/lOOmL. The numbers in the
                                  parentheses are the number of plot points that appear in the quadrant.
                                  Listed below are the distinguishing characteristics of each quadrant:
                                     •  Upper left quadrant. Data points that fall in this quadrant have
                                       observed E. coli densities below the threshold value, but the model
                                       predicts that they will exceed the threshold value. This is known as a
                                       "false positive," or Type  1 error.
                                     •  Upper right quadrant. Data points that fall in this quadrant
                                       have observed E.  coli densities above the threshold value, and the
                                       model correctly predicts that they will exceed the threshold value.
                                       "Sensitivity" is the percentage of all the observed exceedance data
                                       points that fall in this quadrant.
                                     •  Lower left quadrant. Data points that fall in this quadrant have
                                       observed E. coli densities below the threshold value, and the model
                                       correctly predicts that they will not exceed the threshold value.
                                       "Specificity"  is the percentage of all the observed non-exceedance data
                                       points that fall in this quadrant.
                                     •  Lower right quadrant. Data points that fall in this quadrant have
                                       observed E. coli densities above the threshold value, but the model
                                       predicts that they will not exceed the threshold value. This is known as
                                       a "false negative," or Type 2 error.
                                          10,000
                                       £   1,000
                                       V) 	
                                       '« E

                                                False Positive 14]
                                                Correct Nonexceedance (31 )
                                                      •    *»  *
                                                      *
                                                         *   *
                                                                         •»
                                                             *   *
                                                                             Correct Exceedance (0)
                                                                             False Negative (6)
                                                                               *   *
                                                                                  *
                                                          10           100           1,000
                                                            Observed E. coli'm CFU/100ml
                                                                                              10,000
                                                • Notification threshold value (235 CFU/100ml)
                                                 Number of responses
                                  Figure?. Plot of persistence model results of 2005 data (adapted from Francy and
                                         Darner 2006.)

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 4: Developing and Testing a Predictive Model
39
Of the four quadrants, plot points that fall in the lower right quadrant
below the horizontal line (Type 2 errors) are the most troubling because
the persistence model is predicting the water is safe for swimming when, in
fact, the water is unsafe because FIB densities exceed the beach notification
threshold value.

The performance statistics for the persistence model are:
   • (Overall) Accuracy = 75.6%
   • Specificity = 88.6%
   • Sensitivity = 0.0%

Set performance goals for the predictive model
There is no standard formula for setting performance goals; you must use
your judgment in context of the goals and objectives of your beach program.
Assuming you have been relying on the persistence model approach for
making notifications decisions, you will want your predictive model to
perform better than the persistence model. Francy et al. (2013a) suggest a
goal of at least 5 percentage points better for accuracy, specificity, and/or
sensitivity.
As discussed above, the sensitivity statistic is especially important because it
characterizes Type 2 errors. Consequently, if you want to take a conservative
approach in protecting public health, you may want to set your sensitivity
performance goal as high as practicable.

For this Huntington Beach example, using the persistence model evaluation
statistics as a baseline, Francy and Darner chose the following performance
goals for model validation purposes:
   • Accuracy goal > 81%
   • Specificity goal > 94%
   • Sensitivity goal > 50%

Generate evaluation statistics for the predictive model and determine if
your performance goals are met
Once you have established performance goals, you can test your predictive
model to see if it meets those goals. Again using Francy and Darner's 2005
testing dataset,  Figure 8 is a plot of observed E. coli densities vs.
E. coli densities  predicted by the 2000-2004 model. The evaluation statistics
derived from this plot are:
   • Accuracy = 88.0% (exceeds performance goal)

-------
4O
Six Key Steps for Developing and Using Predictive Tools at Your Beach
  Step 4: Developing and Testing a Predictive Model
                                   • Specificity = 95.2% (exceeds performance goal)

                                   • Sensitivity = 50.0% (meets performance goal)

                                In this example, the Francy and Darner 2000-2004 model passes our
                                performance goal test and can be considered good enough to use in a beach
                                notification decision support system.
                                           in,™
!§
                                       °- o
                                       "o
                                       uj
                                                False Positive (2)
                                                Correct Nonexceedance (40)
                                                        *
                                                       * *
»  **
                                                               *^ »*  *
                                                               ^    *   *
                                                                            Conect Exceedance (4)

                                                                               4- *
                       False Negative (4)
                         *        *
                                                         10           100          1,000
                                                            Observed E coliin CFU/IOOml
                                                                                            10,000
                                              — Notification threshold value (235 CFU/lOOmI)
                                              (n) Number of responses

                                Figure 8. Plot of predictive model results of 2005 data (adapted from Francy and Darner
                                        2006.)
                                Models that Do Not Meet Performance Goals
                                Throughout this guide, we have been optimistically moving forward
                                assuming that you are on the path toward creating a successful model.
                                Unfortunately, this is not always the case. If your model does not meet your
                                performance goals, there are some things you can do to try to improve
                                it. For example, you could revisit Step 2 and identify new independent
                                variables and try rebuilding your model, or segregate your dataset and
                                create sub-models that may individually offer better predictive capabilities
                                than one overall model. Another approach is to consider one or more of the
                                alternative predictive tools described in the text box titled Alternatives to
                                MLR Modeling.

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 4: Developing and Testing a Predictive Model
41
Exceedance Probability Threshold
VB3 provides the ability to express a FIB density prediction in terms of a
probability that a defined notification threshold value will be exceeded.
Predictions in this form have some advantages over a FIB density output:
   • They explicitly convey that there is uncertainty associated with the
     model prediction.

   • They give you the flexibility to select a specific exceedance
     probability—rather than a density number—to function as the beach
     notification threshold value.

If you choose exceedance probability as your model output, you must define
a specific probability percentage to function as a notification threshold
value. In general, try to select the lowest (most conservative) exceedance
probability threshold that produces the most correct responses and the
fewest false negative responses. Recall that false negatives (Type 2 errors)
are especially troubling because the model is predicting the water is safe for
swimming when, in fact, the water is unsafe.

Continuing with the Huntington Beach 2000-2004 model example, Francy
and Darner (2006) concluded that a threshold probability of 29 percent
would provide the best balance of correct responses and false negative
responses. Figure 9 is a plot of threshold exceedance prediction and
observed E. coli density using the 2005 testing data set. The quadrants in the

-------
42
Six Key Steps for Developing and Using Predictive Tools at Your Beach
  Step 4: Developing and Testing a Predictive Model
                                   chart are defined by the state standard of 235 CFU/lOOmL (vertical line) and
                                   the probability of exceedance threshold of 29 percent (horizontal line). The
                                   performance statistics from this plot are:
                                      •  Accuracy = 82.0 percent
                                      •  Specificity = 88.1 percent
                                      •  Sensitivity = 50.0 percent
30
80
70
60
bO
40
no
20
10
n
False Positive 15)


#
*

*
Correct Nonexceedance 137)
*• ~>V»O* •»/ **
* «
Correct Exceedance (41
*
»




False Negative (4)
* *
	 t
                                                           ID           100          1.000
                                                              Observed E. coli'm CFU/100ml
                                                                                              ID.OOD
                                               — 29-percent threshold
                                               — Notification threshold value 1235 CFU/lOOmll
                                                (n) Number of responses
                                   Figure 9. Plot of predictive model results of 2005 data expressed as exceedance
                                           probability threshold (adapted from Francy & Darner 2006.)

                                   Using this approach, you can establish a beach management protocol that
                                   requires the issuance of a notification if the model predicts a probability of
                                   exceedance of 29 percent or greater.
                                                                                       ''f-edit: Chelsi Hornbaker/USFWS

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach                                                 43
Step 4: Developing and Testing a Predictive Model
   Alternatives to MLR Modeling
   MLR models are a popular predictive tool used by beach programs, but they are not useful or appropriate
   for all beaches. If for some reason MLR modeling is not right for your beach, you can explore other
   alternatives, including:
   Rainfall Alerts
   This predictive tool is based exclusively on the positive correlation that sometimes exists between rainfall
   and FIB densities: As rainfall totals increase and contaminated runoff reaches the receiving water, there
   is a predictable corresponding increase in FIB density at the beach. By factoring in a beach notification
   threshold, you can predict exceedances of the threshold using a combination of storm duration and
   cumulative precipitation data. Rainfall-based thresholds are derived by simple regression or a frequency of
   exceedance analysis. They represent the oldest approach to predictive modeling and are actively used at
   many beaches in the U.S.

   Partial Least Squares Models
   Partial least squares (PLS) regression models can be used as an alternative to MLR models if there is a
   large number of independent variables that are not well understood; have poor linear correlation with the
   response variable; or have problems with multicollinearity among the independent variables. The primary
   objective of PLS regression remains the same as MLR: a model that accurately predicts FIB concentration
   given a set of independent variables. The system for selecting the variables is what makes this modeling
   different. VB3 includes PLS regression as an optional modeling approach, and it is described in detail in
   Virtual Beach 3.0.4: User's Guide (Cyterski et al. 2013).

   Decision Trees
   In general, decision trees work best when FIB levels are primarily influenced by only a few factors. They
   are basically a series of yes/no questions concerning conditions that influence FIB density. The "tree" is
   typically portrayed visually as a flow chart with binary decision node "branches." The questions with the
   highest importance generally appear at the top of the tree. By moving down the tree and answering the
   set of ordered questions, you are ultimately led to a beach notification classification in the simplest form,
   either "issue a notification" or "don't issue a notification." Decision trees range from simple to complex,
   depending on the number of decision nodes and classification endpoints.

   Gradient Boosting Machine
   The "gradient boosting machine" (GBM) is a computerized approach to constructing a large hierarchical
   set of simple decision trees for making FIB predictions. Similar to PLS regression, it is an alternative to
   MLR if there are a large number of independent variables that might not be well understood; have poor
   linear correlation with the response variable; or have problems with multicollinearity among independent
   variables. VB3 includes GBM as an optional modeling approach, and it  is described in detail in Virtual Beach
   3.0.4: User's Guide (Cyterski et al. 2013).

   Artificial Neural Network
   An "artificial neural network" is software that attempts to mimic the working of the biological neural
   network. Still in the research phase, it presents potentially another alternative for dealing with a large
   amount of independent variables that might not be well understood;  have poor linear correlation with the
   response variable; or have problems with multicollinearity among independent variables. The technique
   incorporates an algorithm that allows it to "learn" relationships between inputs and outputs.

-------
44
Six Key Steps for Developing and Using Predictive Tools at Your Beach
                              Step 5: Integrate the Predictive Tool

                                         into  a Beach Monitoring and

                                         Notification Program

                              Introduction to Step 5
                              Once your Beach Team has developed the predictive tool, they have to
                              integrate it into their beach monitoring and notification program. Model
                              outputs will typically be either estimated FIB levels or a probability that the
                              beach notification threshold will be exceeded. The method your Beach Team
                              selects to use to integrate the model will depend on several things, including
                              the model's accuracy and the availability of resources. Some questions to ask
                              as you consider an integration strategy include:
                                 • How will you use model results to  determine beach notifications?

                                 • Will advisories be posted based solely on model results or on a
                                  combination of models and sampling?

                                 • What do you do  if the model predicts an exceedance?

                                 • What do you do  if sampling results and model results conflict?

                                 • Will you verify model results before posting advisories?

                                 • Will you use a model to remove an advisory or reopen a beach?

                                 • How often will the model be used during the beach season?

                                 • Will you run the model on weekdays and weekends?

                                 • What time of day will you run the model?

                              As you can see, you must consider many factors when deciding how best
                              to integrate predictive  tools into your beach monitoring and notification
                              program. EPA recommends that you use a predictive tool to complement
                              traditional monitoring. A predictive tool cannot completely replace
                              sampling, but it might allow you to reduce the frequency of sampling. Data
                              from culture samples can be used as a basis for models that provide timely
                              results in a cost-effective manner. Predictive tools might also be useful in
                              developing or adapting routine monitoring programs to focus sampling
                              efforts when conditions (e.g., rain events) correlate with high FIB levels.

                              You might choose to issue a beach notification if the model predicts an
                              exceedance of a beach  notification threshold, if sampling results are above

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 5: Integrate a Predictive Tool into a Beach Monitoring and
Notification Program
45
the threshold, or both. If you occasionally use model results in conjunction
with sampling results, consider what to do if the model predictions and
sampling results conflict.

Once you have issued a beach notification, you must decide the process for
removing an advisory. Will you rerun the model with more current data?
Will you collect additional samples? The National Beach Guidance (USEPA
2014) recommends lifting actions that were imposed based on the output
of a predictive model after an additional model run estimates that water
quality conditions have improved to within acceptable parameters.

Frequency of Running the Model
Your Beach Team must decide how often to run the model. Consider
resources available to collect data, run the model, and post results. Running
the model daily  might be ideal, but is not always practical. You might want
to have the results available on the weekends when the most people are using
the beach; however, you might not have staff available to collect the data and
run the model. Many beach programs that use predictive models run them
on weekdays while some also run them on weekends.
Notification  Protocols
As you consider all the factors that are important in determining beach
notifications, you will use them to develop a protocol for making beach
notification decisions. "Notification protocol" is a general term used
to describe a set of questions or decision points that a beach manager
routinely uses to  determine whether to issue a notification or close a beach.
Notification protocols can be simple or complex, but should include all of
the decisions that your Beach Team needs to make after collecting samples
or running a predictive model. The protocol can include the necessary
decisions after a pollution event (CSO or SSO discharge) or hazardous
conditions are discovered (e.g., strong rip currents,  red tide) that might
affect whether the beach should be open, closed, or  under an advisory. An
example of a notification protocol for a beach that uses sampling results and
a predictive model is shown in Figure 10.

-------
46
Six Key Steps for Developing and Using Predictive Tools at Your Beach
                                               Step 5: Integrate a Predictive Tool into a Beach Monitoring and
                                                                                            Notification Program
                                             8:00 a.m.
                                         (the previous day):
                                       collect sample and send
                                       to lab for analysis of FIB
                                             8:00 a.m.
                                      (day of sample col lection):
                                        collect inputvariables
                                           and run model
                                       Did a pollution event
                                        (CSO, harmful algal
                                          bloom] occur?
        Does previous day's sample
          result exceed beach
          notification threshold?
           Does model predict
          exceedance of beach
          notification threshold?
                              Beach is open
 Yes
Issue advisory
                                    Figure 10. Notification protocol for a beach program that uses sampling results and a
                                             predictive model to make notification decisions.
                                    Some beach program managers might choose to use the predictive model
                                    alone to make decisions on notification actions, without considering
                                    sampling results when making those decisions. In that case sample results
                                    might be used only to verify the model is making accurate predictions and
                                    to recalibrate or update the model over time. An example of a notification
                                    protocol for this approach is shown in Figure 11, which is much simpler than
                                    the protocol shown in Figure 10.
                                         8:00 a.m. (day of
                                        sample collection):
                                       collectinputvariables
                                          and run model
           Does model predict
          exceedanceof beach
          notification threshold?
                              Beach is open
                                       Did a pollution event
                                        (CSO, harmful algal
                                          bloom] occur?
             Issue advisory
                                    Figure 11. Notification protocol for a beach that uses only model results to make
                                             notification decisions.

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 5: Integrate a Predictive Tool into a Beach Monitoring and
Notification Program
You should also explore whether you need different notification protocols
for different seasons or for different parts of the beach season (e.g., if there
is a dry part and a wet part to the beach season). In the case study for the
South Shore model, the MHD found that their Environmental Monitoring
for Public Access and Community Tracking (EMPACT) model was less
accurate as the beach season progressed, suggesting there was some level
of seasonality or unidentified influences to water quality between the
beginning and the end of the beach season.

Types of Beach Notifications
A beach advisory is the most common beach notification based on the use
of a predictive tool.  However, the following types of notifications might be
appropriate at certain times.

Beach Advisories
When a model predicts the exceedance of a water quality standard, many
beach managers issue a beach advisory, which warns beach goers that the
FIB density is above the water quality standard and swimming and wading
are not recommended.

Beach Closings
Modeling results might lead you to decide that water quality conditions are
poor enough to warrant closing the beach rather than issuing an advisory.
If you close your beach, you might choose to continue running your model
regularly to determine  when FIB
levels are low enough to reopen,
thereby minimizing the number of
closure days.

Preemptive Advisories
The exploratory data analysis will
give you a good idea of what events
(such as heavy rainfall  or CSOs) are
correlated with higher  FIB levels at
your beach; as a result, you might
decide to issue preemptive advisories
or closures based on those events.
For example, if you  know that a
1-inch rainfall generally causes an

-------
48
          Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 5: Integrate a Predictive Tool into a Beach Monitoring and
                                       Notification Program
                               exceedance of the notification threshold and the weather forecast is calling
                               for more than 1 inch of rain overnight, a preemptive advisory could be
                               issued based on what you already know about rain events and exceedances.
                               You would not need to run the model to issue a preemptive advisory or
                               closure.

                               Permanent Advisories
                               Some beach managers issue permanent advisories when a certain type of
                               event is highly correlated with elevated FIB levels. A predictive tool can help
                               determine whether a permanent advisory is necessary. An example of using
                               this type of advisory is when FIB levels often exceed water quality standards
                               after almost any amount of rainfall. In that case, you might choose to issue
                               a permanent advisory that swimming  should be avoided for a certain period
                               after any rainfall has occurred.

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 5: Integrate a Predictive Tool into a Beach Monitoring and
Notification Program
Public Communication
The predictive tool development process does not necessarily indicate a
need for public involvement. Much of the process involves scientific and
technical expertise and centers around the staff and resources of state and
local agencies and public health departments. Although much of the process
involves experts, predictive modeling stems from the need to protect public
health and much can be gained from involving the public.

Public Education
Public education is an important part of the outreach process. Outreach
often involves teaching the public about beach health and safety—what an
advisory means, what health risks exist, and what precautions should be
taken. When you are using a predictive model, you need to also explain the
use of the model to the public. Some general questions and answers useful
for public education include:
   What is a predictive model? Predictive models are a means of predicting
   or forecasting water quality conditions in the absence of a current water
   sample. Beach managers  assess previous sampling data to determine
   which factors affect water quality. The model uses these factors to
   estimate water quality under current conditions.

   Why use a model? Predictive models are most useful in increasing the
   timeliness of beach notifications, conserving resources by reducing
   sampling, and improving the accuracy of identifying notification days by
   adding to the existing monitoring program.

   How accurate are models? The accuracy of a model depends on the data
   on which it is based and local conditions. A thorough understanding of
   the beach environment and a strong data set can support accurate and
   reliable models. Models should be routinely verified and validated by
   sampling and laboratory analysis, and continuously updated based on
   sampling results.

   How does the model change postings and advisories? With the use of
   a model, postings and advisories can be updated more frequently and
   provide real-time estimates of water quality at beaches.

   Does this mean samples are no longer collected and analyzed? Water
   samples are still collected regularly and analyzed for FIB, both to
   determine the water quality and to verify and update the model.
                       49
The Ohio Nowcast webpage
http://www.ohionowcast.info/
index.asp is a great example
of an outreach website and
includes detailed information
for the public, such as:

• Where Nowcast is used —
  detailed maps.

• How Nowcast works.

• How Nowcast performs.

• Accuracy of Nowcast for
  each beach.

• List of variables used to
  make predictions.

• List of current advisories.

• FAQs.

-------
so
          Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 5: Integrate a Predictive Tool into a Beach Monitoring and
                                      Notification Program
                                 How does this improve public health protection? Beach managers
                                 are able to predict water quality and post advisories in a more timely
                                 manner to prevent illnesses associated with recreating in waters with high
                                 densities of bacteria or pathogens.

                               Public Outreach
                               Public outreach involves directly communicating with the public about
                               beach health and safety. You should consider whether notifications
                               and advisories are easily accessible and whether you are effectively
                               communicating key information. The National Beach Guidance (USEPA
                               2014) discusses a number of possible formats for conducting outreach. The
                               Chicago Parks Department has an especially good outreach program, which
                               includes a public education campaign and a Beach Ambassadors program
                               (see the case study for more information).

                               Other  Uses for Predictive  Models
                               A predictive model might provide other benefits to a beach program besides
                               being used for notifications. For example, the Michigan Department of
                               Natural Resources uses beach models as a tool to identify and remediate
                               sources of contamination to assist with Total Maximum Daily Load (TMDL)
                               development for beaches.

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
51
Step 6:  Evaluate the Predictive Tool

            over Time

Introduction to Step 6
You should plan to evaluate your model periodically to verify that the
performance goals are being met. Many programs choose to assess model
accuracy at the end of the beach season. Any significant decreases in
performance might signal that environmental conditions that affect FIB
density have changed. For example, the season might have been unusually
wet or dry. In this case you might want to conduct more exploratory data
analysis (Step 3) and build a new model (Step 4) using the past season's data
as part of the historical database. Your "rebuilt" updated model may or may
not include the same explanatory variables. The overall goal is to keep the
model current with the environmental conditions that affect FIB density at
the beach.

Several of the case studies at the end of this guide describe situations that
required officials to adjust their model in response to changing conditions or
circumstances.
   •  South Carolina Department of Health and Environmental Control
     updated their stormwater model by using radar data from NexRad
     instead of the data they previously obtained from rain gauges.

   •  Milwaukee Health Department is updating their Nowcast model by
     collaborating with a new partner for their local expertise and using
     improved data at three beach sites instead of at the one beach where
     the model was initially used. They hope to automate data integration,
     translation, and loading to  improve the efficiency of their model.

   •  The City of Racine plans to update their model every year to ensure it
     is still predictive. They also will continue to evaluate whether they can
     decrease monitoring frequency.

   •  Charles River Watershed Association has continued to enhance its
     model over the past 15 years, is always looking at other parameters
     that may improve model predictions to add to the model, and has a
     future goal of real-time data collection for a real-time model.

Changes to the Fate and Transport of FIB
The predictive tools described in this guidance assume that the relationships
between FIB and the environmental conditions associated with the
explanatory variables remain constant over time. This is almost never the

-------
52
                                              Six Key Steps for Developing and Using Predictive Tools at Your Beach
                                                   Step 6: Evaluate the Predictive Tool over Time
                                case, however, because landscapes and human activities change over time
                                and may affect bacteria sources and their movement through the drainage
                                area. An annual sanitary survey of your beach would likely capture many
                                of these changes. Some of the factors that affect FIB movement include the
                                following:
                                   • Land use alterations.
                                   • Infrastructure changes (e.g., repairs to leaky sewer lines).
                                   • Changes to bounding structures (e.g., jetties, breaker walls, piers).
                                   • Changes in pollutant sources (e.g., increase or decrease in algal blooms
                                     or presence of wildlife).

                                All of these factors can cause shifts in the underlying processes influencing
                                FIB densities at your beach.

                                Changes to  Data Sources
                                Step 2 included a discussion of some of the key attributes of the data
                                needed to build and operate the model to make same-day FIB predictions.
                                In general independent variable data need to be collected in a manner
                                consistent with the historical data used to build the model. Additionally,
                                data collected locally are preferred over data obtained from external or
                                online sources, primarily because your model is site-specific. In reality,
                                                                        however, your choice of data
                                                                        sources is often driven by the
                                                                        availability of funding and
                                                                        resources Using data readily
                                                                        available online is much
                                                                        less expensive and resource
                                                                        intensive to obtain than
                                                                        deploying and maintaining
                                                                        your own system of rain
                                                                        gauges, weather stations, water
                                                                        quality sondes, and other
                                                                        equipment. For example,
                                                                        USGS is working with VB
                                                                        developers to make a variety of
                                                                        explanatory data collected by
                                                                        Federal agencies easier for users
                                                                        to access and process using the
                                                                        EnDDaT online system. As
Ozaukee County, Wisconsin
The Ozaukee County Public Health Department developed a model
for a lake. In 2012, they experienced unusual weather conditions—no
rain fell, and the lake temperatures were very warm. The biology and
ecology of the lake changed, and the nearshore environment became
the source of high FIB densities. Advisories were issued for about one
                                       third of the 2012
                                       beach season, and
                                       the model was found
                                       to be only 60 percent
                                       accurate. A revised
                                       model would only
                                       be useful if these
                                       conditions become a
                                       trend.

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Step 6: Evaluate the Predictive Tool over Time
described in the Stormwater Model (Horry County, South Carolina) case
study, the SCDHEC initially used rainfall data collected at local rain gauges
for their predictive models, but over time they switched to using NexRad
data, which eliminated the need for updates to and maintenance of the rain
gauges, while also improving timeliness and accuracy of the model. In other
cases, a beach program might have originally used data from NWS but
plans to install local rain gauges to get more accurate rainfall measurements
for their beach. The MHD initially collected data for its predictive model
using a sonde, but because of high maintenance costs, they chose to use
NWS rainfall data and the previous day's E. coli concentrations—along with
sanitary survey data, which provided additional insight on weather, rainfall,
algae content, litter, and wildlife.

If the data source changes, you will need to collect enough data to rebuild
your model (see the recommendations on amount of data in step 1), as the
relationships between the independent variables and FIB will change from
the relationships in the original model.

Changes to Your Beach Program
The needs of your beach program and the availability of resources can also
change over time. You will need to reevaluate your beach program and its
need for a predictive tool and assess whether you have the resources to meet
that need.

You also should evaluate your notification protocol over time to make sure
it is still appropriate for making the best decisions about beach notifications.
For example, if model  results are highly accurate, a beach program that
initially used both sampling results and modeling results to make beach
notification decisions might decide to rely solely on modeling results for
their beach.  In that case, they might limit sampling to the days on which
the model predicts an exceedance of the water quality standard or other
notification threshold.
S3
                                                                                      Credit: Ryan Hagerty/USFWS

-------
54
Six Key Steps for Developing and Using Predictive Tools at Your Beach
                               Bibliography
                               Aim, E.W., J. Burke, and A. Spain. 2003. Fecal indicator bacteria are
                                   abundant in wet sand at freshwater beaches. Water Research 37(2003)
                                   3978-3982.

                               APHA (American Public Health Association). 1998. Standard Methods for
                                   the Examination of Water and Waste-water, 20th ed. American Public
                                   Health Association, Washington, DC.

                               Biedrzycki, Paul, Disease Control and Environmental Health, City of
                                   Milwaukee Health Department. 2012-2013. Personal communication.

                               Boehm, A.B., R.L. Whitman, M.B. Nevers, D. Hou, and S.B. Weisberg.
                                   2007. Nowcasting recreational water quality. In Statistical Framework
                                  for Recreational Water Quality Criteria and Monitoring, ed. L. Wymer.
                                   Wiley-Interscience, Chichester, West Sussex, England.

                               Breitenbach, Cathy, Chicago Parks District. 2012. Personal communication.

                               Briggs, Shannon, Michigan Department of Environmental Quality. 2012.
                                   Personal communication.

                               Brooks, W.R., Fienen, M. N., and Corsi, S.R. 2013. Partial least squares
                                   for efficient models of fecal indicator bacteria on Great Lakes beaches:
                                   Journal of Environmental Management 114:470-475.

                               Charles River Watershed Association. Charles River Water Quality
                                   Notification Flagging Program.
                                   http://www.crwa.org/field-science/water-quality-notification.

                               Chicago Park District. 2012. Chicago Park District Improves Beach
                                   Monitoring for 2012 Season, http://www.chicagoparkdistrict.com/
                                   chicago-park-district-improves-beach-monitoring-for-2012-season.

                               Cicero, K. The 10 Best Beaches for Families: 2011. Parents Magazine. June
                                   2011. Accessed January 22, 2013. http://www.parents.com.

                               Clark, J., Hortobagyi, M., and Yancey, K.B. Just for Summer: 51 Great
                                   American Beaches. USA Today. March 27, 2012. Accessed January 22,
                                   2013. http://travel.usatoday.com.

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Bibliography
Converse, R.R., J.L. Kinzelman, E.A. Sams, E. Hudgens, A.P. Dufbur, H.
    Ryu, J.W. Santo-Domingo, C.A. Kelty, O.C. Shanks, S.D. Siefring, R.A.
    Haugland, and T.J. Wade. 2012. Dramatic Improvements in Beach Water
    Quality Following Gull Removal. Environmental Science and Technology
    46:10206-10213.

Cyterski, M., W Brooks, M. Galvin, K. Wolfe, R. Carvin, T. Roddick, M.
    Fienen, S. Corsi. 2013. Virtual Beach 3.0.4: User's Guide. National
    Exposure Research Laboratory, U.S. Environmental Protection Agency,
    Athens, GA and U.S. Geological Survey, Middleton, WI.

Eleria, A. and R.M. Vogel. 2005. Predicting fecal coliform bacteria levels in
    the Charles River, Massachusetts, USA. Journal of the American Water
    Resources Association. No. 03111. October 2005.

Francy, D. 2009. Use of predictive models and rapid methods to nowcast
    bacteria levels at coastal beaches. Aquatic Ecosystem Health and
    Management 12(2):177-182.

Francy, D.S., and Darner, R.A. 2006. Procedures for Developing Models to
    Predict Exceedances of Recreational Water Quality Standards at Coastal
    Beaches: U.S. Geological Survey Techniques and Methods 6-B5, 34 p.

Francy, D.S., A.M.G. Brady, R.B. Carvin, S.R. Corsi, L.M. Fuller, J.H.
    Harrison, B.A. Hayhurst, J. Lant, M.B. Nevers, P.J. Terrio, and T.M.
    Zimmerman. 2013a. Developing and Implementing Predictive Models for
    Estimating Recreational Water Quality at Great Lakes Beaches. Scientific
    Investigations Report 2013-5166. U.S. Geological Survey, Reston, VA.
    Accessed March 2015. http://pubs.usgs.gov/sir/2013/5166/pdf/sir2013-
    5166.pdf.

Francy, D.S., E.A. Stelzer, J.W. Duris, A.M.G. Brady, and J.H. Harrison.
    2013b. Predictive Models for Escherichia coli Concentrations at Inland
    Lake Beaches and Relationship of Model Variables to Pathogen Detection.
    USGS Staff-Published Research. Paper 706.

Fulton, Jeff. No date. Public Beaches in Chicago. USA Today.
    http://traveltips.usatoday.com/public-beaches-chicago-53741.html.

Hansen, D.L., S. Ishii, M.J. Sadowsky, R. E. Hicks. 2011. Waterfowl
    abundance does not predict the dominant avian source. Journal of
    Environmental Quality 40:1924-1931.

-------
B6
Six Key Steps for Developing and Using Predictive Tools at Your Beach
                                    Bibliography
                               Hartmann, J.W., S.F. Beckerman, R.M. Engeman, and T.W. Seamans. 2013.
                                   Report to the City of Chicago on Conflicts with Ring-billed Gulls and the
                                   2012 Integrated Ring-billed Gull Damage Management Project. USDA
                                   National Wildlife Research Center, Staff Publications. Paper 1145.

                               Helsel, D.R. and R.M. Hirsch. 2002. Statistical Methods in Water Resources.
                                   Elsevier Publishing.

                               Hou, D., S.J.M. Rabinovici, and A.B. Boehm. 2006. Enterococci Predictions
                                   from Partial Least Squares Regression Models in Conjunction with a
                                   Single-Sample Standard Improve the Efficacy of Beach Management
                                   Advisories. Environmental Science and Technology (40)6:1737-1743.

                               Kesteloot, K., A. Azizan, R. Whitman, and M. Nevers. 2012-2013.New
                                   recreational water testing alternatives. Park Science 29(2).

                               Kinzelman, Julie, City of Racine. 2012-2013. Personal communication.

                               Kurdas, Stephan, City of Racine. 2012-2013. Personal communication.

                               Mas, D.M.L., and K. Baker. Fuss and O'Neill. BIT Guidance for Developing
                                   Predictive Models for Ontario Beaches. Ontario Ministry of the
                                   Environment. Toronto, Ontario Canada. February 2011.

                               Mednick, A.C. 2009. Accessing Online Data for Building and Evaluating Real-
                                   Time Models to Predict Beach Water Quality. Publication PUB-SS-1063.
                                   Wisconsin Department of Natural Resources, Madison, WI. Accessed
                                   March 2015. http://dnr.wi.gov/files/PDF/pubs/ss/SS1063.pdf.

                               Mednick, Adam, Wisconsin Department of Natural Resources. 2012.
                                   Personal communication.

                               NRDC (Natural Resources Defense Council). Testing the Waters: South
                                   Carolina, http://www.nrdc.org/water/oceans/ttw/sc.asp.

                               Seltman, H.J. 2013. Experimental Design and Analysis, Chapter 4
                                   Exploratory Data Analysis. June 10, 2013.

                               Olyphant, G.A., and R.L. Whitman. 2004. Elements of a predictive model
                                   for determining beach closures on a real time basis: The case of 63rd
                                   Street Beach Chicago. Environmental Monitoring and Assessment
                                   98(1-3):175-190.

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Bibliography
57
Our 7 Top Midwest City Beaches. Midwest Living Magazine. July-August
    2010. Accessed January 22, 2013. http://www.midwestliving.com.

Porter, Dwayne, University of South Carolina. 2012. Personal
    communication.

Rockwell, D., K. Campbell, G. Lang, D. Schwab, G. Mann, and R.
    Wagenmaker. 2013. Beach Water Quality Decision Support System.
    Technical Memorandum GLERL-156. National Oceanic and
    Atmospheric Administration, Ann Arbor, MI. Accessed March 2015.
    http://www.glerl.noaa.gov/ftp/publications/tech reports/glerl-156/
    tm-156.pdf.

Rockwell, David, University of Michigan. 2012. Personal communication.

Schwab, D.J., and K.W. Bedford. 1994. The Great Lakes Forecasting System.
    In Coastal andEstuarine Studies: Coastal Ocean Prediction, ed. C.N.K.
    Mooers. American Geophysical Union, Washington, DC.

South Carolina Department of Health and Environmental Control. Beach
    Monitoring Program.
    http://www.scdhec.gov/HomeAndEnvironment/Pollution/
    DHECPollutionMonitoringServices/BeachMonitoring/.

Southeast Coastal Ocean Observing Regional Association. Water Quality
    Observations and Models Help Managers Make Decisions on Issuing
    Swim Advisories, www.secoora.org.

Torrens, Sean, South Carolina Department of Health and Environmental
    Control. 2012-2013. Personal communication.

USEPA (U.S. Environmental Protection Agency). 1999. Action Plan
   for Beaches and Recreational Waters. EPA 600/R-98-079. U.S.
    Environmental Protection Agency, Office of Research and Development
    and Office of Water, Washington, DC.

USEPA (U.S. Environmental Protection Agency). 1999. Review of Potential
    Modeling Tools and Approaches to Support the BEACH Program. EPA-
    823-R-99-002. U.S. Environmental Protection Agency, Office of Science
    and Technology, Washington, DC.

-------
B8
Six Key Steps for Developing and Using Predictive Tools at Your Beach
                                     Bibliography
                               USEPA (U.S. Environmental Protection Agency). 2002. Time-Relevant Beach
                                   and Recreational Water Quality Monitoring and Reporting. United States
                                   Environmental Protection Agency, Office of Research and Development,
                                   National Risk Management Research Laboratory. EPA/625/R-02/017.
                                   October 2002. Cincinnati, Ohio.
                                   http://www.scdhec.gov/HomeAndEnvironment/Water/SwimSafety/

                               USEPA (U.S. Environmental Protection Agency). 2007. Report of the Experts
                                   Scientific Workshop on Critical Research Needs for the Development of
                                   New or Revised Recreational Water Quality Criteria. EPA 823-R-07-
                                   006. U.S. Environmental Protection Agency, Office of Water, Office of
                                   Research and Development. Airlie Center, Warrenton, Virginia.

                               USEPA (U.S. Environmental Protection Agency). 2010a. Predictive Tools
                                   for Beach Notification.  Volume I: Review and Technical Protocol. EPA-
                                   823-R-10-003. U.S. Environmental Protection Agency, Office of Water,
                                   Washington, DC.

                               USEPA (U.S. Environmental Protection Agency). 2010b. Predictive
                                   Modeling at Beaches. Volume II: Predictive Tools for Beach Notification.
                                   EPA-600-R-10-176. U.S. Environmental Protection Agency, National
                                   Exposure Research Laboratory, Athens, Georgia.

                               USEPA (U.S. Environmental Protection Agency). 2010c. Sampling and
                                   Consideration of Variability (Temporal and Spatial) for Monitoring of
                                   Recreational Waters. EPA-823-R-10-005. U.S. Environmental Protection
                                   Agency,  Office of Water, Washington, DC. Accessed March 2015.
                                   http://www.epa.gov/sites/production/files/2015-ll/documents/sampling-
                                   consideration-recreational-waters.pdf.

                               USEPA (U.S. Environmental Protection Agency). 2012. Recreational Water
                                   Quality Criteria. EPA 820-F-12-058. U.S. Environmental Protection
                                   Agency,  Office of Water, Washington, DC.

                               USEPA (U.S. Environmental Protection Agency). 2014. National Beach
                                   Guidance and Required Performance Criteria for Grants. EPA-
                                   823-B-14-001. U.S. Environmental Protection Agency, Office of Water,
                                   Washington, DC.

                               Whitman, R.L. and M.B. Nevers. 2003. Foreshore Sand as a Source  of
                                   Escherichia coli in Nearshore Water of a Lake Michigan Beach. Applied
                                   and Environmental Microbiology 69(9): 5555-5562.

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach
Bibliography
B9
Whitman, R.L., D.A. Shively, H. Pawlik, M.B. Nevers and M.N.
    Byappanahalli. 2003. Occurrence of Escherichia coli and Enterococci in
    Cladophora (Chlorophyta) in Nearshore Water and Beach Sand of Lake
    Michigan. Applied and Environmental Microbiology 69(8):4714-4719.

Whitman, R.L., V.J. Harwood, T.A. Edge, M.B. Nevers, M. Byappanahalli,
    K. Vijayavel, J. Brandao, M.J. Sadowsky, E.W. Aim, A. Crowe, D.
    Ferguson, Z. Ge, E. Halliday, J. Kinzelman, G. Kleinheinz, K. Przybyla-
    Kelly, C. Staley, Z. Staley, and H. Solo-Gabriele.  2014. Microbes in beach
    sands: integrating environment, ecology and public health. Rev Environ
    Sci Biotechnol 13:329-368.

Wood, Julie, Charles River Watershed Association. 2012-2013. Personal
    communication.

Ziegler, Dan, Ozaukee County Public Health Department. 2012. Personal
    communication.

-------
This page intentioanlly left blank. -

-------
Case Study	
The South Shore  Beach Model (Milwaukee, Wisconsin)
                                             61
Introduction
South Shore Beach is in Milwaukee, Wisconsin's
South Shore Park on the western shore of Lake
Michigan. South Shore Beach is a public beach with
150 meters of sandy shoreline within the South Shore
Marina (owned and operated by the South Shore
Yacht Club). A 20-meter embankment separates the
sandy beach area from a cobble/pebble beach area
that has a high-sloping shore (South Shore Rocky
Area). The entire beach and marina area is partially
enclosed by a breakwall, approximately 300 meters
offshore, which limits wave action, water circulation,
and exchange with the outer  harbor. The beach is a
few kilometers south of Milwaukee Harbor and the
Milwaukee Metropolitan Sewerage District Jones
Island Water Reclamation Facility. Three rivers-
Milwaukee, Menomonee, and Kinnickinnic—reach
                                            Milwauke
                                             Bay
                                           Lake
                                          Michigan
a confluence prior to discharging to Lake Michigan
inside the Milwaukee Harbor breakwall.

Visitors to Milwaukee's beaches on hot summer
weekend days exceed 1,000 persons for all three
public beaches combined: Bradford Beach, McKinley
Beach, and South Shore Beach. South Shore Beach is
home to a number of waterfowl and shore birds given
its proximity to a public park and related greenspace.
South Shore Beach also experiences algal blooms of
cladophora, which is native to Lake Michigan and
nearshore environments.

In 1998 the City of Milwaukee Health Department
(MHD) decided to develop a beach water quality
predictive model for purposes of (1) improving
water quality forecasting at the public beaches and
(2) improving water quality advisories and related
           messaging to public beachgoers when
           water quality is unsafe for public
           swimming or contact because of
           elevated bacteria levels. In 2005 MHD
           implemented a different predictive
           model, and variations  of the model are
           still in use today.
           Water Quality
           South Shore Beach has a history of
           poor water quality due to elevated fecal
           bacteria levels. Potential sources of
           fecal bacteria contamination include
           combined sewer overflows (CSOs);
           urban/suburban and agricultural
           runoff from the Milwaukee River
           Basin; runoff from impervious surfaces,
           including South Shore Park parking lots,
           pedestrian sidewalk and roadways, and
           marina infrastructure including docks,
           slips, and boats;  and domestic and wild
           animal populations including Canadian
           geese, seagull, and other waterfowl
           flocks. The beach is directly adjacent to
           the South Shore  Yacht Club and a small
           paved parking area that drains into the
           lake.



-------
62
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
  The South Shore Beach Model (Milwaukee, Wisconsin) (continued)

  Milwaukee Harbor. (USAGE)

  The Natural Resources Defense Council has included
  South Shore Beach several times on its list of the top
  10 dirtiest beaches in the United States. A possible
  contributor to the water quality problem might
  be an offshore breakwall (stone jetty), designed to
  block wave action and protect the lakefront from
  erosion. Unfortunately, it also limits the circulation
  of freshwater into the shallow-depth beach area.
  Pollution that enters the relatively stagnant lake
  through runoff near or around the beach area is
  therefore not readily turned over. To reduce pollution
  entering the lake, Milwaukee County installed a
  trench drain and rain garden along a parking lot
  near the beach. These practices were ineffective.
  The county is considering relocating the beach 100
  yards south—to the other side of the breakwall—as a
  possible long-term solution to improving beach water
  quality conditions during the summer season.

  Model  Development
  MHD used two different models over time—the
  EMPACT model and the Nowcast model. Both are
  described here in separate subsections.

  EMPACT Model
  In 1998 MHD developed a statistical model for three
  of its public beaches using funding awarded through
  the U.S. Environmental Protection Agency (EPA)
  Environmental Monitoring for Public Access and
  Community Tracking (EMPACT) grant. The model is
           E. co//.

           based on 24-hour rainfall data and previous 24-hour
           bacterial sampling data (E. coli MPN/lOOmL),
           which are the two most predictive variables. The
           University of Indiana and U.S. Geological Survey
           (USGS) assisted MHD in developing the model.
           Key factors when selecting which beach model
           to further develop and refine were the amount of
           funding and availability of technical support (both
           data management and model development) that
           could be leveraged to achieve improved predictive
           water quality outcomes. The EMPACT program
           significantly helped MHD take advantage of new
           technologies to provide environmental risk-related
           information to the public in a reliable  and accurate
           near real-time context.

           When developing the model in 1998, the MHD
           was initially excited for the opportunity to try new
           technology for improving the accuracy of water
           quality advisories; however, the project posed many
           unanticipated technical and maintenance challenges.
           To collect data for the model, USGS used a sonde.
           A sonde is a water quality monitoring instrument
           that can measure numerous parameters including
           temperature, conductivity, salinity, dissolved oxygen,
           pH, turbidity, and depth. The harsh lake environment
           was unsuitable for long-term deployment of
           instrumentation. Furthermore, MHD did not
           have sufficient internal capability or resources to
           adequately manage the myriad of sampling, data
           analysis, and routine equipment maintenance. In




-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
                                               63
  The South Shore Beach Model (Milwaukee, Wisconsin) (continued)
         South Shore Boat Park.

  addition, budget and staff cuts made the model too
  complex to sustain by a local public health agency
  with limited environmental health fiscal resources.
  Eventually MHD exhausted all funding and related
  external agency technical support, and stopped using
  the EMPACT model as its primary predictive model
  at the end of the 2004 beach season. The EMPACT
  project provided valuable insight, however, into the
  challenges of developing cost-effective and sustainable
  predictive water quality models at the local level and
  in the context of the Lake Michigan public beach
  environment.

  Nowcast Model
  After the 2004 beach season, MHD decided that
  a simpler Nowcast model would be more effective
  and discontinued use of the EMPACT model. For
  the Nowcast model, the development team selected
  a single public beach in Milwaukee (South Shore
  Beach), where the monitoring equipment could be
  located near a secure power source, protected against
  vandalism, and shielded from harsh environmental
  conditions. South Shore Beach also traditionally
recorded the highest fecal indicator bacteria counts
and, therefore, represented the highest potential
health risk to the public during the typical beach
season (June-August).

It took MHD approximately 6 to 8 months to develop
the Nowcast model, which was ready for use by the
start of the 2005 beach season. If developed today,
MHD could have done it more efficiently because
better statistical and modeling software is more
widely available and less costly to the end user.

Data and Variables
EMPACT Model
The initial variables MHD considered for the
EMPACT model included total rainfall for the
previous 24 hours, pH, conductivity, wave height,
water temperature, and Escherichia coli densities
from the previous 24-hour sampling period. The
MHD deployed a sonde in the water near the beach to
collect real-time water quality data. National Weather
Service (NWS) was utilized to derive daily rainfall
data, which relies on geographically dispersed city

-------
64
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
   The South Shore Beach Model (Milwaukee, Wisconsin) (continued)
  weather stations and gauges. In addition, sanitary
  surveys (typically conducted annually by the MHD)
  were useful in identifying and describing site-specific
  attributes and pollution influences to each of the
  Milwaukee public beaches. The MHD used regression
  analysis to determine which independent variables
  of interest might be most highly associated with or
  predictive of elevated E. coli counts at public beaches
  on a seasonal basis.

  Predictive variables differed between beaches, but
  rainfall data were used in determining water quality
  advisories at all three. Total rainfall over a previous
  24-hour period was determined to be an important
  predictive variable  in all three beach models primarily
  due to contributions of: (1) wastewater treatment
  plant induced CSOs and diversions, (2) sanitary sewer
  cross-connections and infiltration, and (3) stormwater
  runoff. MHD continued collecting select physical and
  chemical water quality data to integrate within beach
  water quality modeling through 2004.

            Design of beaches varies greatly and can determine the
            magnitude of impact, as well as duration of a pollution
            event (how much pollution input and time interval
            required for a beach to naturally recover). More
            specifically, for Milwaukee beaches, total rainfall was
            most highly correlated with bacterial contamination
            and predictive of water quality exceedances when
            it exceeded one-half inch along with temporal
            occurrence early in the beach season (June).

            Raw and summarized data were available daily or
            by request through a public website. MHD collected
            the data electronically via the sonde and transmitted
            it to the website, after review and analysis, for use
            by academic and research entities, the  general
            public, and other interested parties (e.g., media and
            environmental groups).

            Nowcast Model
            The Nowcast model that MDH developed after
            the 2004 beach season is primarily dependent and


               Wisconsin BeachJJ&aJtJ
                                               m
                                                                    JuaJity Button on me tart bacfl lountynanaies water testing snsaflwsones maepenaenny. L
                                counties ate available through the Tit* and County Health Depart-nent Contact' link on Hie left

                                There are more beactiesm Wisconsin ttian appear iii Ihisiisl The beaches In tftis ftsi have current or historical E coli monitoring data in the wiflesch Heath system
                                                      M .' '.I-1-     Annan    Reason  Darts Of This MvHory NHreHTown BHCAMUOI
                                                   '::..' ':--, E i:.

              Wisconsin Beach Health Advisory website.


-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
                                               65
  The South Shore Beach Model (Milwaukee, Wisconsin) (continued)
           Bradford Beach.

  based on regional precipitation, using the previous
  24-hour rainfall total. The MHD model development
  team continues to explore and identify markers for
  nonpoint sources of pollution including chemical
  biomarkers in stormwater discharge (e.g., caffeine
  and triclosan derivatives). Avian and waterfowl
  populations, as well as algal impact, are noted
  but have not been particularly predictive of beach
  water quality in terms of contributions to microbial
  contamination of public health significance.
  Cladophora blooms, however, have increased in the
  past decade at each public beach, causing primarily
  nuisance and aesthetic concerns (e.g., objectionable
  odor and water discoloration). USGS and the
  Wisconsin Department of Natural Resources support
  all data collection and statistical analyses needed to
  develop and implement the Nowcast model. Most
  recently, the MHD partnered with research faculty
  at the newly formed Zilber School of Public Health
  at the University of Wisconsin-Milwaukee (UWM)
  to improve Nowcast modeling and identify other
  indicators of water contamination predictive of or
  directly associated with adverse public health impact.

  Model Implementation
  MHD exclusively used the EMPACT model for
  beach water quality advisory decision making
  from 1998-2004. However, the model expressed
predicted exceedances with a maximum accuracy of
60 percent-70 percent at only one beach and often
approached only 50 percent accuracy at the remaining
two beaches. MHD also noted that the model's
predictive accuracy tended to wane at each beach as
the summer progressed, which suggests some level
of seasonality or unidentified influences to water
quality between early season and late season beach
monitoring periods. As a result, MHD confidence in
sustaining the model diminished over time. MHD
assessed the effectiveness of the model by examining
the degree of sensitivity and specificity. The criterion
for issuing advisories was the exceedance of EPAs
single sample maximum or geometric mean threshold
for E. coli as expressed in MPN/lOOml.

MHD uses the Nowcast model output for beach
advisories. Because model results continue to be less
than optimal in terms of predictive value,  MHD
relies on long-term trending of data and overall
environmental conditions (i.e., water temperature,
multiple day bacterial sampling results, and heavy
rainfall) to refine the issuance of water quality
advisories. MHD posts advisories for 24-hour
intervals and uses the model and trending to
determine when the advisories can be lifted. MHD
would like to see more readily visible, meaningful,
and informative public signs posted on each of the
beaches including explicit illness risk and  prevention

-------
66
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
  The South Shore Beach Model (Milwaukee, Wisconsin) (continued)
  messaging. However, some key community
  policymakers and associated stakeholders (beach
  operators) are concerned that signs would interfere
  with the beach ambiance, tourism, and patron
  use. Current beach water quality advisory signage,
  therefore, remains limited in size, posting, location,
  and level of content.

  Model Cost
  The overall cost to develop the EMPACT model was
  initially in the range of $50,000-$75,000. The most
  costly aspects were siting, maintaining, and refining
  the beach sonde because of the harsh Lake Michigan
  environment and lack of MHD in-house capacity
  and expertise in this regard. Overall, the model did
  not prove to be cost effective due in large part to the
  cost of maintaining the sonde. Annual maintenance
  costs for the sonde ranged from $5,000-$10,000.
  New equipment replacement and upgrades cost an
  additional $20,000-$50,000 every 2 years.

  Milwaukee's beach program currently has a budget of
  around $50,000. MHD is no longer using the sonde
  and has saved additional money by partnering with
  the Zilber School of Public Health at UWM, whose
  graduate students do some of the sampling and data
  collection. They have even been able to increase the
  sampling frequency to 5 times per week at each public
  beach over the season.  This represents a marked
  improvement from 1-2 times per week since 2006.

  Issues Encountered
  For the EMPACT  model, the sonde equipment was
  placed in a very harsh environment. It required
  weekly maintenance. Security and data feed issues
  contributed to the challenges encountered. MHD
  relied on external sources to provide the maintenance
  and replaced equipment on a more frequent basis
  than originally anticipated.

  In addition to the issues MHD had with the sonde,
  they did not have sufficient funding for refining and
  sustaining the model use. The only statistical software
  they had in-house (Epi-Info) was primarily directed
  toward use in tracking the spread of communicable
           disease and outbreak management, which was not
           useful for developing an environmental predictive
           model for a beach water quality monitoring program.

           MHD needed software that is readily available
           and easy to use with basic comparison analysis
           capabilities. Most public health agencies do not have
           these resources in-house, and they do not have the
           technical familiarity and capabilities to effectively use
           the resources. This often creates a knowledge gap and
           vulnerability with regard to environmental statistics
           collection, analysis, trending, and interpretation.

           The EMPACT model was not piloted or tested before
           implementation. In hindsight, MHD should have
           presented the model to the regional beach stakeholder
           group for reaction and feedback, as well as to conduct
           beta testing. Moreover, conducting a more thorough
           comparative analysis with other available models
           and methodologies  as part of model implementation
           would have been helpful. In hindsight, the MHD staff
           did not have sufficient knowledge and expertise to
           design, develop, implement, and evaluate a model that
           could be cost effective and sustained.

           Moving Forward
           MHD has  developed Nowcast models for each of the
           three public beaches located in Milwaukee. MHD
           developed  the Bradford beach Nowcast model in
           partnership with  the Zilber School of Public Health
           at the UWM and is working with Dr. Todd Miller
           and graduate students to conduct field sampling and
           monitoring on a seasonal basis. The MHD/UWM
           team collected water samples from Lake Michigan
           at three beach sites (Bradford, McKinley, and South
           Shore) from early June until late August 2015. UWM
           and MHD assessed these water samples for E. coli
           levels. UWM also investigated fecal coliform levels.
           In addition to fecal indicator bacteria, Dr. Miller's
           study is looking at chemical markers in wastewater,
           specifically the identification of wastewater bacteria
           involved in the degradation of triclocarban. This has
           been shown to be very effective in predicting FIB
           exceedances at beaches. Dr. Miller is also looking at

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
                                                67
  The South Shore Beach Model (Milwaukee, Wisconsin) (continued)
  temporal fluctuations in E. coli sampling; morning
  and afternoon results might be markedly different.

  This collaboration has yielded benefits in both the
  leveraging of available local expertise and improving
  the understanding of beach water quality as related
  to the protection of community health. The MHD/
  UWM team also recorded environmental conditions,
  including weather, rainfall, algae content, litter,
  and wildlife, for each beach on every date that
  they collected water samples. They will continue to
  translate and load the data into a database for long-
  term storage, analysis, and prediction forecasting.

  Further work will automate data integration,
  translation, and loading. The team will explore
  development of a website and an appropriate secure
  interface to provide access to elements from the
  database and forecasting framework to researchers,
  other government agencies, and members of the
  public. The team continues to use USGS EndaTT
  service, NWS data and sanitary surveys periodically
  conducted by the MHD. They believe that these types
  of readily available inputs will result in a more cost-
  effective model for use by the MHD in determining
  seasonal water quality advisories at each of the
  public beaches. The team is no longer using the
  sonde equipment, which has  significantly reduced
  maintenance costs. They are currently using rainfall
  data and the previous day's E. coli concentrations.
  The focus of the new modeling efforts was expanded
  to all three beaches in 2014, although significant
  attention continues to be spent on water quality
  conditions at Bradford Beach. Bradford Beach is very
  popular and supports various recreational activities,
  including national volleyball tournaments, and was
  numerous beachfront attractions including a pavilion,
  beachfront tiki bars, and recreational equipment.

  Finally, the team hopes to refine the predictive model
  and generate more hypotheses on the contribution
  of various sources of intermittent pollution at each
  public beach. For example, they have determined that
  birds and algal blooms were not particularly relevant
  factors at every beach and that chemical markers
in wastewater, along with sub-daily fluctuations
of E. coli concentrations, may be more important
in future predictive modeling initiatives. In 2016,
MHD is planning to pilot the implementation of
buoy equipped with various water quality sensors at
each beach by partnering with Dr. Todd Miller. They
will evaluate the ability to more rapidly collect data
relevant to beach water quality conditions and refine
existing models to improve predictive accuracy.

Advice and Lessons Learned
In 1998 the EMPACT beach predictive model
developed and used by MHD was cutting-edge
because it attempted to identify key environmental
variables other than rainfall that would help predict
elevated E. coli levels at three public beaches in
Milwaukee. It also pioneered the collection and
analysis of select and real-time physical and chemical
characteristics of beach water quality for use by local
public health authorities in determining the need for
posting of beach water quality advisories. However,
the model did not readily improve predictive accuracy
as compared to simple use of previous 24-hour
rainfall measurements, nor was it cost effective. The
project did, however, provide valuable information
about the unique characteristics and attributes of
each beach site in Milwaukee and it allowed MHD
to consider continued exploration of more scientific
and evidence-based approaches important to the
successful development, testing, implementation, and
evaluation of future predictive models.

Overall, the struggles with the initial EMPACT model
had a major impact on MHD's beach monitoring
program. They do not regret going through the
process because of how much they learned. MHD
understands that citizens expect them to protect
public health; therefore, they need the tools to provide
the best available information and meet the needs of
each community. The model used must be a good fit
for the local public health department—in MHD's
case, this meant a simple, low-maintenance, user-
friendly model that allows them to share accurate
health information with the public. It is very

-------
68
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
  The South Shore Beach Model (Milwaukee, Wisconsin) (continued)
  important to earn and keep the public's trust. False
  positives and errors must be minimized.

  Paul Biedrzycki of MHD also offers the following
  advice to fellow beach managers:
  1. Conduct a broad stakeholder planning and review
     process.

  2. Review evidence-based best practices from other
     jurisdictions and research studies.

  3. Build "buy-in" from local policymakers for
     resource allocation (program funding).

  4. Develop quality assurance and quality control
     criteria.

  5. Anticipate resources needed for sustainability.

  6. Conduct independent evaluation and review.

  7. Conduct thorough piloting/testing phase before
     implementation.
           In general, local public health departments have
           increasingly limited resources to conduct either
           extensive or comprehensive environmental health
           assessments. It is anticipated that the public health
           sector will continue to experience significant budget
           cuts at the local, state, and federal levels in the near
           future. While sustainability and green movements
           have provided some moderate assistance in terms
           of additional community resource availability,
           governments are not growing and state agency budget
           and revenue sharing with locals is being reduced.
           Therefore, collaboration and information sharing
           between entities is essential if recreational water
           quality monitoring programs are to remain in the
           future. Partnerships between states and within states,
           as well as between a diverse group of stakeholders
           (e.g., environmental groups, universities, community
           organizations, and federal agencies), must be fostered
           and encouraged.

           References
           USEPA (U.S. Environmental Protection Agency).
              2010b. Predictive Modeling at Beaches. Volume
              II, Predictive Tools for Beach Notification. EPA-
              600-R-10-176. U.S. Environmental Protection
              Agency, National Exposure Research Laboratory,
              Athens, Georgia.

           Biedrzycki, Paul. Disease Control and Environmental
              Health, City of Milwaukee Health Department.
              Personal communication. 2012.

-------
Case  Study
                                           69
Charles River Watershed Association Flag Program
(Boston, Massachusetts)
Introduction
The Charles River, flowing about 80 miles from
Hopkinton, Massachusetts, to its terminus in Boston
Harbor, is one of the busiest recreational rivers in
the country. On a typical summer weekend, the river
will attract tens of thousands of people in a large
and often colorful array of vessels including canoes,
kayaks, dragonboats, sailboats, fishing boats, and
rowing shells. Unfortunately, given the urban nature
of development along the river (it runs through 23
cities and towns), a variety of sources of pollution,
including combined sewer overflows (CSOs), cause
water quality problems, especially in the Lower
Basin—the approximately 9-mile stretch from the
Watertown Dam to the New Charles River Dam.
In 1998, the Charles River Watershed Association
(CRWA) initiated a flag program, flying color-
coded flags to alert people about water quality
conditions in the Charles River Lower Basin. This
case study explores the efforts of the CRWA to build
the scientific foundation of the flag program by
developing a water quality model.
Water Quality
In 1995 the U.S. Environmental Protection Agency
(EPA) established the Clean Charles Initiative with
the purpose of restoring the Charles River and
making it fishable and swimmable. Much progress
has been made, thanks to the collaborative efforts
of EPA; other federal, state, and local government
agencies; nonprofit groups; private institutions; and

-------
7O
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
  Charles River Watershed Association Flag Program (Boston, Massachusetts) (continued)
         The Charles River.

  the public. However, more work remains. Stormwater
  runoff and CSOs remain a special concern and, while
  water quality is usually sufficient for boating and
  other secondary contact water activities, swimming
  and other activities involving continuous full-body
  contact are not recommended because of bacterial
  levels that exceed primary contact standards.

  Model  Development
  CRWA was founded in 1965 for the purpose of
  spearheading projects aimed at cleaning up the
  Charles River. Conditions improved over time,
  allowing more people to safely use the river for
  secondary recreation use; however, the river remained
  impaired for bacteria, especially during wet weather.
  Therefore, in 1998 CRWA, in a joint project with Tufts
  University and funding from EPA, began developing
  a statistical model that predicts the likelihood of a
  violation of the state boating standard in the Lower
  Charles River Basin. One of the project's goals was to
  be able to forecast and publicize daily water quality
  conditions. The Lower Charles River  Basin does not
  have a swimming beach, but it is the busiest section
  of the river and secondary recreational activities
           continue to expand. CRWA initially developed two
           different statistical models, adopting the one with
           the best performance. It took a few years to build up
           a data set of indicator bacteria sample results large
           enough to use to develop the model.

           A former staff member developed the original model
           as part of their master's thesis at Tufts University. To
           select the model's variables, the project team conducted
           a literature review of similar projects, with the major
           limitation that data used had to be readily available on
           a daily basis. The best predictive variables were rainfall
           volume, river flow, and wind. The project team used
           the ordinary least squares (OLS) method in Minitab*
           to develop the regression model, and they used
           Microsoft Excel to run the equation on a daily basis.

           An intern at CRWA who had recently received a
           master's degree, overseen by Julie Wood, updated the
           model in 2009 to account for changes in availability
           of real-time data and a switch from fecal coliform to
           Escherichia coli as the primary indicator bacteria for
           state water quality standards.




-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
                                              71
   Charles River Watershed Association Flag Program (Boston, Massachusetts) (continued)
   Model Implementation
   In 1998 CRWA began flying color-coded flags to alert
   people about water quality conditions in the Charles
   River Lower Basin. Flown from July through October
   at select shore locations between Watertown and
   Boston Harbor, CRWA flags informed boaters about
   E. coli bacteria levels and blue-green algae blooms.
   Specifically:
   • A blue flag indicates CRWAs forecast that the
    likelihood of bacteria exceeding the boating
    standard is less than 50 percent and a blue-green
    algae bloom is not present.

   • A yellow flag indicates that health risks are
    possible, but data are inconclusive to  predict risks
    with certainty. Yellow flags are flown when signs of
    a blue-green algae bloom are present but the actual
    human health risk is unconfirmed or unknown.

   • A red flag means that the probability of the river
    exceeding boating standards is equal to or greater
    than 50 percent, or that a health risk  is present
    because of a confirmed blue-green algae bloom.
    Red flags are also flown for 48 hours following a
    reported1 CSO.
   The decision on which color flag to fly is based on
   the results of a mathematical model that uses rainfall
   and other weather factors along with river conditions
   to estimate the probability of the river exceeding the
   state secondary contact recreation (boating) standard
   of 630 E. coli colony forming units per 100 milliliters
   of water (cfu/100 mL). In addition to the model,
   CRWA collects weekly water samples to help verify
   model predictions and to add to the database of water
   quality information.

   Over the past 15 years, CRWA has continued to
   enhance the model; water sampling has confirmed
   an accuracy rate of about 90 percent for predicting
   water quality violations. The program provides daily
   advisory information and allows river users to make
   more informed decisions about recreating on the river
 Red advisory flag indicating potential health risks.

  1 Unfortunately, only 1 of the 11 active CSOs in the Charles River
   Lower Basin provides real-time overflow notifications.
Charles River Watershed Association Flag Program
website.


-------
72
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
  Charles River Watershed Association Flag Program (Boston, Massachusetts) (continued)
  on that day. The program is not used for enforcement
  actions, and the river is never closed to the public on
  the basis of model results.

  The model-generated advisory information continues
  to be communicated to the general public through
  the posting of color-coded flags and through email,
  CRWA's website, Twitter, and a telephone hotline.
  Eleven facilities fly the color-coded flags along the
  river, providing a valuable public service. These
  facilities include yacht clubs, boating centers, canoe
  and kayak outfitters, and Harvard University's famed
  Weld Boathouse.

  Model Costs
  Key costs for model development included labor
  costs and sample analyses. Labor to collect and
  compile online data was the most significant cost. In
  some cases, older weather data had to be purchased.
  Collecting and organizing free data into a usable
  format, especially when it must be formatted to work
  with a specific statistical software package, can be
  time-consuming. Collecting and analyzing E. coli
  samples also required staff time and lab costs of about
  $30 per sample. CRWA collects four samples, once
  or twice each week to verify its model predictions.
  Before implementing the model, monitoring occurred
  at least twice a week up to as often as daily. Since
  implementation, CRWA has been able to reduce
  monitoring frequency to weekly when funding is
  limited. CRWA believes that the cost of the model
  is offset by the value of the daily water quality
  notifications for public health and safety.

  Issues Encountered
  Challenges associated with model development
  included the following:
  •  Choosing input variables that were easily available
     daily. These include rainfall volume (previous 24,
     48, 72, and 168 hours), wind speed, time since
     specific rainfall volume (more than 0.01 inch;
     more than 0.1 inch), flow, and solar radiation.
     These data are available from the National Oceanic
             and Atmospheric Administration or the U.S.
             Geological Survey.
           • Building a database of E. coli concentrations for
             model calibration and verification.
           • Meeting the needs of all users.
           • Working with a limited budget.

           The biggest challenge that CRWA faced in the
           development phase was the availability of data to test
           predictive factors. The CRWA did not collect any real-
           time data, so it could use only what was available on
           the Web. Consequently, CRWA had to rely on other
           organizations to continue to collect the data and
           publish it in  a timely manner.

           Data availability continues to be a challenge in
           the implementation phase. The model is run every
           morning at 8 a.m. in the recreational season when
           data are available. Usually CSO discharge data are
           not collected, but while CSOs are not a part of the
           statistical model, any existing discharge information
           is incorporated in the notification protocol.

           It is a time-consuming process to develop and
           employ a model. CRWA runs its model Monday
           through Friday from July through October. On Friday
           afternoons, CRWA provides a weekend forecast using
           model simulations based on weather predictions.
           CRWA has discussed running the model on the
           weekends and has done so on occasion; however, this
           is logistically challenging because most of the staff
           work only Monday through Friday. The model is run
           once a day around 8 a.m. This limits its utility for the
           river users (primarily scullers), of which there are
           many who are out on the river in the early mornings.
           Additionally, the model is not updated throughout
           the day, although in reality water quality conditions
           do change continuously. Finally, since the model is
           not run on weekends, accurate information is not
           available to weekend users.

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
                                              73
  Charles River Watershed Association Flag Program (Boston, Massachusetts) (continued)
      Weld Boathouse at Harvard University flying a blue flag indicating suitable boating conditions.
  Moving Forward
  In 2012 CRWA added two additional boathouse
  locations where flags are flown (12 sites total) to
  provide more complete coverage of the area.

  CSOs are a major challenge to maintaining the river's
  water quality. Under the CSO control plan for Boston,
  some CSOs may remain in the  long term. Under
  the control plan, some CSOs have added primary
  treatment and notification, but several have not. A
  goal and priority for CRWA is to continue to reduce
  CSOs significantly and notify the public in a timely
  manner in the event that CSO discharges occur.

  Recreation continues to expand in the watershed
  and might include swimming in the future if water
  quality improves. Real-time modeling is expected to
  help document improving water quality and serve
  as a notification tool for water-based activities in the
  Charles River.
The CRWA is collaborating with Coastal
Environmental Sensing Network (CESN) at the
University of Massachusetts in Boston. CESN
established a real-time weather station and wrote a
program that allows data from the weather station to
be continuously fed into the model, along with flow
data. The station went online in August 2012; the
group has verified data starting in September 2012.
So far, the group has eight overlapping sampling
points with weather station data for October and
three overlapping sampling points for September.
The group  completed the analysis of overlapping data
during the 2013 season. Running the model using
data inputs from this new weather/water quality
station is working well. The accuracy of the model
using inputs from this station has improved when
compared to the current system because the model is
automatically updated every hour based on the most
recent data.

-------
74
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
  Charles River Watershed Association Flag Program (Boston, Massachusetts) (continued)
  Although CRWA does not have additional resources
  to put toward the real-time data collection, the group
  would like to develop a real-time model; continuing
  to collaborate with the university will make this
  goal more realistic. CRWA also hopes to add other
  parameters, such as turbidity, to the model. A real-
  time model would be more effective for quickly
  notifying the public of water quality conditions
  because the Charles River hosts a wide variety of
  recreational activities. For example, water quality
  forecasts go out at 9 a.m. (based on NOAA updates
  at 8 a.m.). However, rowers are out on the water at
  5 a.m.—well before any water quality notifications
  are available. Real-time forecasting capabilities
  would greatly improve the program.

  Unfortunately, the long-term outlook for the project
  depends on the resources CESN and CRWA have
  available to continue to maintain the weather station
  and the real-time data feed to the model.

  Advice and Lessons Learned
  In light of the experience and success of CRWAs
  modeling efforts, Julie Wood of CRWA recommends
  that beach managers "go for  it" with regard to
  developing their own models. The model does not
  have to be complicated—a simple regression model
           can be effective in many systems to broadly predict
           possible risk. In addition, it is important to consider
           the availability of your staff to run the model and
           post notifications, since that affects how often the
           model can be run. It can be especially challenging if
           you want to run the model on the weekends. Overall,
           resources are a major factor when developing and
           implementing predictive models.

           Based on their experience with the CESN station,
           the CRWA team recommends that model developers
           select a model that can be automated and run
           continuously in real-time based on data readily
           available on the Web. You will still need staff to
           collect the  samples to verify the forecasts, but you will
           not need staff to run the model. You can run this type
           of automated model every day of the week and early
           in the morning, providing water quality predictions
           based on the most current data. This would help meet
           public expectations for real-time now-casting in very
           fine timescales.

           References
           Charles River Watershed Association. Charles River
              Water Quality Notification Flagging Program.
              http://www.crwa.org/field-science/
              water-quality-notification.

           Eleria, A. and R.M. Vogel. 2005. Predicting fecal
              coliform bacteria levels in the Charles River,
              Massachusetts, USA. Journal of the American
              Water Resources Association. No. 03111. October
              2005.

           Wood, Julie. Charles River Watershed Association.
              Personal communication. 2012-2013.

-------
Case Study	
Chicago  Park District Beach  Modeling (Chicago, Illinois)
                                              75
Introduction
Chicago's 26 miles of shoreline along Lake Michigan
provide residents and visitors with many water-based
recreational opportunities. Especially popular are
a series of 24 beaches owned and managed by the
Chicago Park District (CPD). Over 20 million people
visit these beaches each year between Memorial Day
weekend and Labor Day to swim and enjoy the sand,
sun, and scenery. CPD's mission with these beaches,
as with all their parks, is to provide a customer-
focused experience that prioritizes and responds to
the safety and needs of children and families.

To aid in providing a safe beach environment CPD
developed a system of colored flags to communicate
safe swimming status at the beaches. A green flag
means that weather conditions and water quality
                 *—^lontrose Beach")
are good and swimming is permitted. A yellow flag
indicates that swimming is permitted but beach-
goers are cautioned that weather conditions are
unpredictable and/or water quality does not meet
state swimming standards. A red flag indicates that
swimming is not permitted either because weather
or water quality is causing unsafe or dangerous
conditions.

In general, the lifeguards stationed at each beach are
responsible for monitoring weather conditions and
changing swim status when necessary. However,
while beachgoers can usually relate to unsafe weather
conditions such as high waves and lightening, unsafe
water quality conditions are not nearly as obvious.
Currently, CPD's decision to change swim status
due to water quality is based on two complementary
            approaches: (1) analysis of water
            samples and (2) a computer model that
            uses weather and hydrology data and
            water conditions to predict real-time
            water quality.
                                                Lake Michigan
                       Park
                       Washington
                                   Jackson
                                  i Park
                                        63rd Street Bea
aclT)
                                                    Illinois
                                                    Indiana
own.
                                                              Water Quality
            Most water quality problems found
            at CPD's beaches can be linked to
            nonpoint sources of pollution origi-
            nating in the small watersheds along
            the shoreline. Runoff from roadways,
            parklands, and other nearshore land
            areas collects and drains to the lake
            through a network of stormwater
            outfalls. Chicago's human sewage is
            not directed into Lake Michigan  except
            during extreme storm events, when the
            locks that separate the Chicago River
            system from  Lake Michigan are opened
            to minimize  or prevent flooding.

            CPD believes that the relatively
            large resident gull and Canada geese
            populations are one of the most
            significant contributors to the pollution
            load at the beaches. In response, the
            District has initiated various programs



-------
76
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
  Chicago Park District Beach Modeling (Chicago, Illinois) (continued)
         Uniformed border collie chasing birds off the beach.
  to discourage their presence, including prohibiting
  feeding and using uniformed border collies to chase
  birds off the beaches.

  Similar to most actively managed freshwater beaches,
  CPD routinely collects water samples and has them
  analyzed in a laboratory for E. coli. Samples are
  collected at each beach  every Monday through Friday
  during the swimming season. Additional samples are
  also collected through the weekend if weekday results
  show high levels of E. coli.

  CPD's sampling program follows U.S. Environmental
  Protection Agency (EPA)  guidelines and protocol
  for water collection and laboratory analysis for E.
  coli concentration. The  bacteria culture process
  takes 18-24 hours to complete (Colilert method);
  consequently, sample results are not available until
  the day after they are taken. If E. coli levels are  found
  to be above the state's water quality standard of
  235 CFU/lOOmL, the water is considered unsafe for
  swimming. CPD subsequently notifies the public of
  the threat through their website and other outlets and
           by posting an advisory at the beach and changing the
           swim status flags.

           Fortunately in most instances when E. coli levels are
           found to be above the 235 CFU/100 mL water quality
           standard, the next-day's sample results are usually
           below the water quality standard. Part of the reason for
           this phenomenon is because the large open shoreline
           encourages water circulation between shore waters
           and deeper offshore waters. Thus, bacteria that enter
           most beach areas during and after storms are dispersed
           and flushed away from near-shore areas fairly rapidly.
           However, beaches that are sheltered in an embayment
           or protected by piers or seawalls often do not circulate
           their beach water as freely and sometimes experience
           more persistent high bacteria levels, with swimming
           advisories lasting multiple days.

           The fact that high FIB  levels at most Chicago beaches
           only last a day underscores the problem of having at
           least an 18-hour lag time between sample collection
           and laboratory results. Beachgoers are unknowingly
           swimming in water with high FIB levels the day the
           water sample is collected, and are advised not to swim


-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
                                               77
  Chicago Park District Beach Modeling (Chicago, Illinois) (continued)
  the following day, when levels are usually safe based
  on the analysis of that day's sample. This lag-time
  problem caused CPD to explore the possibility of
  developing a predictive mathematical model so that
  beach management officials could make more timely
  decisions concerning swim status and thus better
  protect the health of the beach-going public.

  Model  Development
  CPD began the predictive modeling project in 2011
  with the assistance of the U.S. Geological Survey
  (USGS) and a $243,000 Great Lakes Restoration
  Initiative (GLRI) grant. Together the agencies decided
  on a group of weather-related parameters that could
  potentially be incorporated into the model. They
  then developed and deployed buoys for in-water
  measurements and pole-mounted weather stations
  near the beaches to monitor atmospheric conditions.
Given resource limitations, CPD decided to initially
focus on a set of Chicago beaches that: (1) most
frequently exceeded the E. coli criteria and (2)
had the highest beach attendance. Eventually five
beaches were selected for the modeling exercise.
The list included the largest beach in size (Montrose
Beach) to one of the city's most popular (Oak Street
Beach). The other three beaches were Foster Beach,
63rd Street Beach, and Calumet Beach. All the
beaches are primarily affected by nonpoint sources
of contamination and have a history of E. coli
exceedance rates between 8 and 15 percent (percent
of days when the mean of two samples exceeds 235
CFU/100 mL) over the last few years. Attendance
records for the  beaches ranged from approximately
100,000 visitors to several million visitors per
swimming season.
         Foster Beach.

-------
78
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
  Chicago Park District Beach Modeling (Chicago, Illinois) (continued)
  Model Development Key
  Components
  Technical and Financial Resources
  The USGS was instrumental in getting the project
  off the ground. They helped select the monitoring
  equipment and trained staff to use and maintain
  it, USGS also provided guidance on developing
  the model and performed statistical analyses. The
  Lake County Health Department, which already
  has experience implementing a predictive model
  program, also provided expertise during model
  development. In addition, several presentations at the
  Great Lakes Beach Association Conferences provided
  Installing monitoring equipment at Chicago beaches.
           CPD staff with options and a variety of potential
           methods for developing the model.

           The models were developed using multivariate
           regression analysis. The USGS selected variables by
           identifying the ones that fit best statistically. USGS
           considered including gull counts, but found that this
           information was difficult to use and implement in the
           context of the model.

           CPD used its own resources to deploy the monitoring
           equipment, including scuba divers, electricians, and
           related heavy equipment such as boats and a bucket
           truck for installing the weather station on light poles.
           If CPD had contracted the installation, costs would
           have increased significantly.

           Currently CPD provides funding for data collection
           and equipment maintenance but continues to rely on
           the USGS to perform statistical analyses. CPD could
           possibly hire contractors to complete this work, but
           few would have the necessary depth of understanding
           of Lake Michigan ecology.

           CPD spent approximately a year  and a half developing
           the first models and expects to change and improve
           them with additional data in the future. CPD initially
           anticipated the need for two years of data to have
           working models developed because results depend
           strongly on the weather. The Chicago area has very
           different beach seasons from year to year; therefore, a
           larger data set will help improve  the model's accuracy.

           Data Resources
           When developing the model, CPD relied on daily
           weather and water quality data, along with water
           quality data collected as part of CPD's existing beach
           monitoring program.  CPD also considered data
           collected during daily sanitary surveys for model
           development purposes.
                                                     USGS explored whether other data sources, such as
                                                     that from the National Oceanic and Atmospheric
                                                     Administration (NOAA), might be useful. They did
                                                     not use NOAA or other external data because these
                                                     data sources did not work as well. For example,





-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
                                              79

  Chicago Park District Beach Modeling (Chicago, Illinois) (continued)
                                                   Come Out and Play!

    Chicago Park District Beach Notification website,


  NOAA data comes from further offshore; beaches in
  Chicago are man-made and have many structures in
  place, so they require detailed on-site data.

  Model Implementation
  Public Involvement
  CPD did not involve the public during the
  initial phases of model development because the
  information was too technical. However, CPD
  conducted significant public outreach to inform
  people about implementation efforts. All data were
  made available to the public via the CPD website.
  CPD also posted information about how the model
  worked, how advisories work, what changes would
  occur, and how this would improve public health.
  There was a lot of media interest, which gave CPD the
  opportunity for interviews with numerous papers and
  news stations.

  CPD did not receive much feedback from the public
  even though the public could submit questions and
  comments via website or hotline. CPD received
  occasional feedback, however, when there were
  unusual data or equipment malfunctions.
          Calumet Beach
          SWIM STATUS

          O SWIM
WATER QUALITY INFORMATION
Forecasifortotiay     266 4
Most recent test result   789
                            Hfsr/ ran : hj-fri fmtfffnr twim xlfvftfl
Model Output and Validation
The key variables CPD used for these models include
the following:
• Air temperature.
• 6-hour solar radiation.

• 4-hour wave period.
• Longshore (NNW) wind.

• 6-hour longshore (NW) wind.
• 6-hour rainfall.


-------
8O
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
   Chicago Park District Beach Modeling (Chicago, Illinois) (continued)
  •  48-hour rainfall.

  •  4-hour log-wave period.
  •  Day of year.

  •  4-hour onshore wind.

  •  4-hour log turbidity.

  •  4-hour log wave height.

  Each model used a different combination of these
  variables.

  CPD conducted routine sampling throughout the
  2012 beach season to collect data for validating the
  model. They compared actual sampled results with
  modeled results to ensure the model's accuracy. CPD
  reports predicted values and will continue to refine
  the models over the next several beach seasons.
  Although model accuracy fluctuated between years,
  CPD is confident that the advisories they issued
  on the basis of modeled results were more accurate
  than they would have been  without the model. With
  regard to confidence in model results, CPD remains
  "cautiously optimistic."

  CPD is assessing the effectiveness of the model by
  evaluating whether more Type 1 and Type 2 errors
  would have been generated relying only on traditional
  water testing and waiting 24 hours for results.
  Currently, if the model predicts a bacteria level over
  235 CFU/100 mL, CPD issues an advisory. CPD also
  posts the most recent lab results from traditional
  water testing at each beach. If the test results and
  the model do not agree, CPD then uses the model to
  determine the advisory  status.

  Implementation
  CPD began using the model in 2012 to make manage-
  ment decisions on notification actions. They monitor
  all beaches every weekday. They also monitor on
  weekend days following an  exceedance on a Friday, or
  if the model predicts an exceedance on the  weekend.
  CPD runs the models at 9:00 a.m. and issue advisories
  by 9:30 a.m. If the model shows no exceedance, CPD
  posts a green flag. The public can view both model
           results and sampling values by visiting the beach,
           viewing the website, or calling a hotline.

           Model Costs
           The $243,000 GLRI grant provided the bulk of the
           financial resources for the project. CPD also set aside
           $50,000 in their capital budget to help purchase the
           equipment in the first year (2011), and $25,000 in
           capital funds to increase the amount of equipment in
           2012.

           In addition, CPD spent about $120,000 in 2011 for
           water sampling at all the beaches in Chicago. Most
           of these costs would have been incurred without the
           modeling project. The extra sampling for modeling
           was about $15,000. The most costly aspects of the
           modeling process included the purchase of equipment
           and USGS support.

           Equipment costs were approximately $70,000.
           Monthly bills for cellular data were about $3,000—
           this covers data transmitted by eight cellular modems.
           Obtaining water quality data (FIB testing results)
           did not cost extra because this work would have
           been done regardless of development of the models.
           However, for reference, the lab costs for water quality
           sampling were  about $100,000, and the personnel
           costs for water  sampling were approximately $20,000
           annually.

           The grant was funded in the fall of 2010 and
           continued through 2013. A large portion of the funds
           was used to purchase and install the equipment
           and for USGS statistical analysis. Some grant funds
           remain; these will be used to offset ongoing costs
           (maintenance, statistical analyses, etc.). Currently,
           CPD relies on internal funding, which could decrease
           in the future.

           When determining overall cost effectiveness of the
           model, CPD concluded that they would save money
           only if sampling is reduced. CPD does not currently
           plan to reduce sampling; however, if the BEACH
           Act funding is cut, this would affect sampling
           significantly because fewer resources would be

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
                                              81
  Chicago Park District Beach Modeling (Chicago, Illinois) (continued)
         Montrose Avenue dog beach,

  available. For CPD, the bottom line is, "How do you
  put a price on better information?"

  Issues Encountered
  CPD had their share of issues with field equipment,
  including equipment getting damaged by rough
  weather. They adjusted the anchoring scheme for
  the buoys, which helped, and have eliminated some
  buoys, although some equipment issues continue and
  the buoys are expensive to maintain. Looking back,
  CPD might have selected a different anchoring system
  to ensure that the equipment remained in place.

  Moving  Forward
  CPD intends to keep moving forward with their
  models. CPD has already invested over $75,000 of
  department funding into the modeling program,
  which shows confidence in the model's effectiveness.
  CPD has expanded to other beaches since 2011,
  and for the  2015 season predictive models were
  used at all 24 of the city's beaches. In addition, they
have substantial resources going into mitigation
practices. They are also working on developing
better information and methods to address non-
anthropogenic sources of bacteria such as shore birds.

During the initial year of data collection, CPD
increased sampling frequency to twice per day at
the modeled beaches. USGS and Michigan State
University have helped validate and update the
models annually as of 2015.

Some beaches with higher exceedance rate have been
difficult to model. CPD is prioritizing rapid methods
at these beaches. CPD also conducts public outreach
about beach water  quality. They implemented a
new texting service that allows beachgoers to text
the name of their beach to a dedicated number and
receive an automatic response with the current
beach conditions. A public education campaign
encourages people not to litter or feed wildlife, since
waste from seagulls and geese has been shown to
be a major source of fecal bacteria in the water.
The campaign also includes signage on Chicago

-------
82
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
   Chicago Park District Beach Modeling (Chicago, Illinois) (continued)
  public transit, posters at beaches, and a large mural
  at one of Chicago's busiest beaches. A new Beach
  Ambassadors program with direct public outreach
  asks beachgoers not to litter or feed wildlife, and
  expanded programming for CPD's summer day camp
  program educates kids on what they can do to keep
  the water clean.

  Finally, CPD is working to reduce bacteria sources
  directly. New grooming equipment removes debris
  and exposes wet sand to sunlight, killing bacteria.
  At beaches with a history of problems from seagull
  waste, CPD is using dog handlers and trained border
  collies to chase the gulls from the beach. This project
  has significantly reduced the number of days where
  FIB levels  exceeded water quality standards.

  Advice and  Lessons Learned
  Sanitary survey data were tested in 2010, but it was
  determined that more accurate and timely data
  (buoy-based)  was needed for the models. While
  daily sanitary survey data are helpful for monitoring
  operations such as garbage collection and beach
  grooming, and keeping track of pollution sources,
  survey parameters are not included in the models—
  the models are all based on data collected by sensors.

  The success of a model depends on a number of
  factors. For CPD, the most important factor was
  related to the  presence of nonpoint versus point
  sources of pollution. You need to have comprehensive
           knowledge about the beach before you can
           successfully develop a predictive model.

           Cathy Breitenbach of CPD noted they are a large
           jurisdiction with many resources. They were able
           to do all equipment maintenance and monitoring
           in-house and did not have to hire or rely on outside
           support. Without this, they would not have been
           as successful, especially considering that Chicago
           is a big city with a large beach-going population to
           protect. Other agencies who want to develop a model
           must have access to funding and technical resources
           necessary to collect data and conduct statistical
           analyses. If their jurisdiction is small, however, they
           can likely develop and implement a predictive model
           at a lower cost.

           References
           Cathy Breitenbach, Chicago Parks District. Personal
              interview. 2012.

           Chicago Park District. 2012. Chicago Park District
              Improves Beach Monitoring for 2012 Season.
              http://www.chicagoparkdistrict.com/
              chicago-park-district-improves-beach-monitoring-
              for-2012-season.

           Fulton, Jeff. No date. Public Beaches in Chicago.
              USA Today, http://traveltips.usatoday.com/public-
              beaches-chicago-53741.html.

-------
Case Study	
City of  Racine  Nowcast Model (Racine, Wisconsin)
                                             83
Introduction
How does a small coastal Wisconsin city of about
79,000 citizens reel in a "Best Beach in the State" title?
One reason might be its cutting edge approach to
staying on top of water quality. The City of Racine,
on Lake Michigan between Milwaukee and Chicago,
manages two popular swimming beaches, North
Beach and Zoo Beach. At 50 acres, North Beach is
the larger of the two. In 2012, USA Today named
it the best beach in Wisconsin, joining 50 other
beaches similarly selected from each of the states and
the District of Columbia. This honor can be added
to a long list of accolades for North Beach, which
includes a Top 10 Family Friendly Beach designation
by Parents magazine in 2011 and the Midwest Living
magazine's Top City Beaches list in 2010.

North Beach has medium- to fine-grained sand and
is groomed to remove trash and aerate the sand. The
swim area has a fairly shallow slope (2 to 5 percent)
and the beach has a 1 to 1.5 percent slope toward
the water. A harbor break wall increases swimming
safety by keeping waves in check. The beach face is
kept at a steep grade to prevent waves from spilling
over the berm crest. The city maintains restrooms,
a bathhouse, a concession stand, and an adjacent
playground. The city hires lifeguards to ensure public
safety. Weekend visitor numbers can exceed 11,000;
daily visitors average up to 2,200 persons per day
during the swimming season.


                                               Lake Michigan
          . Douglas
          4 Park"
                          4 Lake View
                          'Park
I                                  North Beach
                                  Park
  .
"*"     0.25    0.5 Miles
  L
              J
               Zoo Beach, adjacent and north
               of North Beach, is smaller, less-
               developed, and attracts fewer
               beachgoers than North Beach. So
               named because of the adjacent
               Racine Zoo, it has fewer access
               points and amenities. Lifeguards
               are  on duty only on weekends. The
               swim area has a steep drop-off and
               no break wall, so the wave action
               is more intense. Because of these
               contrasts, Zoo Beach offers visitors
               a quality beach experience with
               beautiful views of Lake Michigan
               in a more peaceful, less-populated
               setting.

               Water Quality
               Racine's beachgoers have not always
               enjoyed the current levels of high
               water quality at their beaches. For
               example, in 2003 North Beach was
               under a no-swimming advisory
               for 34 days because of high fecal
               indicator bacteria counts. On
               several of these days the beach
               was closed entirely. That same
               year, Zoo Beach had notifications


-------
84
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study

  City of Racine Nowcast Model (Racine, Wisconsin) (continued)
                   START OF
                  PROTECTED
                      AREA

      Sampling at North Beach,
           issued on 29 days. Since the swimming season in
           Wisconsin is approximately three months long, these
           problems resulted in a loss of almost 40 percent of
           Racine's potential beach days in 2003.

           In response, the city began a campaign to deal with
           the point and nonpoint sources of fecal pollution that
           were polluting their beaches. Sanitary surveys proved
           to be important tools in helping city officials identify
           pollution sources and plan mitigation projects such as
           wetland construction, dune restoration, and improved
           beach grooming practices. The results of these efforts
           were outstanding, especially in terms of reducing
           the number of beach advisories and closings. In
           2010 North Beach was closed or under a swimming
           advisory on only one day and on only three days in
           2011. Zoo Beach had four  notifications in 2010 and
           five in 2011. This increase  in safe-swimming days
           provides clear evidence of the power of active beach
           management.
           Model Development
           With beach clean-up efforts underway, Racine
           focused on the lag time problem associated with
           the traditional culture-based method of beach
           monitoring.

           Racine explored two options for dealing with this
           lag-time dilemma. One was testing a new method of
           measuring Escherichia coli (E. coli) concentration—
           quantitative polymerase chain reaction (qPCR).
           Instead of growing and enumerating bacterial
           colonies in cultures, qPCR yields more timely results
           by identifying and quantifying genetic sequences
           of bacteria. qPCR results can be obtained from a
           laboratory on the same day the sample is taken, in
           most cases within three hours of sample collection,
           allowing more rapid determinations of beach water
           quality for swimmers' safety.

           Racine also explored using mathematical models
           to predict beach water quality. An accurate model
           would provide a basis for issuing preemptive notifi-
           cations in advance of water sampling, allowing city
           officials to take an even more conservative approach



-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
                                               85
   City of Racine Nowcast Model (Racine, Wisconsin) (continued)
  to swimmers' safety. Racine officials believe that
  the daily use of models, supported by daily beach
  survey data and verified by qPCR monitoring, will
  be the cornerstone of their future beach monitoring
  program.

  Statistical models were developed for Racine's two
  beaches using the U.S. Environmental Protection
  Agency's (EPA's) Virtual Beach (VB) software
  (v2.0-2.2). The Wisconsin Department of Natural
  Resources (WDNR) Sciences Services assisted
  throughout the model development process. WDNR
  coordinates Wisconsin's beach monitoring program
  and administers the BEACH Act grants for the state's
  193 public beaches along 55 miles of Lake Superior
  and Lake Michigan coastlines. Because WDNR staff
  had expertise in  model development for the state's
  many public beaches, they were well-equipped to
  offer guidance to the City of Racine. WDNR's support
  proved invaluable as they pulled together various
  data sources, including data from older and recently
  developed models. Identified as Nowcast models,
  the "real-time" predictive models developed for
  the project use multiple linear regression and other
  statistical procedures to evaluate the relationships
  between measured FIB concentrations in the water
   Performing qPCR,
and certain meteorological factors and onshore and
near-shore conditions associated with water quality.
The output of the current models developed by the
city, in conjunction with the WDNR, expresses two
values: predicted E. coli concentrations and predicted
probability of exceedance.

Model Development Key
Components
The key for developing a good model is selecting
the proper set of component variables and ensuring
that staff have the necessary skills. In the initial
development phase, in 2010, Racine examined
a diverse set of variables for potential use in the
model. Variables included water temperature, air
temperature, seagull counts, dog counts, wildlife
counts, wave height and intensity, water clarity, sky
conditions (i.e., cloud cover), water color changes,
odor, algae amount, algae type, bather load (in, out,
and total), long shore current direction and speed,
wind direction and speed, stream discharge, pollution
discharge, rainfall (24-, 48-, and 72-hour) and other
precipitation records, day of year, season, lake levels,
and the previous day's E. coli values. Initially, all
variables were included because the majority could
have been considered factors that influence local
water quality. The project team reduced the initial
number of variables by conducting correlation
analyses. The model was developed using the
variables that had the strongest associations.

Important data sources for the model development
included the U.S. Geological Survey's (USGS') real-
time data viewer, Racine Water and Wastewater
Utilities, the Great Lakes Observing System (GLCFS
Nowcast 2D), local weather station data, and National
Oceanic and Atmospheric Administration (NOAA)
buoy data. Staff also obtained data from routine
sanitary surveys housed on the Wisconsin Beach
Health website (hosted by the USGS). Exploratory
data analyses revealed that the sanitary survey
data were especially valuable. The presence of algae
and water clarity, for example, proved to be a good
predictor of high FIB levels at some locations.

-------
86
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
   City of Racine Nowcast Model (Racine, Wisconsin) (continued)
  Racine's beaches proved to be good candidates
  for modeling because they have large, consistent
  databases of FIB concentrations and fairly predictable
  pollution incidents associated with storms which
  resulted in advisories. Because North Beach is
  sampled at least five times per week, model developers
  could more frequently compare model results with
  actual FIB concentrations.

  By 2011 the VB software (VB v2.1) was fully
  developed and the city built an operational Nowcast
  model for North Beach. Key variables selected for
  the model included rainfall, wave height, long shore
  current vectors, stream discharge, water clarity, and
  sky conditions. Racine conducted a pilot test using
  qPCR and a culture-based method for measuring
  E. coli concentrations. This preparatory step was
  important because it allowed the city to track model
  predictions with laboratory results and validate the
  model using real-time data.

  The results were very encouraging. The model
  predicted E. coli concentration with 91 percent
  accuracy for culture-based results and with 98
  percent accuracy for qPCR-based results.

  In 2012 Racine built new models (using VB v2.2) for
  North and Zoo Beaches. The new models included
  a greater proportion of Web captured data than the
  2011 models which relied heavily on beach sanitary
  data collected locally. By developing two different
  types of models, Racine was able to determine
  whether the number and types of field data could be
  reduced or eliminated (as a cost savings measure). In
  the new model, wave height was found to be the most
  predictive variable at Zoo Beach. As developed, the
  Zoo Beach model required significantly less locally
  collected field data to run than the 2011 model and
  results have been encouraging. However, the city
  found that the 2011 North Beach model (which
  included several beach sanitary survey parameters)
  was more robust than the 2012 model construct.
           Model Implementation
           Before developing the Nowcast models, the City of
           Racine used the persistence model (i.e., the previous
           day's culture-based results) for issuing beach
           notifications. In 2011 the city used the Nowcast
           model in combination with the lab-based methods
           to support management decisions. Even when the
           model predicted exceedance of the E. coli water
           quality standard, the city did not use the model
           alone to make notification decisions. Instead,
           the city developed a set of guidelines for making
           notification decisions. For example, they issue a
           preemptive advisory in advance of results from the
           laboratory analyses if the probability of exceedance
           is greater than 10 percent and the predicted E. coli
           concentration exceeds 50 colony-forming units per
           100 milliliters of water.

           Each beach monitoring component—sanitary survey,
           Nowcast model, culture-based testing, and qPCR—is
           designed and applied to complement and reinforce
           the others to generate timely, accurate results and a
           better understanding of the conditions and variables
           that accelerate FIB growth in water to create unsafe
           conditions for swimming. In June 2012 Racine,
           Wisconsin, became the first municipality in the
           nation to base notification decisions  on qPCR results.
           In conjunction with the qPCR assay, Racine also ran
           the model at North and Zoo Beaches daily (the city
           runs the model only on weekdays, unless there is an
           advisory or closure that extends their sampling and
           sanitary survey data collection into the weekend).
           The results of qPCR, along with sanitary survey
           information, model estimations and staff judgment
           are all considered when determining whether to issue
           a beach advisory or closure.

           Model Costs
           The city did not incur additional costs for
           data collection for model development and
           implementation. The necessary data were already
           being routinely collected, using equipment already
           in place. The costs associated with the development
           of the Nowcast model were mostly for labor. The staff

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
                                               87
  City of Racine Nowcast Model (Racine, Wisconsin) (continued)
         Stormwater retention management practices at North Beach,
  needed to have a basic understanding of statistics,
  intuitive ability to manipulate data, and a working
  knowledge of factors affecting local water quality.
  The development team usually consisted of two
  laboratory personnel, with support and guidance
  from staff at the WDNR Science Services division.
  In some cases, WDNR staff took a more impromptu
  role by developing models in coordination with
  laboratory personnel. Labor costs included the time
  it took staff to retrieve, format, and assess the data
  and build, train,  and revise the operational version
  of the initial model. Staff needed several months to
  collect and format data to develop the initial model.
  Model costs were minimal but required one person
  to work through the modeling process.  The newer VB
  software reduced the model development time, but
  data evaluation and model development still required
  a week or more.
The daily cost to run the model is minimal—most
of the cost is in data collection and processing (i.e.,
the initial effort required to build the model, run
correlations, and perform statistics), which occurred
over several days. Importantly, EPA's continued
improvement of the VB software allows for more
rapid statistical model development and simplifies the
application of the model for the end user. Newer ver-
sions of the model not only provide quantified results,
but also add an exceedance probability providing
another dimension to beach management decisions.

The time spent running the model is only a
fraction of the time spent on routine, culture-based
monitoring. Once all the routine sanitary survey
data are available, the model takes approximately five
minutes to run—significantly faster than laboratory
sample analyses, which require at least 2 hours  and
up to 18 hours, depending on the method used.


-------
88
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
  City of Racine Nowcast Model (Racine, Wisconsin) (continued)
   Issues Encountered
   Although the overall model prediction results have
   been very accurate, the City of Racine encountered
   a few issues. During the model development phase,
   the city had trouble building a robust dataset because
   of the large amount of data required. They resolved
   this problem after they compared the electronic data
   against the original hardcopies and found that missing
   data and incorrectly entered data often caused issues
   with empty cells or incorrect predictions.

   Another issue encountered was with the estimation
   of E. coli data. North Beach typically has very few
   advisories; as a result, building a model to predict
   those exceedances was difficult. For example,
   since advisory dates were so few and far between,
   those dates could have possibly been identified as
   statistical outliers (i.e., sample results that were
   numerically distant from the rest of the data) and
   it was sometimes difficult to decide which data
   should be culled. Once these decisions were made,
   implementation was far less problematic. City of
   Racine laboratory staff noted, "Where there are few
   exceedances, we sometimes remove them as statistical
   outliers, but we have to be careful doing so because if
   we leave them out, we are essentially excluding event-
   based data."

           The city sometimes had issues with data retrieval.
           Occasionally, either online data were unavailable for
           running the model, or data were unusable because
           of a reporting error. In those cases, the city had
           to find a comparable data source. For example, if
           rainfall data were unavailable from the local airport,
           the city used precipitation records from the local
           wastewater treatment plant (a comparable distance
           from the beach) to make an initial estimation.
           Once precipitation records from the airport became
           available, they re-ran the model using the amended
           data from the original source.

           Moving Forward
           The City of Racine validates the model by comparing
           model results to monitoring (culture and qPCR)
           results. They consider the model to be successful
           because of the low number of Type I and Type II
           errors found after evaluating beach management
           decisions at the end of the beach season. The city ran
           the 2011 (VB v2.1) and 2012 (VB v2.2) models side-
           by-side to compare the results and verified which
           model was most appropriate for each beach. They will
           continue to evaluate their model every year to ensure
           that it is still predictive since major changes can occur
           to beaches and the weather varies significantly from
           year to year. Data collection methods and variables

               Waiting for fireworks.

-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
                                               89
  City of Racine Nowcast Model (Racine, Wisconsin) (continued)
         Life guard watching North Beach,

  used in the model have not changed. The city
  compared the 2011 model to newer model iterations
  and have found that incorporating additional years of
  data has not made any significant improvements.

  As of 2015, the city is using both the monitoring
  and modeling results to make beach notification
  decisions. Because the model results have shown high
  accuracy, monitoring could be reduced in the future.
  However, staff would still need to visit the beaches
  regularly to complete the routine beach sanitary
  survey form that includes data elements necessary to
  run the model. In the future, the city plans to focus
  more on  cost-efficiency; Nowcast models will likely
  play a large role in this endeavor. Eventually, model
  results might be the primary means for making beach
  notification decisions, restricting laboratory analyses
  to only those days when exceedances are predicted.
  Through the use of qPCR this can be accomplished
  in near real time, striking a balance between
  public health protection and maximum utility of
  recreational beaches.
Advice and Lessons Learned
To assist others planning to develop a predictive
model, the City of Racine shared these lessons
learned:
• Partner with agencies or universities that have
  software expertise and experience with predictive
  models.

• Build your model using easily retrievable data
  and collect data in a consistent manner and in
  sufficient quantity. You can't compare an apple
  to an orange (i.e., estimations of wave height
  beachside might not be equivalent to data retrieved
  at a NOAA buoy). It is often best to collect your
  own data and not rely on someone else's.

• Have a robust data set—at least 2 seasons' worth of
  data are preferable.

• Use sanitary surveys to identify pollution sources
  as well as gaps in model performance. One
  season may have a dominant variable that wasn't

-------
9O
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
   City of Racine Nowcast Model (Racine, Wisconsin) (continued)
     previously accounted for. Sanitary survey data
     should be consistently collected each day that
     sampling occurs.

     Evaluate your dataset before building the model.
     Sometimes modelers will expect an unreasonably
     high R-squared value without much knowledge
     of their data. As a result, modelers might spend
     unnecessary time finding, acquiring, compiling,
     formatting, and reviewing additional data,
     which might not significantly improve model
     performance. A flow chart of inputs and outputs
     for FIB  at your beach can help with this.

     Not all beaches will have a single driving force, but
     those that have unique situations might require
     evaluative criteria prior to model development to
     improve chances of a success. There can be a lot
     of background noise from the frequency of non-
     event related observations in predictive variables.
     The City of Racine improved model performance
     by implementing a rainfall threshold to reduce the
     size of the dataset.

     Examine the interaction between variables—not
     just variables as single elements. For example,
     wind direction  might not be predictive, but wind
     direction plus speed might be (e.g., onshore winds
     exceeding a velocity threshold).

     Determine how to best represent your data (i.e.,
     quantitative, qualitative, categorical, or binary).

     Discuss the threshold for exceedance probabilities
     during the implementation phase. Depending on
     the model, the probability of exceedance result
     might be less or more than expected, given the
     model estimate.

     Have comparable backup data sources for your
     model inputs. Be realistic about model outputs
     and combine the results with experience. Does
             the model output match what my experience tells
             me? How should I expect these environmental
             conditions to affect local water quality? (i.e., your
             model needs to make sense.)

           • Validate the model periodically because ambient
             conditions might change.

           References
           Cicero, K. The 10 Best Beaches for Families: 2011.
              Parents Magazine. June 2011. Accessed January 22,
              2013. http://www.parents.com.

           Clark, J., Hortobagyi, M., and Yancey, K.B. Just for
              Summer: 51 Great American Beaches. USA Today.
              March 27, 2012. Accessed January 22, 2013.
              http://travel.usatoday.com.

           Kinzelman, Julie. City of Racine. Personal
              communication.

           Kurdas, Stephan. City of Racine. Personal
              communication.

           Our 7 Top Midwest City Beaches. Midwest Living
              Magazine. July-August 2010. Accessed January 22,
              2013. http://www.midwestliving.com.

-------
Case Study	
The Stormwater and NexRad Rainfall Models (Horry
County,  South  Carolina)
                                              91
Introduction
Horry County, South Carolina has 180 miles of
coastline containing a series of beaches, its most
famous being Myrtle Beach, also known as the
"Grand Strand." Its beaches attract more than 13
million visitors each year.

Like all the public beaches in the state, Grand Strand
beaches are regularly monitored for fecal indicator
bacteria levels by the South Carolina Department of
Health and Environmental Control (SCDHEC), in
conjunction with local governments. The goal of the
monitoring program is to allow the public to make
        ««.*
                                           Atlantic Ocean
                                     5.5
                                     I
 11 Miles
_|
informed decisions about their recreational activities
and any potential for swimming-associated health
effects.

Water Quality
The water quality of Grand Strand beaches is
typically very good. However, during and after heavy
rainstorms, Stormwater discharges occasionally
cause bacteria levels to rise above state water quality
standards, prompting SCDHEC to issue swimming
advisories. To minimize the impact of Stormwater
on these beaches, some Grand Strand communities
have extended their Stormwater outfall structures
            further out into the ocean to discharge
            runoff into deeper waters, away from
            swimming areas.

            In 2011 Myrtle Beach completed
            a project at 4th Avenue North
            that consolidated nine nearshore
            Stormwater drainage pipes into one
            large pipe which runs underneath the
            seabed and empties into the Atlantic
            Ocean more than 1,000 feet from
            shore. Similar projects have been
            conducted at 7th Avenue South in
            North Myrtle Beach and at Deep Head
            Swash in Myrtle Beach. These and
            other infrastructure investments have
            significantly reduced fecal indicator
            bacteria levels at Grand Strand
            beaches.

            Model Development
            Stormwater Model
            In 2007 SCDHEC developed a model
            as part of a staffer's master's thesis
            project to predict fecal indicator
            bacteria levels at South Carolina state
            beaches. To be adopted and applied
            by SCDHEC, the model needed to be
            simple to operate and provide reliable

-------
92
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study

  Stormwater Model (Horry County, South Carolina) (continued)
      Myrtle Beach,

  results. The model effort evolved to include a project
  team consisting of the local health department,
  SCDHEC and University of South Carolina (USC)
  professors with geostatistical modeling, database
  management, and geographic information system
  (GIS) expertise. They chose to develop models using
  information for the popular Grand Strand beaches.
  These beaches are Tier 1 (the highest priority beaches
  because of high risk, high use, or both) beaches and
  were best suited for modeling because they have
  direct Stormwater input and high number of bathers.
  Tier 2 beaches typically had very few exceedances and
  bathers.

  SCDHEC and the project team used various statistical
  methods, a literature review, and professional
  judgment to determine which variables to include.
  Rainfall was found to be the primary predictive
  variable.

  The initial models developed were statistical models
  with rainfall as the most important variable. A
  Multiple Linear Regression (MLR) model and a
  Classification and Regression Tree (CART) model
           were developed and run separately for each sample
           site. To improve prediction, SCDHEC developed
           an ensemble forecast—a statistical approach using
           results from multiple models—by combining results
           from the MLR (predicting estimated fecal indicator
           bacteria levels) and CART (estimating the range
           of expected fecal indicator bacteria levels) for each
           sample site (or section of beach). By combining these
           two results, SCDHEC could approximate a third
           possible fecal indicator bacteria level, called the
           Ensemble prediction. Beach managers could use all
           three model outputs to determine the advisory level
           needed to protect public health in different areas.

           NexRad Rainfall Model
           In 2011 SCDHEC began collaborating with USC and
           the University of Maryland to develop an updated
           version of the Stormwater model, (i.e., NeRad
           rainfall model) one that would not require the use of
           expensive rainfall equipment. The project entailed
           enhancing a user application with new models and
           developing an automated, database-driven tool
           that would estimate bacteria levels and visualize
           model results, allowing SCDHEC to better predict



-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
                                               93
  Stormwater Model (Horry County, South Carolina) (continued)
  and analyze bacteria-related public health threats.
  The project was led by Dr. Dwayne Porter of USC
  and built on previous efforts and incorporated new
  models that provide rainfall estimates using radar-
  based data. These radar data improved existing
  tools by (1) allowing spatial estimates to be averaged
  over a watershed area instead of applying point
  estimates and (2) allowing for automated integration
  of remotely sensed data, eliminating the  need for
  SCDHEC's costly rain gauge network.

  The NexRad rainfall model essentially combined
  the MLR, CART, and ensemble techniques into
  one modeling user interface, and added a new
  element—Next-Generation Radar (NexRad) data,
  compiled from a network of high-resolution Doppler
  weather radars operated by the National Oceanic and
  Atmospheric Administration's (NOAAs) National
  Weather Service (NWS). The goal of using NexRad
  is to have as close to real-time data as possible. As of
  2013, the NexRad data just included rainfall; however,
  the development team planned to consider other
  variables such as sunlight, temperature, salinity, and
  the number of preceding dry days. The development
  team is still determining the best data sources. As of
  2015, USC still uses sanitary surveys although they
  can be time-consuming.

  The NexRad rainfall model also used GIS polygons
  of individual watersheds, which were created by
  overlaying piping diagrams of the Stormwater systems
  provided by the area's individual municipalities. GIS
  polygons are overlaid to create mini-watersheds to
  determine how much rain falls on each beach site.

  SCDHEC tested the NexRad rainfall model in several
  counties during the 2012 beach swimming season
  (May 15 through October 15) and used model results
  as one  of several tools in deciding whether to issue
  swimming advisories. Exceedances of water quality
  standards are expressed as High, Medium, and Low
  (using  the MLR and CART model predictions), but
  the model can also provide actual predicted FIB
  levels.
Data
SCDHEC used historical water quality data to
develop and validate the Stormwater model in
2007. The data and variables considered included
cumulative rainfall, rainfall intensity, number of
preceding dry days, wind speed and direction, tides
and lunar phase data, water current, and salinity. The
water quality data were collected by the SCDHEC
beach monitoring program. Rainfall data were
collected by a system of gauges installed in several
locations. Wind speed and direction data were
obtained online from NOAA.

In 2011, USC began developing the NexRad rainfall
model based on the assimilation and integration of
multiple sources of data including field programs
(bacteria density, salinity, air and water temperature,
tide, weather); observing systems (rainfall, currents,
salinity, wind); and remote  sensing models (salinity,
air and water temperature, rainfall, currents, wave
activity). SCDHEC provided the bacteria density
data. All other data for the NexRad rainfall came
from a variety of sources, including the NWS, the
National Estuarine Research Reserve System, and
the Southeast Coastal Ocean Observing Regional
Association's Integrated Ocean Observing System
(IOOS).

Model Output and Validation
To validate the Stormwater  model, USC compared
the predicted MLR calculations to actual sampled
values twice a month. In general, the Stormwater
model expressed predicted exceedances of the water
quality standard with above-average accuracy;
however, SCDHEC did not  sample after rain events
at sites where acceptable water quality was predicted.
Therefore, an unknown quantity of false positives
might have occurred.

In 2005 SCDHEC ended the data collection used to
validate the Stormwater model. Officials felt that the
post-2005 changes (i.e., offshore Stormwater outfall
pipe [discussed above] and new infiltration pits and
ultraviolet disinfection systems) drastically changed
the environment; therefore, the model was no longer

-------
94
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
  Stormwater Model (Horry County, South Carolina) (continued)
                   Myrtle Beach,


  relevant since it was based on data collected before
  these changes occurred.

  The NexRad rainfall model worked better for some
  beaches than others. USC performed receiver
  operating characteristic (ROC) analyses to determine
  the frequency of Type I and Type II errors. USC staff
  assessed model effectiveness by cross-referencing
  samples taken against predicted MLR calculations.
  If the MLR model calculated fecal bacteria levels of
  greater than 103 colony-forming units (CPU) per
  100 milliliters (mL) of water, SCDHEC issued an
  advisory. If the CART model calculated High, they
  issued an advisory. If the MLR model calculated
  a concentration of greater than 74 CFU/100 mL
  and CART model calculated Medium at the same
  site, they issued an advisory. USC validates the
  NexRad rainfall model using VB's toolbox for model
  development and validation which has made model
  updates and validation fairly easy as long as data are
  available. This tool allows the user to compare model
  predictions against actual monitoring data.
           Model Implementation
           When implementing the Stormwater model (during
           the 2007-2009 beach seasons), SCDHEC discovered
           that the effect of rainfall and other variables differed
           by beach site. Consequently, the agency decided that
           each beach site should be modeled independently
           (i.e., using a different statistical model for each station
           or section of beach) to provide the most accurate
           information.

           SCHDEC applied the Stormwater model to 10 beaches
           in Horry and Georgetown counties. The model was
           designed to extract rainfall data from rain gauges at
           each beach and independently input weather and tidal
           information. These data were continuously added
           to the model, which was constantly recalibrated,
           although a more intensive recalibration was needed to
           adjust to the infrastructure changes.

           When developing the NexRad rainfall model in 2011,
           USC found that combining the separate, sample
           site-specific models (MLR, CART, and Ensemble)
           into one user interface was fairly easy. As of 2015 USC
           makes the daily model results available via email, a
           Web interface, and a phone application. SCDHEC




-------
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
                                              95
  Stormwater Model (Horry County, South Carolina) (continued)
  publishes advisory information at www.howsthe-
  beach.org. In 2012, SCDHEC used the NexRad model
  to implement the initial suite of preemptive beach
  swimming advisory models as a tool to determine
  when an advisory should be issued in Horry County,
  South Carolina. Because of program management
  changes, SCDHEC did not continue using the model
  for advisory decisions in Horry County after 2012.

  Issues  Encountered
  The Stormwater model was used during the 2007-
  2009 beach seasons to make beach management
  decisions.  Little or no modeling was performed in
  2010-2011. In 2010, SCDHEC changed the advisory
  program and began placing permanent advisory
  signs  at beach sites that routinely exceeded the state
  water quality standards (e.g., Stormwater outfalls
  and swashes). Remaining sites were either not
  modeled because of historically low enterococci
  counts or because they never exceeded water quality
  standards, even after a rainfall event (other than a
  tropical storm). This, coupled with drier seasons,
  meant that the Stormwater model was not used very
  frequently, if at all, in most locations. Out of a total
  of 43 sites  monitored, SCDHEC placed 29 permanent
  signs  saying, "Caution, swimming not advised, high
  bacteria counts, refrain from fishing and wading, do
  not put head below water, no swimming within 200
  feet of sign." SCDHEC still monitored all 43 sites, but
  did not want to invest in new monitoring equipment
  to support the modeling when many sites already had
  permanent signs. In addition, the outdated equipment
  used to measure rainfall began failing and was not
  compatible with new computers.  The entire system
  was expensive to replace, with an estimated cost of
  $20,000-$30,000.

  Learning from SCDHEC's experiences, beach
  managers  should be aware of the limitation of
  hardware and equipment. For South Carolina, the
  tipping buckets used to gather rain data required a
  significant amount of maintenance and time to keep
  running and clean. Replacement parts for all types
  of equipment can be costly; in addition, equipment
can sometimes become obsolete, being replaced with
newer technology. In addition, equipment can be
difficult to maintain with limited amounts of staff
and resources.

Model Costs
The initial cost to develop the Stormwater model
in 2007 was low—basically the cost of a graduate
student's time. Once the model was operational, costs
increased because a series of 11 rain gauges needed to
be installed and maintained. Unfortunately, SCDHEC
budget cuts reduced the resources and staff available
to perform maintenance. Using data collected during
the 2007-2009 beach seasons, they were able to target
sites with frequent exceedances and could reduce
monitoring and maintenance of the rain gauges at
other sites, further reducing costs. This continued
until permanent signs were put in place at beach sites
where the water quality standards were routinely
exceeded and eventually they stopped using the rain
gauges all together.

The primary costs for the 2011 NexRad model are
development and continual model updates. There
were no costs associated with model implementation
because model data were obtained for free.

Moving Forward
The NexRad rainfall model eliminated the need for
updates and maintenance of the rain gauge network;
improve timeliness by providing robust decision
support well in advance of verification by biological
sample cultures; and improved accuracy by providing
reliable forecasts of beach hazards that would merit
closures, while reducing false positives. These models
are some of the first marine Enterococcus models,
and some of the first to use CART models. They
are transferable to other swimming beaches in the
southeast United States that experience similar
weather and water circulation patterns and have
Stormwater runoff as the most significant pollution
source. In the future, the scientists who developed

-------
96
Six Key Steps for Developing and Using Predictive Tools at Your Beach—Case Study
  Stormwater Model (Horry County, South Carolina) (continued)
  the model hope to increase buoy and radar coverage
  to provide improved spatial resolution of data and to
  assess the use of the model for predicting salinity and
  currents.

  USC's Dr. Dwayne Porter advises other beach
  programs that "you do not want to shortchange the
  modeling effort, but simpler is often better." Sean
  Torrens with SCDHEC encourages beach managers to
  collaborate with others, such as graduate students and
  universities, and to research what others are doing to
  avoid reinventing the wheel.

  References
  NRDC (Natural Resources Defense Council). Testing
     the Waters: South Carolina.
     http://www.nrdc.org/water/oceans/ttw/sc.asp.

  Porter, Dwayne, University of South Carolina.
     Personal communication.

  South Carolina Department of Health and
     Environmental Control. Beach Monitoring
     Program.
     http://www.scdhec.gov/HomeAndEnvironment/
     Water/SwimSafety/.

  Southeast Coastal Ocean Observing Regional
     Association. Water Quality Observations and
     Models Help Managers Make Decisions on Issuing
     Swim Advisories, www.secoora.org.

  Torrens, Sean, South Carolina Department of
     Health and Environmental Control. Personal
     Communication.

-------