&EPA
EPA 600/R-08/041B I October 2010 I www.epa.gov/ord
   United States
   Environmental Protection
   Agency
                     User's Manual TEVA-SPOT Toolkit
                     VERSION 2.4
   Office of Research and Development
   National Homeland Security Research Center

-------
                                                   EPA 600/R-08/041B I October 2010 I www.epa.gov/ord
        User's Manual TEVA-SPOT Toolkit
        Version 2.4
         by
         Jonathan Berry, Erik Boman, Lee Ann Riesen
         Scalable Algorithms Dept
         Sandia National Laboratories
         Albuquerque, NM 87185


         William E. Hart, Cynthia A. Phillips, Jean-Paul Watson
         Discrete Math and Complex Systems Dept
         Sandia National Laboratories
         Albuquerque, NM 87185


         Project Officer:
         Regan Murray
         National Homeland Security Research Center
         Office of Research and Development
         U.S. Environmental Protection Agency
         Cincinnati, OH 45256
Office of Research and Development
National Homeland Security Research Center, Water Infrastructure Protection Division

-------
    The U.S. Environmental Protection Agency (EPA) through its Office of Research and Develop-
    ment funded and collaborated in the research described here under an Inter-Agency Agreement
    with the Deparment of Energy's Sandia National Laboratories (IAG # DW8992192801). This
    document has been subjected to the Agency's review, and has been approved for publication as
    an EPA document. EPA does not endorese the purchase or sale of any commercial products or
    NOTICE: This report was prepared as an account of work sponsored by an agency of the United
    States Government. Accordingly, the United States Government retains a nonexclusive, royalty-
    free license to publish or reproduce the published form of this contribution, or allow others to
    do so for United States Government purposes.  Neither Sandia Corporation, the United States
    Government, nor any agency thereof, nor any of their employees makes any warranty, express
    or implied,  or assumes any legal liability or responsibility for the accuracy, completeness, or
    usefulness of any information, apparatus, product, or process disclosed, or represents that its use
    would not infringe privately-owned rights. Reference herein to any specific commercial product,
    process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily
    constitute or imply its endorsement, recommendation, or favoring by Sandia Corporation, the
    United States Government, or any agency thereof. The views and opinions expressed herein do
    not necessarily  state or reflect those of Sandia Corporation, the United States Government or
    any agency thereof.

    Questions concerning this document or its application should be addressed to:

        Regan Murray
        USEPA/NHSRC (NG 16)
        26 W Martin Luther King Drive
        Cincinnnati OH 45268
        (513) 569-7031
        Murray.Regan@epa.gov
Sandia is a multiprogram laboratory operated by Sandia Corpora-
tion, a Lockheed Martin Company, for the United States Depart-
ment of Energy's National Nuclear Security Administration under
Contract DE-AC04-94-AL85000.
Sandia National Laboratories

-------
Forward
Since its inception in 1970, EPA's mission has been to pursue a cleaner, healthier environment for the
American people.  The Agency was assigned the daunting task of repairing  the damage already done to
the natural environment and establishing new criteria to guide Americans in making a cleaner environment
a reality.  Since  1970, the EPA has worked with federal, state, tribal, and local partners  to advance its
mission to protect human health and the environment. In order to carry out its mission, EPA employs and
collaborates with some of the nation's best scientific minds.  EPA prides itself in applying sound science and
state of the art techniques and methods to develop and test innovations that will protect  both human health
and the environment.

Under existing laws and recent Homeland Security Presidential Directives, EPA has been  called upon to play
a vital role in helping to secure the nation against foreign and domestic enemies.  The National Homeland
Security  Research  Center (NHSRC) was formed in 2002 to conduct research in support of EPA's role in
homeland security. NHSRC research efforts  focus on five areas:  water infrastructure protection, threat and
consequence assessment, decontamination and consequence management, response capability enhancement,
and homeland security technology testing and evaluation. EPA is the lead federal agency for  drinking water
and wastewater systems and the NHSRC is working to reduce system vulnerabilities, prevent and prepare for
terrorist attacks, minimize public health impacts and infrastructure damage, and enhance recovery efforts.

This Users Manual for the TEVA-SPOT Toolkit software package is published and made available by EPA's
Office of Research and Development to assist the user community and to link researchers with their clients.
    Jonathan Herrmann, Director
    National Homeland Security Research Center
    Office of Research and Development
    U. S. Environmental Protection Agency

-------
License Notice
TEVA-SPOT Toolkit is Copyright 2008 Sandia Corporation.  Under the  terms of Contract DE-AC04-
94AL85000 with Sandia Corporation, the U.S. Government retains certain rights in this software.

The "library" refers to the TEVA-SPOT Toolkit software, both the executable and associated source code.
This library is free software; you can redistribute it and/or modify it under the terms of the BSD License as
published by the Free Software Foundation.

This library is distributed in the hope that it will be useful, but  WITHOUT ANY WARRANTY; without
even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
the GNU Lesser General Public License for more details.

The  TEVA-SPOT Toolkit utilizes a variety of external  executables that are distributed under separate
open-source licenses:

   • PICO - BSD and Common Public License

   • randomsample, sideconstraints - ATT Software  for noncommercial use.

   • ufl - Common Public License

-------
Acknowledgements


The National Homeland Security Research Center would like to acknowledge the following organizations and
individuals for their support in the development of the TEVA-SPOT Toolkit User's Manual and/or in the
development and testing of the TEVA-SPOT Toolkit Software.

Office of Research and Development -  National Homeland Security Research Center

   Robert Janke
   Regan Murray
   Terra Haxton

Sandia National Laboratories

   Jonathan Berry
   Erik Boman
   William Hart
   Lee Ann Riesen
   Cynthia Phillips
   Jean-Paul Watson

Argonne National Laboratory

   Thomas Taxon

University of Cincinnati

   James Uber

American Water Works Association Utility Users Group

   Kevin Morley (AWWA)

-------

-------
Contents









1  Introduction                                                                             1




   1.1   What is TEVA-SPOT?	   1




   1.2   About This Manual	   2






2  TEVA-SPOT Toolkit Basics                                                              3




   2.1   Approaches to Sensor Placement  	   3




   2.2   The Main Steps in Using SPOT	   4




        2.2.1  Simulating Contamination Incidents 	   4




        2.2.2  Computing Contamination Impacts	   4




        2.2.3  Performing Sensor Placement	   5




        2.2.4  Evaluating a Sensor Placement	   5




   2.3   Installation and Requirements for Using SPOT 	   6




   2.4   Reporting Bugs and Feature Requests	   7






3  Sensor Placement Formulations                                                          8




   3.1   The Standard SPOT Formulation  	   8




   3.2   Robust SPOT Formulations	   9




   3.3   Min-Cost Formulations	  10




   3.4   Formulations with  Multiple Objectives 	  10




   3.5   The SPOT Formulation with Imperfect Sensors	  10






4  Contamination Incidents and Impact Measures                                          12




   4.1   Simulating Contamination Incidents  	  12




   4.2   Using tso2Impact	  12




   4.3   Impact Measures	  13




   4.4   Advanced Tools for Large Sensor Placements Problems	  14






5  Sensor Placement Solvers                                                               16




   5.1   A Simple Example	  16




   5.2   Computing a Bound on the Best Sensor Placement Value	  20

-------
   5.3  Minimizing the Number of Sensors	  20



   5.4  Fixing Sensor Placement Locations 	  21




   5.5  Robust Optimization of Sensor Locations	  21



   5.6  Multi-Criteria Analysis	  22




   5.7  Sensor Placements without Penalties	  24



   5.8  Limited-Memory Sensor Placement Techniques	  26




   5.9  Evaluating a Sensor Placement	  27



   5.10 Sensor Placement with Imperfect Sensors	  29




   5.11 Summary of Solver Features	  30






6  File Formats                                                                                32




   6.1  TSO File	  32



   6.2  SDX File	  32




   6.3  TSG File	  32



   6.4  TAI File	  32




   6.5  Sensor Placement File	  33



   6.6  Impact File	  33




   6.7  LAG File	  34



   6.8  Scenario File	  34




   6.9  Node File	  34




   6.10 Sensor Placement Configuration File	  35



   6.11 Sensor Placement Costs File	  35




   6.12 Placement Locations File 	  36



   6.13 Sensor Class File	  36




   6.14 Junctions Class File	  37

-------
      Introduction
Public water distribution systems are inherently vulnerable to accidental or intentional contamination be-
cause of their distributed geography.  Further,  there are many challenges to detecting contaminants in
drinking water systems: municipal distribution systems are large,  consisting of hundreds or thousands of
miles of pipe; flow patterns are driven by time-varying demands placed on the system by customers; and dis-
tribution systems are looped, resulting in mixing and dilution of contaminants. The use of on-line, real-time
contaminant warning systems (CWSs) is a promising strategy for mitigating these risks. Online sensor data
can be combined with public health surveillance systems, physical security monitoring, customer complaint
surveillance, and routine sampling programs to effect a rapid response to contamination incidents [20].

A variety of technical challenges need to be addressed to make CWSs a practical, reliable element of water se-
curity systems. A key aspect of CWS design is the strategic placement of sensors throughout the distribution
network. Given a limited number of sensors, a desirable sensor placement minimizes the potential economic
and public health impacts of a contaminant incident. There are a wide range of important design objectives
for sensor placements (e.g., minimizing the cost of sensor installation and maintenance, the response time to
a contamination incident, and the extent of contamination). In addition, flexible sensor placement tools are
needed to analyze CWS designs in large scale networks.
1.1    What is TEVA-SPOT?
The Threat Ensemble Vulnerability Assessment and Sensor Placement Optimization Tool (TEVA-SPOT)
has been developed by the U. S. Environmental Protection Agency, Sandia National Laboratories, Argonne
National Laboratory, and the University of Cincinnati. TEVA-SPOT has been used to develop sensor network
designs for several large water utilities [11], including the pilot study for the EPA's Water Security Initiative.

TEVA-SPOT allows a user to specify a wide range of modeling inputs  and performance objectives for
contamination warning system design.   Further, TEVA-SPOT supports a flexible decision framework for
sensor placement that involves two major steps:  a modeling process and a decision-making  process [12].
The modeling process includes (1) describing sensor characteristics, (2) defining the design basis threat, (3)
selecting impact measures for the CWS, (4) planning utility response to sensor detection, and (5) identifying
feasible  sensor locations.

The design  basis threat  for a CWS is  the ensemble of contamination  incidents that a CWS should  be
designed to protect against. In the simplest  case, a design basis threat is a contamination scenario with
a single contaminant that is introduced at a  specific time and place.  Thus, a design basis threat consists
of a set of contamination incidents that can be simulated with standard water distribution  models [17].
TEVA-SPOT provides a convenient interface for defining and computing the impacts of design basis threats.
In particular, TEVA-SPOT can simulate many contamination incidents in parallel, which has reduced the
computation of very large design basis threats from weeks to hours on the EPAs high performance computing
system.

TEVA-SPOT was designed to model a wide range of sensor placement problems. For example, TEVA-SPOT
supports a number of impact  measures, including  the  number of people exposed to dangerous levels of a
contaminant, the volume of contaminated water used  by customers, the number of feet of contaminated
pipe, and the time to detection. Response delays can also be specified to account for the time a water utility
would need to verify a contamination incident before notifying the public.  Finally, the user can specify the
feasible  locations for sensors and fix sensor locations during optimization.  This flexibility allows a user to
evaluate how different factors impact the CWS performance and to iteratively refine a CWS design.

-------
1.2   About This Manual

The capabilities of TEVA-SPOT can be accessed either with a GUI or from command-line tools. This user
manual describes the TEVA-SPOT  Toolkit, which contains these command-line tools. The TEVA-SPOT
Toolkit can be used within either a MS Windows DOS shell or any standard Unix shell (e.g. the Bash shell).

The following sections describe the TEVA-SPOT Toolkit, which we refer to as SPOT throughout this manual:

   • TEVA-SPOT Toolkit Basics - An introduction to the process of sensor placement, the use of SPOT
     command-line tools, and installation of the SPOT executables.

   • Sensor Placement Formulations - The mathematical formulations used by the SPOT solvers.

   • Contamination Incidents and Impact Measures - A description of how contamination incidents
     are computed, and the impact  measures that can be used in SPOT to analyze them.

   • Sensor Placement Solvers - A description of how to apply the SPOT sensor placement solvers.

   • File Formats - Descriptions of the formats of files used by the SPOT solvers.

In addition, the appendices of this manual describe the syntax and usage  of the SPOT command-line exe-
cutables.

-------
      TEVA-SPOT Toolkit  Basics
This section provides an introduction to the process of sensor placement, the use of SPOT command-line
tools, and the installation of the SPOT executables.

2.1    Approaches to Sensor  Placement

Sensor placement strategies can be broadly characterized by the technical approach and the type of com-
putational model used. The following categories reflect important differences in proposed sensor placement
strategies:


   • Expert Opinion:  Although expertise with water distribution systems is always needed to design an
     effective CWS, here we refer to approaches that are solely guided by expert judgement. For  example,
     Berry et al. [4] and Trachman  [19] consider sensor placements developed by experts  with significant
     knowledge of water distribution systems. These experts did  not use computational models  to carefully
     analyze network dynamics. Instead, they used their experience to identify locations whose water quality
     is representative of water throughout the network.

   • Ranking Methods:  A related approach  is to use preference information to rank network  locations
     fl, 8].  In this approach, a user provides preference values for the properties of a "desirable" sensor
     location, such as proximity to critical facilities.  These preferences can then be used to  rank  the
     desirability of sensor locations throughout the network.  Further, spatial information can be integrated
     to ensure good coverage of the network.

   • Optimization: Sensor placement can be automated with optimization methods that computationally
     search for  a sensor configuration that minimizes contamination risks.  Optimization methods use  a
     computational model to estimate the performance of a sensor configuration. For example, a model
     might compute the expected impact  of an ensemble of contamination incidents, given sensors placed
     at strategic locations.
     Optimization methods can be further distinguished by the type of computational model that  they use.
     Early sensor placement research focused on models that used simplified network models derived from
     contaminant transport simulations.  For example, hydraulic simulations can be used to model stable
     network flows  [3], or to generate an averaged water network flow model [14].
     More recently, researchers have used models  that directly rely  on contaminant transport simulation
     results. Simulation tools, like EPANET [17], perform extended-period simulation of the hydraulic and
     water quality behavior within pressurized pipe networks. These models can evaluate the expected flow
     in water distribution systems, and they can model the transport of contaminants and related chemical
     interactions. Thus, the CWS design process can directly minimize contamination risks by  considering
     simulations of an ensemble of contamination  incidents, which reflect  the impact of contamination at
     different locations, times of the day, etc.


SPOT development has focused on optimization methods, and in particular on methods that use contaminant
transport simulation. Contaminant transport simulation models can directly model contamination risks, and
consequently optimization methods using these models have proven effective at minimizing risk. Comparisons
with expert opinion and ranking methods suggest that these approaches are not as effective in large, complex
networks [4, 15]. Further, optimization methods using simpler models can fail to capture important  transient
dynamics (see Berry et al. [6] for a comparison).

A key issue for the simulation-based optimization methods is that they require the simulation of a potentially
large number of contamination  incidents. Consequently,  it is very expensive to apply generic optimization

-------
methods like evolutionary algorithms [14]. However, Berry et al.  [5] have shown that these simulations can
be performed in an off-line preprocessing step that is done in advance of the optimization process.  Thus, the
time needed for simulation does not necessarily impact the time spent performing sensor placement.

2.2   The Main Steps in Using SPOT

The following example illustrates the main steps required to (1) simulate contamination incidents, (2) com-
pute contamination impacts,  (3) perform sensor placement,  and (4)  evaluate a sensor placement.  This
example places sensors in EPANET Example 3 (Net3), a small distribution system with 97 junctions.

The data used in this example is available in the C:\spot\examples\simple directory. In general, a user will
need to  use a variety of data sources to develop a sensor placement model. The file C:\spot\doc\SPOT_-
DataRequirements.doc discussess the type of data used in TEVA-SPOT in greater detail.

2.2.1   Simulating Contamination Incidents

Simulation of contamination incidents is performed with the tevasim  command, which iteratively calls
EPANET  to simulation an ensemble of contamination incidents.  The tevasim command has the following
inputs and outputs:

   • Inputs:

        —  TSG File: defines an ensemble of contamination scenarios

        -  INP File: the EPANET input file for the network

   • Outputs:

        —  TSO File: a binary file that stores the contamination results for all incidents

        -  SDX File: a binary index file into the TSO  file

        —  OUT File: a plain text log file

For example, the file C:\spot\examples\simple\Net3.tsg  defines an ensemble of contamination scenarios for
Net3. Contamination incidents are simulated  for all network junctions, one for each hour of the day, and
each contamination incident models an inject that continues for 24 hours. The tevasim command performs
these contaminant  transport simulations, using the following command line:

   tevasim --tsg HetS.tsg --tso HetS.tso HetS.inp HetS.out


2.2.2   Computing Contamination Impacts

A TSO  file contains raw data about  the extent of a contamination throughout a network. This data needs
to be post-processed to compute relevant impact statistics. The tso2Impact command processes a TSO file
and generates one or more IMPACT files. An IMPACT file  is a plain text file that summarizes the consequence
of each  contamination incident in a  manner that facilitates optimization.  The tso2Impact  command has
the following inputs and outputs:

   • Inputs:

        —  TSO File: a binary file of contamination result data generated by tevasim

        —  SDX File: a binary index file generated by tevasim

-------
          INP File: the EPANET input file for the network, which is used to compute impact measures like
          the extent of contamination

   • Outputs:

          IMPACT File(s): plain text files that summarize the observed impact at each location where a
          contamination incident could be observed by a potential sensor.

          NODEMAP File(s):  plain text files that map sensor placement ids to the network junction labels
          (defined by EPANET).

The tso2Impact command generates IMPACT files with the following command line:

   tso2Impact --me --vc --td --nfd --ec Het3 HetS.tso


This command generates IMPACT files for each of the five objectives specified: mass consumed (me), volume
consumed (vc), time to detection (td), number of failed detections (nfd) and extent of contamination (ec).
For each impact file (e.g. Net3_mc. impact ), a corresponding id file is generated (e.g.  Net3_mc. impact. id
).

2.2.3   Performing Sensor Placement

An IMPACT file can be used to define a sensor placement optimization problem. The standard problem
supported by SPOT is to minimize  the expected impact over an ensemble of incidents while  limiting the
number of potential sensors. By default, sensors can be placed at any junction in the network.  The sp
command coordinates the application of optimization solvers for sensor placement. The sp command has a
rich interface, but the simplest  use of it requires the following inputs and outputs:

   • Inputs:

          IMPACT File(s): plain text files that summarize the observed impact at each location

          NODEMAP File(s):  plain text files that map sensor placement ids to the network junction labels

   • Outputs:

          SENSORS File:  a plain text file that summarizes the sensor locations identified by the optimizer

For example, the command

   sp --print-log --netฅork="Het3"  --objective=mc  --solver=snl_grasp \
           --ub=ns,5 --seed=1234567


generates the file Net3.sensors, and prints a summary of the impacts for this sensor placement.

2.2.4   Evaluating a Sensor  Placement

The final output provided by the sp  command is  actually generated by the evalsensor command, and this
command can be directly used to evaluate a sensor placement for a wide variety of different objectives.  The
evalsensor command requires the following inputs:

   • Inputs:

-------
          IMPACT File(s): plain text files that summarize the observed impact at each location

          NODEMAP File(s): plain text files that map sensor placement ids to the network junction labels

          SENSORS File: a plain text file that defines a sensor placement

For example, the command

    evalsensor --nodemap=Net3.nodemap Net3.sensors Het3_ec.impact \
              Net3_mc.impact Het3_nfd.impact


will summarize the solution in the Net3. sensors file for the ec, me and nf d impact measures. No files are
generated by evalsensors.

2.3  Installation  and Requirements for  Using SPOT

Instructions for installing SPOT in Unix are  included in the first appendix.  Installation on MS Windows
platforms is considerably easier. An installer executable can be downloaded from

   https://software.sandia.gov/trac/spot/downloader


When run, this installer places the SPOT software in the directory

   C:\tevaspot


The installer places executables in the directory

   C:\tevaspot\bin


This directory should be added to the system PATH environment variable to allow SPOT commands to be
run within any DOS shell.

Some of the SPOT commands use the Python scripting language. Python is not commonly installed in MS
Windows machines, but an installer script can be downloaded from

   http://www.python.org/download/


Unfortunately,  the system  path needs to  be modified to include the Python  executable.  A nice  video
describing how to edit the system path is available at:

   http://showmedo.com/videos/video?name=960000&fromSeriesID=96


No other utilities need to be  installed to run the SPOT commands.  EPANET  is linked into the tevasim
executable. Detailed information about the SPOT commands is provided on the SPOT wiki:

   https://software.sandia.gov/trac/spot/wiki/Tools


Note that all SPOT commands need to be run from  the DOS command shell.  This can be launched from
the "Accessories/Command Prompt"  menu. Numerous online tutorials can provide information about DOS
commands. For example, see

-------
  http://en.wikipedia.org/wiki/List_of_DOSmcommands
  http://www.computerhope.com/msdos.htm


The plain text input files used by SPOT can be edited using standard text editors. For example, at a DOS
prompt you can type

  notepad NetS.tsg


to open up the NetS.tsg file with the MS Windows Notepad application.  The plain text output files can be
viewed in a similar manner. The binary files generated by SPOT cannot be viewed in this manner. Generally,
output files should not be modified manually since many are used as input to other programs.

2.4   Reporting Bugs and Feature Requests

The TEVA-SPOT development team uses  Trac tickets to communicate requests for features and bug fixes.
The TEVA-SPOT Trac site can can be accessed at:

  https://software.sandia.gov/trac/spot


External  users can insert a ticket, which will be moderated by the developers. Note that this is the only
mechanism for ensuring that bug fixes will be made a high priority by the development team.

-------
      Sensor Placement  Formulations
SPOT integrates solvers for sensor placement that  have been developed by Sandia National Laboratories
and the Environmental Protection Agency, along with a variety of academic collaborators [3, 5, 7, 9, 12, 13].
SPOT includes (1) general-purpose heuristic solvers that consistently locate optimal solutions in  minutes,
(2) integer- and linear-programming heuristics that find  solutions of provable quality, (3) exact solvers that
find globally optimal solutions, and (4) bounding techniques that can evaluate solution optimality.  These
solvers optimize a representation of the sensor placement problem that may be either an implicit or explicit.
However, in either case we can describe the mathematical formulation for this problem.

This section describes the mixed integer programming (MIP) formulations optimized by the SPOT solvers,
and this presentation assumes that the reader is familiar with MIP models. First, we describe the standard
SPOT formulation, eSP, which minimizes expected impact given a sensor budget. Subsequently, we describe
several other sensor placement formulations that SPOT  solvers can optimize. This discussion is limited to
a description of the mathematical structure of these sensor placement problems. In many cases, SPOT has
more than one optimizer that can optimize these formulations, and we describe these optimizers later in this
manual. However, the goal of this section is to describe the mathematical structure of these formulations.

3.1    The Standard SPOT  Formulation

The most widely  studied sensor placement formulation for CWS design is to minimize the expected impact
of an ensemble of contamination incidents given a  sensor budget.  This formulation has also  become the
standard formulation in SPOT,  since it can be effectively used to select sensor placements in large water
distribution networks.

A MIP formulation for expected-impact sensor placement is:


       (eSP)   min  Y,aeA aฐ- ^ieฃ0 daixai

                   xai
-------
has finished, which corresponds to the impact that would occur without an online CWS, or  (2) it has zero
impact.  The first approach treats detection by this dummy location as a penalty.  The second approach
simply ignores the detection by this  dummy, though this  does not really make  sense without additional
side-constraints on the number of failed  detections.

The eSP formulation is a slight generalization of the sensor placement model described by Berry et al.  [5].
Berry et al. treat the  impact of the dummy is treated  as a penalty, in which case the third constraint is
redundant.  The  impact of a dummy detection is larger than all  other impacts  for each incident, so the
witness variable xai for the dummy will  only be selected if no sensors have been placed that can detect this
incident.

Ignoring the constraint in this case, Berry  et al.  note  that  eSP is identical to the well-known p-median
facility location problem [10] when Cj = 1. In the p-median problem, p facilities (e.g., central  warehouses)
are to be located on  m potential sites  such that the sum of distances dai between each of n customers
(e.g.,  retail outlets) and the nearest  facility i is minimized.  In comparing eSP  and p-median problems,
we observe equivalence between (1) sensors  and facilities, (2) contamination incidents and customers, and
(3)  contamination impacts and distances.  While eSP  allows placement  of at most p sensors, p-median
formulations generally  enforce placement of  all p facilities; in practice, the distinction is irrelevant unless p
approaches the number of possible locations.

3.2   Robust  SPOT Formulations

The eSP model can be viewed as optimizing one particular statistic of the distribution of impacts defined
by the contaminant transport simulations. However, other  statistics may provide  more "robust" solutions,
that are less sensitive to changes in this  distribution [22]. Consider the following reformulation of eSP:

       (rSP)   min  Impact f (a, d,x)
              s.t.     ieฃa xai =  1   Va €  A
                   Xai < Si          Va €  A, i € Ca
                   xai < I - Si      Va €  A, i € Ca \ {q}
                      ieLCiSi 

/?}. Note that the distribution W changes with each sensor placement. Further, VaR can be computed using the a, d and x values. • TCE: The Tail-Conditioned Expectation (TCE) is a related metric which measures the conditional expectation of impact exceeding VaR at a given confidence level. Given a confidence level 1 — /?, TCE is the expectation of the worst impacts with probability /?. This value is between VaR and the worst-case value.


-------
     Mathematically, we have

           TCE(/3) = E [W | W > VaR(/3)] .

     The Conditional Value-at-Risk (CVaR) is a linearization of TCE investigated by Uryasev and Rock-
     afellar [16]. CVaR approximates TCE with a continuous, piecewise-linear function of /?, which enables
     the use of CVaR in a MIP models for rSP.

   • Worst:  The worst impact value  can  be easily computed,  since a finite number of contamination
     incidents are simulated.  Further, rSP  can be reworked to formulate a worst-case  MIP  formulation.
     However, this statistic is sensitive to changes in the number of contamination incidents that are mod-
     eled; adding additional contamination incidents may significantly impact this statistic.


3.3    Min-Cost Formulations

A standard variant of eSP and rSP is to minimize cost while constraining the impact to be below a specified
threshold, u . For example, the eSP MIP can be revised  to formulate a MIP to minimize  cost:

       (ceSP)  min     ieLCjSj
              s.t.      iฃฃa xai = l              Va € A
                                               Va € A, i € Ca
                                               Va € A, i € ฃ0 \ M
                       aฃ,4 ฐ'a  iฃฃa ฎaixai  S u
                    Si € {0, 1}                  Vซ € L
Minimal cost variants of rSP can also be easily formulated.

3.4    Formulations  with Multiple Objectives

CWS design generally requires the evaluation and optimization of a variety of performance objectives. Some
performance objectives cannot be simultaneously optimized, and thus a CWS design must be selected from
a trade-off between these objectives [21].

SPOT supports the analysis of these trade-offs with the specification of additional constraints on impact
measures.  For example, a user can minimize the expected extent  of contamination (ec) while constraining
the worst-case time to detection (td). SPOT allows for the specification of more than one impact constraint.
However, the SPOT solvers cannot reliably optimize formulations with more than one impact constraint.

3.5    The SPOT Formulation with Imperfect Sensors

The previous sensor placement formulations make the implicit assumption that sensors work perfectly. That
is, they never  fail to detect a contaminant  when it exists, and they never generate an erroneous detection
when no contaminant exists. In practice, sensors are imperfect, and they generate these types of errors.

SPOT addresses this issue by supporting a formulation that models simple sensor failures [2]. Each sensor,
Si, has an  associated probability of failure, Pi. With these probabilities,  we can easily assess the probability
that a contamination incident will be detected by a particular sensor. Thus, it is straightforward to compute
the expected impact of a contamination incident.

This formulation does not explicitly allow  for the specification of probabilities  of false detections.  These
probabilities do not  impact the performance of a CWS during  a contamination  incident.  Instead, they
impact the day-to-day maintenance and use of the  CWS; erroneous detections create work for the CWS
users, which is an ongoing cost. The overall likelihood of false detections is simply a function of the sensors


                                                10

-------
that are selected. In cases where every sensor has the same likelihoods, this implies a simple constraint on
the number of sensors.
                                                 11

-------
4    Contamination Incidents and Impact Measures


This section describes how to simulate contamination incidents and compute contamination impacts, which
are the first steps needed to setup and solve a sensor placement problem with SPOT. These two steps can
be viewed as preprocessing or data preparation for sensor placement optimization. Thus, these steps can be
performed prior to optimization, which is generally a more interactive, iterative process.

The following sections illustrate the capabilities of SPOT with the example in the C:\spot\examples\simple
directory.

4.1   Simulating Contamination Incidents

To simulate contamination incidents, the tevasim (p. ??) command is  utilized, which uses EPANET to
perform an ensemble of contaminant transport simulations defined by a TSG File (p. 32). An ensemble of
contamination scenarios for EPANET Example Net3 is defined in the file C:\spot\examples\simple\Net3.tsg.
Contamination incidents are simulated for all network junctions,  one for each hour of the day, and  each
contamination incident models an injection that continues for 24 hours. The tevasim command is run with
the following command line:

   tevasim  --tsg NetS.tsg --tso NetS.tso NetS.inp NetS.out


This command generates three files:  (a) NetS.tso, a binary TSO file that contains the contamination transport
data, (b) NetS.sdx,  a binary SDX file that provides an index into the TSO file, and (c) NetS.out, which
provides a textual summary of the EPANET simulations and is the same as the report file  (*.rpt)  from
EPANET.

4.2   Using tso2Impact

After running tevasim (p. ??) command, the output files, NetS.tso and Net3.sdx, can be used to compute
one or more IMPACT files. An IMPACT file summarizes the consequence of each  contamination incident in
a manner that facilitates optimization.  The tso2Impact (p. ??)  command generates these files with the
following command line:

   tso2Impact --me --vc --td --nfd --ec Net3 HetS.tso


This command generates IMPACT files for each of the five objectives specified: mass consumed (me),  volume
consumed (vc), time to detection (td), number of failed detections (nfd)  and extent of contamination  (ec).
For each IMPACT file  (e.g.  Net3_mc. impact ), a corresponding ID file is generated  to map  the sensor
placement ids back to the network junction labels (e.g. Net3_mc. impact. id ).

The  impact  measures computed by tso2Impact represent the amount of impact  that would occur up  until
the point where a contamination incident is detected. This computation assumes that sensors work perfectly
(i.e., there are no false positive  or false negative errors).  However, we can generalize the sensor behavior
in two ways. First, we can specify a detection threshold; contaminants are only detected  above  a specified
concentration limit (the default limit is zero). Second, we can specify a response time, which accounts for the
time needed to verify that a contamination has occurred and  then inform the public (the default response
time is zero). The contamination impact is  computed at the  time where the response has completed (the
detection time plus response time), which is called the effective response time. For undetected incidents, the
effective response time is simply the end of the contaminant transport simulation. The following illustrates
how  to specify these options:


                                                12

-------
    tso2Impact  --responseTime 60  --detectionLimit 0.1 --me Het3 NetS.tso


This computes impacts for a 60 minute response time, with a 0.1 detection threshold.  Note that the units
for  -detectionLimit are the same as for the MASS values that are specified in the TSG file.

Impacts from  multiple TSO files  can be combined to generate a single IMPACT file using the following
syntax:


    tso2Impact  --detectionLimit 30000000  --detectionLimit 0.0001 --me Het3 Het3_la.tso Het3_lb.tso


Note that the value  of 30000000 corresponds to the detection threshold for  the contaminant described in
Net3_la.tso and 0.0001 is the detection threshold for the  contaminant described in Net3_lb.tso.  For
example, this can be used to combine simulation results from different types  of contaminants, in which the
TSO files would have been generated from different TSG files.  Murray et  al. [12] use this technique to
combine data from different tyes of contamination incidents into a single impact metric.

4.3  Impact Measures

After running tevasim (p. ??) command, the output files, NetS.tso and NetS.sdx, can be used to compute
one or more IMPACT  files.  An IMPACT file summarizes the consequence of each contamination incident in a
manner that facilitates optimization. A variety of objective measures are supported by tso2Impact to reflect
the different criteria  that decision makers could use in CWS design.  For most of these criteria, there  is a
detected and undetected version of the  objective.  This difference concerns how undetected contamination
incidents are modeled.

For example, the default time-to-detection objective, td, uses the time at which the EPANET simulations
are terminated to define the time for incidents that are not detected.  By contrast, the detected time-to-
detection, dtd, simply ignores these incidents (they have impact zero).  Sensor placement with the detected
objective is somewhat more precise, but this objective needs to be optimized with a revised formulation that
explicitly limits the fraction of incidents that are not detected by the sensors.

The following objectives are currently supported by tso2Impact:

   • ec and dec - The extent of contaminated in the network. This is the total feet of pipes contaminated
     by the effective response time.  An entire pipe is considered contaminated if contaminant enters the
     pipe at a given time step. For ec, the extent of contamination of an undetected incident is the extent of
     contamination at the point when the simulation terminates, while undetected contamination incidents
     are ignored for  dec.

   • me and dmc -  The mass of contaminant consumed by junctions in the network with nonzero demand.
     For me,  the mass  of contaminant of an undetected incident is the mass of contaminant that has left
     the network via demand at the point when the simulation terminates, while undetected contamination
     incidents are ignored for dmc.  This objective is typically measured in milligrams (the units used in
     the TSG file are mg/L). However, concentrations may also be interpreted; for example, we can treat
     this measure as a count of cells for a biological contaminant, where the  TSG measurement is cells/L.

   • nfd - The  number of contamination incidents that are not detected by any sensor before  the con-
     taminant transport simulations terminate.  NOTE: this measure  is not  affected by the response time
     option.

   • pe and dpe - The number of individuals exposed to a contaminant. For pe, the population exposed for
     an undetected incident is the population exposed at the point when the simulation terminates, while
     undetected contamination incidents are ignored fo dpe.


                                                13

-------
   • pd and dpd - The number of individuals that receive a dose of contaminant above a specified threshold.
     For pd, the population dosed by an undetected incident is the population dosed at the point when the
     simulation terminates, while for dpd the undetected contamination incidents are ignored.

   • pk and dpk - The number of individuals killed by  a contaminant. For pk, the population killed by
     an undetected incident is the population killed at the point when the simulation terminates, while for
     dpk the undetected contamination incidents are ignored.

   • td and dtd -  The time, in minutes,  from the beginning of a  contamination  incident until the first
     sensor detects  it. For td, the time-to-detection of an undetected incident is the time from the start of
     the incident until the end of the simulation, while undetected contamination incidents are ignored for
     dtd. NOTE: this measure is not affected by the response time  option.

   • vc and dvc - The volume of contaminated water consumed by  junctions in the network with nonzero
     demand. For vc, the volume of contaminated water of an undetected incident is the volume of contam-
     inated water consumed at the point when the simulation terminates, while undetected contamination
     incidents are ignored for dvc.


These health impact measures are computed with an auxiliary input file, TAI, that specifies parameters for
a health impact model that predicts how a population is affected by  exposure to a contaminant.  The TAI
File (p. 32) bio.tai specifies  the nature of the contaminant and how it impacts human health. Further, this
file specifies the fraction of  the volume of water consumed at junctions that  is consumed by humans.  For
example, consider the command line:

   tso2Impact --pe Het3 NetS.tso bio.tai


4.4   Advanced Tools for Large Sensor Placements Problems

In some applications, the size of the IMPACT files is very  large, which can lead to optimization models that
cannot be solved on standard 32-bit workstations. SPOT includes several utilities that are not commonly used
to address this challenge: the scenarioAggr (p. ??) executable aggregates similar contamination  incidents,
and the filter_impacts (p. ??) script filters out contamination incidents that have low impacts.

The  scenarioAggr (p. ??)  executable reads an IMPACT file, finds  similar incidents, combines them, and
writes out another IMPACT file.  This aggregation technique combines two incidents that impact the same
locations in the same order, allowing for the possibility that one incident continues to impact other  locations.
For example, two contamination incidents should travel in the same pattern if they differ only in the nature of
the contaminant, though one may decay more quickly than the other.  Aggregated incidents can be  combined
by simply averaging the impacts that they observe and updating the  corresponding incident weight.

For example, consider the command:


   scenarioAggr --numEvents=236 Net3mmc.impact


This creates the files aggrNet3_mc.impact and aggrNet3_mc.impact.prob; where the  Net3_mc.impact
file has  236 events.  The file aggrNet3_mc. impact is the new IMPACT file, and the file aggrNet3_-
mc. impact. prob contains the  probabilities of the aggregated incidents.

The  filter_impacts (p. ??) script  reads an impact file,  filters out  the low-impact incidents, rescales the
impact values, and writes out  another IMPACT file. The  command:

   filter^impacts --percent=5 Net3mmc.impact filtered.impact


                                               14

-------
generates an IMPACT file that contains the incidents whose impacts (without sensors)  are the largest 5%
of the incidents in Net3_mc. impact.  Similarly, the -num=k option selects the k incidents with the largest
impacts, and the option -threshold=h selects the incidents with the impacts greater than or equal to h.

The f ilter_impacts command also includes options to rescale the impact values.  The -rescale option
rescales impact values with a log-scale and the -round option rescales impact values to rounded log-scale
values.
                                                15

-------
      Sensor Placement Solvers
The SPOT sensor placement solvers are launched with the sp (p. ??) command. The sp command reads in
one or more IMPACT files, and computes a sensor placement. Command-line options for sp can specify any
of a set of performance or cost goals as the objective to be optimized, as well as constraints on performance
and cost goals.

The sp command currently interfaces with three different sensor placement optimizers:

   • MIP solvers - Several different MIP solvers can be used by the sp command: the commercial CPLEX
     solver and the open-source PICO solver. These  optimizers use the MIP formulations to find globally
     optimal solutions. However,  this may be a computationally  expensive process  (especially for  large
     problems), and the size of the MIP formulation can become prohibitively large in some  cases.
     Two different MIP solvers can be  used: the public-domain PICO solver and  the commercial PICO
     solver. PICO is included in distributions of SPOT.

   • GRASP Heuristic - The GRASP heuristic performs sensor placement optimization without explicitly
     creating a MIP formulation. Thus, this solver uses much less memory, and it usually runs very quickly.
     Although the GRASP heuristic does not guarantee that a globally optimal solution is found, it has
     proven effective at finding optimal solutions to a variety of large-scale applications.
     Two different implementations of the GRASP solvers can be used: an ATT commercial solver (att_-
     grasp) and an open-source implementation of this solver (snl_grasp).

   • GRASP Heuristic - The GRASP heuristic performs sensor

   • Lagrangian Heuristic - The Lagrangian heuristic uses the structure of the p-median MIP formulation
     (eSP) to find near-optimal solutions while computing a lower bound on the best possible solution.

The following sections provide examples that illustrate the use of the sp command. A complete description
of sp is available in the Appendix. Note that this appendix includes a summary of the limitations of different
solvers.

The sp command has  many  different options.  The following examples show how  different sensor place-
ment  optimization problems can  be solved  with sp.   Note that these examples  can be run in the
C:\spot\examples\simple directory.  The  user needs to  generate IMPACT  files for these examples with
the following commands:

   tevasim --tsg NetS.tsg --tso NetS.tso NetS.inp NetS.out
   tso2Impact --me --vc --td --nfd --ec Net3  HetS.tso


5.1   A  Simple Example

The following simple example illustrates the way that SPOT has been most commonly used. In this example,
SPOT minimizes the extent of contamination (ec) while limiting the number of sensors (ns) to no more than
5. This problem formulation (eSP) can be efficiently solved with all  solvers for modest-size distribution
networks,  and heuristics can effectively perform sensor placement on very large networks.

We begin by using the PICO solver to solve this problem, with the following command line:

sp --netฅork=Het3 --objective=ec --ub=ns,5 --solver=pico


                                                16

-------
This specifies that network Net3 is analyzed.  The objective is to minimize ec, the extent of contamination,
and an upper-bound of 5 is placed on ns, the number of sensors. The solver selected is pico, the PICO MIP
solver.

This execution of the sp command uses the Net3_ec. impact file and creates the following files: Net3. log,
a logfile for the optimization solver, and Net3.sensors, a file with the sensor placement locations.  Also, sp
generates the following output:


read_impact_files:  C:\spot\examples\simple\Net3mec.impact

Number of Nodes               :   97
Number of Contamination Impacts:  9458

Running PICO...
PICO  --debug=l --lpType=clp   --table!nitFrac=0.05 --RRTrialsPerCall=8
--RRDepthThreshold=-l --usingCuts=true  NetS.mod
C:\spot\examples\simple\Net3.dat
... PICO done

Sensor placement id:       22971
Number of sensors:         5
Total cost:               0
Sensor node IDs:           19 28 54 63 75
Sensor junctions:           119 141 193 207  239

Impact File:              Net3mec.impact
Number of events:          236
Min impact:               0.0000
Mean  impact:              8478.9674
Lower quartile impact:     0.0000
Median impact:            6949.0000
Upper quartile impact:      12530.0000
Yalue at Risk (VaR) (  57.): 25960.0000
TCE               (  5'/J: 33323.2833
Max impact:               42994.8000


Greedy ordering of sensors

54     29124.8004
19     18687.7292
63     11471.6750
75     9951.8699
28     8478.9674
Done  with sp


The  initial information up to the statment "... PICO done" is simply output about what solver is run and
information from the solver output. The next information beginning with "Sensor placement id:" is generated
by evalsensor (p. ??). This  is a summary that describes the sensor placement and the performance of this
sensor placement with respect to the impact measure that was minimized. This includes the following data:


   • Sensor placement id - an integer ID used to distinguish this sensor placement

   • Number of sensors - the number of sensors in the sensor placement

   • Total cost: - the cost  of the sensor placement, which may be nonzero if cost data is provided

   • Sensor node IDs - the internal  node indexes used by sp

   • Sensor junctions - the EPANET junction labels for the sensor locations


The  performance of the sensor  placement  is summarized for  each IMPACT file  used  with sp.  The impact
statistics included are:


                                                  17

-------
   • min - The minimum impact over all contamination events. If we make the assumption that a sensor
     protects the  node at which it is placed, then this measure will generally be zero.

   • mean - The mean (or average) impact over all contamination events.

   • lower quartile - 25% of contamination events, weighted by their likelihood, have an impact value less
     than this quartile.

   • median - 50% of contamination events, weighted by their likelihood, have an impact value less than
     this quartile.

   • upper quartile - 75%  of contamination  events, weighted by their likelihood, have an impact value
     less than this quartile.

   • VaR - The  value at risk (VaR)  uses a user-defined  percentile.  Given 0.0 < /? <  1.0, VaR is the
     minimum value for which 100 * (1 — /?)% of contamination events have a smaller impact.

   • TCE - The  tailed-conditioned expectation (TCE) is the mean value of the impacts that are greater
     than or equal to VaR.

   • worst - The value of the worst impact.


Finally, a greedy sensor placement is described by evalsensor, which takes the five sensor placements and
places them one-at-a-time, minimizing the mean impact as each sensor is placed. This gives a sense of the
relative priorities for these sensors.

The  evalsensor  command can evaluate a sensor placement for a wide variety of different objectives. For
example, the command


   evalsensor --nodemap=Net3.nodemap Het3.sensors Net3mec.impact \
              Net3mmc.impact Net3mnfd.impact


will summarize the solution in the Net3.sensors file for the ec, me and nfd impact measures.

The  following example shows  how to solve this same problem  with the GRASP heuristic. This solver finds
the same  (optimal) solution as the MIP solver, though much more quickly.


sp --netฅork=Net3 --objective=ec --ub=ns,5 --solver=snl_grasp

read_impact_files:   C:\spot\examples\simple\Het3_ec.impact
Note: witness aggregation disabled for grasp

Number of Nodes               :  97
Number of Contamination Impacts: 9458

Number of sensors=5
Objective=ec
Stat ist ic=mean
Impact file=C:\spot\examples\simple\Net3mec. impact
Delay=0
Running  iterated descent heuristic for *perfect* sensor model
Iterated descent completed

Sensor placement id:        23009
Number of sensors:          5
Total cost:                0
Sensor node IDs:           19  28 54 63 75
Sensor junctions:           119 141 193 207 239

Impact File:                Net3mec.impact


                                                  18

-------
Number of events:            236
Min impact:                 0.0000
Mean impact:                 8478.9674
Lower quartile impact:       0.0000
Median impact:              6949.0000
Upper quartile impact:       12530.0000
Yalue at Risk (VaR)  (  57.):  25960.0000
TCE                 (  5'/J:  33323.2833
Max impact:                 42994.8000
Greedy ordering of sensors
54      29124.8004
19      18687.7292
63      11471.6750
75      9951.8699
28      8478.9674
Done with sp
Finally, the following example shows how to solve this problem with the Lagrangian heuristic. This solver
does not find as good a solution as the GRASP heuristic.
sp --network=Net3 --objective=ec  --ub=ns,5  --solver=lagrangian
read_impact_files:   C:\spot\examples\simple\Net3mec.impact

Number of Nodes                :   97
Number of Contamination Impacts:  9458

aggregateImpacts NetS.config 10000
Setting up Lagrangian data files...
Running UFL solver  ...
ufl Net3_ec_agg.lag  6 0

Sensor placement id:         27730
Number of sensors:           5
Total cost:                 0
Sensor node IDs:            15 17 19 21  66
Sensor junctions:           111 115  119  121  211

Impact File:                 C:\spot\examples\simple\Net3_ec.impact
Number of events:           236
Min impact:                 0.0000
Mean impact:                 12306.8229
Lower quartile impact:       0.0000
Median impact:              10410.0000
Upper quartile impact:       18100.0000
Yalue at Risk (VaR) (   5'/J : 41526.9000
TCE                 (   5'/J: 45057.9000
Max impact:                 49880.8000
Greedy ordering of sensors

19      32174.3110
66      17854.3458
15      13583.0559
21      12929.5602
17      12306.8229
Done with sp
                                                      19

-------
5.2   Computing a Bound on the  Best Sensor Placement Value

The following example shows how a lower bound can be computed on the best possible sensor placement.
That is, any sensor placement would have an  expected impact greater than this value.  A bound is computed
with the following syntax:


sp --network=Net3 --objective=ec --ub=ns,5  --solver=pico --compute-bound

read_impact_files:   C:\spot\examples\simple\Net3mec.impact

Number  of Nodes               :  97
Number  of Contamination Impacts: 9458

Running PICO...
PICO --debug=l --lpType=clp   --onlyRootLP=true NetS.mod
C:\spot\examples\simple\Net3.dat
...  PICO done
Computing a lower bound
Objective lower bound:  8478.96737288
Done with sp


5.3   Minimizing the  Number of Sensors


We can "invert" the sensor placement problem by minimizing the number of sensors subject to a constraint
on the extent of contamination.  Note that the following example finds a solution with a single sensor that
meets our goal of 40000 mean  extent of contamination.


sp --network=Net3 --objective=ns --ub=ec,40000 --solver=pico

read_impact_files:   C:\spot\examples\simple\Net3_ec.impact

Number  of Nodes               :  97
Number  of Contamination Impacts: 9458

WARNING: Location aggregation does not work with side constraints
WARNING: Turning off location aggregation
Running PICO...
PICO --debug=l --lpType=clp   --RRTrialsPerCall=8 --RRDepthThreshold=-l --usingCuts=true --absTolerance=.99999  NetS.mod C:\spot
...  PICO done
Sensor placement id:        27738
Number of sensors:          1
Total cost:                0
Sensor node IDs:           37
Sensor junctions:          161

Impact File:               C:\spot\examples\simple\Net3_ec.impact
Number of events:          236
Min impact:                0.0000
Mean impact:               26901.9572
Lower quartile impact:      3940.0000
Median impact:             22450.0000
Upper quartile impact:      38855.0000
Yalue at Risk (VaR)  (   5'/J : 71377.8000
TCE                (   5'/J: 81046.0667
Max impact:                103746.0000
Greedy ordering of sensors

37      26901.9572
Done with sp
                                                   20

-------
5.4   Fixing Sensor Placement Locations


Properties of the sensor locations can be specified with the -sensor-locations option. This options specifies
a Placement Locations  File (p. 36) that can control whether sensor locations are feasible or infeasible,
and fixed or unfixed.  For example, suppose the file locations contains


infeasible 193  119  141 207 239
fixed  161


The following example shows how these restrictions impact the solution.  Compared to the first  example
above, we have a less-optimal solution, since we cannot use the sensor locations above and we are  required
to include junction 161.


sp --netฅork=Net3 --objective=ec  --ub=ns,5 --solver=pico
       --sensor-locations=locations

read_impact_files:   C:\spot\examples\simple\Net3mec.impact

Number of Nodes               :  97
Number of Contamination Impacts:  9458

Running PICO...
PICO --debug=l  --lpType=clp  --table!nitFrac=0.05 --RRTrialsPerCall=8
--RRDepthThreshold=-l --usingCuts=true  NetS.mod
C:\spot\examples\simple\Net3.dat
...  PICO done

Sensor placement id:        22996
Number of sensors:          5
Total  cost:                0
Sensor node IDs:            17  33 37 50 66
Sensor junctions:           115 151 161 185  211

Impact File:                Net3_ec.impact
Number of events:           236
Min impact:                0.0000
Mean impact:                9338.7119
Lower  quartile  impact:      0.0000
Median impact:              7640.0000
Upper  quartile  impact:      14120.0000
Yalue  at Risk (VaR)  (  5'/J :  27335.0000
TCE                (  57.):  32282.3000
Max impact:                45300.0000


Greedy ordering of  sensors

37     26901.9572
66     18192.6581
33     13958.3119
17     11281.2907
50     9338.7119
Done with sp



5.5   Robust  Optimization of Sensor Locations


The following  example demonstrates the optimization of sensor placements using the TCE measure. TCE is
the mean value of the worst incidents in the ensemble being evaluated. Given a confidence level 1 — /?, TCE
is the expectation of the worst  impacts with probability  /?. Compared with our first example,  we see that
this finds a better solution in  terms of TCE,  although the mean performance is slightly worse.


                                                  21

-------
sp --netฅork=Net3 --objective=ec_tce --ub=ns,5 --solver=snl_grasp

read_impact_files:  C:\spot\examples\simple\Net3mec.impact
Note:  witness aggregation disabled for grasp

Number of  Nodes                :  97
Number of  Contamination Impacts: 9458

Number of  sensors=5
Objective=ec
Statistic=tce
Impact file=C:\spot\examples\simple\Net3_ec.impact
Delay=0
Running iterated descent heuristic for *perfect*  sensor model
Iterated descent completed

Sensor placement id:        23005
Number of  sensors:          5
Total  cost:                 0
Sensor node IDs:            17  19 24 65 88
Sensor junctions:           115 119 127 209 267

Impact File:                Net3_ec.impact
Number of  events:           236
Min impact:                 0.0000
Mean impact:                10266.1110
Lower  quartile impact:       0.0000
Median impact:              10400.0000
Upper  quartile impact:       16930.0000
Yalue  at Risk (VaR) (  5'/J :  24199.0000
TCE                (  5'/J:  26376.2167
Max impact:                 28564.8000


Greedy ordering of sensors

88      30803.8140
19      19369.7636
65      12568.0822
17      11130.4161
24      10266.1110
Done with  sp


Note  that the greedy ordering of sensors is less useful in this case. Although we optimized to minimize TCE,
the greedy ordering uses the mean value to select each sensor location.


5.6   Multi-Criteria  Analysis


We now illustrate  how sp  supports  multi-objective analysis through an iterative  process. SPOT  does not
have  a general "pareto search" optimizer.  Instead, users can specify constraints  with sp that ensure that
previously optimized  objectives are "close" to their previous  values.  In this way,  the user can explicitly
search for trade-offs between one-or-more performance objectives.

The examples  above  consider the extent-of-contamination objective.  We can assess how well the sensor
placements  generated above  minimize  other objectives  like the expected mass of contaminant consumed
using evalsensor.  Consider  the solution generated by the previous example (which minimized  ec_tce),
which we have copied into  the file Net3_ec.sensors.


evalsensor --nodemap=Net3.nodemap Net3mec.sensors Net3mmc.impact

Sensor placement id:        23112
Number of  sensors:          5
Total  cost:                 0


                                                   22

-------
Sensor node  IDs:
Sensor junctions:
     17 19  24 65 88
     115 119 127 209 267
Impact File:
Number of events:
Min impact :
Mean impact :
Lower quartile impact:
Median impact :
Upper quartile impact:
Yalue at Risk (VaR) ( 57.) :
TCE ( 5'/J :
Max impact :

Greedy ordering of sensors
65 71599.4274
88 71256.6780
24 71042.6323
17 70952.7213
19 70895.2854
Net3mmc . impact
236
0.0000
70895.2854
503.9170
83150.7000
144002.0000
144271.0000
144546.8333
144693.0000







The mean mass consumed is  70895, which is far from the optimal value of 21782  (which we computed
separately).  We revisit the robust optimization example in the previous section; we keep "extent of contam-
ination - tee" as our primary objective, but we now impose a "side constraint" that precludes  any solution
that admits  an average mass consumed of worse  than 30,000 units.  We do this as follows:
sp --network=Net3 --objective=ec_tce --ub=mc_mean,30000 --ub=ns,5 --solver=snl_grasp

read_impact_files:  C:\spot\examples\simple\Net3mec.impact
read_impact_files:  C:\spot\examples\simple\Het3_mc.impact
Note:  witness  aggregation disabled for grasp

Number of Nodes                :   97
Number of Contamination Impacts:  9458

WARNING:  Location aggregation does not work with side constraints
WARNING:  Turning off location aggregation
Number of sensors=5
Objective=ec
Statistic=tce
Impact file=C:\spot\examples\simple\Net3_ec. impact
Delay=0
Running iterated descent heuristic for *perfect* sensor model
Iterated descent completed
Sensor placement  id:
Number of sensors:
Total cost:
Sensor node  IDs:
Sensor junctions:

Impact File:
Number of events:
Min impact:
Mean impact:
Lower quartile  impact:
Median impact:
Upper quartile  impact:
Yalue at Risk  (VaR)  (
TCE                 (
Max impact:

Impact File:
Number of events:
5'/J
5'/J
23143
5
0
4 15 29 68  81
35 111 143  215 253

Net3_ec.impact
236
0.0000
14315.5322
1379.0000
10810.0000
21809.8000
37915.8000
48340.3667
71329.0000

Net3_mc.impact
236
                                                    23

-------
Min impact :
Mean impact:
Lower quartile impact:
Median impact :
Upper quartile impact:
Yalue at Risk (VaR) ( 5'/J :
TCE ( 5'/J :
Max impact :

0.0000
29501.3226
139.0200
1766.1000
16476.3000
144271.0000
144276.3333
144335.0000

Greedy ordering of sensors
4      33437.5797
15     22324.5746
29     17100.4814
68     14912.0110
81     14315.5322
Greedy ordering of sensors
81     55207.4343
29     37684.3130
4      32849.0896
68     29738.0747
15     29501.3226
Done with sp


Note that the primary objective, minimizing the TCE of the "extent of contamination" measure, has gotten
worse: it is now 48340 rather than 26376. However, our side constraint has been honored, and the mean
mass consumed value is now 29501 rather than 70895.

5.7   Sensor Placements without Penalties

A fundamental issue for sensor placement is how to handle the fact that a limited budget of sensors will not be
able to cover all possible incidents. SPOT addresses this issue by providing impact measures that integrate
an impact  'penalty' for incidents that are not detected by a CWS  design. Thus, in the previous  examples
there was an implicit  trade-off between impact reduction and reduction in the number of contamination
incidents that are detected.

SPOT also includes impact measures that do not  contain these penalties, which allows  a user to more
directly assess the performance of a CWS design in the context where detections have occured. For example,
the time-to-detection measure (td) includes a penalty for undetected incidents, but the detected-time-to-
detection measure (dtd) has no penalty (or, more precisely, a zero penalty).

For example, consider the  simple example above, which minimizes the extent of contamination.  We apply
evalsensors to the final solution  to evaluate the ec, dec and nfd impact measures:

evalsensor --nodemap=Het3.nodemap Net3morig.sensors Net3mec.impact
Het3_dec.impact  Het3_nfd.impact
Sensor placement  id:
Number of sensors:
Total cost:
Sensor node IDs:
Sensor junctions:

Impact File:
Number of events:
Min impact:
Mean impact:
Lower quartile impact:
3789
5
0
19 28 54 63 75
119 141 193 207 239

Net3mec.impact
236
0.0000
8478.9674
0.0000
                                                 24

-------
Median impact :
Upper quartile impact:
Yalue at Risk (VaR) (
TCE (
Max impact :
Impact File:
Number of events:
Min impact :
Mean impact :
Lower quartile impact:
Median impact :
Upper quartile impact:
Yalue at Risk (VaR) (
TCE (
Max impact :
Impact File:
Number of events:
Min impact :
Mean impact :
Lower quartile impact:
Median impact :
Upper quartile impact:
Yalue at Risk (VaR) (
TCE (
Max impact :
6949.0000
12530.0000
57.): 25960.0000
57.): 33323.2833
42994.8000
Net3mdec . impact
236
0.0000
8184.5182
0.0000
6949.0000
12530.0000
57.): 25960.0000
57.): 33323.2833
42994.8000
Net3mnf d. impact
236
0.0000
0.2500
0.0000
0.0000
0.0000
57.): 1.0000
57.): 1.0000
1.0000
Greedy  ordering of sensors: Net3_ec.impact
54     29124.8004
19     18687.7292
63     11471.6750
75     9951.8699
28     8478.9674
Greedy  ordering of sensors: Net3_dec.impact
19     4845.7771
28     3613.4678
54     10489.1614
63     7097.6114
75     8184.5182
Greedy  ordering of sensors: Net3mnfd.impact
75     0.2712
28     0.2500
19     0.2500
54     0.2500
63     0.2500
In this example, the final sensor placement fails to detect 25% of the incidents. It is noteworthy that this
does not impact the mean performance very much, since the impact penalty has led to a final solution that
fails to detect few incidents with high penalties.

Note that minimizing dtd does not really make  sense. With zero sensors, you detect no incidents, which
means that  the final impact  measurement is zero! Thus, minimizing dtd requires the additional  constraint
on the number of failed detections (nfd) as well as a limit on the number of sensors (or total sensor costs).

Only the 'pico' SPOT solver  currently supports optimization with 'detected' impact measures. For example:

sp --network=Net3 --objective=dec --ub=ns,5 --ub=nfd,0.25 --solver=pico
                                                  25

-------
Number of Nodes                :  97
Number of Contamination Impacts: 9458

WARNING: Location aggregation does not work with side constraints
WARNING: Turning off location aggregation
Running PICO...
PICO --debug=l --lpType=clp   --RRTrialsPerCall=8
--RRDepthThreshold=-l --feasibilityPump=false  --usingCuts=true
Net3.mod C:\spot\examples\simple\Net3.dat
...  PICO done
Sensor placement id:
Number of  sensors:
Total cost:
Sensor node  IDs:
Sensor junctions:

Impact File:
Number of  events:
Min impact:
Mean impact:
Lower quartile impact:
Median impact:
Upper quartile impact:
Yalue at Risk (VaR) (
TCE                (
Max impact:
5'/J :
5'/J :
23466
5
0
19 28 54  63 75
119 141 193 207 239

Net3mdec.impact
236
0.0000
8184.5182
0.0000
6949.0000
12530.0000
25960.0000
33323.2833
42994.8000
Impact File:
Number of events:
Min impact :
Mean impact :
Lower quartile impact:
Median impact :
Upper quartile impact:
Yalue at Risk (VaR) (
TCE (
Max impact :
Net3mnf d. impact
236
0.0000
0.2500
0.0000
0.0000
0.0000
5'/J: 1.0000
5'/J: 1.0000
1.0000
Greedy ordering of sensors:  Net3_dec.impact
19      4845.7771
28      3613.4678
54      10489.1614
63      7097.6114
75      8184.5182
Greedy ordering of sensors:  Net3mnfd.impact
75      0.2712
28      0.2500
19      0.2500
54      0.2500
63      0.2500
Done with  sp
5.8   Limited-Memory Sensor Placement Techniques

Controlling memory usage is a critical issue for solving large sensor placement  formulations.  This is a
particular challenge for MIP  methods, but both the GRASP and Lagrangian heuristics can have difficultly
solving very large problems on standard workstations. A variety of mechanisms have been integrated into sp
to reduce the problem representation size while preserving the structure of the sensor placement problem.

The scenarioAggr  (p. ??) method described in the previous section is  one possible strategy.  This tool
                                                   26

-------
compresses the impact file while preserving the fundamental structure of the impact file and it is appropriate
when optimizing for mean performance objectives.  Similarly, the filter_impacts (p. ??) script can limit
the sensor placement to only consider contamination incidents that are "sufficiently bad" in the worst-case.
Another strategy is to limit the number of sensor placements, using the -sensor-locations option described
above, since eliminating feasible locations reduces the problem representation used by the sp solvers.

Two other strategies are also supported  by sp. First, the GRASP heuristic has several options for controlling
how memory is managed. The -grasp-representation option can be used to control how the local search
steps are performed. By default, a dense matrix is precomputed to perform local search steps quickly, but
a sparse matrix can be used to perform local search with less memory.  Also, the GRASP heuristic can be
configured to use the local disk to store this matrix. It should be noted that the Lagrangian heuristic requires
less memory than the GRASP heuristic, and thus similar techniques have not been developed  for it.

Second, the witness aggregation technique can be used to limit the size of the sensor placement formulation.
This term refers to the variables in the  MIP formulation that "witness" a contamination event. By default,
variables that witness contamination events with the same impact are aggregated, and this typically reduces
the MIP constraint matrix by a significant amount. Further reductions can be performed with more aggressive
aggregations.

To illustrate the use of witness aggregation, we generated impact files with the C: \spot\etc\tsg\hourly. tsg
TSG  file.   The following table illustrates the use of the two witness aggregation options when optimiz-
ing the mean extent of contamination:  -aggregation-percent and -aggregation-ratio  (used with the
-distinguish-detection option, which helps this aggregation option).  The second line of data in this table
is the default aggregation, which has about half as many non-zero values in the MIP constraint matrix. Both
the percent and ratio aggregation strategies effectively  reduce the problem size while  finding  near-optimal
solutions.
Aggr Type
None
Percent
Percent
Ratio
Aggr Value
NA
0.0
0.125
0.125
Binary Vars
97
97
97
97
MIP Nonzeros
220736
119607
49576
12437
Solution Value
8525
8525
9513
10991
5.9    Evaluating a Sensor Placement

The evalsensor (p. ??) executable takes sensor placements in a Sensor Placement  File (p. 33) and eval-
uates them using data from an Impact File (p. 33) (or a list of impact files). This executable measures the
performance of each sensor placement with respect to the set of possible contamination locations. This anal-
ysis assumes that probabilities have been assigned to these contamination locations, and if no probabilities
are given then uniform probabilies are used by evalsensor.

The following example illustrates the use of evalsensor after running the first sensor placement optimization
example.
evalsensor --nodemap=Het3.nodemap Net3morig.sensors Net3mec.impact
Net3mmc.impact
Sensor placement id:
Number of sensors:
Total cost:
Sensor node IDs:
Sensor junctions:

Impact File:
Number of events:
Min impact:
5511
5
0
19 28 54 63 75
119 141 193 207 239

Net3_ec.impact
236
0.0000
                                                 27

-------
Mean impact :
Lower quartile impact:
Median impact :
Upper quartile impact:
Yalue at Risk (VaR) (
TCE (
Max impact :
Impact File:
Number of events:
Min impact :
Mean impact:
Lower quartile impact:
Median impact :
Upper quartile impact:
Yalue at Risk (VaR) (
TCE (
Max impact :

8478.9674
0.0000
6949.0000
12530.0000
57. ): 25960.0000
5'/J: 33323.2833
42994.8000
Het3_mc . impact
236
0.0000
43636.7076
220 . 0020
1909.9500
105031.0000
5'/J: 144271.0000
57.): 144345.0000
144477.0000

Greedy  ordering of sensors
54     29124.8004
19     18687.7292
63     11471.6750
75     9951.8699
28     8478.9674
Greedy  ordering of sensors
75     59403.2616
28     44478.4678
63     43854.6979
54     43659.7307
19     43636.7076
The evalsensors command can also evaluate a sensor placement in the case where sensors can fail, and
there is some small number of different classes of sensors (grouped by false negative probability).  Consider
the Net3. imperf ectsc file, which defines different categories of sensor failures:
1 0.25
2 0.50
3 0.75
4 1.0
Sensors of class "1" give false negative readings 25% of the time, sensors of class "2" give them 50% of the
time, etc.

Once failure classes have been defined, the junctions of the  network are assigned  to classes.  This is done
with another file (a "junction class" or jc file), like Net3. imperf ectjc.
1 i
2 1
3 1
4 1
5 1
6 1
7 1
8 1
9 1
10 1
11 1
12 1
                                                  28

-------
13 1
14 1
15 1
16 1
17 1
18 1
19 1
20 1
Given the junction classes, we can run evalsensor to determine the expected impact of a sensor placement,
given that sensors may fail. Again, using the solution from the original example:


evalsensor  --nodemap=Net3.nodemap --sc-probabilities=Net3.imperfectsc
--scs=Net3.imperfectjc  Net3morig.sensors Net3mec.impact


Sensor placement id:        5511
Number of sensors:          5
Total cost:                0
Sensor node IDs:           19 28 54 63 75
Sensor junctions:          119 141 193 207 239

Impact File:               Het3_ec.impact
Number of events:          236
Min impact:                0.0000
Mean impact:               17161.8656
Lower quartile impact:      3940.0000
Median impact:             15307.2500
Upper quartile impact:      26537.2156
Yalue at Risk (VaR)  (   5'/J : 47637.7773
TCE                (   57.): 54977.0644
Max impact:                75509.9043


Greedy ordering of  sensors

54     36553.6321
63     25655.0376
28     21209.3084
75     18810.1629
19     17161.8656


Note that the mean impact of this "extent of contamination" changes dramatically if sensors are allowed to
fail. The original  solution, 8478 pipe feet, was misleading if sensors  fail according to the probabilities we
have assigned.  With sensor failures, the expected impact is  17161 pipe feet - more than twice the "perfect
sensor" impact.


5.10   Sensor  Placement with Imperfect  Sensors


The GRASP heuristics in SPOT can optimize sensor placements that take  into account sensor failures. For
example, we can perform sensor placement optimization with imperfect sensors using the Net3. imperf ectsc
and Net3.imperfectjc files defined in the previous section.


sp --network=Net3 --objective=ec --ub=ns,5 --imperfect-scfile=Net3.imperfectsc
--imperfect-jcfile=Net3.imperfectjc --solver=snl_grasp

read_impact_files:   C:\spot\examples\simple\Net3_ec.impact
Note:  witness aggregation disabled for grasp
                                                   29

-------
Number of Nodes               :   97
Number of Contamination  Impacts:  9458

Number of sensors=5
Objective=ec
Stat ist ic=mean
Impact file=C:\spot\examples\simple\Net3mec. impact
Delay=0
Running iterated descent heuristic for  *imperfect* sensor model
Iterated descent completed
Sensor placement id:
Number of sensors:
Total cost:
Sensor node IDs:
Sensor junctions:
9285
5
0
33 63 75 83 87
151 207 239 257 265
Impact File:
Number of events:
Min impact :
Mean impact :
Lower quartile impact:
Median impact :
Upper quartile impact:
Yalue at Risk (VaR) (
TCE (
Max impact :
Net3mec . impact
236
0.0000
13414.2479
3610.0000
11690.0000
20096.0000
57.): 36467.5500
5'/J: 44420.1990
63324.0500
Greedy ordering of sensors
87
63
83
75
33
27984.5508
22080.0939
16853.0790
14955.9915
13414.2479
Done with sp
After this optimization, the mean impact is 13414 pipe feet rather than the 17161 pipe feet value for the
solution optimized with perfect sensors. Thus, it is clear that the GRASP heuristic makes different choices
if the sensors are imperfect.
5.11   Summary of Solver Features
The following table highlights the capabilities of the  SPOT optimizers. The previous examples illustrate
SPOT's capabilities, but the advanced features in SPOT are not available for all optimizers.
                                                   30

-------
Solver Feature
Minimize mean impact
Minimize worst impact
Minimized number of
sensors
Robust objectives
Side-constraints
Fixed/Invalid locations
Witness aggregation
Incident probabilities
Incident aggregation
Sparse data
management
Imperfect sensor model
Computes lower bound
MIP
YES
YES
YES
YES
YES
YES
YES
YES
YES
NO
NO
YES
GRASP
YES
YES
NO
YES
YES
YES
NO
NO
NO
YES
YES
NO
Lagrangian
YES
NO
NO
NO
YES
YES
YES
YES
YES
NO
NO
YES
31

-------
6    File Formats


6.1   TSO File

   • Desription: provides a compact representation of simulation results.

   • Format: binary

   • Created By:  tevasim

   • Used By: tso2Impact

   • Details: The format of TSO files is described in C:\spot\doc\TEVAUtil.doc.


6.2   SDX File

   • Desription: provides an index into a TSO file.

   • Format: binary

   • Created By:  tevasim

   • Used By: tso2Impact

   • Details: SDX files provide an index file that contains information about at what byte offset in which
     TSO file a particular injection scenario's results are located.  The format of SDX files is described in
     C:\spot\doc\Threat\ Simulator.doc.


6.3   TSG File

   • Desription: specifies how an ensemble of EPANET simulations will be performed.

   • Format: ascii

   • Created By:  SPOT user

   • Used By: tevasim

   • Details: Each line of a TSG file specifies injection locations,  the  injection mass, and the injection
     time-frame:

     HZD MASS   

     This format is described in detail in C:\spot\doc\Threat\ Simulator.doc.


6.4   TAI File

   • Desription: describes the information needed for assessing health impacts

   • Format: ascii

   • Created By:  SPOT user


                                               32

-------
   • Used By: tevasim

   • Details: A TAI provides information needed for assessing health impacts. This file is only required
     for impact values  like pe  that involve health impacts.  The format of TAI files will be  described
     in C:\spot\doc\threatAssess_readme.txt when health impacts are  integrated into the TEVA-SPOT
     Toolkit release.


6.5    Sensor  Placement File

   • Desription: describes one or more sensor placements

   • Format: ascii

   • Created By: sp

   • Used By: evalsensor

   • Details:
     Lines in  a sensor  placement file that begin with the '$' character are  assumed to be comments.
     Otherwise, lines of this file have the format

           ...

     The sensor placement ID is used to identify sensor placements in the file. Sensor placements may have
     differing numbers of sensors, so each line contains this information. The node indices map to values in
     the Node File (p. 34).


6.6    Impact File

   • Desription: describes the impact of a contamination event at the point that it is witnessed through
     a water distribution network.

   • Format: ascii

   • Created By: tso2Impact

   • Used By: sp and evalsensor

   • Details:  An  IMPACT file  describes the impact of a contamination event at the  point  that it is
     witnessed throughout a water distribution network.  Specifically, the witness events are assumed to
     occur at junctions  in the network.
     The first line of an IMPACT file contains the number of events.  The next line specifies the types of
     delayed impacts provided in  this file,  with the format:

           ... 

     The delay times are in minutes. (Currently, the SPOT utilities only support a single delay time.)
     Subsequent lines have the format

           

     The node index is  the index of a witness  location for the attack.  A  scenario ID maps to a line in the
     network Scenario File (p. 34). A node index maps  to a line in the network Node File (p. 34). The
     time of detection is in minutes. The  value of impacts are in the corresponding units for each impact
     measure. The  different impact measures in each line  correspond to the different delays that have been
     computed.


                                                33

-------
6.7   LAG File

   • Desription: A sparse matrix file used by the UFL Lagrangian solver

   • Format: ascii

   • Created By: setupIPData

   • Used By:  lagrangian

   • Details: This is a variant of the IMPACT format. Conceptually, it can be viewed as a transpose of
     the matrix  specified in an IMPACT file. The first line specifies the number of locations, the number
     of events, and the impact values:

          

     These impact values differ from  the values in the IMPACT file, in that they are normalized by the
     probability of the event. Subsequent lines describe the impact  of each event:

          

     Note that the location and event indices are indexed starting at 1.


6.8   Scenario File

   • Desription: The scenario file provides auxiliary information about each contamination incident.

   • Format: ascii

   • Created By: tso2Impact

   • Used By:  evalsensor

   • Details: The scenario file provides auxiliary information about each contamination scenario. Each
     line of this  file has the format:

              

     The node index maps to the network Node File (p. 34), and the EPANET ID provides this information
     (redundantly). The scenario start and stop are in minutes, and these values are relative to the start
     of the EPANET simulation.  The source type is the injection mode for an attack, e.g., flow-paced or
     fixed-concentration. The source strength is the concentration of contaminant at the attack source.


6.9   Node File

   • Desription: provides a mapping from the indices used for sensor placement to the junction IDs used
     within EPANET

   • Format: ascii

   • Created By: tso2Impact

   • Used By:  evalsensor and  sensor placement solvers

   • Details: The node file provides a mapping from the indices used for sensor placement to the IDs used
     within EPANET. Each line of this file has the format:

                                                34

-------
         


     This mapping is generated by tso2Impact (p.??), and all sensor placement solvers subsequently use
     the node indices internally.


6.10   Sensor Placement Configuration File


   • Desription: a configuration file used to define a sensor placement problem

   • Format:  ascii

   • Created  By: sp

   • Used By: setupIPData

   • Details: The sensor placement configuration file is generated by the sp (p. ??) solver interface, and it
     contains all of the information that is needed to define a sensor placement problem.  The file has the
     following format:

      
     
          \
             \
      ...  \
      ... 
     
     
     


     The values in this file correspond to the command-line arguments of the sp (p. ??) solver.  Compression
     threshold  or percentage refers to node aggregation  values.  The attack-collapse-flag is a 0 or 1 value in
     the configuration file, indicating whether compression/aggregation can make an attack trivial (single
     supernode equivalent to no detection). The , 
     and   data sets are simply an import of the data from the corresponding  files that are
     specified within the sp (p. ??) solver interface.


6.11   Sensor Placement Costs File


   • Desription: specifies the costs for installing sensors at different junctions throughout a network

   • Format:  ascii

   • Created  By: SPOT user

   • Used By: sp

   • Details: Each line of this file has the format:

         


     Junctions not explicitly enumerated in this  file are assumed to have zero cost unless  the ID '_  -
     default^  ' is specified.  For example:

        .default  1.0


     This example would specify that all un-enumerated junctions have a cost of 1.0.


                                                 35

-------
6.12   Placement  Locations File

   • Desription: specifies whether sensor placements are fixed and whether locations are feasible sensor
     placement

   • Format: ascii

   • Created By: SPOT user

   • Used By:  sp

   • Details: Each line of this file has the format:

           ... 

     The valid keywords are feasible, infeasible, fixed and unfixed.  These keywords correspond to two
     semantic states for each location:  (1) the feasibility of sensor placement and (2) whether a sensor
     placement is forced. The semantics of these keywords are as follows:

          feasible - the specified locations are feasible and unfixed

          infeasible - the specified locations are infeasible and unfixed

          fixed  - the specified locations are constrained to contain a sensor (fixed and feasible)

          unfixed - the specified locations are not constrained to contain a sensor (unfixed and feasible)

     The locations are EPANET-IDs from the  network model. Additionally, the keyword ALL or * can be
     used to specify that all network locations  are to be processed.
     A location  file is processed sequentially. The initial state is that all locations are feasible and unfixed.
     Subsequently, each line updates the state of the locations, given the state defined by the previous lines
     of this file.  For example, the file:

     infeasible ALL
     feasible ABC

     makes all locations infeasible except for locations A,  B and C. Similarly

     fixed ALL
     feasible ABC

     makes all locations fixed except for locations A, B and C; the feasible keyword has the same semantics
     as the unfixed keyword.


6.13   Sensor Class File

   • Desription: contains false-negative probabilities for different types of sensors

   • Format: ascii

   • Created By: SPOT user

   • Used By:  sp

   • Details: The file has format:

          
-------
For example, the following file defines a failure class 1, with a false negative rate of 25 percent, and a failure
class 2 with a false negative rate of 50 percent:
   1  0.25
   2  0.5
6.14   Junctions Class File

   • Desription: provides the mapping from EPANET junction IDs to failure classes

   • Format: ascii

   • Created By: EPANET user

   • Used By: sp

   • Details: When a sensor class file is being used, the "junction  class" file provides the mapping from
     junction (node)  id's to failure classes. The format of this file is:

         
         



For example, supposing that junction 1 is of class 2, junction 2 is of class 1, and junction 3 is of class 1:
   1  i
   2  2
   3  1
                                                 37

-------
Bibliography


 fl] R. Bahadur, W. B.  Samuels,  W. Grayman, D. Amstutz, and J. Pickus.  Pipelinenet: A model for
    monitoring introduced contaminants in a distribution system. In Proc. World Water and Environmental
    Resources Congress. ASCE, 2003.

 [2] J. Berry, R.  Carr, W.  Hart, V.  Leung, C. Phillips, and J. Watson.  On the placement of imperfect
    sensors  in  municipal water networks.  In Proceedings  of the 8th Symposium on  Water Distribution
    Systems Analysis. ASCE, 2006.

 [3] J. Berry, L. Fleischer, W. E. Hart, C. A.  Phillips, and J.-P. Watson.  Sensor placement in municipal
    water networks.  J. Water Resources Planning and Management, 131(3):237^243, 2005.

 [4] J. Berry, W.  E. Hart, C. A. Phillips, J. Uber, and  T. Walski. Water quality sensor placement  in water
    networks with budget constraints. In Proc. World Water and Environment Resources Conference, 2005.

 [5] J. Berry, W. E. Hart, C. E. Phillips, J. G. . Uber, and J.-P. Watson. Sensor placement in municiple water
    networks with temporal integer prog ramming models.  J.  Water Resources Planning and Management,
    132(4):218^224, 2006.

 [6] J. W. Berry, W. E. Hart, and C. A. Phillips. Scalability of integer programming computations for sensor
    placement in municipal water networks. In Proc. World Water and Environmental Resources Congress.
    ASCE, 2005.

 [7] R. Carr, H. Greenberg, W. Hart, G. Konjevod, E. Lauer, H. Lin, T. Morrison, and  C. Phillips. Robust
    optimization of contaminant sensor placement  for community water systems.  Mathematical Program-
    ming, 107(1) :337^356, 2006.

 [8] J. R. Chastain, Jr. A Heuristic Methodology for Locating Monitoring Stations to Detect Contamination
    Events in Potable Water Distribution Systems.  PhD thesis, University of South Florida, 2004.

 [9] W. E. Hart,  J. Berry, R. Murray, C.  A. Phillips, L. A. Riesen, and J.-P. Watson.  SPOT: A sensor
    placement optimization toolkit  for drinking water contaminant warning system design.  Technical Report
    SAND2007-4393, Sandia National Laboratories, 2007.

[10] P. Mirchandani and R. Francis, editors. Discrete Location Theory.  John Wiley and Sons, 1990.

[11] K. Morley,  R. Janke, R. Murray, and  K. Fox. Drinking water contamination warning systems: Water
    utilities driving water research. J. AWWA, pages 40^46, 2007.

[12] R. Murray, J. Berry, and W. E. Hart. Sensor network design for contamination warning systems: Tools
    and applications.  In Proc. AWWA Water Security Congress, 2006.

[13] R. Murray, J. Uber, and R. Janke. Modeling acute health impacts resulting from ingestion of contami-
    nated drinking water. J. Water Resources Planning and Management: Special Issue on Drinking Water
    Distribution Systems Security,  2006.

[14] A. Ostfeld  and E. Salomons. Optimal  layout of early warning detection stations for water distribution
    systems security. J. Water Resources Planning and Management, 130(5):377^385, 2004.

[15] A. Ostfeld, J. G. Uber, E. Salomons, J. W. Berry, W. E. Hart, C. A. Phillips, J.-P. Watson, G. Dorini,
    P. Jonkergouw, Z. Kapelan, F.  di  Pierro, S.-T. Khu, D. Savic, D. Eliades, M. Polycarpou, S. R. Ghimire,
    B. D. Barkdoll, R. Gueli, J. J.  Huang, E. A. McBean, W. James, A. Krause,  J. Leskovec, S. Isovitsch,
    J. Xu, C. Guestrin, J. VanBriesen, M. Small, P. Fischbeck, A. Preis, M. Propato, O. Filler,  G. B. Tra-
    chtman, Z.  Y. Wu, and T. Walski. The battle of the water sensor networks (BWSN): A design challenge
    for engineers and algorithms. J Water Resources Planning and Management,  2007. (submitted).

                                                38

-------
[16] R. T. Rockafellar and S. Uryasev. Conditional value-at-risk for general loss distributions. J. of Banking
    and Finance, 26(7):1443-1471, 2002.

[17] L. A. Rossman.  The EPANET programmer's toolkit for analysis of water  distribution systems.  In
    Proceedings of the Annual Water Resources Planning and Management  Conference, 1999.  Available at
    http://www.epanet.gov/ORD/NRMRL/wswrd/epanet.html.

[18] N. Topaloglou,  H. Vladimirou, and S.  Zenios. CVaR models with selective  hedging for international
    asset allocation. J. of Banking and Finance, (26):1535-1561, 2002.

[19] G. B. Trachtman.  A "strawman" common sense approach  for water quality sensor site selection.  In
    Proc. 8th Annual Water Distribution System Analysis Symposium, 2006.

[20] USEPA. WaterSentinel System Architecture. Technical report, U.S. Environmental Protection Agency,
    2005.

[21] J.-P. Watson, H. J. Greenberg,  and W.  E.  Hart.  A multiple-objective analysis of sensor placement
    optimization in water networks.  In Proc.  World Water and Environment Resources  Conference, 2004.

[22] J.-P. Watson, W. E. Hart, and R. Murray. Formulation and optimization  of robust sensor placement
    problems for contaminant warning systems. In Proc.  Water Distribution System Symposium, 2006.
                                                39

-------
                                                                       EPA 600/R-08/041B I October 2010 I www.epa.gov/ord

&EPA
     United States
     Environmental Protection
     Agency
PRESORTED STANDARD
 POSTAGES FEES PAID
         EPA
   PERMIT NO. G-35
     Office of Research and Development
     National Homeland Security Research Center
     Cincinnati, OH 45268


     Official Business
     Penalty for Private Use
     $300
             Recycled/Recyclable
             Printed with vegetable-based ink on
             paper that contains a minimum of
             50% post-consumer fiber content
             processed chlorine free

-------