EPA/600/B-15/315
Microbial Source Module (MSM):
Documenting the Science and Software
for Discovery, Evaluation, and Integration
Gene Whelan
Rajbir Parmar
Gerard F. Laniak
11/3/15
U.S. Environmental Protection Agency
Office of Research and Development
National Exposure Research Laboratory
Ecosystems Research Division
Athens, GA 30605

-------
OF CONTENTS
CUT1VE SUMMARY	5
2.	INTRODUCTION	6
3.	MICROBIAL SOURCE MODULE	 10
4.	APPLICATION OF AN ONTOLOG IE MICROBIAL SOURCE MODULE	 13
IMPONENT SUPERCLASS	 13
4.1.1	Key Words	13
4.1.2	Component Description	 13
4.2	RESOURC ER	 13
4.2.1	Developer Class	 14
4.2.2	Organization Class	 14
4.2.3	Project Class	 14
4.2.4	Development Level Class	 14
4.2.5	Data Class	 15
4.3	COUPLINf R	 15
4.3.1	Modeling Framework Class	 15
4.3.2	Computational Resolution Class	 17
4.3.3	Standards Interface Class	 17
4.3.4	Architecture Class	 18
4.4	TECHf	!R	18
4.4.1	Operating System Class	18
4.4.2	Programming Language Class	 18
4.4.3	Memory Requirements Class	 18
4.4.4	Number of Processors	 18
4.5	SCIENTIFI R	 19
4.5.1	Domain Class	 19
4.5.2	Mathematical Classification Class	19
4.5.3	Symbol Class	 19
4.5.3.1	Ontollogicall Metadata Format	19
4.5.3.2	Indices	24
4.5.4	Equation Class	26
.5.4.1 Summary of Assumptions a istraints	26
2

-------
4.5.4.2	Domestic Animal
Rates	28
4.5.4.2.1	Domestic Animal Waste Available for Runoff.	28
4.5.4.2.2	Wildlife Shedding Rates	31
4.5.4.3	Accumulated Microbial Loading Rates on Cropland	31
4.5.4.3.1	Wildlife	31
4.5.4.3.2	Dairy Cow	32
4.5.4.3.3	Beef Cattle	32
4.5.4.3.4	Poultry	33
4.5.4.3.5	Swine	33
4.5.4.4	Accumulated Microbial Loading Rates on Pasture	34
4.5.4.4.1	Shedding to Land Surface	34
4.5.4.4.1.1 Wildlife	34
?f Cattle	34
4.5.4.4.1.3	Horses	34
4.5.4.4.1.4	Sheep	35
4.5.4.4.1.5	Other Agricultural Animals	35
4.5.4.4.2	Manure Application to Land Surface	36
.* ' iry Cow	36
?f Cattle	36
4.5.4.4.2.3 Horses	36
4.5.4.5	Accumulated Microbial Loading Rates on Forest	37
4.5.4.7	Accumulated Overland Micro!'KM.'..jdiing Rates to the Land Surface, Adjusted for
Die-off	38
4.5.4.7.1	Die-off Adjustment	38
4.5.4.7.2	Accumulated Overland Microbial Loading Rates and Maximum Microbial
Storage with Die-off	39
4.5.4.7.2.2	Pasture	39
4.5.4.7.2.3	Forest	39
4.5.4.7.2.4	Urbanized	40
4.5.4.7.3	Maximum Microbial Storage with Die-off	40
4.5.4.8	Microbial Point Source Loading Rates	40
3

-------
4.5.4.8.1	Cattle in Streams	40
4.5.4.8.2	Septics	41
4.5.4.8.3	Point Source	41
5.	CONTEXT I MICROBIAL SOURCE MODULE WITHIN A MULTI-COMPONENT
WORKFLOW	43
5.1	TRANSFORMATION OF LATITUDE-LONGITUDE COORD! (WATERSHED
DESIGNATIONS	47
5.2	WORKFLOW	52
5.3	ONTOLOGICAL K INSHIIP8 BETWEEN VARIABLES, EQUATIONS, AND COMPONENTS
	52
6.	MAPPING THE MICROBIAL SOURCE MODI	OGICAL METADATA DICTIONARIES
I« - H EXTENSIBLE MARK I if* I f i«,l I AGE DOCUMENT	55
REFERENCES	60
APPENDIX A	65
MICROBIAL SOURCE MODULE XML DOCUMENT FOR INPUT PARAMETERS/VARIABLES	65
APPENDIX B	82
MICROBIAL SOURCE MODULE XML DOCUMENT FOR C	iRAMETERS/VARIALBES	82
4

-------
1. EXECUTIVE SUMMARY
The Microbial Source Module (MSM) estimates microbial loading rates to land surfaces from non-point
sources, and to streams from point sources for each subwatershed within a watershed. A subwatershed,
the smallest modeling unit, represents the common basis for information consumed and produced by
the MSM which is based on the HSPF (Bicknell et al., ) Bacterial Indicator Tool (EPA, 2013b, 2013c).
Non-point sources include numbers, locations, and shedding rates of domestic agricultural animals
(dairy and beef cows, swine, poultry, etc.) and wildlife (deer, duck, raccoon, etc.). Monthly maximum
microbial storage and accumulation rates on the land surface, adjusted for die-off, are computed over
an entire season for four land-use types (cropland, pasture, forest, and urbanized/mixed-use) for each
subwatershed. Monthly point source microbial loadings to instream locations (i.e., stream segments
that drain individual sub-watersheds) are combined and determined for septic systems, direct instream
shedding by cattle, and POTWs/WWTPs (Publicly Owned Treatment Works/Wastewater Treatment
Plants).
The MSM functions within a larger modeling system that characterizes human-health risk resulting from
ingestion of water contaminated with pathogens. The loading estimates produced by the MSM are input
to the HSPF model that simulates flow and microbial fate/transport within a watershed. Microbial
counts within recreational waters are then input to the MRA-IT model (Soller et a I,,» 2008, 2004) to
estimate human exposure and risk.
A new approach has been taken in the design and implementation of MSM documentation and software
with the goal of enhancing the MSM's potential for reuse and interoperability with modeling systems.
Satisfying this goal requires the MSM to be easy to discover, understand, evaluate, access, and
integrate: therefore, the strategy is to 1) facilitate discovery, understanding, and evaluation by
documenting the module with an ontological framework, and 2) facilitate access and integration by
implementing the software as a web service.
The ontological framework is based on the Water Resources Component (WRC) ontology (lEIag and
Goodall, 2013). The WRC is a structured way to describe the ontology of an environmental system
represented by a science software component such as the MSM. The MSM ontology is documented in
Protege (Protege, 2014), an editor that implements the Web Ontology Language (OWL; W3C, 2013). The
ontological framework also documents key aspects of the MSM including key words; module purpose,
assumptions, and constraints; inputs; outputs; and internal variables. Finally, this document represents a
traditional Theory Manual that accompanies the science; it has been structured to mirror the ontology,
thus facilitating development in Protege.
To facilitate access and integration, MSM software has been designed with object-oriented principles
and is "published" as a Representational State Transfer (REST, 2015) web service. The web service
consumes XML input and produces XML output which can be accessed directly via browser add-ons such
as Postman for Chrome. The most common way to consume the web service is through a custom
desktop or web client program. The web service is platform and programming language agnostic.
5

-------
2. INTRODUCTION
Many modeling frameworks have adopted an approach to compartmentalize science through individual
models to link a set of small components to create larger modeling workflows. Development of
integrated watershed models increasingly requires coupling of multidisciplinary, independent models
and collaboration between scientific communities since component-based modeling enables integration
of models from different disciplines (Eiag and Goodall, 2013). Integrated Environmental Modeling (IEM)
systems focus on transferring information between components by capturing a conceptual site model
(CSM), establishing local metadata standards for input/output of models/databases, managing
data/information flow between models and throughout the system, facilitating quality control of
data/information exchanges (e.g., units checking, units conversion, inter-language transfers), handling
warnings/errors, and coordinating sensitivity/uncertainty analyses (Whelan et al., 2014a). Although
many computational software systems are designed to facilitate communication between, and
execution of, components (Whelan et a!., 2014a; Laniak et a I,, 2013), there are no common approaches,
protocols, or standards for turn-key linkages between software systems and models, especially if the
intent is not to modify components.
While there has been a notable increase in component-based modeling frameworks in recent years
(Laniak et a!., 2013; Whelan et a!., 2014a), there has been less work on creating standard vocabularies,
metadata, semantics, and ontologies (see Table 1) to ensure proper technical and conceptual
assemblage, although work on ontologies is gaining traction. For example, Elag and Goodall (2012, 2013)
and Morsey et al, (2014) designed an ontology for the water resources community, using a skeletal
methodology described by Uschold and Gruninger (1996). Titled the Water Resources Component
(WRC) ontology, it was developed to advance application of component-based modeling frameworks
across water-related disciplines. Although their WRC ontology was designed for water resources, its
design can be extended to include other domains, such as microbial source-term modeling, to document
individual modeling components for eventual inclusion in larger, disparate systems. It advances the
conceptual integration of components from different, but related, disciplines by handling semantic and
syntactic heterogeneities to describe them, so they can be more easily reused, extended, and
maintained by a larger group of model developers and end users. The WRC has four ontological layers
(Elag and Goodall, 2012, 2013):
•	Resources: defines digital resources related to the component.
•	Coupling: defines coupling standards used by the component, the framework in which the
component can be coupled, and its computational resolution.
•	Scientific: describes the equations, symbols, mathematical classification, and component
purpose.
•	Technical: defines required computer architecture to employ and edit the component.
An overview of the WRC Ontology's four layers and their classes (Elag and Goodall, 2012, 2013) is
presented in Figure 1. Details of the layers are presented in Figure 2. The strength of this ontology, like
others, is its structure for capturing and documenting key information that define a component's
vocabulary, metadata, semantics, and ontology to promote interoperability of components across
disciplines and modeling frameworks.
The purposes of this effort are to 1) enable construction of scientifically consistent, coherent
environmental software systems for multi-disciplinary data integration, decision and policy support, and
6

-------
modeling; and 2) discover, access, and integrate components developed and published by different
scientists (Laniak, 2012). The objectives are to 1) describe a model, using a standard ontology (see Table
1), so that the module can be discovered, understood, evaluated, accessed, and implemented on the
cloud, and 2) place the model within the context of a workflow. A model called the Microbial Source
Module (MSM) is described using this ontology. A glossary of terms related to interoperability is
provided in Table 1. The ontology documents metadata, syntactics, and semantics of the model's
Input/Output (I/O) through expanded dictionaries (Whelan et a!., 2014a), mathematical formulations
that define and/or use each I/O parameter/variable, constraints (i.e., assumptions) associated with each
I/O parameter/variable, and an Extensible Markup Language (XML) file that encodes the I/O dictionaries
for access and execution on the cloud. An in-depth discussion of applications of the WRC ontology
relative to the MSM is presented, followed by a description of the MSM within a more complex
modeling environment, where ontological relationships are captured within a more inclusive, multi-
component paradigm.
Table 1. Definition of Terms Related to Interoperability
TERM
DEFINITION
Data
Information that is consumed and produced
Vocabulary
Terminological dictionary, which contains designations (e.g., names) and definitions
from one or more specific subject fields ( 2008)
Taxonomy
Science of classification according to a pre-determined system with the resulting catalog
used to provide a conceptual framework for discussion, analysis, or information
retrieval (i.e., identifies, names, and classifies data,1 so it can be standardized, shared,
and re-used in multiple systems2)
Metadata
Information about the data used to capture content (Kashyap and Sheth, 2000)
Syntatics
Data structure [i.e., how elements are sequenced to form valid conditions (e.g.,
keywords, object names, operators, delimiters, and so on are in the correct places)]
Semantics
Data and their relationship to other data3 by relating content and representation of
information resources to entities and concepts in the real world (Meersman and Mark,
) and including not only the metadata about data but also the intended use (i.e.,
application) of data (Sheth, 2001)
Ontology
Explicit specification of conceptualization, describing knowledge about the domain4 and
relationships between domain concepts5
1 http://en.wi
cipedia.org/wiki/Taxonomy
2	http://it.toolbox.com/blogs/irm-blog/the-benefits-of-a-data-taxonomy-4916
3	http://en.wikipedia.org/wiki/Semantic_data_model
4	http://www.obitko.com/tutorials/ontologies-semantic-web/ontologies.html
5	http://en.wikipedia.org/wiki/Ontology (information science)
7

-------
Resources
Developer
Organization
Development
Level
Project
Coupling
Modeling
Framework
Architecture
Computational
Resolution
Data
Standards
Interface





Component




Scientific
Symbols
Equation
Units
Mathematical
Classification
Domain
Technical
Programming
Language
Operating
System
Number Of
Processors
Memory
Requirements
Figure 1. Overview of the Water Resources Component Ontology, describing the Four Basic Layers and
their Classes (Elag and Goodall, 2012, 2013)
8

-------
| Developerj-
Resources
1



University 1
a
\

Organization |-
/ , *

Company | |
V
2
1
1
| Project |-
DevelopsA »
| Research 11 Teaching"] | Commercial]
Data
Value
Coupling
isUsedBy»
«isBuiltOn
Modeling
Framework
Concurrent
| Architecture!
isCompahb]eWilkA-: --
=: :lsAccep»dE)
V ,
I Sequential
S-
11
Standard
Interface
——[Component]
«has | I" L
Spatial

Extent

Spatial
1


Temporal
Resolution
Spatial

Computational
Resolution

Resolution
Level I
Level II
Development
Level 111
Level IV
Tabular
GeoSpatial
WaterML
Vector
Technical
Number Of
Processors
Operating
System


Memory
Requirements
\ Component \
Programming
Language
Scientific
DataValue
Symbol
Variable
«hasSymbolValuc
«uses
usedBy»
isCalculatedIn»
«calculates
«has!n/Output
Project 1
isln/0utput»
[Independent] <
-------
3. MICROBIAL SOURCE MODULE
A coupled software system is being developed that will connect IEM legacy technologies to support a
watershed-scale Quantitative Microbial Risk Assessment (QMRA) source-to-receptor assessment,
focusing on animal-impacted catchments, although point sources are also considered. A Quantitative
Microbial Risk Assessment (QMRA) is a modeling approach that integrates disparate data (including
fate/transport, exposure, and human health effect relationships) to characterize potential health
impacts/risks from exposure to pathogenic microorganisms (Soller et al., 2010; Whelan et al., 2014b;
Haas et al., 1999; Hunter et al., 2003). As Whelan et al. (2014b) note, a QMRA's conceptual design fits
well within an integrated, multi-disciplinary modeling perspective (illustrated in Figure 3) which
describes the problem statement, data access retrieval and processing [e.g., D4EM (EPA, 2013a; Whelan
et al., 2009; Wolfe et al., 2007)]; software frameworks for integrating models and databases [e.g.,
FRAMES (Johnston et al., 2011)]; infrastructures for performing sensitivity, variability, and uncertainty
analyses [e.g., SuperMUSE (Babendreier and Castleton, 2005)]; and risk quantification. Coupling
modeling results with epidemiology studies allows policy-related issues (e.g., EPA, 2010; EPA and USDA,
2012) to be explored (Figure 3).
Problem Definition
Pathogens
Identification
Etiologies
Properties
Scenarios
Baseline
Alternatives

Data Access, Retrieval, and Processing
Field and
Laboratory
Monitoring
Sampling
Measurements
Sorption studies
In activation and die-off rates
Mortality kinetics
Impacts: Sunlight, Temp
Data Access,
Retrieval, and
Processing
Met data
Soils &Topography
Land use/cover
Watershed/Stream Delineations
Policy-related Issues
Risk target(s)
It"
--
Sensitivity/Uncertainty
XL
QMRA Investigations
Site characteristics and pathogens
Model scale and resolution
Risks by varying hydrologicconditions
Source apportionment
Risk by pathogen, fecal source, water type

Integrated Modeling Framework
_A_
Source-term
Loadings
Fate & Transport
V\&tershed
Hydrology
™ter body
Network
Receiving
Voters
O)
C

Q_
E
03
CO

*35
c
o

03

=3
s=



CO

Health
Impacts
Dose-
Response
Relationship
Epidemiology
Studies
Exposure
Intake
Risk
Quantification
(at Receptor
Locations)
Iterate on sources, watersheds, water bodies and receptor locations
Figure 3. One possible rendition of QMRA from an integrated, multi-disciplinary multimedia modeling
framework perspective that links problem definition; data access, retrieval, and processing; integrated
modeling framework with source-to-receptor environmental models, housed within a
sensitivity/uncertainty software structure; risk quantification linked to epidemiology studies and policy-
related uses (After Whelan et al., 2014b)

-------
An important piece of the IEM microbial workflow is the Microbial Source Module (MSM) that organizes,
analyzes, and supplies data necessary to determine microbial loading rates within a watershed to
support watershed modeling. The MSM makes this determination from sources correlated to four land-
use types (cropland, pasture, forest, and urbanized/mixed-use) for each subwatershed, the smallest
modeling unit within a watershed. Microbial sources include numbers and locations of domestic
agricultural animals (dairy and beef cows, swine, poultry, etc.) and wildlife (deer, duck, raccoon, etc.)
with estimated shedding rates due to grazing; manure application rates where the manure is directly
incorporated into a pasture's soil; and loading rates due to urbanized/mixed-use activities (commercial,
transportation, etc.). Manure contains microbes, and the monthly maximum microbial storage and
accumulation rates on the land surface, adjusted for die-off, are computed over an entire season to
represent the source for subsequent overland fate and transport to instream locations. Monthly point
source microbial loadings to instream locations are also determined for septic systems, instream
shedding by cattle, and POTWs/WWTPs (Publicly Owned Treatment Works/Wastewater Treatment
Plants).
The MSM module is based on the HSPF (Bicknell et al., 1997) Bacterial Indicator Tool (EPA, 2013b,
2013c). The subwatershed is the basis for spatial data consumed and produced by the MSM. Although
microbial loadings maybe determined by land use type (e.g., pasture, cropland, urbanized, and
residential), they are combined and assigned to the entire subwatershed. Attributes of the MSM,
captured within the ontological description, include the following:
•	The MSM considers only one microbe at a time and must be individually executed, if multiple
microbes are being assessed; the MSM, therefore, does not consume any information that
specifically identifies the microbe by name.
•	Overland microbial loading rates, accounting for die-off, are computed for each subwatershed
by land use type on a monthly basis.
•	The MSM considers microbial loadings from sources correlated to four land-use types for each
subwatershed, where a subwatershed is the smallest area associated with watershed modeling.
Correlated sources and land use types are pictorially illustrated in Figure 4 and summarized as
follows:
o Cropland:
¦	Land application of some domestic animal waste (Beef Cow, Dairy Cow, Swine,
and/or Poultry)
¦	Wildlife shedding
o Pasture:
¦	Some domestic animal grazing with shedding (Beef Cow, Horse, Sheep, and/or
Other)
¦	Land application of some domestic animal waste (Beef Cow, Dairy Cow, and/or
Horse)
¦	Wildlife shedding
o Forest: Wildlife shedding
o Built: Urban-related releases:
¦	Commercial and Services
¦	Residential
¦	Mixed Urban
¦	Transportation, Communication, Utilities
11

-------
o Direct Loading to Streams (point source releases):
¦	Septic systems
¦	Wastewater Treatment Plants (WWTPs), Water Treatment Plants (WTPs), Publicly
Owned Treatment Works (POTWs)
¦	Instream Beef Cow shedding
•	Instream loading rates are identified with each subwatershed.
•	The MSM currently assumes the smallest time increment associated with nonpoint- and point-
source loadings is monthly, representing typical loadings for that month, regardless of the year.
Pasture:
•	Domestic animal grazing
•	Land application domestic animal waste
•Wildlife shedding
Point Source Releases:
•	Septic systems
•	WWTPs, WTPs, POTWs
•	Instream Beef Cattle shedding
Cropland:
• Land application
domestic animal waste
•Wildlife shedding
Built: Urban-related releases:
•	Commercial and Services
•	Residential
•	Mixed Urban
•	Transportation, Communication, Utilities
Forest: Wildlife shedding
Figure 4. Schematic correlating microbial sources and land use types considered by the Microbial Source
Module (After Whelan et ai., 2014b)
12

-------
4, APPLICATION 01 MW (>f H v H I > * i It ft IK llCROEIf M - H It ¦ I MODULE
To demonstrate how ontologies like the WRC can help define a component's vocabulary, metadata,
semantics, and ontology, the Microbial Source Module has been singled out, and an ontological analysis
has been performed and documented. Using the WRC as a guide, this section provides an ontological
description of the MSM using the Component superclass and four ontology "layers": Resource, Coupling,
Technical, and Scientific. Because the Component superclass represents the central hub in each layer, it
is described first, followed by a descriptions of the four layers.
4.1 COMPONENT SUPERCLASS
The Microbial Source Module (MSM) is the Component. Key words and a description of the MSM are
provided for the purpose of discovery.
4.1.1 Key Words
Source-term model, microbial modeling, microorganisms, microbial loading rates, watershed, watershed
modeling, microbial properties
Component Description
The Microbial Source Module (MSM) determines microbial loading rates within a watershed from
sources correlated to four land-use types (cropland, pasture, forest, and urbanized/mixed-use) for each
subwatershed, the smallest modeling unit within a watershed. Microbial sources include numbers and
locations of domestic agricultural animals (dairy and beef cows, swine, poultry, etc.) and wildlife (deer,
duck, raccoon, etc.), with estimated shedding rates due to grazing; manure application rates where the
manure is directly incorporated into a pasture's soil; and loading rates due to urbanized/mixed-use
activities (commercial, transportation, etc.). The monthly maximum microbial storage and accumulation
rates on the land surface, adjusted for die-off, are computed over an entire season, representing the
source for subsequent overland fate and transport to instream locations. Monthly point source
microbial loadings to instream locations are also determined for septic systems, instream shedding by
cattle, and POTWs/WWTPs.
4.2 RESOURCES LAYER
The Resources layer has five super classes that collectively describe the component's "digital
Resources" (as illustrated in Figures 1 and 2, Eiag and Goodall, 2013), identifying the developers,
pertinent organization, projects supporting the component, its development level (Levels I through IV
which represent basic model research up to a fully deployable, vetted model), and information on data
used by the component.
13

-------
4.2.1	Developer Class
The Developer class stores information about the component's development team (Bag and Goodall,
2013):
Rajbir Parmar (Software)
Gene Whelan (Science)
Gerard F. Laniak (Ontology)
4.2.2	Organization Class
The Organization class is related to the Developer class and identifies the agency or institute where the
component is developed (Elag and Goodall, 2013):
U.S. Environmental Protection Agency
Office of Research and Development
National Exposure Research Laboratory
Ecosystems Research Division
960 College Station Road
Athens, GA 30605
4.2.3	Project Class
The Project class defines information about projects, where components are coupled to form a
workflow. When a component is part of a modeling workflow, it is necessary to know where and how it
is used within that project, including any specific project requirements (Elag and Goodall, 2013). Projects
related to and supporting this effort include:
•	Research
•	Sustainable and Healthy Communities Research Program (SHCRP)
o Task 1.1.2.2: Interoperability (2014)
o The purpose is to develop Guidelines for Designing and Implementing Environmental
Decision Support Software for Reuse and Interoperability
•	Safe and Sustainable Water Research Program (SSWR)
o Task 2.2.B.8: Integrated Public Health Evaluation of Pathogens (e.g., Occurrence,
Exposure, Effects and Treatment) (2012-2015)
o The purpose is to provide Quantitative Microbial Risk Assessment (QMRA) software
infrastructure to perform predictive modeling and microbial risk assessments in mixed
watersheds, using pathogen and indicator loadings and transport via models
Later, this document describes how the MSM component is coupled in a workflow containing multiple
models.
4.2.4	Development Level Class
The Development Level class defines the component's development stage according to a four level
scheme. Babendreier (2010) adopted guidance from the U.S. EPA National Exposure Research
14

-------
Laboratory's Modeling Workgroup to classify model development on four levels, as presented in Table 2.
The levels range from the most rigorous QA at Level I to the least rigorous at Level IV. Level I directly
and/or immediately supports specific Agency rule-making, enforcement, regulatory or policy decisions,
and Level IV documents basic, exploratory, or conceptual model-based research to study basic
phenomena or issues. The MSM is a QA'ed at Level IV.
4.2.5 Data Class
The Data class has two subclasses: Data File and Data Value (Elag and Goodall, 2013). The Data File has
four subclasses: Geospatial, Tabular, Time Series, and Extensible Markup Language (XML) data and the
Data Value class stores numerical or categorical values used by the MSM component. The relationship
between the MSM Component and Data class can be input, output, or associated data. Examples of
associated data include model parameters/variables or source code files (Elag and Goodall, 2013).
Identifying existing data resources and describing the exact format of the data document could enable
components to utilize remote data sources in an automated manner. MSM utilizes XML to describe
input/output file content (xml schema) and exchange data with the user as input/output data values.
4.3 COUPLir ER
Elements of the Coupling Layer are presented in Figures 1 and 2. The Coupling Layer answers three
questions about component coupling (Elag and Goodall, 2013): What coupling standards are used by the
component? In which frameworks can components be coupled? What is the computational resolution of
the component? The Coupling Layer addresses these questions through four classes: 1) Modeling
Framework, 2) Standards Interface, 3) Architecture, and 4) Computational Resolution (Figure 2). Figure 5
presents the workflow relationships and interactions between the MSM and other components from
which it consumes and produces data.
4.3.1 Modeling Framework Class
A Modeling Framework provides an environment where components can be coupled (Elag and Goodall,
2013). In component-based modeling, it couples components that adopt a specific Standards Interface
and Architecture. A modeling Component can be used within a Modeling Framework if its design is
consistent with the Framework's Standards Interface and Architecture.	d Goodall (2013) classify
the Modeling Framework based on the level of interaction between components: 1) Concurrent, in
which the framework allows components to communicate during the time horizon of the simulation
(e.g., dynamic feedback during runtime) and 2) Sequential, in which the framework allows components
to communicate after the simulation time horizon concludes.
The MSM operates as a stand-alone module, where necessary input data are available for consumption,
or as a module integrated into a modeling framework. The MSM Component design accommodates the
specific Standards Interface and Architecture associated with the Framework for Risk Assessment in
Multimedia Environmental Systems (FRAMES) (Whelan et a!., 2014). Data transfer protocols are
captured in ontological metadata dictionaries, extensions of the metadata described by nd Goodall
(2013) and Whelan et ai, (2014). Either as a stand-alone or within a framework, the MSM operates
sequentially, where the module communicates after the conclusion of the time horizon.
15

-------
Table 2. Guidance for interpretation of QA level requirements for modeling projects (Babendreier, 2010)

Exam
pie Model Evaluation Tasking

Best Practice
Level
Description
Descriptors of Models as Technology and Science
O
«
-o
Corroboration


Recordkeeping
Best Practice Type
QA Level
Model-Based
Category Description
Models as Technology
(i.e., software)
Models as Science
(i.e., archives of data/science)
jde Verification
jnsitivity Analysis
at
3
&
2
m
w
6
SJ
a-
o'
3
Via Observations
Via Model Comparison
Via Other Means
>er Advice, Review
icertainty Analysis
Example Recordkeeping
Requirements

Project Specific - All
Types May Be Involved
Category IV
Basic, exploratory, or
conceptual model-
based research to
study basic
phenomena or issues.
Self-certification via lab notebook,
publishing, or other means.
Self-certification via lab notebook,
publishing, or other means.
Required (self)
As undertaken
As undertaken
As undertaken
As undertaken
As undertaken
Recommended
As undertaken
Lab notebook and peer-reviewed journal
articles as appropriate for project
needs. Source code, executables,
data, and results.
E
©
O
CL
£
a>
S
a>
sz
O)

Development
Category III
Demonstration or proof
of concept of model's
technology basis.
Certification is geared to a
reproducible demonstration of
software behavior by the developer
The software does what we think it
is supposed to do. Includes
planned unit-level (i.e. module)
testing. Includes an appropriate
level of systems testing if the
integrated system is being certified
at this QA level (e.g., stress
testing integrated components).
Certification at this level does not
necessarily require knowledge of
the model's accuracy/precision:
i.e.. its ability to represent or
predict the system of
interest/focus. As appropriate and
feasible for the project's needs,
model evaluation studies are
conducted and reported A
process of building overall
confidence in model output data for
specific uses.
Testable Document Required
As practical for project needs
As practical for project needs
As practical for project needs
As practical for project needs
As practical for project needs
Recommended
As practical for project needs
Source code, executables. data, and
results. Documentation describing in
some form software requirements,
design (approach), specifications (I/O),
test plan(s). and expected\actual test
results. Model corroboration studies,
sensitivity studies, and parameter
estimation\calibration methodologies
are increasingly best conducted with
some level of software verification in
place. Model evaluation studies are
pursued as practical via peer-reviewed
journal articles/presentations, and
other independent peer review venues.
>
o
CO
c
o
M
o
s=
5
5
a.
CO
E
a
o
cL
Model Evaluation

c
o
CO
Category II
Model-based research
of high programmatic
relevance which, in
conjunction with other
ongoing or planned
studies, is expected to
provide
complementary
support of Agency rule-
making. regulatory, or
policy decisions-
Meets Category III requirements for
code verification: systems level
testing would typically be expected
to be more thorough than for
Category III. Code verification is
conducted by person(s) other than
immediate code developer (e.g..
other teammate, etc).
Demonstration or proof of concept
of the model's science basis is
valued. Documentation of model
and model evaluation tasking are
sufficient to support the intended
purpose/use of the model. Implies
a minimal level of understanding by
users of relative reliability and
attendant uncertainties associated
with model data. Typically
includes peer-community support
of the science basis of the model.
Required (non-developer)
Recommended
As needed
As needed
As needed
As needed
Required (as needed)
Required (as needed)
Source code, executables. data, and
results. Supporting verification
document(s) and model evaluation
studies. Supporting and non-
supporting peer-reviews, and
responses to peer-reviews as
appropriate. An overall statement of
uncertainties involved relative to the
needs of the specific use; may involve
separating effects of natural variability
on model output from the effects of
sources of epistemic uncertainty.



"5.
<
Category 1
Model-based research
which directly and/or
immediately supports
specific Agency rule-
making, enforcement,
regulatory, or policy
decisions.
Meets Category II requirements for
code verification: systems level
testing would typically be expected
to be more thorough than for
Category III or II. Code compilation
and verification is conducted by
person(s) independent of the
immediate software development
team. As appropriate and feasible,
engages best available or
practically achievable methods.
Documentation objective: the
model application is sufficient to
support the intended purpose/use.
Modelers focus their role on best
describing sources of uncertainty
in outputs, and the range and scale
of associated outcomes possible.
A practical level of understanding
by users of the model is expected,
with acceptable levels of
community wide agreement on
utility of use.
Required (independent)
Recommended
As appropriate, tends towards best
available or practically achievable
As appropriate; tends towards best
available or practically achievable
As appropriate, tends towards best
available or practically achievable
As appropriate; tends towards best
available or practically achievable
Required (as needed)
Required (as needed)
Source code, executables. data, and
results. Supporting verification
document(s) and model evaluation
studies. Supporting and non-
supporting peer-reviews, and
responses to peer-reviews as
appropriate. An overall statement of
uncertainties involved relative to the
needs of the specific use; may involve
separating effects of natural variability
on model output from the effects of
sources of epistemic uncertainty.

-------
Determine
Pathogen
Risks
I
Execute
MRA-IT
Create DIC
File for
MRA-IT
T
Microbe
Properties
Database
Microbial
Source
Database
Execute
J
FVCOM
n
Data Sources
NLCD

NHD
Plus

BASIN5

STORE!

NWIS

<...>
I
D4EM
SDMProjectBuilder
¦ 	
Execute
HSPF
Microbial
Source
Module
BASINS
BASINS
Viewers
Figure 5. Microbial Source Module Interaction within a Large Modeling Workflow between
SDMProjectBuilder, D4EM, MSM, HSPF, BASINS, FVCOM, and MRA-IT (After Whelan et al., 2014c)
4.3.2	Computational Resolution Class
The Computational Resolution class covers both temporal and spatial resolutions of the component
model (Elag and Goodall, 2013). The Temporal Resolution class introduces the order of permissible
operating time steps, and the Spatial Resolution class describes the space resolution. For numerically-
based models, descriptive information such as grid or mesh size and dimensionality (1-D, 2-D, 3-D), as
well as size of the time step to keep it numerically stable, are important to capture. For lumped-
parameter or reduced-form models, data needs are less onerous. For the spatial resolution, the MSM is
designed to work on polygon-shaped subwatershed elements, with no real minimum or maximum size
defined, although typical sizes range from HUC-8s to HUC-16s. For the temporal resolution, overland
microbial loading rates and direct loading to the stream are on a monthly basis.
4.3.3	Standards Interface Class
A Standards Interface is the way data, both input and output, are exchanged with the Component. From
a developer's perspective, accessing the software functionality is achieved via MSM's Application
Programming Interface (API) through a web service. An API is a set of routines, protocols, and tools for
building software applications; it expresses a software component in terms of operations, inputs,
outputs, and underlying types. An API defines functionalities that are independent of implementation
which allows definitions and implementations to vary without compromising each other. A good API
makes it easier to develop a program by providing all the building blocks for a programmer (API, 2015).

-------
4.3.4 Architecture Class
Software architecture is the fundamental organization of a system embodied in its components, their
relationships to each other and the environment, and the principles guiding its design and evolution
(IBM,	, 2000). It represents the high-level structure of a software system which facilitates
communication (Wikipedia, 2014). The MSM is designed as a web service and reflects a Service-Oriented
Architecture (SOA) which is a design pattern based on distinct pieces of software providing application
functionality as services to other applications via a service-orientation. It is independent of any vendor,
product, or technology. A service is a self-contained unit of functionality such as retrieving an online
bank statement. Services can be combined by other software applications to provide the complete
functionality of a large software application. SOA makes it easy for computers connected over a network
to cooperate. Every computer can run an arbitrary number of services, and each service is built to
ensure it can exchange information with any other service in the network without human interaction
and without needing to change the underlying program itself (SOA, 2015).
4.4 TECHNICAL LAYER
The Technical Layer answers questions about the computer architecture required to 1) run a component
simulation, 2) edit or update the component code, 3) determine computational resources required by
the component, and 4) optimize simulation time, given available computational resources (Elag and
Goodall, 2013).
crating System Class
The Operating System (OS) class defines the different systems that are compatible with the component
(Elag and Goodall, 2013). The MSM was developed under the Microsoft Windows OS.
4.4.2	Programming Language Class
The Programming Language class determines the language used in writing the component (Elag and
Goodall, 2013). The MSM software is written with a combination of C# and ASP.NET.
4.4.3	Memory Requirements Class
The Memory Requirements class describes required memory capacity to support a single component
simulation (Elag and Goodall, 2013). Since the MSM software does not consume large volumes of data,
it has no specific memory requirements.
4.4.4	Number of Processors
The Number of Processors class includes elements representing the number of processors the
component can leverage (Elag and Goodall, 2013). The MSM software runs as a web service, so from a
user's perspective (including software developers), the MSM is executed on a single processor.
18

-------
4.5 SCIENTIF ER
The Scientific Layer describes the component's equations, Input and Output (I/O) variables, parameters,
purpose, and mathematical classification (Bag and Goodall, 2013). Components of the Scientific Layer
are shown in Figures 1 and 2. The four Scientific Component classes (Domain, Mathematical
Classification, Symbol, and Equation) are described as follows.
main Class
The Domain describes the category with which the Microbial Source Module should be affiliated and is
designated as a Source-term model.
4,5.2 Mathematical Classification Class
The Mathematical Classification class defines how variables are treated in space and time and if they are
deterministic or stochastic. The MSM is classified as Deterministic, as it uses algebraic equations in a
Deterministic mode.
4.15.3 Symbol Class
The Symbol class classifies symbols as Independent or Dependent Variables, Parameters, or Constants,
where each must have a unique, unambiguous name, and where the names themselves can represent
the symbols (Elag and Goodall, 2013). A variable is an entity that changes with respect to another, and a
parameter is an entity that connects variables. A variable is a real world entity with a measureable
quantity, while a parameter is an entity that may or may not be measurable; therefore, the same set of
variables can be described by different parameters (e.g., indices) (Different	2012). For
example, in the equation of a straight line (y = mx + b), x and y are independent and dependent
variables, respectively, and m and b are parameters. When modeling this equation, x, m, and b are
typically inputs, and y is typically an output. The output of one model, which produces dependent
variables, could be classified as independent variables or parameters of a downstream model that
consumes the information as input.
4.5.3.1 Ontoloeiical Metadata Format
Tables 3 and 4 extend the variable names and definitions associated with Tables 5 and 6 to succinctly
capture the vocabulary, metadata, syntactics, semantics, and ontology associated with MSM input and
output variables, respectively. Tables 3 and 4 are ontological dictionaries describing each variable's
metadata, its relationship to other variables through indices, its use, mathematical expressions that
define or use it, and relevant assumptions that impact its use and/or value. Table 7 summarizes the
indices and provides their definitions. An ontological dictionary, as used here, groups like and related
parameters and provides a single naming convention for variables and parameters shared by modeling
components; specifically, each table provides the following information (Whelan et al,» 2014a):
•	Parameter/Variable Name
•	Parameter/Variable Description (Definition of parameter/variable)
19

-------
Table 3. Microbial Source Module Component - Input: Re evant Vocabulary, Taxonomy, Metadata, Syntactics, Semantics, and Ontology
Variable Name
Parameter Deseriptioi
1(1,2,6-17,
Land Use Type, self-indexed (i .e
Microbial Source
Module	
l.Whela
(2015)
1(3-7,9-23, 27-
31,41)
Subwatershed
Subwatershed Identification designation, self-indexed (i.e
Microbial Source
Module
l.Whela
(2015)
1(6,7,9-23, 27-
40)
Name of Built up a
ielf-indexed (i.e., self-ei
Microbial Source
Module
l.Whela
(2015)
Name ofSub-urbai
Microbial Source
Module
l.Whela
(2015)
Wildlife Name, self-indexed (i.e
Microbial Source
Module
l.Whela
(2015)
Local ID for Month of the year (January, February,..., December), self-indexed
(i.e.,self-e numerated)	
Microbial Source
Module
l.Whela
(2015)
1(1,6-17,27-
32,35,36,38,39)
Areas associated with each land use type (LandUse) per subwatershed
(Subwatershed)
SDMPBOutput. Subwatershed
SDMPBOutput.LandUse
Microbial Source
Module
l.Whela
(2015)
1(6,7,9-17,23,
41)
Fraction of the Urbanized Area contributed by the four urbanized types
(Urbanized) per subwatershed (Subwatershed), land use type (LandUse), a
urbanized type (Urbanized) (i.e., ratio of the area associated with each
urbanized type and total urbanized area). Fractions must total 1.0.	
SDMPBOutput. Subwatershed
SDMPBOutput.LandUse
Microbial Source
Module
l.Whela
(2015)
NumberOfAnim
Number of dome stic
s (Agricultural) per subwatershed (Subwatershed)
SDMPBOutput. Subwatershed
SDMPBOutput.Agricultural
Microbial Source
Module
l.Whela
(2015)
Point source discharge per subwatershed (Subwatershed) by month
(Month ID)	
SDMPBOutput. Subwatershed
Microbial Source
Module	
l.Whela
(2015)
PointMi crobeRate
Microbial loadings to the stream associated with the point sc
per month (MonthID) per subwatershed (Subwatershed)
Microbial Counts/Volui
Microbial Counts/L
SDMPBOutput. Subwatershed
Microbial Source
Module
l.Whela
(2015)
Number of septic systems per subwatershed (Subwatershed)
SDMPBOutput. Subwatershed
Microbial Source
Module
l.Whela
(2015)
Fraction ofmanure applied to sc
animal (Agricultural)	
ach month (MonthID) per dom
SDMPBOutput.Agricultural
Microbial Source
Module	
l.Whela
(2015)
SubUrbanizedBuiltUpRate
General microbial loading rates bysub-urbar
(Microbial Counts/Time/Area)	
:ed (SubUrbanized)category
Microbial Counts/Area/Tim
Microbial Courts/Acre/d
SDMPBOutput.SubUrbani
Microbial Source
Module
l.Whela
(2015)
Typical
mber of wildlife (Wi Idlife) per unit area by la nduse (LandUse)
SDMPBOutput.Wildlife
SDMPBOutput.LandUse
Microbial Source
Module
l.Whela
(2015)
First-ordermicrobia
month (MonthID)
/ation/die-offrate on the land surface defined per
Microbial Source
Module
l.Whela
(2015)
Numberofgrazingdays perdoi
(MonthID)	
:ultural)permonth
SDMPBOutput.Agricultural
Microbial Source
Module
l.Whela
(2015)
Manure Incorporate dintoSoil
Fraction ofamount >
incorporated into s<
SDMPBOutput.Agricultural
Microbial Source
Module
l.Whela
(2015)
MicrobeAnimalProductionRates
Production or shedding rate ofi
equals the multiple ofthe l)Do
(ww) pertimeand2) Microbial
from the domest
nimal sheddingr;
tion based on m.
animal, which
; in mass ofwaste
s ofwaste shed by
Microbial Counts/Tim
Microbial Counts/d
SDMPBOutput.Agricultural
Microbial Source
Module
l.Whela
(2015)
MicrobeWildlifeProductionRates
Typic;
obial production or shedding rate per wildlife (Wildlife)
Microbial Counts/Tim
Microbial Counts/d
SDMPBOutput.Wildlife
Microbial Source
Module
l.Whela
(2015)
SepticNumberPeople
Average numberofpeople persepticsystem a
Microbial Source
Module
l.Whela
(2015)
;eptic system waste a
Microbial Counts/Volui
Microbial Counts/L
Microbial Source
Module	
l.Whela
(2015)
SepticFailureRate
Typical fraction of septic systems that fail i
Microbial Source
Module
l.Whela
(2015)
SepticOvercharge
/ercharge flow rate per person (e.g., gal/d/pei
Volume/Time/Unitlei
Microbial Source
Module
l.Whela
(2015)
MSMInput TimeSpentlnStream
Fraction ofthe
spends time in
mber ofgrazingdays that a dom
tream per month (MonthID)
FALSE TRUE 0 1 Ratio
TRUE SDMPBOutput.Agricultural
Independent Input
Microbial Source
l.Whela
(2015)
1Whelan, G. R. Parmar, G.F. Laniak. 2015. Microbial Source Module (MSM): Documenting the Science and Software for Discovery, Evaluation, and Integration. U.S. Environmental Protection Agency, Office of Research and Development,
Athens, GA.

-------
Table 4. Microbial Source Module Component - Output: Relevant Vocabulary, Taxonomy, Metadata, Syntactics, Semantics, and Ontology
Dictionary Name
Variable Name
Parameter Description
Cardinality |
Data Type (Float, Integer, etc.) (Priviledge:
0=lnput, 1=BC)
Primary Key (i.e., used as a Unversal
parameter?)
Scaler [Not Self-Indexed (i.e., not self-
enumerated) = True]
Minimum 1
Maximum
Measure
Unit
Stochastic (Is it allowed to change in a Monte
Carlo analysis?)
Index 1
Index2
IndexB
Parameter Type [Independent, Dependent,
Parameter (e.g., Index)]
Parameter Function (Input, Output, Internal)
Component
Document (Reference number with
reference)
Equation in Document that Defines Variable
(Reference numbers with relevant equations
in parentheses)
Equations in Document that use Variable
(Reference numbers with relevant equations
in parentheses)
Equation Type
Relevant Assumption (Reference number
with relevant assumptions in parentheses)
MSMOutput
AccumulationRateMonth
			, 		 ,					
land surface per land use type (LandUse) area per
subwatershed (Sub watershed) without die-off (a. k.a. ACQOP-
Month in HSPF)
3
FLOAT
FALSE
TRUE
0
1E+38
Microbial
Counts/Area/Time
Microbial
Counts/Ac re/d
FALSE
SDMPBO utput.Subwatershed
SDMPBOut put. Land Use
MSMInput.MonthID
Dependent
Output
Microbial
Source
Module
1. Whelan et
al. (2015)
1 (27-30)
1(31)
Algebraic

MSMOutput
Storage Limit Month
Maximum microbial storage by month (MonthID) per land
use type (LandUse)area per subwatershed (Subwatershed),
summed across all domestic animals (Agricultural)and
wildlife (Wildlife), adjusted for die-off (a.k.a. SQOLIM-Month
in HSPF)
3
FLOAT
FALSE
TRUE
0
1E+38
Microbial
Counts/Area
Microbial
Counts/Ac re
FALSE
SDMPBO utput.Subwatershed
SDMPBOut put. Land Use
MSMInput.MonthID
Dependent
Output
Microbial
Source
Module
1. Whelan et
al. (2015)
1(31)

Algebraic

MSMOutput
PointFlowToStream
Flow rate to the stream from Point Sources per month
(MonthID) per subwatershed (Subwatershed)
2
FLOAT
FALSE
TRUE
0
1E+38
Volume/Time
ftA3/sec
FALSE
SDMPBO utput.Subwatershed
MSMInput.MonthID

Dependent
Output
Microbial
Source
Module
1. Whelan et
al. (2015)
1(36)

Algebraic
1(14)
MSMOutput
PointM icrobeRateToStream
Microbial loading rate to the stream from Point Sources by
Subwatershed (Subwatershed) by month (MonthID)
2
FLOAT
FALSE
TRUE
0
1E+38
Microbial
Counts/Time
Microbial
Counts/hr
FALSE
SDMPBO utput.Subwatershed
MSMInput.MonthID

Dependent
Output
Microbial
Source
Module
1. Whelan et
al. (2015)
1(35)

Algebraic
1(14)
1Whelan, G. R. Parmar, G.F. Laniak. 2015. Microbial Source Module (MSM): Documenting the Science and Software for Discovery, Evaluation, and Integration. U.S. Environmental Protection Agency, Office of Research and Development,
Athens, GA.
21

-------
Table 5. Glossary of Microbial Source Module Input Variables [Descriptors in parentheses refer to
indices outlined in Table 7.]
Index
Definition
Application
Fraction of manure applied to soil each month (MonthID) per
domestic animal (Agricultural)
Area
Areas associated with each land use type (LandUse) per
subwatershed (Subwatershed)
AreaFraction
Fraction of the Urbanized Area contributed by the four urbanized
types (Urbanized) per subwatershed (Subwatershed), land use
type (LandUse), and urbanized type (Urbanized) (i.e., ratio of the
area associated with each urbanized type and total urbanized
area). Fractions must total 1.0.
Density
Typical number of wildlife (Wildlife) per unit area by landuse
(LandUse) pattern
DieOff
First-order microbial inactivation/die-off rate on the land surface
defined per month (MonthID)
GrazingDays
Number of grazing days per domestic animal (Agricultural) per
month (MonthID)
ManurelncorporatedlntoSoil
Fraction of amount of manure shed by domestic animal
(Agricultural) incorporated into soil
MicrobeAnimalProductionRates
Production or shedding rate of microbes from the domestic
animal, which equals the multiple of the 1) Domestic animal
shedding rate in mass of waste (ww) per time and 2) Microbial
concentration based on mass of waste shed by domestic animal
(Agricultural)
MicrobeWildlifeProductionRates
Typical microbial production or shedding rate per wildlife
(Wildlife)
NumberOfAnimals
Number of domestic animals (Agricultural) per subwatershed
(Subwatershed)
PointFlow
Point source discharge per subwatershed (Subwatershed) by
month (MonthID)
PointMicrobeRate
Microbial loadings to the stream associated with the point source
discharge per month (MonthID) per subwatershed
(Subwatershed)
SepticNumberPeople
Average number of people per septic system across the study
area
SepticConc
Typical microbial concentration in septic system waste across the
study area
SepticFailureRate
Typical fraction of septic systems that failure across the study
area
SepticNumber
Number of septic systems per subwatershed (Subwatershed)
SepticOvercharge
Typical septic overcharge flow rate per person (e.g., gal/d/person)
SubUrbanizedBuiltUpRate
General microbial loading rates by sub-urbanized (SubUrbanized)
category
TimeSpentlnStreams
Fraction of the number of grazing days that a domestic animal
(Agricultural) spends time in a stream per month (MonthID)

-------
Table 6. Glossary of Microbial Source Module Output Variables [Descriptors in parentheses refer to
indices outlined in Table 7.]
Index
Definition
AccumulationRateMonth
Rate of monthly (MonthID) microbial accumulation on the land
surface per land use type (LandUse) area per subwatershed
(Subwatershed) without die-off (a.k.a. ACQOP-Month in HSPF)
PointFlowToStream
Flow rate to the stream from Point Sources per month (MonthID)
per subwatershed (Subwatershed)
PointMicrobeRateToStream
Microbial loading rate to the stream from Point Sources by
Subwatershed (Subwatershed) by month (MonthID)
StorageLimitMonth
Maximum microbial storage by month (MonthID) per land use
type (LandUse) area per subwatershed (Subwatershed), summed
across all domestic animals (Agricultural) and wildlife (Wildlife),
adjusted for die-off (a.k.a. SQOLIM-Month in HSPF)
•	Cardinality [Number of elements in a set or grouping, as a property of that parameter/variable
(dimensions). For example, if the variable "Area" (see Tables 3 and 5) is a function of its location
(subwatershed) and land use type (LandUse) (see Table 3), it has a cardinality of 2 and
Subwatershed and LandUse (see Tables 3 and 7) will be classified as parameters (versus
variables).
•	Data Type (String, Float, Integer, Logical)
•	Primary Key [Parameters/Variables that can be identified and defined only once in a workflow
ontology, so that the universal parameter/variable is equally recognized by all components
within a workflow, so that the universal parameter/variable is equally recognized by all
components within a workflow (e.g., when all components use the same time reference)]
•	Scaler [If TRUE, the variable is not part of a list. If FALSE, it is part of a list and is considered self-
indexed (a function of itself) or self-enumerated (specified one after another). For example, a
time series is typically self-enumerated, so the first time is indexed to 1, the second time to 2,
etc. Self-indexing (i.e., being non-scaler) increases the parameter/variable cardinality by one.]
•	Parameter/Variable Range (Minimum and Maximum)
•	Measure (Categorizes a collection of units that inherit the same measuring properties; for
example, meter, foot, and yard are units for length.)
•	Parameter/Variable Units (Scaling properties within the same measure.)
•	Stochastic (Identifies parameters/variables available for statistical manipulation, such as Monte
Carlo)
•	Indices (Elements in a set or grouping, as a property of that parameter/variable; see Table 7)
•	Parameter/Variable Type (Independent, Dependent, Parameter, or Constant)
•	Parameter/Variable Function (Input, Output, Internal: whether the parameter/variable
represents input, output, or is associated with linking input to output)
•	Component (Identifies the component that defines the parameter/variable)
•	Document (Identifies the document related to the parameter's/variable's descriptions,
equations, and assumptions)
•	Equation in Document that Defines Parameter/Variable
•	Equations in Document that use Parameter/Variable
•	Equation Type (Algebraic, Differential, or Integral)
•	Relevant Assumption (Assumptions that impact the parameter's/variable's use and/or value)
23

-------
Table 7. List of indices associated with Parameters
Index
Definition
Agricultural
Domestic Animal Name. There are seven domestic animal name designations:
•	DairyCow: Dairy Cow
•	Beef Cow: Beef Cattle
•	Swine
•	Poultry
•	Horse
•	Sheep
•	OtherAgAnimal: Other Agricultural Animal
LandUse
Land use Type. There are four land use type designations:
•	Forest
•	Cropland
•	Pasture
•	Urbanized (a.k.a. Builtup)
MonthID
Name of the Month: January, February, March, April, May, June, July, August,
September, October, November, December
SubUrbanized
Sub-urbanized Built up area. There are five name designations:
•	Commercial
•	SingleFamilyLowDensity: Single Family Low Density
•	SingleFamilyHighDensity: Single Family High Density
•	MultiFamilyResidential: Multi-family Residential
•	Road
Subwatershed
Subwatershed Identification designation
Urbanized
Urbanized or Builtup areas. There are four Urbanized designations:
•	CommercialAndServices: Commercial and Services
•	Residential
•	MixedUrban: Mixed Urban
•	TransportationCommunicationUtilities: Transportation,
Communication, Utilities
Wildlife
Wildlife Name: There are six wildlife name designations:
•	Duck
•	Goose
•	Deer
•	Beaver
•	Racoon
•	OtherWildlife: Other Wildlife
4.5.3.2 Indices
The first seven parameters listed in Table 3 correspond to the seven indices outlined in Table 7 (i.e.,
Agricultural, Landllse, Subwatershed, Urbanized, Wildlife, and MonthID) upon which other parameters
24

-------
and variables are dependent. If a parameter/variable has an index, as illustrated in Tables 3 and 4, that
parameter/variable is a function of that index (i.e., another parameter). For example, microbial die-off
(DieOff in Tables 3 and 5) is a function of the month of the year (January, February,..., December; as
captured with MonthID in Tables 3 and 7); hence, DieOff has 12 associated values, one for each month
[i.e., DieOff(MonthlD)]:
DieOff(January)
DieOff(February)
DieOff(December)
MonthID is a parameter but also an index. Each index may, therefore, be described by one or more
elements: MonthID has 12, LandUse has four (Forest, Cropland, Pasture, and Urbanized), etc. Indices
and their assigned elements are reported in Table 7.
Indices organize the dimensionality of a system by providing hierarchical relationships (i.e., context)
between variables and parameters, supporting the concept of semantics (see Table 1). VVheian et al.
(2014a) note that semantics refers to the meaning of data and their relationship to other data, including
indices, by relating content and representation of information resources to entities and concepts in the
real world (Meersman and Mark, 1997; Wang et al., 2009).
Some parameters/variables may be a function of multiple indices, such as the variable "Area" (see
Tables 3 and 5), which is function of its location (i.e., Subwatershed) and land use type (i.e., LandUse)
(see Tables 3 and 7). When a parameter/variable is a function of multiple indices, a hierarchical
relationship exists between multiple indices (i.e., one index is essentially contained within another). For
example, the variable "PointFlow" (see Table 3) is defined with the indices of Subwatershed and
MonthID; thus, there will be a value for "PointFlow" for each combination of Subwatershed and
MonthID; a relationship that can be expressed as:
PointFlow(Subwatershed, MonthID)
or
PointFlow(Month ID,Subwatershed)
In this case, the list of values remains the same, and the order in which they are referenced, using
indices, is simply reversed. Both expressions are valid, although it is desirable to establish a consistent
ordering of indices to facilitate software and documentation development. The following logic was used
to prioritize the order of indices for MSM parameters and variables: Subwatershed, Agricultural,
Wildlife, LandUse, Urbanized, SubUrbanized, and MonthID. All ontological metadata contained in tables,
such as Tables 3 and 4, prioritize their indices (Index 1 to Index 3) in this order.
A glossary of indices defining associations between variables and parameters by identifying their
correlations, which help to define metadata associated with input and output variables, are provided in
Table 7, and Tables 5 and 6 provide glossaries of the MSM input and output parameters/variables,
respectively. The glossaries are intended to be easy look-up tables.
25

-------
4.15.4 Equation Class
The Equation Class describes all equations used by the MSM component, translating information from
input to output. The purpose is to cross-correlate input, output, and internal variables; equations using
or defining the variables; and associated assumptions. Internal variables refer to those used within the
mathematical formulations, not consumed as input or produced as output. This section is subdivided as
follows:
•	Summary of Assumptions and Constraints impacting the variables and their use within the MSM
•	Domestic Animal Waste Available for Land Application and Wildlife Shedding Rates, Calculations
associated with Domestic Animal Waste available for Land Application and Wildlife Shedding
Rates
•	Accumulated Microbial Loading Rates on Cropland
•	Accumulated Microbial Loading Rates on Pasture
•	Accumulated Microbial Loading Rates on Forest
•	Accumulated Microbial Loading Rates on Urbanized Areas
•	Accumulated Overland Microbial Loading Rates to the Land Surface, Adjusted for Die-off
•	Microbial Point Source Loading Rates
4.5.4.1 Summary of Assumptions and Constraints
1.	The 22 or more land use types associated with the National Land Cover Database (NLCD) are
consolidated into Cropland, Pastureland, Forest, and Urbanized, providing a more manageable
modeling set when land use is used as an index, since supporting data for finer granularity are
not available.
2.	Urbanized land is subdivided into Commercial and Services; Mixed Urban or Built-Up;
Residential; and Transportation, Communications, and Utilities. A single, weighted Urbanized
loading rate is quantified for each subwatershed (all months) based on individual Urbanized land
uses present. Loading rates are calculated for each Urbanized land-use category by:
o Commercial and Services: Commercial
o Mixed Urban or Built-up: Average microbial accumulation rates for Road, Commercial,
Single family low density, Single family high density, and Multifamily residential
o Residential: Average microbial accumulation rates for Single family low density, Single
family high density, and Multifamily residential
o Transportation, Communications, and Utilities
3.	Fecal shedding from animals is used for loading estimates of all land uses except Urbanized.
4.	Manures from Swine and Poultry are assumed to be collected and applied to Cropland.
5.	Beef Cattle/Dairy Cow manure is assumed to be applied only to Cropland and Pastureland by the
same method.
6.	Dairy Cows are only kept in feedlots; therefore, all of their waste is used for manure application,
divided between Cropland and Pastureland.
7.	Beef Cattle are kept in feedlots or allowed to graze by month. During grazing, a specified
percentage of cattle also have direct access to streams; therefore, Beef Cattle waste is either
applied as manure to Cropland and Pastureland, contributed directly to Pasture (shedding) or
Streams (shedding).
8.	Horse manure not deposited in Pastureland during grazing is assumed to be collected and
applied to Pastureland.
26

-------
9.	Manures from Beef Cattle, Horses, Sheep, and Other domestic animals are assumed to
contribute to Pastureland in proportion to time spent grazing. Sheep and Other domestic animal
manures not deposited to Pastureland during grazing are assumed to be collected and treated
or transported out of the watershed.
10.	Wildlife densities are provided for all land uses except Built-up and assumed to be the same in
all subwatersheds. The wildlife population is the only microbial contributor considered to Forest.
11.	Fraction of annual domestic animal manure application available for runoff each month (EPA,
2013b, 2013c) = [Fraction of manure applied] * {1 - [Fraction of manure incorporated] / 2}
12.	Because Beef Cattle are allowed to graze, they are assumed to have access to streams; direct
contribution of microbes from Beef Cattle to a stream through shedding is thus represented as a
point source. Dairy Cows are not allowed to graze and, therefore, do not have access to streams.
13.	Direct contributions of microbes from Septics and Point sources to a stream are represented as
point sources.
14.	Only one Point Source is allowed per Subwatershed, so point sources are aggregated.
Assumptions and constraints that correlate manure application with land use type by domestic animal
and wildlife are summarized in Table 8. An index glossary that correlates subscripts used within the
mathematical formulas is provided in Table 9. Subscripts relate to the indices associated with the MSM
parameters, as summarized in Table 7; included is an index on the microbe, which accounts for indicator
bacteria, pathogen bacteria, protozoa, and viruses:
Microbial Name (Name)
•	Indicator Bacteria: E. coli, Enterococci, Clostridium perfringens, Fecal Coliforms, Bacteroides
•	Pathogen Bacteria: Salmonella spp., Campylobacter jejuni, E. coli 0157:1-17, Listeria,
Mycobacterium avium paratuberculosis
•	Pathogen Protozoa: Cryptosporidium parvum, Giardia lamblia, Toxoplasma gondii
•	Pathogen Viruses: Enterovirus, Rotavirus, Adenovirus, Norovirus
Table 8. Correlation of Manure Application with Land Use Type by Domestic Animal and Wildlife
Manure Application
Correlated to Land Use
Domestic Animals and Wildlife
Beef Cow
Dairy Cow
Swine
Poultry
Horse
Sheep
Other
Wildlife
Cropland
Grazing/Shedding







X
Pasture
Grazing/Shedding
X



X
X
X
X
Forest Shedding







X
In Stream Shedding
X







Cropland Application
X
X
X
X




Pasture Application
X
X


X



Notes:
1.	Any domestic animal "Application" has a complementing value for "ManurelncorporatedlntoSoil."
2.	All domestic animals and wildlife have production rates associated with them (i.e.,
"MicrobeAnimalProductionRates" and "MicrobeWildlifeProductionRates," respectively).
27

-------
Table 9. Index Glossary used in the Mathematical Formulations
Index
Description
i
Subwatershed ID
k
Microbe (1 = E. coli, 2 = Enterococci, 3 = Clostridium perfringens, 4 = Fecal Coliforms, 5 =
Bacteroides, 6 = Salmonella spp., 7 = Campylobacter jejuni, 8 = E. coli 0157:1-17, 9 = Listeria,
10 = Mycobacterium avium paratuberculosis, 11 = Cryptosporidium parvum, 12 = Giardia
lamblia, 13 = Toxoplasma gondii, 14 = Enterovirus, 15 = Rotavirus, 16 = Adenovirus, 17 =
Norovirus)
e
Land Use Type (1 = Cropland, 2 = Pasture, 3 = Forest, 4 = Urbanized)
m
Domestic Animal [1 = Dairy Cow (DairyCow), 2 = Beef Cattle (BeefCow), 3 = Swine, 4 =
Poultry, 5 = Horse, 6 = Sheep, 7 = Other Agricultural Animal (OtherAgAnimal)]
n
Wildlife (1 = Duck, 2 = Goose, 3 = Deer, 4 = Beaver, 5 = Racoon, 6 = Other Wildlife)
o
Septic ID
P
Point Source Name (i.e., ID)
q
Month of the year (January to December)
r
Urbanized category (1 = Commercial and Services; 2 = Mixed Urban or Built-Up; 3 =
Residential; and 4 = Transportation, Communications and Utilities)
s
Animal Location ID
u
Sub-urbanized category (1 = Commercial, 2 = Single Family Low Density, 3 = Single Family
High Density, 4 = Multi-family Residential, 5 = Road)
Although the microbial name is not needed by the MSM because it handles only one microbe at a time,
it is presented in the mathematical formulations for completeness and corresponds to a microbial
database (VVheian et a!., 2014d) under development. Names are also included in the formulas because
other modules within a workflow may contain parameters/variables that are a function of the microbial
name. A glossary of internal variables used to link input and output variables is presented in Table 10,
and the corresponding ontological dictionary (similar to Tables 3 and 4) is presented in Table 11.
4.5.4.2 Domestic Animal Waste Available for Land Application and Wildlife .Shedding Rates
4,5,4,2,1 Domestic Animal Waste Available for Runoff
The fraction of annual manure application available for runoff each month by domestic animal, based on
the monthly fraction applied and incorporated into the soil, is computed as follows:
FractionManureAvailableRunoffm,q = (Applicationm,q)] [1 - (ManurelncorporatedlntoSoiL) / 2]
(1)
in which
AnimalFractionAvailablem = 1 - (ManurelncorporatedlntoSoilm) / 2
(2)
where
28

-------
Table 10. Glossary of Internal Variables (not including constants) [Descriptors in parentheses refer to
indices outlined in Table 7.]
Index
Definition
AccumBuiltUpRate
Accumulated microbial loading rate associated with the Urbanized land use type
(LandUse = Urbanized) per subwatershed (Subwatershed), weighted by the areas
associated with four Urbanized categories for all months (i.e., applicable throughout
the year)
AnimalFractionAvailable
Fraction of manure shed by domestic animal (Agricultural) that is applied to the land
and available for runoff
BeefCowMicrobeRateApply
The microbial loading rate due to manure application associated with the domestic
animal Beef Cattle (Agricultural = BeefCow) for land use types Cropland and Pasture
(LandUse = Cropland, LandUse = Pasture) by subwatershed (Subwatershed) by
month (MonthID)
BeefCowMicrobeRateShed
The microbial loading rate to land use type Pasture (LandUse = Pasture) due to
grazing of domestic animal Beef Cattle (Agricultural = BeefCow) by subwatershed
(Subwatershed) by month (MonthID)
BeefCowStreamMicrobeRate
Microbial loading rate of domestic animal Beef Cattle (Agricultural = BeefCow)
shedding into a stream by subwatershed (Subwatershed) by month
BuiltUpRate
Accumulation rates in median microbial counts by microbe per Urbanized land type
(LandUse = Urbanized) per area per time, indexed by the Urbanized subcategories
DairyCowMicrobeRateApply
The microbial loading rate due to manure application associated with the domestic
animal Dairy Cow (Agricultural = DairyCow) for land use types Cropland and Pasture
(LandUse = Cropland, LandUse = Pasture) by subwatershed (Subwatershed) by
month (MonthID)
Fraction ManureAvailableRunoff
Fraction of annual manure from domestic animal (Agricultural) applied to the land
surface that is available for runoff each month (MonthID)
HorsesMicrobeRateApply
The microbial loading rate due to manure application associated with the domestic
animal Horses (Agricultural = Horses) for land use type Pasture (LandUse = Pasture)
by subwatershed (Subwatershed) by month (MonthID)
HorsesMicrobeRateShed
The microbial loading rate to land use type Pasture (LandUse = Pasture) due to
manure application of domestic animal Horses (Agricultural = Horses) by
subwatershed (Subwatershed) by month (MonthID)
OtherAgAnimalMicrobeRateShed
The microbial loading rate to land use type Pasture (LandUse = Pasture) due to
manure application of Other Agricultural Animals (Agricultural = OtherAgAnimal) by
subwatershed (Subwatershed) by month (MonthID)
PoultryMicrobeRateApply
The microbial loading rate due to manure application associated with the domestic
animal Poultry (Agricultural = Poultry) for land use type Cropland (LandUse =
Cropland) by subwatershed (Subwatershed) by month (MonthID)
SepticStreamFlowRate
Average septic flow rate to the stream by subwatershed (Subwatershed)
SepticStreamLoadingRate
Microbial loading rate to the stream from leaking septic systems by subwatershed
(Subwatershed)
SheepMicrobeRateShed
The microbial loading rate to land use type Pasture (LandUse = Pasture) due to
manure application of domestic animal Sheep (Agricultural = Sheep) by
subwatershed (Subwatershed) by month (MonthID)
SwineMicrobeRateApply
The microbial loading rate to land use type Cropland (LandUse = Cropland) due to
manure application of domestic animal Swine (Agricultural = Swine) by
subwatershed (Subwatershed) by month (MonthID)
TotalGrazeDays
Total number of grazing days per year by agricultural domestic animal (Agricultural)
WildLifeMicrobeRateShed
Microbial shedding rate by wildlife (Wildlife) by land-use-type (LandUse) area
WildLifeMicrobeRateShedSum
Total microbial shedding rate per land-use-type (LandUse) area, summed across all
wildlife (Wildlife)
29

-------
Table 11. Microbial Source Module Component - Internally Computed Variab es: Relevant Vocabu ary, Taxonomy, Metadata, Syntactics, Semantics, and Ontology
MSMlntemalVariables
AccumBuiltUpRate
Accumulated microbial loading rate associated v
Urbanized) per subwatershed (Subwatershed), w
Urbanized categories (i.eCommercial and Servi¦
Communication, and Utilities)for all months (i .e.
ith the Urbanized land use type (LandUse =
ighted by the areas associated with four
e; Residential; Mixed Urban; Transportation,
applicable throughout the year)
Microbial
Counts/Area/Tim
Microbial
Counts/ac/d
SDMPBOutput. Subwatershed
SDMPBOutput. LandUse
Microbial
Source Module
1. Whelan <
al. (2015)
MSMlntemalVariables
AnimalFractionAvailable
Fract
ofm
shed bydom
il (Agricultural) that is applied to the land and
SDMPBOutput.Agricultura
Microbial
Source Module
1. Whelan
al. (2015)
MSMlntemalVariables
BeefCowMi crobe Rate Apply
The microbial loading rate due to manure application associated with the domestic a nir
Beef Cattle (Agricultural = Bee fCattle) for land use types Cropland and Pasture (LandUse
Cropland, LandUse = Pasture) by subwatershed (Subwatershed) by month (Month ID)
Microbial
Counts/Area/Tim
Microbial
Counts/ac/d
SDMPBOutput. Subwatershed
SDMPBOutput. LandUse
MSMInput.MonthID
Microbial
Source Module
1. Whelan <
al. (2015)
MSMlntemalVariables
BeefCowMi crobe Rate Shed
lading rate to land use type Pasture (LandUse = Pasture) due to grazing of
11 Beef Cattle (Agricultural = BeefCow) by subwatershed (Subwatershed) by
Microbial
Counts/Area/Tim
Microbial
Counts/ac/d
SDMPBOutput. Subwatershed
SDMPBOutput. LandUse
MSMInput.MonthID
Microbial
Source Module
1. Whelan <
al. (2015)
MSMlntemalVariables
Be efCowStre am Microbe Rate
Microbial loading rate of domes tic animal Beef Cattle (Agricultural = BeefCow) shedding into a
stream by subwatershed (Subwatershed) by month (MonthID)
Microbial
Counts/Tim
Microbial
Counts/d
SDMPBOutput. Subwatershed
MSMInput.MonthID
Microbial
Source Module
1. Whelan
al. (2015)
MSMlntemalVariables
Accumulation rates in median microbial counts per a
and urbanized type (Urbanized) (i.e., built up areas)
SDMPBOutput.LandUse
SDMPBOutput. Urbanized
Microbial
Source Module
1. Whelan
al. (2015)
MSMlntemalVariables
DairyCowMi crobe Rate Apply
The microbial loading rate due to manure application associated with the domestic a i
Dairy Cow (Agricultural = DairyCow) for land use types Cropland and Pasture (LandUse
Cropland, LandUse = Pasture) by subwatershed (Subwatershed) by month (MonthID)
Microbial
Counts/Area/Tim
Microbial
Counts/ac/d
SDMPBOutput. Subwatershed
SDMPBOutput. LandUse
MSMInput.MonthID
Microbial
Source Module
1. Whelan
al. (2015)
MSMlntemalVariables
FractionManureAvailable Runoff
annual manure from domestic animal (Agricultural) applied to the land surface
lable for runoffeach month (MonthID)
SDMPBOutput.Agricultura
MSMInput.MonthID
Microbial
Source Module
1. Whelan
al. (2015)
1(6,7,9,10,
15-17)
MSMlntemalVariables
Horse sMi crobe Rate Apply
The microbial loading rate due to man
Horses (Agricultural = Horses) for land
(Subwatershed) by month (MonthID)
e application associated with the domestic animal
e type Pasture (LandUse =Pasture)bysubwatershed
Microbial
Counts/Area/Tim
Microbial
Counts/ac/d
SDMPBOutput. Subwatershed
SDMPBOutput. LandUse
MSMInput.MonthID
Microbial
Source Module
1. Whelan
al. (2015)
MSMlntemalVariables
Horse sMi crobe Rate Shed
The microbial loading rate to land use type Pasture (LandUse = Pasture) due to m
application of domes tic animal Horses (Agri cultural = Horses) by subwatershed
(Subwatershed) by month (MonthID)	
Microbial
Counts/Area/Tim
Microbial
Counts/ac/d
SDMPBOutput. Subwatershed
SDMPBOutput. LandUse
MSMInput.MonthID
Microbial
Source Module
1. Whelan <
al. (2015)
MSMlntemalVariables
OtherAgAnimalMicrobeRateShed
The microbial loading rate to land use type Pasture (LandUse = Pasture) due to manure
application of Other Agricultural Animals (Agricultural = OtherAgAnimal) by subwatershed
(Subwatershed) by month (MonthID)	
Microbial
Counts/Area/Tim
Microbial
Counts/ac/d
SDMPBOutput. Subwatershed
SDMPBOutput. LandUse
MSMInput.MonthID
Microbial
Source Module
1. Whelan <
al. (2015)
MSMlntemalVariables
PoultryMicrobeRateApply
The microbial loading rate due to manure application associated with the domestic
Poultry (Agricultural = Poultry) for land use type Cropland (LandUse = Cropland)by
subwatershed (Subwatershed) by month (MonthID)
Microbial
Counts/Area/Tim
Microbial
Counts/ac/d
SDMPBOutput. Subwatershed
SDMPBOutput. LandUse
MSMInput.MonthID
Microbial
Source Module
1. Whelan
al. (2015)
MSMlntemalVariables
Se pti cStre am FlowRa te
Average septicflo>
n bysubwatershed (Subwatershed)
SDMPBOutput. Subwatershed
Microbial
Source Module
Whelan <
.(2015)
MSMlntemalVariables
Se pti cStre am LoadingRate
Microbial loadingrate to the s
(Subwatershed)
n from leaking septic systems bysubwatershed
Microbial
Counts/Tim
SDMPBOutput. Subwatershed
Microbial
Source Module
Whelan <
.(2015)
MSMlntemalVariables
SheepMi crobe Rate Shed
The microbial loadingrate to la
application of domes tic animal
bymonth (MonthID)	
ise type Pasture (LandUse = Pasture) due to manure
ep (Agricultural = Sheep) bysubwatershed (Subwatershed)
Microbial
Counts/Area/Tim
SDMPBOutput. Subwatershed
SDMPBOutput. LandUse
MSMInput.MonthID
Microbial
Source Module
Whelan
.(2015)
MSMlntemalVariables
SwineMicrobeRate Apply
The microbial loadingra
application of dome stic
and use type Cropland (LandUse = Cropland) due to manure
il Swine (Agricultural = Swine) by subwatershed (Subwatershed)
Microbial
Counts/Area/Tim
SDMPBOutput. Subwatershed
SDMPBOutput. LandUse
MSMInput.MonthID
Microbial
Source Module
1. Whelan
al. (2015)
MSMlntemalVariables
TotalGrazeDays
Total number ofgrazingdays per year by agricultur
SDMPBOutput.Agricultura
Microbial
Source Module
Whelan <
.(2015)
MSMlntemalVariables
WildLifeMicrobeRateShed
Microbial sheddingrate bywildlife (Wildlife)byland use type (LandUse)ai
Microbial
Counts/Area/Tim
Microbial
Counts/ac/d
SDMPBOutput.Wildlife
SDMPBOutput. LandUse
Microbial
Source Module
Whelan <
.(2015)
MSMlntemalVariables
WildLifeMicrobeRateShedSuir
obiaI shedding rate per land use type (LandUse) ai
1 FLOAT FALSE TRUE 0 IEh
Microbial
Counts/Area/Tim
FALSE SDMPBOutput.LandUse
Microbial
Source Module
Whelan
.(2015)
1Whelan, G.
Athens, GA.
R. Parmar, G.F. Laniak. 2015. Microbial Source Module (MSM): Documenting the Science and Software for Discovery, Evaluation, and Integration. U.S. Environmental Protection Agency, Office of Research and Development,

-------
•	FractionManureAvailableRunoffm,q = Fraction of annual manure application available for runoff each
month (q) by domestic animal (m), equivalent to the ratio of Counts available for runoff each month
to Counts available for runoff per year (Ratio)
•	Applicationm,q = Fraction of annual manure applied each month (q) per domestic animal (m),
equivalent to the ratio of Counts applied each month to Counts applied per year (Ratio)
•	ManurelncorporatedlntoSoiL = Fraction of applied manure incorporated per domestic animal (m)
into the soil (Ratio)
•	AnimalFractionAvailablem = Fraction of domestic animal (m) manure available for runoff (Ratio)
4.5.4.2.2 Wildlife Shedding Rates
Wildlife shedding is the only manure contribution to Forest, although Wildlife also contributes to
Cropland and Pasture. The microbial shedding rate per Wildlife per land-use-type per microbe is:
WildLifeMicrobeRateShedk,«,n = (Densityi,n) (MicrobeWildlifeProductionRatesk,n) for t = 1, 2, 3
WildLifeMicrobeRateShedk,«,n = 0	for I = 4
where
•	WildLifeMicrobeRateShedk,£,n = Shedding rate by Wildlife (n) by microbe (k) per land-use-type {€)
area (Microbial Counts/Time/Area)
•	Density^ = number of wildlife (n) per unit area by land use type {€) area (Number/Area)
•	MicrobeWildlifeProductionRatesk,n = Microbial shedding rate per microbe (k) per wildlife (n)
(Microbial Counts/Time)
The total microbial shedding rate per land-use-type area by microbe, summed across all wildlife is:
WildLifeMicrobeRateShedSurrik/ = £n WildLifeMicrobeRateShedk,£,n
(5)
where
•	WildLifeMicrobeRateShedSumk,« = Total microbial shedding rate per land-use-type {€) area by
microbe (k), summed across all wildlife (n) (Microbial Counts/Time/Area)
4.5.4.3 Accumulated Microbial Loading Rates on Cropland
This section describes calculations to determine the accumulated microbial loading rate on Cropland, by
month, by subwatershed area, by domestic animal, due to manure application (i.e., non-grazing) and
wildlife shedding to the land surface.
4.5.43.1 Wildlife
The microbial loading rate to Cropland (I = 1) due to shedding per microbe (k), associated with all
Wildlife, is equal to WildLifeMicrobeRateShedSumk,£=i, with units of Microbial Counts/Time/Area.
(3)
(4)

-------
4.5.4.3.2	Dairy Cow
The microbial loading rate to Cropland (I = 1) which is the same loading rate to Pasture due to manure
application per microbe, associated with Dairy Cow, by month, by Subwatershed is equal to:
DairyCowMicrobeRateApplyk,i,«=i,q = (NumberOfAnimalSi,m=Dairycow)
(MicrobeAnimalProductionRatesk,m=Dairycow) (FractionManureAvailableRunoffm=Dairycow,q) (365 /
DaylnMonthq) / (Areai,£=i + Area,,^)
(6)
where
•	DairyCowMicrobeRateApplyk,i,£=i,q = The microbial loading rate to Cropland (I = 1) due to manure
application per microbe (k), associated with Dairy Cow, by month (q), by Subwatershed (i) (Microbial
Counts/Time/Area)
•	NumberOfAnimalSi,m=Dairycow = Number of Dairy Cows (m = DairyCow) associated with Subwatershed
indexed by (i) (Number)
•	MicrobeAnimalProductionRatesk,m=Dairycow = Production rate of microbe (k) shed by domestic animal
(m = DairyCow), equals the multiple of 1) Domestic animal shedding rate in mass of wet waste (ww)
per time (year) and 2) Microbial concentration based on mass of waste shed by domestic animal
(Counts/Time)
•	365 = Conversion constant for days in a year
•	DaylnMonthq = Conversion constant by month for days per month with January = 31, February = 28,
..., December = 31, in which the months are indexed (q) as 1 = January,..., 12 = December
•	Areai,£=i = Area associated the Cropland (I = 1) land-use type for Subwatershed (i) (Area)
•	Area,,£=2 = Area associated the Pasture (I = 2) land-use type for Subwatershed (i) (Area)
4.5.4.3.3	Beef Cattle
The microbial loading rate to Cropland which is the same loading rate to Pasture due to manure
application per microbe, associated with Beef Cattle, by month, by Subwatershed is equal to:
BeefCowMicrobeRateApplyk,m=Beefcow,i,q,«=i = (NumberOfAnimalSi,m=Beefcow)
(MicrobeAnimalProductionRatesk,m=Beefcow) (FractionManureAvailableRunoffm=Beefcow,q) [(365 -
TotalGrazeDaysm=Beefcow) / DaylnMonthq] / (Areai,£=i + Area,,^)
(7)
in which
TotalGrazeDaysm = £q GrazingDaysm,q
(8)
where
•	BeefCowMicrobeRateApplyk,i,«=i,q = The microbial loading rate to Cropland (I = 1) due to manure
application per microbe (k), associated with Beef Cattle, by month (q), by Subwatershed (i)
(Microbial Counts/Time/Area)
•	NumberOfAnimalSi,m=Beefcow = Number of Beef Cattle (m = BeefCow) associated with
Subwatershed indexed by (i) (Number)
32

-------
•	MicrobeAnimalProductionRatesk,m=Beefcow = Production rate of microbe (k) shed by domestic
animal (m = BeefCow), equals the multiple of 1) Domestic animal shedding rate in mass of wet
waste (ww) per time (year) and 2) Microbial concentration based on mass of waste shed by
domestic animal (Counts/Time)
•	GrazingDaysm=Beefcow,q = Number of grazing days by Beef Cattle (m = BeefCow) by month (q)
(Number)
•	TotalGrazeDaysm=Beefcow = Total number of grazing days per year for Beef Cattle (m = BeefCow)
(Number)
4.5.43.4 Poultry
The microbial loading rate to Cropland due to manure application per microbe, associated with Poultry,
by month, by Subwatershed is equal to:
PoultryMicrobeRateApplyk,i,£=i,q = (Number0fAnimalSi,m=pOuitry) (MicrobeAnimalProductionRatesk,m=Pouitry)
(FractionManureAvailableRunoffm=Pouitry,q) (365 / DaylnMonthq) / (Areai,£=i)
(9)
where
•	PoultryMicrobeRateApplyk,i,£=i,q = The microbial loading rate to Cropland (I = 1) due to manure
application per microbe (k), associated with Poultry (m = Poultry), by month (q), by
Subwatershed (i) (Microbial Counts/Time/Area)
•	NumberOfAnimalSi,m=Pouitry = Number of Poultry (m = Poultry) associated with Subwatershed
indexed by (i) (Number)
•	MicrobeAnimalProductionRatesk,m=Pouitry = Production rate of microbe (k), shed by domestic
animal (m = Poultry), equals the multiple of 1) Domestic animal shedding rate in mass of wet
waste (ww) per time (year) and 2) Microbial concentration based on mass of waste shed by
domestic animal (Counts/Time)
4.5.4.3.5 Swine
The microbial loading rate to Cropland due to manure application per microbe, associated with Swine,
by month, by Subwatershed is equal to:
SwineMicrobeRateApplyk,i,«=i,q = (NumberOfAnimalSi,m=Swine) (MicrobeAnimalProductionRatesk,m=swine)
(FractionManureAvailableRunoffm=swine,q) (365 / DaylnMonthq) / (Areai,£=i)
(10)
where
•	SwineMicrobeRateApplyk,i,«=i,q = The microbial loading rate to Cropland (I = 1) due to manure
application per microbe (k), associated with Swine by month (q), by Subwatershed (i), (Microbial
Counts/Time/Area)
•	NumberOfAnimalSi,m=swine = Number of Swine (m = Swine) associated with Subwatershed indexed
by (i) (Number)
•	MicrobeAnimalProductionRatesk,m=swine = Concentration of microbe (k), based on mass of waste
shed by domestic animal (m = Swine) (Counts/Mass)
33

-------
• MicrobeAnimalProductionRatesk,m=swine = Production rate of microbe (k), shed by domestic
animal (m = Swine), equals the multiple of 1) Domestic animal shedding rate in mass of wet
waste (ww) per time (year) and 2) Microbial concentration based on mass of waste shed by
domestic animal (Counts/Time)
4.5.4.4 Accumulated Microbial Loading Rates on Pasture
This section describes calculations to determine accumulated microbial loading rate by month, by
subwatershed area, by animal or wildlife, for Pasture due to animal shedding (i.e., grazing) and manure
application to the land surface (i.e., non-grazing).
4,5,4,4,1 Shedding to Land Surface
4.5.4.4.1.1 Wildlife
The microbial loading rate to Pasture [t = 2) per microbe (k), associated with all Wildlife is equal to
WildLifeMicrobeRateShedSumk,£=2.
4.5.4.4.1.2 Beef Cattle
The microbial loading rate to Pasture due to grazing per microbe, associated with Beef Cattle, by month,
by Subwatershed is equal to:
BeefCowMicrobeRateShedk,i,£=2,q = (NumberOfAnimalSi,m=Beefcow)
(MicrobeAnimalProductionRatesk,m=Beefcow) (GrazingDaysm=Beefcow,q) (1 - TimeSpentlnStreamsm=Beefcow,q) /
(Areai ,«=2)
(11)
where
•	BeefCowMicrobeRateShedk,i,£=2,q = The microbial loading rate to Pasture (I = 2) due to grazing
per microbe (k), associated with Beef Cattle by month (q), by Subwatershed (i), (Microbial
Counts/Time/Area)
•	NumberOfAnimalSi,m=Beefcow = Number of Beef Cattle (m = BeefCow) associated with
Subwatershed indexed by (i) (Number)
•	MicrobeAnimalProductionRatesk,m=Beefcow = Production rate of microbe (k), shed by domestic
animal (m = BeefCow), equals the multiple of 1) Domestic animal shedding rate in mass of wet
waste (ww) per time (year) and 2) Microbial concentration based on mass of waste shed by
domestic animal (Counts/Time)
•	GrazingDaysm=Beefcow,q = Number of grazing days by Beef Cattle (m = BeefCow), by month (q)
(Number)
•	TimeSpentlnStreamsm=Beefcow,q = Fraction of grazing days of domestic animal Beef Cattle (m =
BeefCow) instream each month (q) (Ratio)
4.5.4.4.13 Horses
The microbial loading rate to Pasture due to grazing per microbe, associated with Horses, by month, by
Subwatershed is equal to:
34

-------
HorsesMicrobeRateShedk,i,«=2,q = (NumberOfAnimalSi,m=Horses) (MicrobeAnimalProductionRatesk,m=Horses)
(GrazingDaysm=Horses,q) / (Area^)
(12)
where
•	HorsesMicrobeRateShedk,i,«=2,q = The microbial loading rate to Pasture (I = 2) due to grazing per
microbe (k), associated with Horses, by month (q), by Subwatershed (i), (Microbial
Counts/Time/Area)
•	NumberOfAnimalSi,m=Horses = Number of Horses (m = Horses) associated with Subwatershed
indexed by (i) (Number)
•	MicrobeAnimalProductionRatesk,m=Horses = Production rate of microbe (k), shed by domestic
animal (m = Horses), equals the multiple of 1) Domestic animal shedding rate in mass of wet
waste (ww) per time (year) and 2) Microbial concentration based on mass of waste shed by
domestic animal (Counts/Time)
•	GrazingDaysm=Horses,q = Number of grazing days by Horses (m = Horses), by month (q), (Number)
4.5.4.4.1.4	Sheep
The microbial loading rate to Pasture due to grazing per microbe, associated with Sheep, by month, by
Subwatershed is equal to:
SheepMicrobeRateShedk,i,£=2,q = (NumberOfAnimalSi,m=SheeP) (MicrobeAnimalProductionRatesk,m=sheeP)
(GrazingDaysm=Sheep,q) / (Area^)
(13)
where
•	SheepMicrobeRateShedk,i,£=2,q = The microbial loading rate to Pasture (I = 2) due to grazing per
microbe (k), associated with Sheep (m = Sheep), by month (q), by Subwatershed (i) (Microbial
Counts/Time/Area)
•	NumberOfAnimalSi,m=sheep = Number of Sheep (m = Sheep) associated with Subwatershed
indexed by (i) (Number)
•	MicrobeAnimalProductionRatesk,m=sheep = Production rate of microbe (k), shed by domestic
animal (m = Sheep), equals the multiple of 1) Domestic animal shedding rate in mass of wet
waste (ww) per time (year) and 2) Microbial concentration based on mass of waste shed by
domestic animal (Counts/Time)
•	GrazingDaysm=sheep,q = Number of grazing days by Sheep (m = Sheep), by month (q) (Number)
4.5.4.4.1.5	Other Agricultural Animals
The microbial loading rate to Pasture due to grazing per microbe, associated with Other Agricultural
Animals, by month, by Subwatershed is equal to:
OtherAgAnimalMicrobeRateShedk,i,«=2,q = (NumberOfAnimalSi,m=0therAgAnimai)
(MicrobeAnirnalProductionRatesk,m=otherAgAnimai) (GrazingDaysm=0therAgAnimai,q) / (Area^)
(14)
where
35

-------
•	OtherAgAnimalMicrobeRateShedk,i,£=2,q = The microbial loading rate to Pasture [t = 2) due to
grazing per microbe (k), associated with Other Agricultural Animals, by month (q), by
Subwatershed (i), (Microbial Counts/Time/Area)
•	NumberOfAnimalSi,m=otherAgAnimai = Number of Other Agricultural Animals (m = OtherAgAnimal)
associated with Subwatershed indexed by (i) (Number)
•	MicrobeAnimalProductionRatesk,m=otherAgAnimai = Production rate of microbe (k), shed by domestic
animal (m = OtherAgAnimal), equals the multiple of the 1) Domestic animal shedding rate in
mass of wet waste (ww) per time (year) and 2) Microbial concentration based on mass of waste
shed by domestic animal (Counts/Time)
•	GrazingDaysm=0therAgAnimai,q = Number of grazing days by Other Agricultural Animals (m =
OtherAgAnimal), by month (q) (Number)
4.5.4.4.2 Manure Application to Land Surface
4.5.4.4.2.1 Dairy Cow
The microbial loading rate to Pasture which is the same loading rate to Cropland, due to manure
application per microbe, associated with Dairy Cow, by month, by Subwatershed is equal to:
DairyCowMicrobeRateApplyk,i,£=2,q = (NumberOfAnimalSi,m=Dairycow)
(MicrobeAnimalProductionRatesk,m=Dairycow) (FractionManureAvailableRunoffm=Dairycow,q) (365 /
DaylnMonthq) / (Areai,£=i + Areai/=2)
(15)
where
•	DairyCowMicrobeRateApplyk,i,£=2,q = The microbial loading rate to Pasture (I = 2) due to manure
application per microbe (k), associated with Dairy Cow, by month (q), by Subwatershed (i)
(Microbial Counts/Time/Area)
4.5.4.4.2.2	Beef Cattle
The microbial loading rate due to manure application to Pasture which is the same loading rate to
Cropland per microbe, associated with Beef Cattle, by month, by Subwatershed is equal to:
BeefCowMicrobeRateApplyk,i,«=2,q = (NumberOfAnimalSi,m=Beefcow)
(MicrobeAnimalProductionRatesk,m=Beefcow) (FractionManureAvailableRunoffm=Beefcow,q) [(365 -
TotalGrazeDaysm=Beefcow) / DaylnMonthq] / (Areai,£=i + Area,,^)
(16)
where
•	BeefCowMicrobeRateApplyk,m=Beefcow,i,q,«=2 = The microbial loading rate to Pasture (I = 2) per
microbe (k), associated with Beef Cattle, by month (q,) by Subwatershed (i) (Microbial
Counts/Time/Area)
4.5.4.4.2.3	Horses
36

-------
The microbial loading rate due to manure application to Pasture which is the same loading rate to
Cropland per microbe, associated with Horses, by month, by Subwatershed is equal to:
HorsesMicrobeRateApplyk,i,£=2,q = (NumberOfAnimalSi,m=Horses) (MicrobeAnimalProductionRatesk,m=Horses)
(FractionManureAvailableRunoffm=Horses,q) [(365 -TotalGrazeDaysm=Horses) / DaylnMonthq] / (Areai,£=i +
Areai,£=2)
(17)
where
•	HorsesMicrobeRateApplyk,i,£=2,q = The microbial loading rate to Pasture (I = 2) per microbe (k),
associated with Horses by month (q), by Subwatershed (i) (Microbial Counts/Time/Area)
4.5.4.5	Accumulated Microbial Loading Rates on Forest
The microbial loading rate to Pasture (I = 3) due to shedding per microbe (k), associated with all
Wildlife, is equal to WildLifeMicrobeRateShedSumk,£=3.
4.5.4.6	Accumulated Microbial Loading Rates on Urbanized Areas
The Urbanized Land Use type category is divided into four Urbanized categories (r = 1 for Commercial
and Services; r = 2 for Mixed Urban or Built-Up; r = 3 for Residential; and r = 4 for Transportation,
Communications and Utilities) which are further divided into Sub-urbanized categories (u = 1 for
Commercial, u = 2 for SingleFamilyLowDensity, u = 3 for SingleFamilyHighDensity, u = 4 for
MultiFamilyResidential, and u = 5 for Road). Accumulation rates in median microbial counts by microbe,
per Urbanized land type area per time, indexed by the Urbanized subcategories are computed as
follows:
BuiltUpRatek,«=4,r=i = SubUrbanizedBuiltUpRatek,u=i
(18)
BuiltUpRatek,«=4,r=2 = {Iu=i,s [SubUrbanizedBuiltUpRatek,u ] } / 5
(19)
BuiltUpRatek,«=4,r=3 = {Iu=2,4 [SubUrbanizedBuiltUpRatek,u ] } / 3
(20)
BuiltUpRatek,«=4,r=4 = SubUrbanizedBuiltUpRatek,u=s
(21)
where
•	BuiltUpRatek,«=4,r = Accumulation rates in median microbial counts by microbe (k) per Urbanized
land type (I = 4) per area per time, indexed by the Urbanized categories (r) (Microbial
Counts/Time/Area)
•	SubUrbanizedBuiltUpRatek,u = General microbial loading rates, by microbe (k) by sub-urbanized
(SubUrbanized) category (u) (Microbial Counts/Time/Area)
Accumulated microbial loading rate by microbe, associated with the Urbanized land type per
subwatershed, weighted by the areas associated with four Urbanized categories (i.e., Commercial and
Service; Residential; Mixed Urban; Transportation, Communication, and Utilities), for all months (i.e.,
applicable throughout the year) is computed as follows:
37

-------
AccumBuiltupRatek,i,«=i,2,3 = 0
AccumBuiltUpRatek,i,«=4 = £r [ (AreaFractioni,£=4,r) (BuiltUpRatek,£=4,r) ]
where
(22)
(23)
•	AccumBuiltUpRatek,i,«=4 = Accumulated microbial loading rate by microbe (k) associated with
Urbanized land type (I = 4) per subwatershed (i), weighted by areas associated with four
Urbanized categories (r) (i.e., Commercial and Service; Residential; Mixed Urban;
Transportation, Communication, and Utilities), for all months (i.e., applicable throughout the
ear) (Microbial Counts/Time/Area)
•	Area Fraction,,£,r = Fraction of Urbanized land type (I = 4) which is a subset of the Land Use Type
{€), indexed to the four subcategories of Urbanized (r) (i.e., Commercial and Service; Residential-
Mixed Urban; Transportation, Communication, and Utilities), by Subwatershed (i)
4.5.4.7 Accumulated Overland Microbial Loading Rates to the Land .Surface, Adjusted for Die-off
4,5,4,7.1 Die-off A djustm en t
Microbial accumulation on the land surface and maximum microbial storage accumulation calculations
are based on formulations associated with the HSPF watershed model. Die-off or decay on overland
surfaces is simulated as a function of the input accumulation rate and maximum storage of microbes
which represents accumulation without removal (e.g., die-off, runoff, etc.). The unit removal rate
represents processes such as die-off and wind erosion (Bicknell el a!., 2005). The unit removal rate of
the stored microbes (number removed per day) is computed as the microbial accumulation rate
(Counts/ac/d), divided by the maximum microbial storage accumulation (storage limit) (Counts/ac). For
example, the removal rate = (accumulation rate) / (storage limit). The factor used to compute maximum
microbial storage accumulation on the land surface (Nt), which is computed as the sum of storages for
each day of the month:
Nt = N0 J"10"kt dt = N0 / [k (In 10)] = N0 / (2.303 k) from t = 0 to t = number of days in month
(24)
where
•	Nt = maximum microbial storage accumulation on the land surface (Counts/Area)
•	N0 = Initial uniform loading to the overland surface (Microbial Counts/Time/Area)
•	k = First-order microbial die-off rate (1/Time)
•	2.303 = Conversion constant for ln(10)
For a die-off rate of k = 0.36/d,
Nt = No/0.83 = 1.21 No
(25)
Likewise, for a die-off rate of k = 0.51/d,
38

-------
Nt= No/1.17 = 0.85 No
(26)
4,5,4,7,2 Accumulated Overland Microbial Loading Rates and Maximum Microbial Storage with Die-off
The accumulated overland microbial loading rates and maximum microbial storage with die-off is
presented by microbe, by subwatershed (i), by land use type {€), by microbe, by month.
4.5.4.7.2.1 Cropl an d
The summation of microbial loading rate by microbe (k) per subwatershed (i) by month (q) across all
domestic animals (m) and wildlife (n) on Cropland (I = 1) is computed as follows:
AccumulationRateMonthk,i,«=i,q = (WildLifeMicrobeRateShedSumk,«=i) +
(DairyCowMicrobeRateApplyk,i,£=i,q) + (BeefCowMicrobeRateApplyk,i,«=i,q) +
(PoultryMicrobeRateApplyk,i,£=i,q) + (SwineMicrobeRateApplyk,i,«=i,q)
(27)
where
• AccumulationRateMonthk,i,«=i,q = Microbial loading rate by microbe (k), per subwatershed (i), by
month (q), by Land Use Type for Cropland (I = 1), across all domestic animals (m) and wildlife (n)
(Microbial Counts/Time/Area)
4,5,4,7,2.2 Pasture
The summation of microbial loading rate by microbe (k), per subwatershed (i), by month (q), across all
domestic animals (m) and wildlife (n) on Pasture (I = 2) is computed as follows:
AccumulationRateMonthk,i,£=2,q = (WildLifeMicrobeRateShedSumk,£=2) + (BeefCowMicrobeRateShedk,i,£=2,q)
+ (HorsesMicrobeRateShedk,i,«=2,q) + (SheepMicrobeRateShedk,i,«=2,q) +
(OtherAgAnimalMicrobeRateShedk,i,£=2,q) + (DairyCowMicrobeRateApplyk,i,«=2,q) +
(BeefCowMicrobeRateApplyk,i,«=2,q) + (HorsesMicrobeRateApplyk,i,£=2,q)
(28)
where
AccumulationRateMonthk,i,£=2,q = Microbial loading rate by microbe (k), per subwatershed (i), by month
(q), by Land Use Type for Pasture (I = 2), across all domestic animals (m) and wildlife (n) (Microbial
Counts/Time/Area)
4.5.4.7.2,3 Forest
The summation of microbial loading rate by microbe (k), per subwatershed (i), by month (q), across all
domestic animals (m) and wildlife (n), on Forest (I = 3), is computed as follows:
AccumulationRateMonthk,i,£=3,q = WildLifeMicrobeRateShedSumk,«=3
(29)
where
39

-------
AccumulationRateMonthk,i,«=i,q = Microbial loading rate by microbe (k), per subwatershed (i), by month
(q), by Land Use Type for Forest (I = 3), across all domestic animals (m) and wildlife (n) (Microbial
Counts/Time/Area)
4.5.4.7.2.4 Urbanized
The summation of microbial loading rate by microbe (k), per subwatershed (i), by month (q), across all
domestic animals (m) and wildlife (n), on Urbanized (I = 4), is computed as follows:
AccumulationRateMonthk,i,«=4,q = AccumBuiltUpRatei,k,£=4
(30)
where
AccumulationRateMonthk,i,«=4,q = Microbial loading rate by microbe (k), per subwatershed (i), by month
(q), by Land Use Type for Urbanized (I = 1), across all domestic animals (m) and wildlife (n) (Microbial
Counts/Time/Area)
4.5.4.7.3 Maximum Microbial Storage with Die-off
The maximum microbial storage by microbe (k), by month (q), per subwatershed (i), by month (q), by
Land Use Type {€), across all domestic animals (m) and wildlife (n), adjusted for die-off, is computed as
follows:
StorageLimitMonthk,i,«,q= AccumulationRateMonthk,i,£,q / (2.303 DieOffk,q)
(31)
where
•	StorageLimitMonthk,i,«,q = Maximum microbial storage by microbe (k), per subwatershed (i), by
month (q), by Land Use Type {€), across all domestic animals (m) and wildlife (n), adjusted for
die-off (Counts/Area)
•	DieOffk,q = First-order microbial inactivation/die-off rate on the land surface defined by microbe
(k), by month (q), to account for warm and cold months (1/Time)
4.5.4.8 Microbial Point .Source Loading Rates
4.5.4.3.1 Cattle in Streams
The microbial loading rate of Beef Cattle (m = BeefCow) shedding into a stream by microbe (k), by
subwatershed (i), by month (q) is as follows:
BeefCowStreamMicrobeRatek,i,q = (NumberOfAnimalSi,m=Beefcow)
(MicrobeAnimalProductionRatesk,m=Beefcow) [(GrazingDaysm=Beefcow,q) / DaylnMonthq]
(TimeSpentlnStreamsm=Beefcow,q)
(32)
where
40

-------
•	BeefCowStreamMicrobeRatek,i,q = Microbial loading rate of Beef Cattle shedding into a stream
by microbe (k), by subwatershed (i), by month (q) (Microbial Counts/Time)
4.5.4.8.2	Sep tics
The average septic flow rate to the stream by subwatershed is as follows:
SepticStreamFlowRate, = (SepticNumber,) (SepticNumberPeople) (SepticOvercharge) (SepticFailureRate)
(33)
where
•	SepticStreamFlowRate, = Average septic flow rate to the stream subwatershed (i)
(Volume/Time)
•	SepticNumber, = Number of septic systems associated with Subwatershed (i) (Number)
•	SepticNumberPeople = Average number of people per septic system (Number)
•	SepticOvercharge = Typical septic overcharge flow rate (Volume/Time/Person)
•	SepticFailureRate = Typical fraction of septic systems that failure (Ratio)
The microbial loading rate associated with septic systems by microbe, by subwatershed, is as follows:
SepticStreamLoadingRatek,i = (SepticStreamFlowRate,) (SepticConck)
(34)
where
•	SepticStreamLoadingRatek,i = Microbial loading rate to the stream from leaking septic systems
by microbe (k), by subwatershed (i) (Microbial Counts/Time)
•	SepticConck = Typical microbial concentration in septic system waste by microbe (k)
(Counts/Volume)
4.5.4.8.3	Poin t So urce
Coupled with the time series associated with discharge from the point source, the time series of
microbial loadings to the stream associated with a point source by microbe (k), by subwatershed (i), by
month, is computed as follows, noting there is only one PointSource (p) per subwatershed:
PointMicrobeRateToStreamk,i,q = PointMicrobeRatek,i,q + BeefCowStreamMicrobeRatek,i,q +
SepticStreamLoadingRatek,i
(35)
where
•	PointMicrobeRateToStreamk,i,q = Microbial loading rate time series to the stream from Point
Sources in the stream by microbe (k), by subwatershed (i), by month (q) (Microbial Counts/Time)
•	PointMicrobeRatek,i,q = Microbial concentration time series associated with the Point Source
discharge by microbe (k), by subwatershed (i), by month (q) (Counts/Volume)
The flow rate, by month, that is consumed as input (PointFlow) is equal to the flow rate, by month,
produced for consumption by other models (input flow equal output flow).
41

-------
PointFlowToStrearrii,q = PointFloWi,q
(36)
where
•	PointFloWi q = Point source discharge, by month, to the stream by subwatershed (i), by month (q)
(Volume/Time)
•	PointFlowToStreami q = Point source discharge, by month, to the stream by subwatershed (i), by
month (q) (Volume/Time)
Equation (36) could be construed as redundant, but it explicitly assigns a point source discharge to the
stream in this assessment.
42

-------
5. > < m 11 M « H Milt MICROBIAL SOURCE MOC' >f t T-1111 UN A MULTI-
COMPONENT WORKFLOW
Although the focus here is to describe how to capture the ontology associated with a component (i.e.,
Microbial Source Module) for discovery, access, and execution on the web, it also provides context for
where the component fits into a larger workflow and modeling paradigm. As noted earlier, the MSM
provides microbial loading rates to overland areas (subwatersheds) and instream locations associated
with each subwatershed. To perform its calculations, the MSM needs microbial properties (e.g., die-off
rates) and the number of subwatersheds associated with the watershed delineation, since a delineation
is a function of the minimum subwatershed size and minimum allowable stream length. In addition, the
MSM needs to know where the sources are located, relative to the subwatershed delineation and
strength of each source (e.g., microbial loading rate). These data are supplied by other modules and
databases associated with the workflow, and their original form may not match input requirements of
the MSM; therefore, some transformation may be necessary. The MSM interacts directly with one
component (SDMProjectBuilder), but indirectly receives information from two additional components
(Microbe Properties Database and Microbial Source Database), as well as a number of other databases
accessed by D4EM (see Figure 5). Additional information on the SDMProjectBuilder, Microbe Properties
Database, and Microbial Source Database is as follows:
•	SDMProjectBuilder - The Site Data Manager Project Builder (SDMProjectBuilder) (Whelan et al,
2013a, 2013b) leverages data-management tools [e.g., Data for Environmental Modeling
(D4EM)] to access, retrieve, analyze, and cache web-based environmental data (e.g., NHDPIus,
NLCD, NCDC, STORET, NLDAS, STATSGO/SSURGO, etc.); provides geographic information system
(GIS) capabilities using DotSpatial technology; converts DotSpatial-based project files to
Mapwindow-based project files (Mapwindow 2011, 2013; Watry and Ames, 2008); and
automatically pre-populates input files of fate and transport models. For clarification, the
ActiveX control-based Mapwindow is being improved to a dotSpatial version using .NET (Ames,
2010). SDMPB automates the watershed delineation process, allowing HUC-8, HUC-12, or pour-
point analyses; assigns map-layer features (e.g., slope, soil, land use, microbial sources, and
NLDAS (Kim et a!., 2014) radar meteorological stations) automatically; and accounts
automatically for snow accumulation/melt, microbial fate and transport, and user-defined
simulation time increments such as hourly, daily, monthly, and annually.
•	Microbe Properties Database - The Microbe Properties Database (Whelan et al,, 2014d), which
is under development, captures physico-microbial properties of indicator and pathogen
microorganisms of interest, as well as data on the release of microorganisms associated with
fecal material. Typical properties include pre-determined microbes, domestic animals, wildlife,
and land-use types of interest (e.g., cropland, pasture, forest, urbanized), animal shedding rates,
microbial concentrations in waste manure for both animals and humans, microbial die-off rates,
and wildlife densities by land-use type. In conjunction with Table 9, which identifies microbial
names, Table 12 presents parameters/variables, their indices, and basic metadata associated
with the Microbe Properties Database. The SDMProjectBuilder extracts those properties from
the database needed for the MSM, then transforms and registers the information within the
MSMInput ontological metadata dictionary (Table 3). An example transformation is the
microbial name. Because the MSM considers only one microbe at a time, the index on microbe
names is not considered in its design and does not appear in the MSMInput ontological
metadata dictionary. A mapping of relevant names and indices of the Microbial Properties
Database output (MicrobeProperty, Table 12) to the SDMProjectBuilder output (SDMPBOutput,
43

-------
Table 3) and Microbial Source Module input (MSMInput, Table 3) ontological metadata is
presented in Table 13.
Table 12. Microbial Properties Database Ontological Output (MicrobeProperty
Name
Description
Cardinality 1
Data
Type
Primary Key
Not Self-Indexed?
Minimum
Maximum
Measure
Units
Stochastic
Index 1
Index 2
Index 3
Name
Name? of microbe?

SIRING
FALSL
FALSL




FALSL



DomesticAnimalName
Domestic Animal Name
1
STRING
FALSE
FALSE




FALSE



WildLifeName
Wildlife Name
1
STRING
FALSE
FALSE




FALSE



LandUseName
Land Use Type
1
STRING
FALSE
FALSE




FALSE



UrbanizedName
Name of Mixed Urban or
Built up area
1
STRING
FALSE
FALSE




FALSE



MediumName
Environmental Medium
associated with Microbe
1
STRING
FALSE
FALSE




FALSE



ManureForm
Physical form of the manure
(solid, slurry, dry litter)
1
STRING
FALSE
FALSE




FALSE



AlphaM
Fitting parameterthat
Controls the initial
Microbial release rate from
the manure
2
FLOAT
FALSE
TRUE
0
1.0E+38
1/Time
1/hr
TRUE
vlanureForm


Bman
The fitting parameter
definingthe shape of the
Microbe manure-release
curve (Bman)
2
FLOAT
FALSE
TRUE
0
1.0E+38


TRUE
vlanureForm


ReleaseRateEff
Microbial release rate
efficiency from the manure
(which is constant with
time); it is the fraction of
the microbes that are
actually released from the
manure (Er)
2
FLOAT
FALSE
TRUE
0
1.0E+38
Ratio
fraction
TRUE
\lame
ManureForm

ReleaseRateEffConstant
A Constant parameterthat
reflects the microbial
release rate efficiency from
the manure; the release
rate efficiency varies with
time (b)
2
FLOAT
FALSE
TRUE
0
1.0E+38
1/Time
1/hr
TRUE
\lame
ManureForm

ExcretionDensity
Density of microbe in feces
of infected animal perww
2
FLOAT
FALSE
TRUE
0
1.0E+38
Microbial Counts/Mass
Microbial Counts/g
TRUE
\lame
DomesticAnimalName

ExcretionDensitySuperShedder
Density of microbe in feces
of infected animal that is a
super shedder perww
2
FLOAT
FALSE
TRUE
0
1.0E+38
Microbial Counts/Mass
Microbial Counts/g
TRUE
\lame
DomesticAnimalName

FastDieOffManure
Fast phase (light) microbial
inactivation rate by manure
form
3
FLOAT
FALSE
TRUE
0
1.0E+38
1/Time
i/d
TRUE
\lame
DomesticAnimalName
ManureForm
PartitionCoef
Instantaneous partition
coefficient between liquid
and solid phases
(traditional distribution
coefficient, Kd)
2
FLOAT
FALSE
TRUE
0
1.0E+38
Volume/Mass
mL/g
TRUE
\lame
MediumName

Prevalence
Fraction of the animals
infected with microbe
2
FLOAT
FALSE
TRUE
0
1.0
Ratio
Fraction
TRUE
\lame
DomesticAnimalName

SlowDieOffManure
Slow phase (dark) microbial
inactivation rate by manure
form
3
FLOAT
FALSE
TRUE
0
1.0E+38
1/Time
i/d
TRUE
\lame
DomesticAnimalName
ManureForm
BetaPoissonConst
Constant shape parameter
in Beta Poisson Dose-
Response Model
1
FLOAT
FALSE
TRUE
0
1.0E+38
Microbial Counts
Microbial Counts
TRUE
\lame


BetaPoissonExp
Exponent shape parameter
in Beta Poisson Dose-
Response Model
1
FLOAT
FALSE
TRUE
0
1.0E+38


TRUE
\lame


ExpoDoseRespConst
Constant parameter (r) in
Exponential Dose-Response
model
1
FLOAT
FALSE
TRUE
0
1.0E+38
1/Microbial Counts
1/Microbial Counts
TRUE
\lame


GompertzLogFirstConst
First parameter (a) in
Gompertz-log Dose-
Response model
1
FLOAT
FALSE
TRUE
0
1.0E+38


TRUE
\lame


Dictionary
44

-------
Table 12. Microbial Properties Database Ontological Output (MicrobeProperty) Dictionary (cont'd)
GompertzLogSecondConst
Second parameter (b) in
Gompertz-log Dose-
Response model
1
FLOAT
FALSE
TRUE
0
1.0E+38


TRUE
Name


HypergeometricFirstConst
First parameter (a) in
Hypergeometric Dose-
Response model
1
FLOAT
FALSE
TRUE
0
1.0E+38


TRUE
Name


HypergeometricSecondConst
Second parameter (b) in
Hypergeometric Dose-
Response model
1
FLOAT
FALSE
TRUE
0
1.0E+38


TRUE
Name


Attach Rate
Attachment rate of Microbe
at the soil-solid phase (ka in
KINEROS2/STWIR)
3
FLOAT
FALSE
TRUE
0
1.0E+38
1/Time
1/hr
TRUE
Name
MediumName
ManureForm
DetachRate
Detachment rate of Microbe
at the soil-solid phase
(lowercase kd in
KINEROS2/STWIR)
3
FLOAT
FALSE
TRUE
0
1.0E+38
1/Time
1/hr
TRUE
Name
MediumName
ManureForm
ExchangeDepth
Thickness of top layerthat
actively interacts with
overland flow (i.e., mixing
zone)
1
FLOAT
FALSE
TRUE
0
1.0E+38
Length
m
TRUE
ManureForm


InfilFracKf
Fraction of infiltrated cells
that have been filtered out
by the soil mixing zone (i.e.,
staying in mixing zone) (kf
in Model 2
KINEROS2/STWIR)
2
FLOAT
FALSE
TRUE
0
1
Ratio
fraction
TRUE
Name
ManureForm

MassT ra nsfe rRate K
Mass Transfer Rate of
Microbe at the soil-solid
phase interface (k in Model
1 KINEROS2/STWIR)
3
FLOAT
FALSE
TRUE
0
1.0E+38
Length/Time
cm/hr
TRUE
Name
MediumName
ManureForm
StrainCoef
Straining coefficient (kstr in
KINEROS2/STWIR)
2
FLOAT
FALSE
TRUE
0
1
Ratio
fraction
TRUE
Name
ManureForm

SepticConc
Typical microbial
concentration in septic
system waste
1
FLOAT
FALSE
TRUE
0
1.0E+38
Microbial Counts/Volume
Microbial Counts/L
TRUE
Name


lnfect_asymtomatic
Duration of asymptomatic
infection in days
1
FLOAT
FALSE
TRUE
0
1.0E+38
Time
days
TRUE
Name


lnfect_endemic
Beta_end (endemic
transmission rate)
1
FLOAT
FALSE
TRUE
0
1.0E+38


TRUE
Name


Infectjncubation
Duration of incubation in
days
1
FLOAT
FALSE
TRUE
0
1.0E+38
Time
days
TRUE
Name


lnfect_p_to_p
Beta_pp (person-person
transmission rate)
1
FLOAT
FALSE
TRUE
0
1.0E+38


TRUE
Name


lnfect_reinfect
Duration of protection from
reinfection in days
1
FLOAT
FALSE
TRUE
0
1.0E+38
Time
days
TRUE
Name


lnfect_response
Probability of symptomatic
response, expressed as a
fraction
1
FLOAT
FALSE
TRUE
0
1
Ratio
fraction
TRUE
Name


lnfect_symtomatic
Duration of symptomatic
infection in days
1
FLOAT
FALSE
TRUE
0
1.0E+38
Time
days
TRUE
Name


AnimalConcMass
Microbial concentration
based on mass of waste
shed bydomesticanimal
2
FLOAT
FALSE
TRUE
0
1.0E+38
Microbial Count s^ees
Microbial Counts/g
TRUE
Name
DomesticAnimalName

AnimalShedRateMass
Domesticanimal shedding
rate in mass of waste (ww)
pertime
1
FLOAT
FALSE
TRUE
0
1.0E+38
Mass/Time
Kg/d
TRUE



BuiltUpRate
Accumulation rates in
median microbial counts
per area pertime by built
up land use
3
FLOAT
FALSE
TRUE
0
1.0E+38
Microbid CounG/Are^Tirne
Microbial Counts/ac/d
TRUE
Name
LandUseName
UrbanizedName
DieOffManure
First-order microbial
inactivation/die-off rate by
manure form
3
FLOAT
FALSE
TRUE
0
1.0E+38
1/Time
1/d
TRUE
Name
DomesticAnimalName
ManureForm
DieOffMedium
First-order microbial
inactivation/die-off rate by
medium
2
FLOAT
FALSE
TRUE
0
1.0E+38
1/Time
1/d
TRUE
Name
MediumName

DieOffTempCorr
Microbial inactivation/Die-
off rate Temperature
Correction
1
FLOAT
FALSE
TRUE
0
1.0E+38


TRUE
Name


Mass
Mass of a single microbe
1
FLOAT
FALSE
TRUE
0
1.0E+38
Mass
g
TRUE
Name


WildLifeDensity
Typical numberof wildlife
per unit area by landuse
pattern
2
FLOAT
FALSE
TRUE
0
1.0E+38
Unitless/Area
Number/ac
TRUE
WildlifeName
LandUseName

WildLifeShedRate
Typical wildlife microbial
shedding rate per wildlife
2
FLOAT
FALSE
TRUE
0
1.0E+38
Microbial Counts/Time
Microbial Counts/d
TRUE
Name
WildlifeName

45

-------
Table 13. Mapping of Names and Indices of Relevant Parameters/Variables from the Microbial Properties Database Output (MicrobeProperty)
Dictionary to the Microbial Source Module Input (MSMInput) Ontological Metadata Dictionary (refer to Tables 3 and 12)
Naming Convention for Names and Indices of Microbial Properties Database Output
(MicrobeProperty) Ontological Metadata
Naming Convention for Names and Indices of
SDMProjectBuilder Output (SDMPBOutput) and Microbial
Source Module Input (MSMInput) Ontological Metadata
Name
Index 1
Index 2
Index 3
Name
Index 1
Index 2
Name




ManureForm



LandllseName



Land Use


DomesticAnimalName



Agricultural


WildLifeName



Wildlife


UrbanizedName



Urbanized


BuiltUpRate
Name
LandllseName
UrbanizedName
BuiltUpRate
Land Use
Urbanized
AnimalShedRateMass
DomesticAnimalName


Microbe Animal Production Rates
Agricultural

AnimalConcMass
Name
DomesticAnimalName

SepticConc
Name


SepticConc


DieOffManure
Name
DomesticAnimalName
ManureForm
DieOff


WildLifeDensity
WildlifeName
LandllseName

Density
Wildlife
Land Use
WildLifeShedRate
Name
WildlifeName

MicrobeWildlife Production Rates
Wildlife

Note that the MSMInput ontological dictionary handles only one microbe; therefore, the microbe name is not required.
Selected Definitions with no one-to-one correspondence:
Name = Name of Microbe
ManureForm = Physical form of the manure (solid, liquid slurry, dry litter)
AnimalShedRateMass = Domestic animal shedding rate in mass of waste wet weight per time
AnimalConcMass = Microbial concentration based on mass of waste shed by domestic animal

-------
• Microbial Source Database - The Microbial Source Database contains a listing of microbial
source locations, animal numbers and types [agricultural animals (cattle, swine, poultry, etc.),
wildlife (ducks, deer, etc.), manure application schedules, grazing patterns, and point source
releases such as septic systems and treatment facilities, where applicable, by latitude-longitude
and correlates them with land-use type (i.e., built-up/impervious, pasture, cropland, forest)
within subwatersheds delineated by the SDMProjectBuilder. A subwatershed is the smallest
area associated with watershed modeling. The SDMProjectBuilder extracts source-related
properties needed by the MSM from the Microbial Source database, then transforms and
registers the information within the MSMInput ontological metadata dictionary (Table 3).
5.1 TRANSFORMATION OF LATITUDE-LONGITUDE COORDINATES TO SUBWATERSHED
DESIGNATIONS
A transformation of information is required because data produced and consumed by two components
(e.g., databases or modules) typically do not exactly align by name or indices. An example is the
conversion of Latitude-Longitude locations to subwatershed locations: Latitude-Longitude locations of
point sources, domestic animals, and septic systems must be correlated to specific subwatershed
locations (i.e., indexed to the subwatershed ID). A glossary of these external variables, used by the
SDMProjectBuilder to develop spatially based input for the MSM, is presented in Table 14. Table 15
presents a mapping of names and indices of the Microbial Source Database output, which is also the
SDMProjectBuilder input, to the SDMProjectBuilder output (SDMPBOutput, Table 3), which is also the
Microbial Source Module input (MSMInput, Table 3).
In the Microbial Source ontological metadata dictionary, each Latitude-Longitude coordinate has an
associated identifier; hence, Latitude-Longitude coordinates are indexed to these identifiers. A
hypothetical example of locations of septic systems and locations and numbers of domestic agricultural
animals, based on Latitude and Longitude, is presented in Table 16. Therefore, spatial locations
associated with input variables correlated to specific Latitude-Longitude coordinates are also indexed to
the identifiers, not to the Latitude-Longitude coordinates. Latitude-Longitude coordinates (i.e.,
AnimalLat-AnimalLong, PointLat-PointLong, SepticLat-Septic-Long) are overlaid on subwatershed
polygons, and subsequently re-designated by Subwatershed (i) for the following input variables.
The number of domestic agricultural animals, designated by Animal Location ID (s), corresponding to
latitude-longitude pairs (AnimalLat, AnimalLong) and domestic animal name (m) within a subwatershed
(i), is summed as follows:
NumberOfAnimalSi,m = ^withini (AgriculturalAnimalNumbers,m)
(37)
where
•	NumberOfAnimalSi m = Number of domestic agricultural animals (m) associated with
Subwatershed indexed by (i) (Number)
•	AgriculturalAnimalNumbers,m = Number of domestic agricultural animals (m) associated with an
animal location ID (s), using designated latitude (AnimalLat) and longitude (AnimalLong)
(Number)

-------
Table 14. Glossary of External Parameters/Variables that are used by the SDMProjectBuilder to develop
spatially based input for the Microbial Source Module [Alphabetic descriptors in parentheses refer to
the glossary presented in Table 9.]
Parameter/Variable Name
Parameter/Variable Description
AgriculturalAnimal Number
Number of domestic agricultural animals associated with a
designated latitude (AnimalLat) and longitude (AnimalLong)
AnimalLat
Latitude associated with a domestic animal location
(AnimalLocationID) by Domestic Animal Name (Agricultural)
AnimalLong
Longitude associated with a domestic animal location
(AnimalLocationID) by Domestic Animal Name (Agricultural)
BuiltUpArea
Area associated with each Urbanized land type (I = 4), indexed by
the Urbanized subcategories (r)
PointFlowlnTime
Point source discharge time series to the stream associated with
a designated latitude (PointLat) and longitude (PointLong) by
Point Source Name (p) by time (PointTime) (Volume/Time)
PointLat
Latitude associated with a point source location by Point Source
Name (p)
PointLong
Longitude associated with a point source location by Point Source
Name (p)
PointMicrobeRatelnTime
Microbial rate time series associated with the point source
discharge by microbe (k) associated with a designated latitude
(PointLat) and longitude (PointLong) by Point Source Name (p) by
time (PointFlowlnTime) (Microbial Counts/Time)
PointTime
Input time series associated with the point source flow and
microbial concentration, expressed as Julian Days by Point Source
Name (p) (Time)
SepticLat
Latitude associated with aseptic system location by Septic ID (o)
SepticLong
Longitude associated with aseptic system location by Septic ID (o)
SepticNumberBylD
Number of septic systems by septic system ID (o)
48

-------
Table 15. Mapping of Names and Indices of relevant parameters/variables in the Microbial Source Database output ontological metadata
dictionary to the Microbial Source Module Input (MSMInput) Ontological Metadata Dictionary
Naming Convention for Names and Indices of Microbial Source
Database Output and SDMProjectBuilder Input Ontological Metadata
Names and Indices of the SDMProjectBuilder Output
(SDMPBOutput) and Microbial Source Module Input
(MSMInput) Ontological Metadata
Name
Index 1
Index 2
Name
Index 1
Index 2
Index 3
SepticID



PointSourceName


AnimalLocationID


AnimalLat
AnimalLocationID
DomesticAnimalName
Subwatershed



AnimalLong
AnimalLocationID
DomesticAnimalName
AgriculturalAnimal Number
AnimalLocationID
DomesticAnimalName
NumberOfAnimals
Subwatershed
Agricultural

PointLat
PointSourceName

Subwatershed



PointLong
PointSourceName

PointTime
PointSourceName

(Time is indexed monthly)
PointFlowlnTime
PointSourceName
PointTime
PointFlow
Subwatershed
Agricultural
MonthID
PointMicrobeConc
PointSourceName
PointTime
PointMicrobeRate
Subwatershed
MonthID

PointFlowlnTime
PointSourceName
PointTime
SepticLat
SepticID

Subwatershed



SepticLong
SepticID

SepticNumber
SepticID

SepticNumber
Subwatershed


For a multiple name index containing a period (e.g., MicrobeProperty.DomesticAnimalName), the first name refers to the dictionary (e.g., MicrobeProperty), and the second is
the variable name (e.g., DomesticAnimalName), which means that the variable and its definition and contents are the same in both dictionaries (e.g., Microbial Source and
MicrobeProperty).
Selected Definitions with no one-to-one correspondence:
SepticID = Septic Identification
PointSourceName = Name associated with the point source.
AnimalLocationID = Identifier associated with the domestic animal location.
PointMicrobeConc = Microbial concentrations associated with the point source discharge to the stream each month, as a function of subwatershed (Subwatershed)
PointMicrobeRatelnTime = Multiple of corresponding pairs of PointMicrobeConc and PointFlowlnTime.

-------
Table 16. Hypothetical Example of Locations and Numbers of Domestic Agricultural Animals, based on Latitude and Longitude
ID
Source Location
Description and Number of Domestic Animals
Latitude
Longitude
Beef
Cattle
Dairy Cow
Swine
Horses
Poultry
Sheep
Other
Agricultural
Animals
(BeefCow)
(DairyCow)
(Swine)
(Horses)
(Poultry)
(Sheep)
(OtherAgAnimals)
137
44.22581737
-88.03049124
21
0
0
5
0
0
0
1586
44.21140158
-88.08726864
0
150
0
0
0
0
0
2134
44.09547271
-88.02632371
0
60
0
0
0
0
0
26
44.23975806
-88.09391664
0
2580
0
10
0
0
0
1880
44.19050723
-88.12829326
0
0
0
0
0
50
0
543
44.22692777
-88.18719257
0
0
0
0
0
0
65
20
44.19138897
-88.13265345
0
0
0
0
4000
0
0
15
44.18322712
-88.14603015
0
40
0
0
0
0
0
398
44.22670654
-88.14615681
0
996
0
10
0
0
0
245
44.12400977
-88.07510429
0
400
0
0
0
0
0
199
44.22709849
-88.20934359
36
0
400
0
0
0
0
165
44.19682251
-88.02294658
0
170
0
0
0
0
0
222
44.18450288
-88.1445154
0
0
0
0
1500
0
0
333
44.21850831
-88.00419665
0
0
1000
0
0
0
0
402
44.12405639
-88.07181604
0
550
0
0
0
0
0
50986
44.23873521
-88.1833881
150
0
0
0
0
0
0
35479
44.17716655
-88.03138565
0
0
800
0
0
0
0
3456
44.21836008
-88.1341596
0
0
0
0
3500
0
0
1234
44.20785446
-88.06378785
0
0
0
0
0
250
0
85656
44.22696627
-88.18968199
0
0
0
0
0
300
0
7864
44.21111207
-88.0748307
0
0
0
0
0
0
50

-------
•	AnimalLat = Latitude associated with a domestic animal location (s) by Domestic Animal Name
(m) (Coordinates)
•	AnimalLong = Longitude associated with a domestic animal location (s) by Domestic Animal
Name (m) (Coordinates)
•	i = index on Subwatershed
•	m = index on domestic animal name
•	s = index on Animal Location ID
The point flow names, designated by Point Source Name (p) corresponding to Latitude-Longitude pairs
(PointLat, PointLong), are mapped to a subwatershed (i) and e monthly average flow rates and microbial
flux rates to the stream are computed as follows:
PointFloWi,q = Monthly Average [PointFlowlnTimep,p0intTime]
(38)
PointMicrobeRatek,i,q = Monthly Average [PointMicrobeRatelnTimek,p,p0intiime]
(39)
where
•	PointFlowlnTimep,PointTime = Point source discharge time series to the stream associated with a
designated latitude (PointLat) and longitude (PointLong), as defined by Point Source Name (p)
by time (PointTime) (Volume/Time)
•	PointMicrobeRatelnTimek,P,PointTime = Microbial rate time series associated with the point source
discharge by microbe (k), associated with a designated latitude (PointLat) and longitude
(PointLong), as defined by Point Source Name (p) by time (PointFlowlnTime) (Microbial
Counts/Time)
•	PointLat = Latitude associated with a point source location by Point Source Name (p)
(Coordinates)
•	PointLong = Longitude associated with a point source location by Point Source Name (p)
(Coordinates)
•	PointTime = Input time series associated with the point source flow and microbial
concentration, expressed as Julian Days, by Point Source Name (p) (Time)
•	k = index on Microbe Name (Name)
•	p = index on Point Source Name (PointSourceName)
The number of septic systems, designated by Septic ID (o), corresponding to latitude-longitude pairs
(SepticLat, SepticLong) within a subwatershed (i), are summed as follows:
SepticNumben = Xo(withini) (SepticNumberBylDo)
(40)
where
•	SepticNumben = Number of septic systems associated with Subwatershed (i) (Number)
•	SepticNumberBylDo = Number of septic systems by septic system ID (o) (Number)
•	SepticLat = Latitude associated with aseptic system location by Septic ID (o) (Coordinates)
•	SepticLong = Longitude associated with aseptic system location by Septic ID (o) (Coordinates)
•	o = index on SepticID

-------
AreaFraction,,£=4,r = (BuiltupAreai,£=4,r) / (Area,,^)
(41)
where
•	Area Fraction,,£,r = Fraction of Urbanized land type (I = 4) which is a subset of the Land Use Type
{€), indexed to the four Urbanized subcategories (r) (Commercial and Service; Residential; Mixed
Urban; Transportation, Communication, and Utilities), by Subwatershed (i) (Ratio)
•	BuiltUpAreak,«=4,r = Area associated with each Urbanized land type (I = 4), indexed by the
Urbanized subcategories (r) (Area)
5.2 WORKFLOW
SDMProjectBuilder/D4EM accesses, retrieves, analyzes, and caches web-based data; delineates the
basin into subwatersheds; consumes source location data and overlays source locations as a map layer
to identify subwatersheds which correspond to various source locations; consumes microbial properties
data; and automatically pre-populates input needs of the fate and transport watershed model. The
MSM develops microbial loadings, adjusted for die-off, to the overland (e.g., Counts/ha/d)
subwatershed areas by land use and instream (e.g., Counts/hr) locations within a watershed. HSPF
simulates flow and microbial fate/transport within a watershed. BASINS (EPA, 2013b, 2013c) provides a
user interface and visualization tool for HSPF. FVCOM (Chen et a!., 2003, 2006a, 2006b) accounts for the
unsteady, wind-induced water surface changes within waterbody networks. MRA-IT (So Her el a!., 2008,
2004) characterizes human-health risk from ingesting water contaminated with pathogens. Each
component within the workflow has its own input, output, and computed variable ontological metadata
dictionaries.
53 ONTOLOGICAL RELATIONSHIPS BETWEEN VARIABLES, EQUATIONS, AND COMPONENTS
Using the ontological metadata dictionaries, a definitive relationship can be established between
variables, components within the workflow, equations, metadata, and assumptions. Even when a
variable is shared between multiple components (e.g., SDMProjectBuilder and Microbial Source
Module), relationships can be established using an approach similar to a Resource Description
Framework (RDF) triple (Price, 2" ). An RDF format is the standard for encoding metadata and other
knowledge on the semantic web ;t b, 2014), and an RDF triple consists of the 1) subject that
identifies the object the triple is describing, 2) predicate that defines the piece of data in the object to
which we are giving a value, and 3) object that is the actual value. In other words, a subject and an
object are linked by a predicate. For our purposes, Elag and Goodall (2013) describe this as a "3-ary"
because neither the equation, variable (symbol), nor component can be considered the primary subject.
Expanded examples of a "3-ary" are presented in Figures 6 and 7; Figure 6 illustrates the relationships
between the variable "AreaFraction" and the components (SDMProjectBuilder and Microbial Source
Module) and equations that define and use it (Equations 41 and 23, respectively). The ontological
metadata associated with AreaFraction is provided in Table 6. Figure 6 illustrates that both equations
and components link to the same variable, through which the metadata and assumptions are described.
Interesting features that are captured include 1) linkage of two different modules (SDMProjectBuilder
and Microbial Source Module); 2) definition of the variable in one module which registers it as output
52

-------
(SDMProjectBuilder), and consumption of same variable in another module as boundary-condition input
(Microbial Source Module); and 3) demonstration of how one accounts for input as "module-specific"
(MSMInput ontological dictionary) and as a boundary condition (SDMPBOutput ontological dictionary),
where this is an example of the boundary condition case. Since multiple modules could be linked to this
variable, this figure is not necessarily limited to two.
Variable Defined
Variable Used
Microbial Source Module
Equation (23):
AccumBuiltUpRateki M =
[ (AreaFractioniJ=4 r) {BuiItUpRate,
isDefinedlnDictionary = SDMPBOutput
islndexedlnDictionary = SDMPBOutput
isAFunctionOf = Subwatershed, Landuse,Urbanized
isDefinedAs = Fraction of subwatershed Urbanized area contributed by the four Urbanized area types
hasMeasureOf = Ratio
hasllnitsOf = Fraction
hasMinimumValueOf = 0.001
hasMaximumValueOf = 1.0
hasDimensionsOf = 3
hasDataTypeOf = Float
isAPrimaryKey = False
isScaler = True
IsDesignatedAsStochastic = True
isParameterType = Independent
isParameterFunction = Output
uselnEquationType = Algebraic
isDocumentedln = Whelan et al. (2015)
islmpactedByAssumptions = 1,3-9,11
Figure 6. Relationships between the Variable "AreaFraction" and Components, Equations, Metadata,
and Assumptions
AreaFraction
SDMProjectBuilder
Equation (41):
AreaFractioniMl.=
(BuiItupAreaj |=4 r) /
53

-------
Variable Defined
Variable Used
TotalGrazeDays
Equation (8)
Equations (7,16,17)
Microbial Source Module
Microbial Source Module
isDefinedlnDictionary = SDMPBOutput
islndexedlnDictionary = SDMPBOutput
isAFunctionOf = Subwatershed, Agricultural
isDefinedAs = Number of domestic animals per subwatershed
hasMeasureOf = Unitless
hasUnitsOf = Number
hasMinimumValueOf = 0
hasMaximumValueOf = 1.0E+38
hasDimensionsOf = 1
hasDataTypeOf = Float
isAPrimaryKey = False
isScaler = True
IsDesignatedAsStochastic = True
isParameterType = Independent
isParameterFunction = Output
uselnEquationType = Algebraic
isDocumentedln = Whelan et al. (2015)
islmpactedByAssumptions = 3-9
Figure 7. Relationships between the Variable "TotalGrazeDays" and Components, Equations, Metadata,
and Assumptions
This type of "3-ary" can be applied to all registered variables. For example, Figure 7 presents a "3-ary"
for an internal variable which is defined and used wholly within the MSM (i.e., TotalGrazeDays). Its
interesting features include the variable being defined and used in the same module, and its indices
(SDMOutput ontological dictionary) being associated with a different ontological dictionary than where
the variable is registered (MSMInternalVariables ontological dictionary). In summary, model developers
are able to focus on the model code itself rather than linkages between components by expressing
model and variable descriptions and assumptions within the ontology.
54

-------
-f. tJP.riPINt: "HtEl MICROBIAL SOURCE MODULE ONTOLOGIi vf Ml lAt.-vi".
DICTIONARY ¦ li » > vhi | .. l 1 NS1BLE MARK' ([•' IL M 'GU,v ;1 ,* UMENT
The concept of an ontological framework for documenting science software "products" (e.g.,
components, models, databases, assessments, etc.) lends itself to describing knowledge about the
product and relationships between product concepts (see Table 1, for example). A science-based
software product communicates science theory and software usability where the traditional means
were text-based, although technology is changing to digitized formats that facilitate not only product
understanding but automated discovery, evaluation (for a purpose), and integration (with other
products). To achieve digitized documentation for communication, discovery, evaluation, and
integration, an ontological framework provides a structured and possibly standardized way to combine
data, taxonomy, and relationships among concepts and data. Hence, the WRC ontology framework
described by	d Goodall (2013, 2012) encompasses many elements (data, taxonomy, concepts,
and relationships) in one format such as OWL, web ontology language.
Spreadsheets (illustrated by Tables 3, 4, and 11) combine essential variables with metadata and intra-
and inter-parameter relationships between variables (see Figures 6 and 7). They have traditionally been
used because they are intuitive and most software developers are comfortable with them, although
they are not the only format that could capture the ontological metadata. By agreeing on a format to
express data exchange, tools can be developed that facilitate the process to higher-level ontological
frameworks (e.g., OWL). With standardization, for example, user-friendly, graphical user interfaces
(GUIs) can be developed from a spreadsheet to capture ontological metadata, as illustrated in Figure 8
with the FRAMES Dictionary Editor (Whelan et al., 2014a). Likewise, spreadsheet-based ontological
metadata can be easily converted to GUIs, as illustrated by the Dictionary Registration Tool (Pelton,
2009), resulting in interchangeable forms of the same ontological metadata (i.e., spreadsheet to GUI and
vice versa).
Moving toward controlled vocabularies to name data elements and associated metadata, standardized
tools can facilitate linking controlled vocabularies (and definitions) to individual software product
digitized formats. Coupled with taxonomy (classification of concepts) relative to science software, a
more complete ontology can be documented and tools developed to compare, merge, and produce
such files and formats, as illustrated by expression of the WRC ontology using Protege (Pr otege, 2014).
Figure 9a illustrates interchangeable forms describing ontological metadata (or schema) between
spreadsheets, GUIs, and ontology editors.
When coupled with input values, data transfer with metadata can ensure proper quality control within
and between components; not only is the value known, but its metadata (description, units, range,
relationships to other parameters/variables, etc.) accompanies it. Standardization facilitates multiple
formats for expressing values with their ontological metadata (Figure 9b). For example, it allows
standardized user interfaces to deliver input to or produce data from models (illustrated in Figure 10)
which is the user interface for FRAMES's Data Client Editor (DCE, 2010) that captures three input
variables (TimePts, CumMass, and TotalFlux) with metadata and values associated the
ChemAquiferTotalFlux dictionary. This information can be easily converted into a flat file (e.g., csv, txt)
or expressed electronically, as illustrated by the MSM output captured in the Extensible Markup
Language (XML) in Figure 11 (see bottom third of figure).
55

-------
: Frames Development Environment - [Frames Dictionary Editor]
File Editors lools Configure Help
ffl- ~ Ch e m S u rface Wate rAd s o rb e d Fl ux
El ^ ChernSurfaceWaterDissolvedConc
TimeRs
ffl Ch e m S u rf ace Wate rD i s s o I ve d Fl ux
E * Ch e m S u rfaceW ate rT otal Co n c
E ^ ChemSurfaceWaterTotalFlux
ffl- % ChernT errestrialTRV
F % ChemThermo dynamics
E % ChernToxicity
ffl- Ch e m Vad ~ s eTotal Fl ux
E- ChernWSL
El Chronic
ffl- ChronicExposure
E- ^ cpSSF
E ^ CSTROutput
E ^ EcoBodyBurdensSUF
E eeGRF
E EMConfigFile
FMPvitorin
Cone Variable Properties
|Conc
Variable
Primary Key
Description fTh e dissolve-phase concentration associated
w | Dimension [3
Data Type Float
Minimum |g
Maximum
[1E+38
Unit of
Measure
Stochastic
| Mass/Volume	I mg/L
| True
Preposition
3
Related Variables SurfaceWaterPoints.Feature
List of indices
S u rface Wate rPo i nts. Fe atu re
Ch em Li st. CAS ID
Ch e m S u rface Wate rD i s s o Ive d
E
2]
Add Index
Jj
Del Index
Self-Indexed on the list of indices (implies itself as an index)
Figure 8. Example of Ontological Metadata captured by the FRAMES Dictionary (DIC) Editor (Whelan et
al... 2014a)
(a)
Database
(e.g., flat file
spreadsheet)
Graphical
User
Interface
(e.g., DIC Editor)
Ontology Editor
(e.g., Protege)
(b)
Graphical
User
Interface
(e.g., DCE)
Database
(e.g., flat file,
spreadsheet)
Electronic Format
(e.g., XML)
2
Input/Output
Values
Figure 9. Interchangeable forms (a) describing ontological metadata (or schema) and (b) documenting
instances of a dataset related to the schema (After Whelan et al., 2015)
56

-------
DCE (Data Client Editor)
File Tools Edit Help
View & Edit Data 1
-	0 Edit Data
ffl Bjp AquiferFluw
^[3 AquiferPolygons
1^1-^3 CherrAquiferT otalFlux
i	V TimePts
x/ CurnMass
y/ T otalFluK
ffl p RadAquiferT otalFluK
-	01 View Data
[j] ChemList
RadList
'rrp GeoReference
TimePts - Flux time point

Property Value
Minimum
0
Maximum
100000000
Unit
yr
Type
FLOAT
Scalar
No
Dimension
3
Configure View
Add Row
Delete Row
Units ->
Actions ->
n/a
|N/A
Select
Select
3111 ~ | 71432
All	71432
yr
Edit
71432 ~Jo
~ |g
Edit
0
TotalFlux
~ g/yr
Edit
43000
4300000 43000
Figure 10. Standardized Graphical User Interface for FRAMES' Data Client Editor (DCE, 2010), capturing
Ontological Metadata and Values for Three Variabless (TimePts, CumMass, and TotalFlux) associated
with the ChemAquiferTotalFlux Dictionary
XML is 1) a standard or set of rules that governs encoding of documents into an electronic format
(Difference Between, 2014) that is human- and machine-readable; 2) a textual data format with strong
support via Unicode for different human languages; and 3) widely used to represent arbitrary data
structures such as those in web services (XML Wikipedia, 2014). An XML document captures rules in a
readable form and is compared to the XML schema (XSD) developed to execute the web service. The
purpose of comparison is to verify the syntax and validate the structure of an XML document. The
purpose of this section is to illustrate how an MSM ontological metadata dictionary maps to its
corresponding XML document, so the model can be executed as a web service.
There is a logical, natural mapping of MSM variables to an XML document. For example, Appendix A
captures the MSM input variables and metadata directly with the MSM XML document. The only
metadata directly captured by the XML document are names, units, and indices, including each
parameter's hierarchical relationships between indices. The XML document also captures the value for
each variable and lets the user include comments/explanations. Input parameter/variable names are
easily mapped from an ontological dictionary such as Table 3 to an XML document. The
parameter/variable name represents the lowest level in the hierarchy, telescoping from the highest
mapped index (Index 1) to the lowest index such as Index 3. Figure 12 illustrates how the metadata for
input variables "Area" and "AreaFraction" are mapped to the XML document.
57

-------
Subwatershed Areas File Areas.txt [
FC Production Rates Rle FCProdRatestxt |
Ag Animal Count Rle AnimalSub.txt |
Wildlife Densities Rle WildlifeDensities.txt [
Manure Application Rle ManureApplication.txt [
Grazina Davs File GrazingDays.M | ... J
Seotics Data Rle SepticsDatatxt j
Difinff Ratfis Filfl Monthly RrstOrderDieOffRateConstantsM [ ... |
Manure Application Rle PointSourcestxt |
Outputs have been produced You may view them using the drop down box on this form
Get Output | Display output for: likl-iM ¦»




SubWatershed ID
JanAccum FebAccum MarAccum AprAccum MayAccum JunAccum JulAccum AugAccum
SepAccum OctAccum I *

3085712045.528...
3415905523.620..
3085712045.528...
4596099815.344...
4459130668.876...
3446759819.438...
3344032959.587...
3344032959.587...
3534588930.218...
25004502639.06...
2 J
P2
1326890074.330...
1468638341.223...
1326890074.330...
5924322062.432...
5865522781.351...
4428928414.925...
4384828954.114...
4384828954.114...
5468627634.052...
14685414943.55...
ill
P3
50746112.65865...
55764669.37208...
50746112.65865...
1028749881.918...
1026668110.244...
768955732.4834...
767394403.7281...
767394403.7281...
1012616151.447...
1338933861.302...
1
P4
238930975.1139...
264112195.6618...
238930975.1139...
3909985764.015...
3899540220.676...
2920099215.098...
2912265057.595...
2912265057.595...
3829032803.142...
5466371721.436...
6
P5
519724939.8360...
574991228.0327...
519724939.8360...
7189477629.541...
7166552354.437...
5368924755.360...
5351730799.032...
5351730799.032...
7011806747.486...
10605343620.01...
1 "

OR Run Module Using XML Imput Rle
MS M Input Example xml
[ Run Module Using XML |
-I




P2



257971875
92869875


15465933829.531027
~

Figure 11. Microbial Source Module Graphical User Interface: 1) input data files (top third), 2) tabular
form of output results, values (middle third), and 3) XML-based output results, values and ontological
metadata.
"Area" is a function of the Subwatershed, of which a watershed contains one or more subwatersheds,
and LandUse, of which there are four types (Cropland, Pasture, Forest, and Urbanized). For each
subwatershed, therefore, an area is assigned to each land use type (see Figure 12.) In addition to being a
function of Subwatershed and LandUse, "AreaFraction" is a function of the Urbanized land use type, of
which there are four (CommercialAndServices, Residential, MixedUrban,
TransportationCommunicationUtilities). "AreaFraction," therefore, is captured in the XML document
under Subwatershed, LandUse, and Urbanized for each Urbanized land use type, with Figure 12
illustrating the telescoping indices. A similar procedure can be followed when mapping the remaining
MSM input variables listed in Table 3, as illustrated by Appendix A. Using this template, the MSM
ontological metadata output in Table 4 can also be captured in an XML document, as illustrated in
Appendix B.
58

-------


O
CUD
CD
+-»
cu
U
"D
d) ¦
U
o
00

Indexl 
P1

Index 2 

40.8


Index 3
Comment -Q
Area
0
DD
o»
ro
U
T3
01
H3
-Q
480.0


48.4


403.1


0.25 ^


0.25


0.25


0.25





0


AreaFraction
Figure 12. Mappings to the XML document (See Appendix A) of the Metadata for Input Variables "Area"
and "AreaFraction" (see Table 6)
DISCLAIMER
This document has been reviewed in accordance with U.S. Environmental Protection Agency policy and
approved for publication.
59

-------
REFERENCES
Ames, D., 2010. MapWindow: Moving MapWindow 6 to DotSpatial.
 (last accessed 21.09.14).
API, 2015. Application Programming Interface.
 (last accessed 04.02.15).
Babendreier, J.E. 2010. QA/QC Management Protocols for iemTechnologies Software Development:
Summary of Guiding Principles and Best Practices for Quality Assurance in Modeling. U.S. Environmental
Protection Agency, National Exposure Research Laboratory, Ecosystems Research Division, Athens, GA
(March).
Babendreier, J.E., Castleton, K.J., 2005. Investigating uncertainty and sensitivity in integrated,
multimedia environmental models: Tools for FRAMES-3MRA. Environ. Modell. Softw. 20 (8), 1043-1055.
Bicknell, B.R., Imhoff, J.C., Kittle, Jr., J.L., Jobes, T.H., Donigian, Jr., A.S. 2005. HSPF Version 12.2 User's
Manual. AQUA TERRA Consultants, Mountain View, CA.
Bicknell, B.R., Imhoff, J.C., Kittle, J.L., Donigian, A.S., Jr., Johanson, R.C., 1997. Hydrological simulation
program - FORTRAN, user's manual for version 11. EPA/600/R-97/080, U.S. Environmental Protection
Agency, Athens, GA, 755 p.
Chen, C., Beardsley, R.C., Cowles, G., 2006a. An unstructured grid, finite-volume coastal ocean model
(FVCOM) system, Special Issue entitled "Advances in Computational Oceanography". Oceanography 19
(1), 78-89.
Chen, C., Cowles, G., Beardsley, R.C., 2006b. An unstructured grid, finite-volume coastal ocean model:
FVCOM user manual, SMAST/UMASSD Technical Report-06-0602, Second edition. 315p.
Chen, C., Liu, H., Beardsley, R., 2003. An unstructured grid, finite-volume, three dimensional, primitive
equations ocean model: application to coastal ocean and estuaries. J. Atmos. Oceanic Tech. 20 (1), 159-
186.
DCE, 2010. FRAMES Downloads, EarthDomainDocumentation.zip. Data Client Editor
(DCE)_0710.ppt. http://iemhub.org/resources/133/supportingdocs (last accessed 21.03.15).
Difference Between, 2014. Difference Between XML and XSD. Difference Between.net

(last accessed 03.08.14).
Difference Between. 2012. Difference Between Variable and Parameter
 (last accessed
15.03.15).
Elag, M.M. and J.L. Goodall. 2013. An ontology for component-based models of water resource systems.
Water Resour Res, 49(8), 5077-5091.
60

-------
Elag, M.M. and J.L. Goodall. 2012. Design and Application of an Ontology for Component-Based
Modeling of Water Systems, Abstract IN11B-1470. Presented at 2012 Fall Meeting, AGU, San Francisco,
Calif., December 3-7.
EPA (U.S. Environmental Protection Agency), 2010. Quantitative microbial risk assessment to estimate
illness in freshwater impacted by agricultural animal sources of fecal contamination, EPA 822-R-10-005,
Office of Water, Washington DC.
EPA (U.S. Environmental Protection Agency), 2013a. Data for Environmental Modeling (D4EM). Office of
Research and Development, Athens, GA. http://www.epa.gov/AthensR/research/d4em.html (last
accessed 10.02.14.).
EPA (U.S. Environmental Protection Agency), 2013b. BASINS/HSPF Training, Exercise 10 - Bacterial and
temperature modeling. http://water.epa.gov/scitech/datait/models/basins/upload/Exercise-10-
Bacteria-and-Temperature.pdf (last accessed 23.02.14.).
EPA (U.S. Environmental Protection Agency), 2013c. BASINS user information and guidance, BASINS
tutorials and training. http://water.epa.gOv/scitech/datait/models/basins/userinfo.cfm#tutorials (last
accessed 23.02.14.).
EPA (U.S. Environmental Protection Agency) and USDA (U.S. Department of Agriculture/Food Safety and
Inspection Service), 2012. Microbial risk assessment guideline: Pathogenic microorganisms with focus on
food and water. EPA/100/J-12/001; USDA/FSIS/2012-001. Washington, DC.
GitHub. 2014. What is RDF and what is it good for?  (last accessed 02.08.14).
Haas, C.N., Rose, J.B., Gerba, C.P., 1999. Quantitative microbial risk assessment. John Wiley & Sons, Inc.
New York. 449 p.
Hunter, P.R., Payment, P., Ashbolt, N., Bartram. J., 2003. Chapter 3. Assessment of risk. In: Ronchi, E.,
Bartram, J. (Eds.). Assessing microbial safety of drinking water: Improving approaches and methods.
OECD/WHO guidance document. OECD/WHO, Paris, pp. 79-109.
IBM. 2006. What is a software architecture?
 (Last accessed 17.07.14).
IEEE. 2000. IEEE Recommended Practice for Architectural Description of Software-Intensive Systems:
IEEE Std 1472000. IEEE Computer Society.
JCGM. 2008. International vocabulary of metrology — Basic and general concepts and associated terms
(VIM), JCGM 200, Joint Committee for Guides in Metrology, Bureau International des Poids et Measures,
Sevres Cedex, France,  (last accessed
22.03.15.)
Johnston, J. M., McGarvey, D.J., Barber, M.C., Laniak, G.F., Babendreier, J.E., Parmar, R., Wolfe, K.,
Kraemer, S.R., Cyterski, M., Knightes, C., Rashleigh, B., Suarez, L., Ambrose, R., 2011. An integrated
61

-------
modeling framework for performing environmental assessments: Application to ecosystem services in
the Albemarle-Pamlico basins (NC and VA, USA). Ecol. Model. 222 (14), 2471-2484.
Kashyap, V. and A.P. Sheth. 2000. Information Brokering across Heterogeneous Digital Data: A
Metadata-based Approach, Kluwer Academic Publishers, Boston, MA
Kim, K., Price, K., Whelan, G., Galvin, M., Wolfe, K., Duda, P., Gray, M., Pachepsky, Y., 2014. Using
remote sensing and radar meteorological data to support watershed assessments comprising integrated
environmental modeling. In: Ames, D.P., Quinn, N. (Eds.), Proceedings of the 2014 International
Congress on Environmental Modelling and Software, San Diego, CA USA.
Laniak, G.F. 2012. Environmental Software Reuse and Interoperability: An Initial Review and
Recommendations for the Office of Research and Development. Internal Report, Ecosystems Research
Division, National Exposure Research Laboratory, Office of Research and Development, U.S.
Environmental Protection Agency, Athens, Georgia, December 31, 2012 (Final Draft).
Laniak, G.F., G. Olchin, J. Goodall, A. Voinov, M. Hill, P. Glynn, G. Whelan, G. Geller, N. Quinn, M. Blind, S.
Peckham, S. Reaney, N. Gaber, R. Kennedy, A. Hughes. 2013. Integrated Environmental Modeling: A
Vision and Roadmap for the Future. Environ Modell Softw, 39:3-23.
MapWindow, 2011. MapWindow 4.  (last accessed 15.04.13.).
MapWindow, 2013. MapWindow 6.  (last accessed 15.04.13.).
Meersman R. and L. Mark (eds). 1997. Database Application Semantics. Chapman and Hall.
Morsey, M.M., J.L. Goodall, C. Bandaragoda, A.M. Castronova, J. Greenberg. 2014. Metadata for
Describing Water Models. International Environmental Modelling and Software Society (iEMSs), 2014
International Congress on Environmental Modelling and Software, Bold Visions for Environmental
Modeling, Sixth Biennial Meeting, San Diego, CA USA.
Pelton, M.A. 2009. Requirements, Design, and Specifications for Excel Dictionary Registration Tool (DRT).
PNWD-4141, Battelle, Pacific Northwest Division, Richland, WA.
Price, R. 2004. What Is An RDF Triple?
 (last accessed
02.08.14).
Protege. 2014. Protege 4 User's Guide.  (last accessed
22.03.15.)
REST (Representational State Transfer). 2015.
 (Last accessed 01.05.15).
Sheth, A.P. 2001. Changing Focus on Interoperability in Information Systems: From System, Syntax,
Structure To Semantics. In: M.F. Goodchild, M.J. Egenhofer, R. Fegeas, and C.A. Kottman (eds).
Interoperating Geographic Information Systems, Kluwer Academic Publishers, 25 pp.
62

-------
SOA, 2015. Service-oriented architecture 
(last accessed 04.02.15).
Soller, J.A., Schoen, M.E., Bartrand, T., Ravenscroft, J., Ashbolt, N.J., 2010. Estimated human health risks
from exposure to recreational waters impacted by human and non-human sources of faecal
contamination. Water. Res. 44 (16), 4674-4691.
Soller, J.A., Seto, E., Olivieri, A.W., 2008. Microbial risk assessment interface tool: User documentation.
Water Environmental Research Foundation, Alexandria, VA.
Soller, J.A., Olivieri, A.W., Eisenberg, J.N.S., Sakaji, R., Danielson, R., 2004. Evaluation of microbial risk
assessment techniques and applications. 00-PUM-3. Water Environmental Research Foundation,
Alexandria, VA.
Uschold, M., and M. Gruninger (1996), Ontologies: Principles, methods and applications, Knowl. Eng.
Rev., 11(2), 93-136.
W3C. 2013. Web Ontology Language (OWL)  (Last accessed
01.05.15).
Wang, W.G., Tolk, A., Wang, W.P., 2009. The levels of conceptual interoperability model: applying
systems engineering principles to modeling and simulation. In: Proceedings of the Spring Simulation
Multiconference. Society for Modeling & Simulation International (SCS), San Diego, CA.
 (last accessed 15.03.15).
Watry, G., Ames, D.P., 2008. A practical look at MapWindow GIS. First Edition, 316 p.
 (last accessed 15.04.13).
Whelan, G., Tenney, N.A., Pelton, M.A., Coleman, A.M., Ward, D.L., Droppo, J.G., Jr., Meyer, P.D., Dorow,
K.E., Taira, R.Y., 2009. Techniques to access databases and integrate data. PNNL-18244, Pacific
Northwest National Laboratory, Richland, WA.
 (last accessed
04.04.13).
Whelan, G., Kim, K., Pelton, M.A., Castleton, K.J., Laniak, G.F., Wolfe, K., Parmar, R., Galvin, M.,
Babendreier, J., 2014a. Design of a Component-based Integrated Environmental Modeling Framework.
Environ. Modell. Softw. 55, 1-24.
Whelan, G., Kim, K., Pelton, M.A., Soller, J.A., Castleton, K.J., Molina, M., Pachepsky, Y., Ravenscroft, J.,
Zepp, R., 2014b. An Integrated Environmental Modeling Framework for Performing Quantitative
Microbial Risk Assessments. Environ. Modell. Softw. 55, 77-91.
Whelan, G., K. Kim, R. Parmar, K. Wolfe, M. Galvin, P. Duda, M. Gray, M. Molina, R. Zepp, Y. Pachepsky, J.
Ravenscroft, L. Prieto, B. Kitchens. 2014c. Using IEM to Automate a Process-based QMRA. International
Environmental Modelling and Software Society (iEMSs), 2014 International Congress on Environmental
Modelling and Software, Bold Visions for Environmental Modeling, Sixth Biennial Meeting, San Diego, CA
USA.
63

-------
Whelan, G., Pelton, M., Molina, M., J. Ravenscroft. 2014d. Microbial Properties Database Editor Tutorial.
U.S. Environmental Protection Agency, Office of Research and Development and Office of Water,
Athens, GA. (Draft)
Whelan, G., K. Kim, R. Parmar, K. Wolfe, M. Galvin, M. Gray, P. Duda, M. Molina, R. Zepp. 2013a.
SDMProjectBuilder Tutorial: Lesson 1 - Navigate the SDMPB and Identify a Watershed of Interest,
Delineate a 12-Digit HUC within a Watershed Perform an Assessment on a 12-Digit HUC within a
Watershed; Simulation with WinHSPF and Data Analysis/Viewing with BASINS (Alpha Version). U.S.
Environmental Protection Agency, Athens, GA.
Whelan, G., K. Kim, R. Parmar, K. Wolfe, M. Galvin, M. Gray, P. Duda, M. Molina, R. Zepp. 2013b.
SDMProjectBuilder Tutorial: Lesson 2 - Navigate the SDMPB and Identify a Watershed of Interest,
Delineate a Watershed Associated with a Pour Point, and Perform an Assessment of a Watershed
Associated with a Pour Point; Simulation with WinHSPF and Data Analysis/Viewing with BASINS (Alpha
Version). U.S. Environmental Protection Agency, Athens, GA.
Wikipedia. 2014. Software architecture.  (Last
accessed 17.07.14).
Wolfe, K.L., Parmar, R.S., Laniak, G.F., Parks, A.B., Wilson, L., Brandmeyer, J.E., Ames, D.P., Gray, M.H.,
2007. Data for environmental modeling (D4EM): Background and example applications of data
automation. Presented At International Symposium On Environmental Software Systems, Prague, Czech
Republic, May 22-25, 2007
 (last accessed 04.04.13).
XML Wikipedia. 2014. XML. (last accessed 03.08.14).
64

-------
APPENDIX A
MICROBIAL SOURCE MODULE III ICUMENT FOR INPUT
S/VARIABLES


2.8 
0.025 
70.0 
1000.0 

6210000.0 
10300000.0

16600000.0

23300000.0 
200000.0 

 
 
0.36

 
0.36


0.36


0.51


0.51


0.51


0.51


0.51


-------

0.51


0.36


0.36


0.36




P1

180000


4000.0
1.0


4000.0
2.0


4000.0
3.0


4000.0
4.0


4000.0
5
66

-------


4000.0
6


4000.0
7


4000.0
8


4000.0
9


4000.0
10


4000.0
ll


4000.0
12




40.8

67

-------

480.0


48.4


403.1


0.25


0.25


0.25


0.25





180000.0


70.0


0.0


700.0


48.0


90.0


74.0




68

-------
P2

100.0


4000.0
1.0


4000.0
2.0


4000.0
3.0


4000.0
4.0


4000.0
5.0


4000.0
6.0


4000.0
7.0


69

-------
4000.0
8.0


4000.0
9.0


4000.0
10.0


4000.0
11.0


4000.0
12.0




40.8


480.0


48.4


403.1


0.25


0.25
70

-------


0.25


0.25





0.0


70.0


0.0


700.0


48.0


90.0


74.0







0.75
104000000000.0



0,0
0

71

-------

0.0375
0,0
0.0


0,0
0.0


0.05
20.0
0.1


0.05
31.0
0.15


0.0375
30.0
.10


0.0375
31.0
0.1


31.0
0.1


0.0375
30.0
0.1


0.3
31.0
0.1


0.3
15.0
72

-------
0.05


0,0
0.0




0.75
104000000000.0



0.0375


0.0375


0.0375


0.0375


0.3


0.3

73

-------

0.0375




0.75
420000000.0


31.0
0,0


28.0
0,0


31.0
0,0


30.0
31.0
0.1


30.0
0,0


31.0
0,0


31.0
0,0


30.0
31.0
74

-------
30.0
0.4


31.0
0,0




12000000000.0


0,0


0,0


0,0


0,0


0.0


30.0


31.0


31.0


51.4


0.0


0,0
75

-------


0.9




0.96
136000000.0



0.0


0,0


0,0


0,0


0.40


0.80
10800000000.0



0.0


0,0


0,0


0,0


0.40


104000000000.0


0,0


0.0


0,0


0,0


0,0


30.0


31.0


31.0


15.4


0,0


0,0


0.0






2430000000.0


78

-------
0.1


1.4


0.4


0.1




49000000000.0


0.1


0.1


0.1


0.1




49000000000.0


0.1


0.05


0.1


0.1


79

-------


49000000000.0


0.1


0.1


0.05


0.05




49000000000.0


0.1


3.0


0.1


0.05




49000000000.0


0.1


0.05


80

-------
0.01


0.08





81

-------
APPENDIX B
MICROBIAL SOURCE MO	1L DOCUMENT FOR OUTPUT
S/VARIABLES



P2




257971875
92869875


15465933829.531027
5567736178.63116


15465933829.531
5567736178.63116


8616333.333333334
3101880


1558.245
4000.0000598897395




257971875
92869875


15465933829.531027
5567736178.63116
82

-------


15465933829.531
5567736178.63116


8616333.333333334
3101880


1558.245
4000.0000598897395




257971875
92869875


15465933829.531027
5567736178.63116


15465933829.531
5567736178.63116


8616333.333333334
3101880


1558.245
4000.0000598897395




257971875
92869875

83

-------

15465933829.531027
5567736178.63116


15465933829.531
5567736178.63116


8616333.333333334
3101880


1558.245
4000.0000598897395




257971875
92869875


15465933829.531027
5567736178.63116


15465933829.531
5567736178.63116


8616333.333333334
3101880


1558.245
4000.0000598897395




84

-------
257971875
92869875


15465933829.531027
5567736178.63116


15465933829.531
5567736178.63116


8616333.333333334
3101880


1558.245
4000.0000598897395




257971875
92869875


15465933829.531027
5567736178.63116


15465933829.531
5567736178.63116


8616333.333333334
3101880


1558.245
4000.0000598897395
85

-------




257971875
92869875


15465933829.531027
5567736178.63116


15465933829.531
5567736178.63116


8616333.333333334
3101880


1558.245
4000.0000598897395




257971875
92869875


15465933829.531027
5567736178.63116


15465933829.531
5567736178.63116


8616333.333333334
3101880
86

-------


1558.245
4000.0000598897395




257971875
92869875


15465933829.531027
5567736178.63116


15465933829.531
5567736178.63116


8616333.333333334
3101880


1558.245
4000.0000598897395




257971875
92869875


15465933829.531027
5567736178.63116


15465933829.531
5567736178.63116

87

-------

8616333.333333334
3101880


1558.245
4000.0000598897395




257971875
92869875


15465933829.531027
5567736178.63116


15465933829.531
5567736178.63116


8616333.333333334
3101880


1558.245
4000.0000598897395




88

-------