United States
Environmental Protection
Agency
Solid Waste and
Emergency Response
(5305W)
EPA530-R-99-002
May 7999
www.epa.gov/osw
oEPA Industrial Waste
Management Evaluation
Model (IWEM):
Ground-water Model
DRAFT
Printed on paper that contains at least 30 percent postconsumer fiber
-------
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
TABLE OF CONTENTS
1.0 INTRODUCTION
1
2.0
3.0
4.0
^~ki • i- .*;-**•••?** set? n
Objectives -A M%>§m®< •• • • 2
•* „•«••££••„ ^X;:X<-:v
Relation to Other Documents ... • 2
Limitations, Caveats and Disclaimers
1.1
1.2
1.3
OVERALL MODELING STRATEGY fH^.. .jJFT . 4
2.1 Ground-Water Modeling Strategy 4
2.1.1 Conceptual Model 4
2.1.2 Waste Management Unit liners . 6
2.2 Monte Carlo Analysis fff... 7
2.2.1 Basis for the Use of Monte Carlo Approacn 8
2.2.2 Basis for Use of the 90th Percentile , ^. 9
2.3 Development of Neural Networks to Emulate Ground-Water Models 9
2.3.1 Brief Overview of Neural Networks .... l\J^f 10
2.3.2 Development of a Neural Network to Emulate EPACMTP for Tier 2,
Location-Adjusted Evaluations 13
Tier 1 ,V .*.' 14
Tier 2 '. T 15
2.4
2.5
MODELS USED TO DEVELOP THE TWO-TffiRED APPROACH ...... ........ 17
3.1 .HELP .......... t ....... T± ......................... . . . ........ 17
3.1,1 Determining Infiltration from a Waste Management Unit ......... 17
3.1.1.1 Determination of Surface Impoundment Infiltration Rate ..... 18
3.1.2 Sensitivity of HELP Model to Input Parameter Values .............. 18
3.2 , MINTEQA2-deriyed and Empirical Isotherms .......................... 20
3.2.1 The K.J, Values for Metals Used in Modeling Support for Tier 1 ....... 20
3.2.2 The Kj Values for Metals Used in the Neural Network for Tier 2 ..... 24
3.2.3 Linear and Non-linear Sorption Isotherms ...................... . . 27
3.3 EPACMTP [[[ 28
3.3.1 , Analysis of Unsaturated Zone and Saturated Zone Flow and Transport . 32
* _3.3.2 Data Sources ...................................... . . ...... 33
3.3.3 Model Sensitivities to Input Parameters ......................... 33
CALCULATION OF THE TIER 1 LEACHATE CONCENTRATION ............. 44
4.1 Assumptions and Parameters ..................... ................... 44
4.2 Determination of Dilution Attenuation Factors (DAFs) ................... 46
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
5.0 DEVELOPMENT OF NEURAL NETWORKS FOR TIER 2 EVALUATION OF
THRESHOLD VALUES .,. 53
5.1 Sensitivity Analysis to Identify Key Parameters 53
5.2 Assumptions and Parameters Used to Develop Neural Networks v' 55
5.3 Neural Network Development ' 51
5.4 Neural Network Performance Evaluation 5&
5.5 Integration of the Neural Networks into a Single User-Friendly GUI 63
>,«*"< **,.' ^
6.0 APPLICATION OF MODEL RESULTS TO WASTE MANAGEMENT , 64
6.1 Use and Interpretation of Tier 1 Evaluation .. 1 64
6.2 Use and Interpretation of Tier 2 Evaluation ...... 65
s-
V y x
7.0 REFERENCES x. 66
~\*,~*
APPENDIX A: DEVELOPMENT OF THE FOUR NON-LINEAR NEURAL NETWORKS
FOR TIER 2 , .'...- ! A-1
*. f
jf .,
APPENDIX B: LINER INFILTRAUO^EVALUATIONS B-1
•*" ** *
APPENDIX C: HISTOGRAMS OF THE INPUT PARAMETERS USED IN THE NEURAL
NETWORKS ^ C-1
^ \ _,
APPENDIX D: GLOSSARY ...' D-1
IWEMJIBD.wpd
11
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
FIGURES
2-1 Schematic Diagram of Ground-water modeling ...". \ 5
2-2 Three Liner Scenarios Considered in the Two-Tiered Modeling Approach for,
Nonhazardous Solid Waste Guidelines ._ ^. ,O.„ 7
2-3 Schematic Diagram to Illustrate the Monte Carlo Modeling Approach ^. 1. "^... 8
2-4 Schematic Diagram of Neural Network ',". 11
f \
3-1 Variation hi log K^ with log of Total Cd Concentration C 23
4-1 Well Location Parameters Used in the Tier 1 Analysis — -£• • •'• • 50
4-2 Cumulative Probability Density Functions of Ground-water Well Concentrations and
DAFs
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
TABLES
Table
3-1 Empirical pH-dependent K,, Relationships from U.S. EPA (1990) . .1. Trh 2M
3-2 Regression Parameters of the pH-dependent K,, Relationships used in the Neural
Network v-. ?7 —^:--.:,;'.. 26
3-3 EPACMTP Modeling Assumptions and Input Parameters \ 30
3-4 List of 28 EPACMTP Input Parameters Used in the Deterministic Sensitivity Analysis. 35
3-5 Ranked Sensitivity of Parameters Evaluated in Probabilistic Sensitivity Analysis for (a)
Landfills, (b) Surface Impoundments 7. :.. -\. 39
3-6 Ranked Sensitivity of Parameters Evaluated in Probabilistic Sensitivity Analysis for (a)
Waste Piles, and (b) Land Application Units .... .s '...: 40
3-7 Most Sensitive EPACMTP Input Parameters 43
-', -C ' ^ j. f
z/ *•
4-1 Assumptions Used to Compute Infiltration for Landfills 47
4-2 Assumptions Used to Compute Infiltration for Surface Impoundments 48
4-3 Assumptions Used to Compute Infiltration for Waste Piles 49
4-4 Toxicity Characteristic Regulatory Levels 52
5-1 Neural Network Summary Statistics .....'. 62
IWEM_1BD.wixi
IV
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
1.0 INTRODUCTION
EPA's Office of Solid Waste (OSW) is developing the Guide for Industrial Waste Management
to facilitate evaluation of non-hazardous industrial waste (U.S. EPA, 1999). This document
describes one aspect of the guidance, the technical basis for a ground-water moclel designed to
determine if waste management unit (WMU) designs are protective of ground-water resources. ,4f
The degree of ground-water protection provided by a particular design is determinedly modeling
the migration of waste constituents from the WMU through^underlying soil to a monitoring point
in an aquifer. The methodology and the assumptions used to perform the ground-water modeling
and the application of the results of the modeling are described ia this document.
XX ' X
' < t *%
This model incorporates two levels, or tiers, of analysis to provide facility managers, the public,
and state regulators flexibility in assessing the protectiveness of particular WMU designs. The
first level of analysis, Tier 1, is based on a conservative Monte Carlo probabilistic analysis that
accounts for the nationwide variability of ground-water modelling parameters. Tier 1 requires
minimal data about the facility and provides an analysis of the; extent to which a WMU design or
waste concentration for land application units protects groundjwater.* The second level, Tier 2,
allows users to change a subset of thejFafciHty-specific groundrwater modeling parameters that
may be known with greater certamt^||j|l5 specific location. -The Tier 2 analysis is based on a
predictive neural network tool that incorporates the sophistication of a probabilistic ground-water
modeling analysis yet requires a rniriimal amount of site.'data.
i \r' --, ' - ^c^'
The unique aspect of the two-tiered approach developed by the Agency is that it provides two
levels of analysis that require a minimum of data and provide instantaneous analysis and
recommendations for the type of liner thatjshould be used in a WMU and/or whether land
application is appropriate. Both anaiysespre combined into a user-friendly Windows-based
software tool, the Industrial Waste Evaluation Model (IWEM) that will operate on any standard,
MS Windows™-based PC platform.
& ' */
It should be noted here that the guidance also recommends detailed site-specific Tier 3 analysis
in satiations when a morphorough evaluation of site conditions is needed. This approach
ar||fes appropriate ground-water models and a full array of detailed site climatologic and
h^^^ologic data^|However, the Tier 3 analyses are beyond the scope of this background
dOctMent ind;ftt|^er information regarding their application to selecting appropriate liner
desig^;for;Mius1trial solid waste management unit designs is described in the main Guidance
(U.S;EPATl999).
The development of this two-tiered approach is described in detail in this document. The
remainder of this section summarizes the objectives of the two-tiered approach and contains a list
of the limitations, caveats, and disclaimers associated with the two-tiered approach used to
evaluate Industrial WMU designs.
IWEMJTBD.wpd
1
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
1.1 Objectives of this Document
The objectives of this document are to describe hi detail the methodology and assumptions used
to develop the two-tiered approach to evaluating WMU designs. Knowledge of the methodology
and assumptions used to develop the two-tiered approach will allow the user, decision makers,
and other stakeholders to determine if the approach is appropriate for the evaluation of specific
WMU designs at specific sites.
* „
A. ^ / X **
This background document contains a summary of the overall modeling strategy used, a
description of the models, the method used to calculate Tier 1 and Tier 2 leachate concentration
threshold values (LCTVs), and a summary of how to apply the model results for Tier 1 and Tier 2
to the evaluation of WMU designs and waste concentrations for land application.
<4^ ^
/"
1.2 Relation to Other Documents ^ ^,
' v
As stated above, this document describes the'ground-water modelingjnethodology used to
determine the protectiveness of a WMU design. Other aspects of the guidance, such as air
emissions modeling and ground-water monitoring are described elsewhere in the main Guidance
(U.S. EPA, 1999).
y > v>
13 Limitations, Caveats and Disclaimers
The two-tiered approach developed to evaluate WMU designs uses the latest available peer-
reviewed ground-water modeling methodology incorporating state-of-the-art probabilistic
techniques to account for the uncertainty. However, given the complex nature of the evaluations,
a number of limitations and, caveats must be delineated. These are described in this section.
V. 'V •>
f "~« sX-x ^
To perform the evaluations fporhmended by the Guide for Industrial Waste Management
(Guidance) (U.S. EPA, 1999),.mathematical models are used. These models are based on a
number of simplifying assumptions to represent conditions that may potentially be encountered at
waste management sites^thin the U.S. Efforts have been made to obtain representative
nationwide data and ap|bunt for the uncertainty hi the data. However, as with all modeling
evaluations, these simplifying assumptions potentially might not apply or may be inappropriate
for evaluating a specific WMU design at a specific site with a unique combination of conditions
that might not be accounted for with the available data. Therefore, where appropriate, EPA used
conservative estimates of parameter values to ensure ground-water protection.
The two-tiered approach described in this document is designed to be used as guidance in the
selection of an appropriate WMU design or land application of waste. Given the number of
variables involved and the uncertainty of hydrogeologic characteristics of a specific site, the
IWEMJTBD.wpd
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
application of this guidance is not intended to provide a guarantee that a specific design will be
appropriate and protective. The fate and transport model upon which this analysis is based uses a
national database. Evaluations using parameter values at the extremes of the distribution may
have significant error associated with them. s- $
£ \
The user, decision makers, and other stakeholders who are evaluating the results of a two-tiered Jl
analysis need to ensure that there is sufficiently documented and verifiable justification of sitejf^
parameter values and any potential uncertainty or data gaps that may exist. This is especiaHfifrue
with respect to highly sensitive modeling parameters such as infiltration rate, WMU area, y
sorption and hydrolysis rates, and the distance to a ground-watec monitoring well. Additional
information about the uncertainty involved in the modeling and two-tiered approach is provided
in Section 5.4 of this document. <,c
~ *x
ryiv • S ' *^^%
The two-tiered approach presented in this document was developed by EPA in consultation with
state regulatory agencies, representatives from industries, and environmental stakeholders. EPA
has provided this guidance as a tool for states! industry, the public, and environmental groups,
who may all have a role in making decisions'regarding appropriateness of WMU designs.
IWEMJTBD.wpd
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
2.0 OVERALL MODELING STRATEGY
This section describes the overall ground-water modeling strategy used to develop the two-tiered
evaluations of WMU designs (Section 2.1). Section 2.2 presents a summary of the Monte Carlo
analysis used in the ground-water modeling. Section 2.3 provides a summary of neural networks.
The technical basis for the Tier 1 and Tier 2 approaches are described in sections 2.4 and 2.5.
* \ % ,y
2.1 Ground-Water Modeling Strategy - /
-
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
.WASTE MANAGEMENT UNIT
Figure 2-1 Schematic Diagram of Grounfi-Water Modeling Scenario
0 ° .?:&' &fZ '^«?^S«fe*gte?i*3>'
The objective of the ground-water modeling performed forjpis guidance is to compute the
amount of dilution and attenuation a contaminant may undergo as it migrates from a WMU to a
ground-water well. The amount of dilution and attenuapon is expressed as a
dilution/attenuation factor (DAF), which represents fhe ratio of the initial leachate
concentration (CL) to the ground-water monitoring well concentration (CMW):
DAF =
"MW
The DAF is assumed to be independent of the initial leachate concentration for most chemicals.
For a given initial leachat;e concentration, there is a direct, one-to-one correspondence between
DAF and monitoring w|li concentration. Therefore, computing a DAF for specific WMU
scenarios allows forejasy determination of the expected ground-water well concentration. The
initial leachate concentration is divided by the DAF to determine the expected ground-water well
concentration./Similarly, the toxicity reference level at the ground-water well represents an
acceptable threshold value for the concentration of chemicals hi ground water. Multiplying the
toxicity reference level by the DAF produces an acceptable concentration level from a unit waste
leachate. This acceptable leachate concentration is known as the leachate concentration
threshold value (LCTV)-
IWEMJTBD.wpd
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
where:
LCTV=
DAF =
TRL =
LCTV = DAF*TRL
.Leachate Concentration
Threshold Value v «
s
X f
Dilution/Attenuation Factor
>
•v O
Toxicity Reference Level
(e.g.MCLorHBN)
The LCTV computed for a specific
chemical for a given ground-water
modeling scenario is compared to the
leachate concentration of a waste of
concern, as measured using the TCLP or
other appropriate leaching test. If the
leachate concentration is below the
LCTV, the scenario modeled for the
waste of concern will be protective of
ground water.
The conceptual model described here is
evaluated with a probabilistic ground-
water model developed by the Agency,
EPACMTP, EPA's Composite Model for
Landfills with Transformation Products , , / -
(U.S. EPA, 1996a). The probabilistic aspects of EPACMTP are summarized in Section 2.2.
Additional details on specific EPACMTP modeling assumptiois: and parameter values and data
sources are provided in Section 3.3. ^ *V
2.1.2 Waste Management Unit liners
The primary method of controlling the release of waste constituents to the subsurface is to install.
a low permeability liner at the base of a WMU. A liner generally consists of a layer of clay or
other material with a low hydraulic conductivity that is used to prevent or mitigate the flow of
liquids from a WMU. jfowever, the tyjppof liner that is appropriate for a specific WMU is
highly dependent upon a number of location-specific parameters, such as climate and
hydrogeologic characteristics. In addition, the amount of infiltration of liquids from a WMU has
been shown to be a highly sensitive parameter in predicting the release of contaminants to ground
water. Therefore, one of the main objectives of the two-tiered modeling approach is to evaluate
the appropriateness of a proposed liner design hi the context of other location-specific parameters
such as precipitation arici evaporation and the characteristics of the subsurface beneath a facility.
EPA has chosen to evaluate three types of liner scenarios, the no-liner, single-liner, and
composite-liner scenarios. The no-liner scenario (Figure 2a) represents a WMU that is relying
upon location-specific conditions such as low permeability native soils beneath the unit or low
annual precipitation rates to mitigate the release of contaminants to ground water. The single-
liner scenario represents a 3 foot thick clay liner with a low hydraulic conductivity (IxlO'7
cm/sec) beneath a WMU (Figure 2b). The composite-liner scenario consists of a 3-foot-thick
clay liner beneath a 40 mil thick high-density polyethyelene (HDPE) flexible membrane liner or
FML (Figure 2c).
IWEMJTBD.wpd O
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Clay Liner
Clay Liner.
/ Membrane
Liner
y
2a) No-Liner Scenario
2b) Single Liner Scenario 2c) Composite Liner Scenario,
Three Liner Scenarios Considered in the Two-Tiered Modeling
Approach for Nonhazardous Solid Waste Guidelines.
Figure 2-2
While other liner scenarios may be proposed by a facility or considered by the State authorities,
such as double liners or liners with drainage layers and leachate collection systems, they are not
addressed in the two-tiered approach described in this document ^For further information on
other liner scenarios, the reader is referred to the Guidance (U.S. EPA, 1999).
2.2 Monte Carlo Analysis /
In its nationally applicable analyses of the fate and transport of waste constituents, the Agency
for regulatory purposes, generally uses a Monte Carlo approach to compute probabilistic
estimates of constituent concentrations in hypothetical downgradient ground-water wells. The
Monte Carlo procedure randomly draws input parameter values from representative statistical
distributions for each parameter (Figure 2-3). A set of input parameter values is developed and
the model is run to compute the ground-water monitoring well concentration and the DAF. This
process is repeated thousands of times until a distribution of thousands of output values (DAFs)
is produced. The DAF values'are ranked from high to low and for the purposes of the current
Guidance, the 90th percentile DAF is determined. The 90th percentile DAF represents the amount
of .dilution and attenuation that would occur hi at least 90% of the cases modeled. Specifically,
e DAF vahieslwould be greater than the 90th percentile DAF and 10% would be lower.
stf^x
s, the;EpF is protective in at least 90% of the modeled cases. The use of the 90th
ji^refore, provides a conservative measure of the amount of dilution and
attenua^ likely occur.
IWEMJTBD.wpd
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Input Parameters
Output
X1
Cumulalivo
Frequency
EPACMTP
.Cumulative
Frequency
.OAF Values
Cumulative
Frequency
Input Values
Figure 2-3 Schematic Diagram to Illustrate the Monte Carlo Modeling Approach
2.2.1 Basis for the Use of Monte Carlo Approach
' . ; ' •>" " ^,
Given the number of input parameters used in EPACMTP (52) and the application of EPACMTP
to determine DAFs representative of waste management facilities nationwide, the output of the
model, the DAF, may be subject to a high degree of variability and uncertainty. The Monte
Carlo approach accounts for me variability (range of values) and uncertainty (level of confidence
that the values are representative) in the input parameters by computing distributions of DAFs
that are based on representative distributions of input parameters and reasonable and
representative combinations of these input parameters.
The Monte Carlo approach used in EPACMTP has been applied in various EPA regulatory
efforts, including me proposed Hazardous Waste Identification Rule (HWIR) of 1995 (U.S. EPA,
1995) and hazardous waste listing evaluations (for example, the Petroleum Refinery Waste
Listing determination, U.S. EPA, 1997). As such, the Monte Carlo procedure and it's
applicability to national analyses has been extensively reviewed within EPA and by the Science
Advisory Board and has been subject to public review and comment.
IWEM.TBD.wpd
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
2.2.2 Basis for Use of the 90th Percentile
The selection of a 90th percentile DAF (for DAFs ranked from highest to lowest) is* based on the
need to choose a level of protection that is conservative and consistent with other EPA analysis
including the proposed 1995 HWIR and hazardous waste listing efforts mentioned above. Also,
it is desirable to have a large degree of confidence that the results are adequately protective of
human health and the environment given the degree of uncertainty inherent in the data and the •
analyses. Therefore EPA has selected the 90th percentile output of the ground-water fate and f
transport modeling for this guidance. The 90th percentile DAF implies that of the modeled
scenarios (which are assumed to be representative of facilities nation wide) 90% resuTfih DAFs
that are higher than the 90th percentile DAF and thus are considered protective.
* ' ' '**'*/*
To enable users to perform a ground-water modeling analysis to evaluate a WMU design, EPA
developed a user-friendly method to obtain the results of other ground-water modeling analyses
conducted by the Agency. The Agency generally uses EP'A's EPACMTP ground-water fate and
transport model analyses. However, because EPACMTP is a mechanistic model that runs in
Monte Carlo mode, it requires up to several thousand iterations or realizations for each
' '" v" ' ri'W
simulation to account for changes in parameter values. Each iteration uses one set of input
parameter values that are randomly drawn from a representative statistical distribution. This
process is repeated 2,000 times for each simulation to'produce a statistical distribution of output
values. - - „ - ~"
, ', "": *s
Therefore, even when EPACMTP is rum on a current generation PC, such as a Pentium n, it
requires up to several hoursxto complete^a-^,000 iteration Monte Carlo simulation. In addition,
EPACMTP is a complex model that requires ground-water modeling expertise and a large
amount of facility and hydrogeologic data to be used effectively. Therefore, EPA developed a
simplified easy-to-use version of EPACMTP for users with little or no ground-water modeling
expertise and a nunimumjambtiht of key location-specific data. After exploring options, EPA
chcje|to develop a neurajSietwork as a user friendly predictive tool that simulates EPACMTP
anf|$ip compute a DAF based on user-defined values of 6 to 7 key EPACMTP input parameters.
lt ..riSt^
iof what neural networks are and how they are used to evaluate Industrial WMU
in the following sections.
IWEMJTBD.wpd
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
23.1 Brief Overview of Neural Networks
A neural network is a computational tool that can be used to predict outcomes based on the
characteristics of multiple inputs to a function or a system. Neural networks generate a model to
predict values of a variable (outputs) as a function of input data and can provide accurate
predictions over wide ranges of input values for non-linear and non-Gaussian processes (SPSS, J5
1996). Many neural network models are similar or identical to popular statistical techniques suen
as generalized linear models, polynomial regressions, and nonparametric regressions (Sarle^ '
1994). Neural networks have been used in a number of real-world applications, including
optimizing ground-water remediation strategies (Rogers, etal, |9£5, Johnson and Rogers, 1995).
Specifically, neural networks have been used in place of a mechanistic fate and transport model
to determine the remediation success of various configurations of pumping wells to withdrawal
and control of contaminant plumes (Rogers, et al, 1995). Neural networks also have been used to
evaluate "what if scenarios with Monte Carlo-style simulation experiments (Kalos and
Whitlock, 1986). Other applications of neural networks to ground-water problems include
prediction of aquifer parameters from geophysical curves (Aziz and Wong, 1992) and geospatial
estimation of hydraulic conductivity (Rizzp and Daugherty, 1994).
^A' < '*''
Neural networks are developed with;sofiware that consists of a large number of simple
computational or processing units (referred to as neurons) that are interconnected in a net-like
structure. Software neural networks derive their terminology from their similarity to biological
neural networks, such as the biih^in which the processing units are "neurons" that are
interconnected. Both biolpgip^ aind.software rieufbns receive input signals and send an output
signal to other neurons in thie'netwoiiriL The strengths of the connections between neurons,
referred to as weights, can be changed in response to information provided to the neural network.
As such, the network can learn and/or be,"trained" to produce an output based on a set of inputs.
"' • •;:->"'-:-i: ^:#""?W14v^ ^
Neural networks are analogous^to regression analysis, a statistical technique used to develop a
best fit to observed data thati-caribe used to predict an outcome based on a number of inputs. As
stated above, each layer uva neural network consists of "nodes" (also known as neurons) and
weights (Figure 2-4). Thfe input nodes are the independent variables and the output nodes are the
dependent variables. The hidden node values are analogous to regression equation terms and the
weights are analogous to the coefficients hi a regression equation. Neural networks are most
often trained using least squares methods (Sarle, 1994), which involves minimizhig the sums of
squares of the error (similar to regression models) over all outputs.
IWEM_TBD.wpd
10
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Components of a Neural Network
input
Figure 2-4 Schematic Diagram of a Neural Network
/ - v / \ f, <• \
The neural network makes an initial prediction of what an outcome should be based on inputs it
is given. It then computes the error^that is, the difference between the target (predicted) and
actual (observed) values. Weights are then adjusted to reduce the errors and the neural network
"backpropagates" to nodes within thejnetwork. Backpropagation involves working backward
from the cptput node'tb adjust the weights accordingly, and reduce the average error across all
nodes. This process is repeated until the weights reach their optimal values and the error in the
output is minimized.
Components of Applying Neural Networks
The successful ^application of neural networks requires consideration of the following aspects,
which are discussed below:
^ A. >
• Selecting an appropriate neural network architecture,
• Data pre-processing,
• Training,
• Performance evaluation
IWEM.TBD.wpd
11
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document _ April 13, 1999
Many varieties of neural network architecture are available including multi-layer perceptrons
(MLP), Radial Basis Functions (RBF), and Self Organizing Maps (Kohonnen networks) (SPSS,
1996). MLP networks are general-purpose, flexible, nonlinear models, based on thejiiginal
models of neural computing (Sarle, 1994). MLPs (which are the architecifce djpcfed hi Figure
2-4) are suited to a wide range of applications, and currently are the most successful and widely
used neural computing model (SPSS, 1996). For these reasonlfEPA selected Jhe MLP as the
most appropriate architecture for developing neural networks for the Guidance.
"""" " $ ^y
Pre-processing applies to any operations performed on the jaw data to extract features relevant to
a specific application, for example, natural log transformations arid removal of outliers. "For the
Tier 2 neural network, log transformations were made of all input parameters and the output
values (the DAF) to obtain values that were normally distributed.
\
The training or learning process consists of presenting the neural network with example data and
then adjusting the network's internal weights until the desired response is obtained. The method
used to adjust the weights is known as the training algorithm. Generally two types of training
algorithms are used to develop neural networks: "supervised training" and "unsupervised
training". In supervised training, the neural network learns to adjust its weights so that the
outputs coincides with a particular target value. In unsupervised training, the neural network is
not provided with a "target". Instead, it learns to recognize'fpatterns inherent in the data. Given
that the purpose of the neural networks was to predict a target DAF value, EPA used supervised
training.
/"
In general, increased numbers of training data sets and increased training iterations (the actual
number of counts or times the weights are adjusted) will increase the predictive capability of a
neural network. However* if a neural netwprk is highly trained on data it has seen (training data
sets), it might not be able to predict well using input data it has not been trained on. A key
attribute of a trained neural network application is its ability to generalize (i.e., to give accurate
answers on data that it did not "see" as part of the training process.) However, neural networks
can become "over-fitted". An over-fitted neural network is one that produces excellent results
with the training input data, but performs poorly when presented with data it has not seen before,
even if it comes from the same source (i.e., the EPACMTP) as the training data. Such a neural
network results hi poor generalization. The achievement of good generalization is a key design
aim, and it is achieved by careful choice of neural network size and the amount of training
applied to the.neural network.
Performance evaluation of neural networks consists of evaluating the ability of the trained
network to predict an output based on inputs. A neural network's performance can be evaluated
using standard R-squared statistics and plots of the actual outputs versus the neural network
predicted outputs with 90% confidence intervals. Sensitivity analyses can be performed to
determine the sensitivity of an output variable to changes made in input values. The results can
IWEMjrBD.wpd
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
be ranked in order of decreasing sensitivity.
Selection of Appropriate Neural Network Training Software
Numerous neural network training software packages are available commercially and as
shareware and freeware. For the purposes of the current Guidance, a lovsHcdst package called
NNModel (Neural Fusion, Inc. 1998) was evaluated. NNModel was demonstrated to be an
appropriate neural network modeling package for the purposes of developing the Guide for -^ 7
Industrial Waste Management based on it's ease of use, it s technical capabilities' low-cost/and
ability to produce robust neural networks. Trial versionsj>f the software are available on a
shareware basis (http://www.neuralfusion.com/). "' " N
^ > $
2.3.2 Development of a Neural Network to Emulate EPACMTP for Tier 2, Location-
Adjusted Evaluations
S" "« ^
To develop a neural network for the Tier 2 analysis for evaluations of Nohhazardous Solid Waste
WMUs, the following steps were performed:^'
X "*
• EPACMTP was run to create training and test data sets;
• the EPACMTP output was imported to spreadsheets of modeling inputs and results;
• the spreadsheets were imported into NNModel and a data matrix was created;
• the data were examined with graphical visualization features of NNModel and
appropriate transformations were performed to normalize the data;
• a neural network was created with the training and tested with test data sets; and
• the output for the neural network software is an exported neural network (ENN file),
which is then linked to the windows-based graphical user interface.
For the Tier 2 analysis des'cribedjn Section 2.6, a neural network was developed with the
NNModel software and trained using sets of EPACMTP input and output parameter values. The
trained network was used to predict the EPACMTP output value (the DAF) as a function of
combinations of EPACMTP input values. The EPACMTP input parameters used to develop the
trained neural networks are:
,. .
infiltration;
chemical hydrolysis rate;
organic carbon partition coefficient;
depth to water table;
aquifer thickness; and
distance to ground-water well.
IWEMJTBD.wpd
13
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Numerous combinations of various percentile values for each of these parameters were used to
train the neural network (i.e., provide the neural network with the information necessary to
predict the DAF based on changes in the values of these parameters). EPACMTP wls" run to
develop a number of "training data sets" (sets of input and output parameter values} for
combinations of these values, in which the DAF is the requked output. Additional details of the
neural networks developed for the Tier 2 analysis are provided in Section 5 and Appendix A.
2.4 Tier 1
For the Tier 1 National Evaluation modeling, the ground-water modeling scenario us($pae
conceptual model presented in Figure 2-1 (see section 3.3 for details of the assumptions used).
The modeling assumptions used for three liner scenarios (shown in Figure 2-2) are the same for
all chemicals considered, except for two chemical-specific parameters. JThe chemical-specific
parameters, the organic carbon partition coefficient (Koc) and the decayia||iX), are used to
determine the degree to which a chemical will sorb to subsurface soil particles and/or hydrolyze
or otherwise breakdown into daughter chemicilproducts, respectively. EPACMTP modeling
was performed for the National Evaluation^and the results of this'modeling are incorporated in
tabular form in the IWEM software, provjlmg Leachate Concentration Threshold Values
(LCTVs) for all of the 190 Nonhazardoiis Solid Waste chemicals for liner scenarios for each type
of WMU (with the exception that onlyjlie no-liner scenarijps considered for land application
units). s^r
*' \ ' -, '*"
•v s. > %V
Given the large number of combinations of KQC and X for organic chemicals and the amount of
time required to perform EPACMTP simulations for each of the organic constituents, EPA
developed an interpolation procedure jtajreduce the amount of effort requked to develop DAFs
for organics. The process involved per|opiiing simulations for a range of representative
combinations of K^; intd X and rnterp$fating between these values to obtain DAFs for the
majority of the organic'chemicals. This interpolation method and tests used to verify it are
described in the EPACMTP Finite Source Methodology Background Document (U.S. EPA,
1996b)/ ' '
The measure of how appropriate a liner design is for a particular WMU is determined by
comparing the estimated waste constituent leachate concentration (as determined by the TCLP or
other appropriate EP& test method) to the calculated LCTV in the appropriate Look-Up Table.
This look-up function is performed automatically in the IWEM software. The result of this
comparison determines the recommended liner system for the WMU or determines whether land
application of this waste is appropriate and will not exceed the toxicity reference levels (i.e.,
health-based numbers or maximum contaminant levels) at a down gradient ground-water well.
For example, if the estimated leachate concentrations for all constituents are lower than the
corresponding no-liner LCTVs, then the no-liner scenario is recommended as being sufficiently
14
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
protective of ground water. If any leachate concentration is higher than the corresponding no-
liner LCTV, then a minimum of a single clay liner is recommended. If any leachate
concentration is higher than the corresponding single-liner LCTV, then a minununi^f-a
composite liner (an FML geomembrane overlain by a clay liner) is recornmejadedpFor waste
streams with multiple constituents, the recommended liner design is ba^^tti^most protective
liner required for any one constituent. "
2.5 Tier 2
The Location-Adjusted Evaluation (Tier 2) is provided to assist the user in risk-based|f eei'sion
making regarding WMU liners or land application of waste. The Tier 2 Evaluation uses a set of
four neural networks (one for each type of WMU) which enables theuser to input certain
location-specific data from a proposed or existing WMU. These location-specific data are then
used to determine whether a liner is recommended as part of the WMU design for a given facility
or to determine whether or not land application of a waste is protective o£ ground water.
" "/ -Y \ <**
^ ^ v ."?
Using a set of several thousand EPACMTP, simulations, a neural network was trained for each
type of WMU to estimate the DAFs that EPACMTP would have generated based on the values
of the input parameters. The user can vary input parameter values within the range of values in
EPACMTP's nationwide distributions, and the neural network then predicts the results that
EPACMTP would have generated. Thus, the Location-Adjusted Evaluation allows the user to
instantaneously evaluate a number of site-specific considerations without having to run
EPACMTP or another ground-water fate and transport model. The Tier 2 neural networks mimic
EPACMTP model'results without requiring long simulation times. Additionally, the intuitive,
windows-based user interface eliminates the need for the user to have training or extensive
knowledge of neural networks, the EPACMTP model, or statistics. As is done in the National
Evaluation, the expected waste leachate concentrations are compared to the Location-Adjusted
LCTVs for the constituents of concern to determine the recommended liner type for the WMU or
to determine whether land application of a waste is appropriate.
s
y
In developing and training the neural networks, problems were encountered when the extremes of
the'distributions were used as input to the training. Training of the neural networks was then
limited to parameter values between the 10th and 90th percentile. The composite-liner infiltration
rate assumed in Tier f(3 x 10"5 m/yr for landfills) was outside the 10th to 90th percentile range
(0.024 to 0.45 m/yr for landfills), and thus the neural networks were not trained using this value.
Because the use of infiltration rates outside the range over which the neural networks were
trained will result in significant error, the Tier 2, Location-Adjusted Evaluation does not
explicitly address the composite-liner scenario.
IWEMJTBD.wpd
15
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
The proposed guidance still allows for assessment of the composite-liner scenario through the
Tier 1 and Tier 3 evaluations. EPA is seeking comment on how to address inclusion of the
composite-liner scenario hi the Tier 2 Location-Adjusted Evaluation. One option currently under
consideration is the development of multiple networks — one for low infiltration rates and one
for high infiltration rates. ^ x '
->
^\
x w
IWEMJIBD.wpd
16
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
3.0 MODELS USED TO DEVELOP THE TWO-TIERED APPROACH
This section presents summary descriptions of the models used to develop the Tier 1 LCTVs and
the Tier 2 neural networks. The following models were used: ^ /'
• the HELP model for determining WMU infiltration rates; % ^ X^
• the MINTEQA2 model for detennining metals sorption; and , ^
• EPACMTP for determining dilution/attenuation factors (DAFs) of waste constituents, inv
ground water. y " ;/
*i *f, 4s-
Additional information can be found in the references cited in the description for each model.
The software used to develop the neural networks for Tier 2 isHiscussed briefly in section 2.3.1
and additional detail on the development of neural networks is provided in Section 5.0.
- / *
3.1 HELP * .
s*
The Hydrologic Evaluation for Landfill Performance (HELP) model is a quasi-two-dimensional
hydrologic model used to compute water balance analyses of landfills, cover systems, and other
solid waste management facilities (U.S. EPA, 1994). HELP uses weather, soil and design data
and computes a water balance for landfill systems accounting for the effects of surface storage,
snowmelt, runoff, infiltration, evapotranspkatibn, vegetative growth, soil moisture storage,
lateral subsurface drainage, leachate recirculation, unsaturated vertical drainage, and leakage
through soil, geomembrane or composite liners. HELP can model landfill systems consisting of
various combinations of vegetation, cover soilC waste cells, lateral drain layers, low permeability
barrier soils, and synthetic geomembrane liners. The model computes runoff, evapotranspkation,
drainage, leachate collection and liner leakage that may be expected to result from the operation
of a wide variety of landfill designs. The primary purpose of the model is to assist in the
comparison of design alternatives.
3.1.1 Determining Infiltration from a Waste Management Unit
The HELP model requires that the user: input climatic data, using historical data provided for up
to 10L cities in the U.S.; select the number of soil, waste and liner layers (e.g., liner, cover, waste
layeri >drainageJayerjNind the characteristics of each layer (thickness, hydraulic conductivity,
etc:); and sejec|iurface characteristics (SCS Runoff Curve number, vegetative cover, and
whether the layer is compacted). HELP uses these inputs to compute a water balance for the
scenario selected by the user by computing the amount of precipitation that reaches the surface of
the unit, minus the amount of runoff and evaporation. HELP then computes the amount of water
that infiltrates through the surface layer (if applicable), into the waste layer, and through the
bottom soil or liner, based on the initial moisture content and the hydraulic conductivity of each
IWEMJTBD.wpd
17
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
layer. The output from the model is the amount of water that infiltrates through the bottom of the
WMU. This value is then input to EPACMTP as the infiltration rate (hi m/yr).
3.1.1.1 Determination of Surface Impoundment Infiltration Rate
7
The HELP model does not have the capability to estimate infiltration rates from surface
impoundments. However, this capability is incorporated into EPACMTP. EPACMTP estimates;
the rate of infiltration through the base of the impoundment as a function of (1) the ponding /
depth of liquid hi the waste unit, (2) the thickness of a low permeability sediment layer, or liner,
at the base of the impoundment, and (3) the hydraulic conductivity of this impeding layer.
Additional information regarding characterization and modeling of liquid wastes managed in
surface impoundments is provided in the "EPACMTP Background Document and User's Guide"
(U.S. EPA, 1996a). '>" " "
3.1.2 Sensitivity of HELP Model to Input Parameter Values
vj s / M> V v
s" v^
As part of the evaluation of the applicability of the HELP model to the two-tiered approach, a
sensitivity analysis was performed on the HELP model to determine the most sensitive, HELP
input parameters (Allison Geoscience Consultants, Inc., 1997a). While the industrial waste
guidance focuses its evaluation on the liner characteristics, this sensitivity analysis examined all
HELP model input parameters to confirm the sensitivity of the liner characteristics. The
sensitivity of the infiltration rate (computed with the HELP model) input to EPACMTP is
discussed in Section 5.1
The sensitivity analysis quantitatively'ranked HELP model input parameters based on their
influence in determining the modefs prediction of the infiltration rate of liquid from the bottom
of a simulated landfill. The input parameters were restricted to a range of values expected in
actual field designs and the analysis used a Monte Carlo shell program developed for EPA
(Salhotra et al., 1988) linked with HELP. The Monte Carlo shell program linked to HELP
enabled the specification of many of the HELP input parameters as probabilistic distributions
rather than single, deterministic values. This allowed the parameter values to be random
variables that take on new'values each time the model is executed. The Monte Carlo shell
supervises multiple executions of the HELP model, with each random variable assuming a new
value for each execution. This can be likened to running the HELP model many times, with new
values of the input parameters each time.
The probability distributions for each HELP input parameter limit the range of values the
parameter can assume and even statistical properties that the set of generated parameter values
should have, such as the mean and standard deviation. In this way, a large number of HELP
simulations, each made with input parameters that have been randomly generated but constrained
according to particular limits and statistical measures, are completed with a minimum of user
IWEM.TBD.wpd
18
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
interaction. This large set of HELP simulations can be thought of as a set of deterministic model
outcomes (or outcomes given some particular set of inputs). Each of the randomly generated
model inputs was saved along with the model outcomes to determine which input parameters are
most important in determining model outcome and the parameters were faiiked according to their
impact on the model output. This was accomplished by developing a coirBlitti|n matrix that
shows the statistical correlation between each model input parameter and ti|f|||del outcome of
• . . "" 4SJr|$|t«|l«>.. -58
interest. ,
The landfill scenario evaluated consisted of a closed landfill scenario with
vegetative cover layer, a barrier layer, a waste layer, a lateral drainage layer with aleaihlfe
collection system, and a clay liner. Analyses were conducted for two representative sites, one in
a humid region and one in an arid region of the U.S., based on cliraate,data provided with the
HELP model.
v^ - „ -
*" v ? «.'
The sensitivity analysis indicated that the HELP model prediction of the average annual
infiltration located in a relatively humid regiqn (Atlanta, Georgia) is most sensitive to the
following parameters: „ •"•<*
1) the layer 1 (surface) soil textural and associated hydraulic properties,
2) the SCS runoff curve number,
3) the evaporative zone|depth, fy
4) quality and quantity of vegetation maintained on the cover, and
5) the hydraulic properties of the cap barrier layer.
For the same design in an arid region (Las^ Vegas, Nevada) the model outcome is most sensitive
to: -/,,_-
"' ;*' * , "•
1) the evaporative zone depth,
2) ' the layer 1 soil fextural and associated hydraulic properties, and
, <~ 3) the quality and quantity of vegetation maintained on the surface.
s t ^
The lack of rainfall in th|lmodel simulations for the arid case resulted in little or no moisture
reaching the liner. TJtuiinay have confounded the sensitivity analysis to the extent that the liner
parameters for these scenarios have no impact on the results. This may suggest that in such dry
regions the characteristics of the surface component of the landfill cap are very important in
determining landfill performance.
Another conclusion of this sensitivity analysis applies not just to closed landfills, but to HELP'S
prediction of the performance of the bottom liner and leachate collection system. The percent of
the infiltrate that passes through the cap of a landfill, or that migrates to the bottom liner in the
case of an open landfill, and then percolates through the bottom liner, as predicted by HELP, is
IWEMJIBD.wpd
19
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
determined by: 1) the hydraulic properties of the bottom Uner and 2) the maximum drainage
distance of the leachate collection system. Neither thickness of the Uner or of the drainage layer,
nor slope to the drams appears to be of much importance in HELP'S estimate of thislSndfill
performance measure. This conclusion is derived from analysis of the humid region results only,
because of the lack of leachate that actually penetrated the cap barrier inlleltrii region case.
f -.
y"< '
A related conclusion concerns the sensitivity of the model's prediction |>f|meli^i||b|ead on the
bottom Uner. That model outcome is most sensitive to: 1) the hydrau]|cfpror^r^^|o^the bottom
Uner, 2) the maximum drainage distance of the leachate collection system, 3) thelsoiltextural and
hydrauUc properties of the lateral drainage layer, and 4) the slope of the bottom layeifin between
the drains. This conclusion also is based only on the humid region modeling results, however, it
would apply to the arid region scenario if a sufficient amount of liquid were to reach the bottom
liner of the landfiU. / ^
The EPA acknowledges the importance of landfill design characteristiegand the evaluation of
drainage layers and leachate collection systems may most appropriatelysbe considered in detailed
Tier 3 analyses. However, the focus of the two-tiered approach is on the three Uner scenarios
described in Section 2.0. The HELP model sensitivity analysis confirms the primary importance
of the hydrologic characteristics of the liners considered in Tier 1 and Tier 2.
-V „?
3.2 MINTEQA2-derived and Empirical Isotherms
V. j. t>
.'*'.- "- •=""
The process that most affects the transport of metals in the subsurface is sorption. The effect of
sorption on metals! transport is determined with sorption distribution coefficients. The sorption
distribution coefficients (K,,) for metals used hi the Tier 1 and Tier 2 scenarios are compatible
with those used in other U.S. EPA regulatory efforts. The K,, values used in the modeling on
which the Tier 1 approach is based are-non-linear representations in which K^ is represented as a
function of the total metal concentration and several important geochemical variables. The
neural network used in trie Tier 2 approach precluded the use of metal concentration-dependent
Kd values. Therefore, empirical representations in which Kj is represented as a function of pH
only were used. The methpds use to model metals sorption are described in the following
sections. 4?
<. .-""!";£
" ' -' •, -'^5^
3.2.1 The B^ Values for Metals Used in Modeling Support for Tier 1
;-' • ' ,''• ." r ' i?
'. -• "" r ' .•'j,f
The Ka values for metals used in the modeling support for the Tier 1 approach were identical to
those presented in U.S. EPA (1995). The K,, values were characterized in one of two ways,
depending on the metal: 1) using empirical relationships that express Kj as a function of pH or 2)
using the MINTEQA2 speciation model (Allison, et al, 1991).
IWEM.TBD.wpd
20
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Empirical relationships that express K
chromium in the +3 oxidation state (Or™),
copper, (Cu),
lead(Pb),
mercury (Hg),
nickel (Ni),
silver (Ag),
vanadium (V), and
zinc (Zn).
IWEMJTBD.wpd
21
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
The system-dependent, metal concentration-dependent Kj values computed with MINTEQA2
were then used in the unsaturated zone fate and transport modeling performed with EPACMTP.
The M1NTEQA2 model and its use in determining the non-linear Kd relationships are described
below.
\
Description ofMINTEQA2
MINTEQA2 is an equilibrium speciation
model developed and distributed by the U.S.
EPA (Allison et al., 1991). For a particular
metal, the model is used to calculate the
equilibrium mass distribution among
dissolved, sorbed, and precipitated phases.
The input data for the model includes the pH,
the major ion composition of the ground
water, the expected concentration of
dissolved and particulate organic carbon «
(DOC and POC), and an estimate of the
concentration of binding sites associated with
hydrous ferric oxide (HFO) of the solid
aquifer matrix. The output from MENTEQA2
can be used to compute Kj. Specifically, for
a particular metal, K^ is the ratio of total
sorbed metal concentration to total dissolved
metal concentration at: equihljrii^ ^ke^
assumption hi using MBS|TJtiQA2 tp-compute
K,j values for ground ^atef is that thiqrate of
chemical reactionsru^iiadrDigJ^Qrptionreactions,
velocity. ' WillS®'51
MINTEQA2 /
Input data: , /
PH>
• the major ion composition of the
ground water, -
• the expected concentration of
dissolved and particulate organic
carbon (DOC and POC), and
• an estimate of the concentration of
binding sites associated with
hydrous ferric oxide (HFO) of the
solid aquifer matrix.
Output:,
• v used to compute Kd. the ratio of total
sorbed metal concentration to total
dissolved metal concentration at
equilibrium.
is fast relative to the ground-water flow
, v \ *5 •::'•>-*
A typical Kj result computed by MINTEQA2 for Cd is shown in Figure 3-1. This result shows
the variation in log Kj with the total cadmium concentration (log scale) for the case where pH is
6.8. and the concentratibns of Cd binding sites associated with the DOC, POC, and HFO are all
set to a mid-range \raiue from a distribution of reasonable values. Details of the use of
MINTEQA2in computing K.J values used in Tier 1 modeling support are presented elsewhere
(U.S.EPAT1996c).
IWEMJTBD.wpd
22
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
1.5
S
0.5
-4 -2 0 24
' Log Total Cd (mg/L)
Figure 3-1 Variation in log K^ with log of Total Cd Concentration.
IWEM.TBD.wpd
23
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
3.2.2 The Kj Values for Metals Used in the Neural Network for Tier 2
The K,j values for metals used in the neural network for Tier 2 are determined on the^basis of pH.
The pH-dependency for each metal is represented in the form of a regression equation of the
,.
form:
logKd = a-pH + b
as presented in U.S. EPA (1990). The empirically derived a and b parameters
are shown In Table 3-2, as are representative values of K,, at pH 4.9, 6.8, and 8.0.
dependent relationships whose parameters are shown in the table were determined from
experimental analysis of aquifer material samples collected from across the U.S. The pH-
dependent K^ relationships for As(m), Cr(VT), Sb(V), SelTV), and HO) have been used in other
U.S. EPA regulatory efforts (e.g., U.S. EPA, 1995),
*• '
The metals Ag, Cr(ID), and V were not included in the U.S. EPA (1990) study. For the tier 2
neural network, the pH-dependent relationship presented in Table 3-2 for Ni is also used for Ag,
and the relationship for Cu is also used for CrOOQ). Vanadium is assumed to be in the anionic
form vanadate, which is modeled wilfrjthe As(V) relationship from Table 3-2.
Development
t of the Kd values useamthe TiejrZNeurallyetwork
•••:-
The pH-dependent Kj relaliOnsMpSTepresenteff:in;Table 3-2 were determined from analyses of
six aquifer materiaVground-water-Siamples. Aquifer material samples were added to ground-
water samples to which metal halbefen added at concentrations ranging from 3 to 10 mg/L.
*• :; 4J'. „ v"'~"'•'•;, '•','"•-, '-'•"'rf.f!'~-'' •,-X'vii,;>,'*.i#.t.;>.t
After an equilibration'period of ^S^noun^u16 ground-water solution was extracted by
centrifugation, arid^thie metal Concentrations remainhig in solution were analyzed by inductively-
coupled plasma: spectrometry (ICP). In calculating Kd for a particular metal (M), the sorbed
metal concentration (Cs) is equivalent to the difference between the total metal [M]totai (after
metal addition to the sample) and the metal remaining hi the extracted supernatant solution
normalized by the concentration of solid aquifer material SC:
[M]
total
^solution
mg
(3)
SC
IWEMJTBD.wpd
24
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
The equilibrium dissolved metal concentration (Cj) is equal to [M]solution, so that Kj is determined
with:
kg
mg/kg
mg/L
- \M\solutlonmglL
[M]
^solution
mg/l
SC kgIL
Experiments were conducted over a pH range of approximately 2 to 12. For each metal, the
values from Equation (3) were used in a linear regression analysis with pH as the independent
variable. The parameters defining the resulting linear regression equations are those presented in
Table 3-2. „ ,~ ~ -
Limitations ofKd Values Used in the Tier 2 Neural Network "f
jr "**$*-
X v, <
The empirical pH-dependent K,j values of the U.S. EPA (1990) study are subject to uncertainty
and to limitations and assumptions that arise from the experimental methods and the statistical
analysis of the data. Most of the Unuptions result in increaised likelihood of under-estimating
sorption (and Kj) and increased uncertainty in the valuesfPCertainry, the usual assumptions
regarding equilibrium partitioning apply: the rate of partitioning of metal between the solution
and solid phases was such that a close approach to equilibrium was achieved hi the 48 hour
equih'bratibn period. If this was not the case, the K^ values may under-estimate actual
partitioning. Also, the usual problems associated with filtering a sample to separate the solution
and solid phases after equilibration contribute to uncertainty hi the results. Other factors that
should be taken into account when assessing these results include:
~ - ** '•»' f ^ ^ s**
1. The [M]solimoa values used in Equation (3) did not always represent the actual
concentration of mefal remaining hi solution. This is because of limitations of the
•"• *
ICP instrument used for this measurement. In cases where the concentration of
*" • metal remaining hi solution was below the instrument detection limit, the value of
[M]solutiq/|ised in Equation (3) was set equal to the detection limit. This primarily
' , impacted the value of the denominator hi Equation (3). In cases where [M]^,^,,
" " "** was set to the detection limit, the resulting Kd was under-estimated. Different
metals have different detection limits in this instrument, as detailed in U.S. EPA
(1990).
IWEM_TBD.wpd
25
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Table 3-2 Regression Parameters of the pH-dependent Kj Relationships used in the
Neural Network. (U.S. EPA, 1990).
Metal
As(m)
As(V)
Ba
Be
Cd
Cr(VI)
Cu
Hg
Ni
Pb
Sb(V)
Se(IV)
Se(VI)
T10)
Zn
a
0.0322
-0.219
0.190
0.378
0.397
-0.117
0.355
0.122
0.332
0.0768
-0.207
rO.282
-0.296
0.110
0.378
b
1.240±0.910
3.340±0.641
0.638±0.811
-0.200±0.907
-0.943±0.820
2.070±1.309
0.044±1.022
1.420±0.493
-0.471±0.797
1.550±0.394
2.996±0.512'
3.300±0.438
2.710±0.947
1.102±0.542
-0.621±0.820
^(L/kg) /,
pH4.9
24.99
184.88
37.07
, 44.90
' ^ 10.05
3138
60.74
104.18
.4ff4.32
' 84.40
95.87
82.83
18.18
43.75
17.03
pH 6.8
28.77X
70.93
85.11
234.64
57.10
18.81
287.08
177.66
61.18
118.10
38.76
24.12
4.98
70.79
89.00
, pHS.O
AX 31.45
38.73'
143.88
666.81
171.00
13.61
765.50
248.89
153.11
146.02
21.88
11.07
2.20
95.94
252.93
IWEMJIBD.wpd
26
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
For a particular metal, the set of Kj values used in the statistical regression analysis
included values that were calculated using [M]soiuti0!1 set to the detection limit. Thus,
the resulting regression equations include the effects of under-estimates o
2.
3. The ratio of solid aquifer material to solution was representative of a system with a
low concentration of sorbing sites relative to the metal concentration. For a given
metal concentration, such systems generally present a lower K^tfian systems with j|F
higher concentration of sorbing sites. The degree to which the resulting^ values^
° *• -Civ". . f i• -.' &&%$?&?'
tend toward the conservative because of the ratio of sorbing sites to metal, pfr
concentration is difficult to quantify. "~ ^ * ">H~?
-^ X
4. For a particular metal, these K' :?-*
3.2.3 Linear and Non-linear Sorptiou Isotherms
For a particular instance of transport modeling, the use of Kj values that vary with total metal
concentration corresponds to the assumption of a non-linear metal sorption isotherm. The Tier 1
approach reflects the non-linear isotherm assumption. The use of a single Kj value (not
dependent on metal concentration) in a particulaf instance of transport modeling, as is done hi the
neural network for Tier 2, corresponds to the assumption of a linear sorption isotherm. The
assumption of a linear sorption isotherm is more realistic for metals that are not as strongly
sorbed and for instances where the maximum total metal concentration is low.
Metals are adsorbed by soils and sediments at reactive sites on mineral and organic surfaces. A
plot of sorbed metal concentration versus dissolved metal concentration at equilibrium is referred
to as an isotherm, and the instantaneous slope of the isotherm plot is Kj. At low total metal
concentration, the concentration of unreacted surface sites (i.e., those sites available to sorb
metais) is much greatejffiian the concentration of metal. If the total concentration of metal is
increased, the systenvwill respond by sorbing more metal. Moreover, the proportion of the added
metal that is sorbed is essentially constant. Thus, K,, changes little so long as the concentration
of unreacted sites is not significantly reduced. At higher metal concentrations, the concentration
of unreacted sites begins to be depleted, and K,, for the system decreases more noticeably. For a
particular metal at a particular total concentration of sorbing sites, the isotherm plot will show an
almost linear trend (almost constant Kj) if only the low metal concentration portion of the curve
is considered. The overall nature of the isotherm curve is non-linear, and this will be obvious if
the portion extending to high metal concentration is considered. The instantaneous slope (K^)
IWEMJIBD.wpd
27
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
can be treated as approximately constant at low metal concentrations, but at higher metal
concentrations, the slope changes considerably from the initial value.
, A
The use of a single Kj corresponding to a linear isotherm has been favored^in fate/md transport
modeling for metals because for most systems, too little data has been available to characterize
the change hi K,j with metal concentration, and because transport modeling is simplified by
choosing a single value. The distinguishing feature of a non-linear isothermis that it accounts
for the depletion of available surface sites as successive pore volumes^bring more metal into the'
system. This is important because, while the leachate concentration exiting a landfill or othe/
source may not be very high, the total metal concentration in an equilibrating mass of aquifer in
the subsurface will be equal to the leachate concentration only for'the first pore volume. As
adsorption occurs, successive pore volumes continue to bring in metal until the source is
depleted. The total concentration of metal hi the equilibrating mass of aquifer material will be the
sum of the newly arriving leachate metal and the previously adsorbed^^ra|from prior pore
volumes. Depending on the initial K, and the numbilKpixppre volumetlcSeachate that pass
^^ ^'^•I-i'"-^;?''i'^->^/>*^^fe»'^^^.> " $&*•**'
through, the total concentration of metal canb^dme1|^ipbjt|h,.,and thelCd may decrease due to
depletion of unreacted sites. JS
Given the possible decrease in K, in the subsurfaefe with continued leaching, when it is necessary
* •;*'*; :,>':*''•*?*• fy.fi-' •sSSs''
to use a single Kj for fate and transport modeling, a conservative value should be chosen. As can
be seen from the foregoing discussion, the initial K^ of tffe; pristine system is not a conservative
value, rather, it is the maximum expected K^S||te:||^Ie Kj values used in the Tier 2 approach
represent values that have;^ coiiservative biaSiptie^rseveral factors (see discussion of limitations
above). The ratio of soUd SOT^rig:|ites to total metal concentration hi the experiments from
which these Kj values were determined: contribute to the inherent conservatism of the values in a
way that is difficult to assess. x; ;;'
£•* ^
33 EPACMTP
.-.". f -* *>
EPA's Composite Model forLeachate Migration with Transformation Products (EPACMTP) is a
fate and transport model jised by EPA to establish regulatory levels for concentrations of
chemicals in wastes managed in land disposal units (landfills, surface impoundments, waste
piles, or land application: units), for a number of EPA hazardous waste regulatory efforts.
EPACMTP simulates one-dimensional, vertically downward flow and transport of contaminants
iri the unsaturated zone beneath a waste disposal unit as well as two-dimensional or three-
dimensional ground-water flow and contaminant transport hi the underlying saturated zone. The
model accounts for the following processes affecting contaminant fate and transport: advection,
hydrodynamic dispersion, linear or nonlinear equilibrium sorption, chained first-order decay
reactions, and dilution from recharge in the saturated zone.
28
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
EPACMTP can be run in a probabilistic
Monte Carlo mode (see Section 2.2) to
randomly select parameter values from their
respective statistical distributions. The
Monte Carlo procedure allows assessment
of the uncertainty associated with ground-
water well concentrations that results from
variations in the model input parameters.
The subsurface modeled by EPACMTP
consists of a WMU, an unsaturated zone,
and an underlying water table aquifer, and a
ground-water well at a downgradient
location in the aquifer (refer back to Figure
2-1). Contaminants move vertically
downward from the base of the WMU
through the unsaturated zone to the water
table. EPACMTP allows simulation of
flow and transport in the unsaturated zone
and hi the saturated zone, either separately
or combined. For the two-tiered approach,
the combined scenario was used.
EPACMTP consists of four major
components:
analytical and numeric^ Solutions for water
flow and contaminant transport in the
£•'*" ~'£&-&&¥$>i3$^-':
unsaturated zone beneath a "
'
a numerical module for steady^state ground-
> ^ '"'*•$&%? ^••'•^y
water flow subject to recharge from the
unsaturated zone;
> >
> / '•.
a module of analytical and numerical
solutions for contaminant transport in the
saturated zone; and
N <•*' ,'
a Monte Carlo module for randomly
selecting input values to account for the
effect of variations in model parameters on
predicted ground-water well concentrations
The model accounts for the following mechanisms affecting leachate constituent migration:
x , t
' "' * *
• leachate constituent transport by advection and dispersion,
• leachate constituent retardation resulting from reversible linear or nonlinear
equilibrium adsorption onto the soil and aquifer solid phase, and
• chemical and biological degradation processes expressed as a first-order decay
rate. s
•i
The latter may involve chain decay reactions if the contaminant or contaminants of concern form
a decay chain. The assumptions and input parameters required to run EPACMTP for the
majority of EPA ground-water evaluations are listed in Table 3-3. Modeling assumptions and
input parameter values that are specific to computing infiltration rates for the Guidance are listed
in Tables 4-i through 4-3.
IWEMJTBD.wpd
29
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Table 3-3 EPACMTP Modeling Assumptions and Input Parameters
Overall Assumptions
Modeling Element
Management
Scenario
Modeling Scenario
Exposure Evaluation
Regulatory Protection Level
Description or Value >
Landfill ' \
Surface impoundment * * ^
Waste pile ( > „ ~ „ -^
Land application unit v ,> " /
Finite-source Monte Carlo;
depleting source for organics, ' '
constant concentration pulse> source for metals
/
Downgradient ground-water monitoring well;
peak well concentrationsfor noncarcinogens, maximum 30-year
average well concentration for carcinogens; 10,000-year exposure
time limit - ^ -
90 percent
•Source Parameters
Parameter
Waste Unit Area
Waste Unit Volume
Infiltration Rate
Landfill
Surface Impoundment
Waste Pile
Land Application Unit
Leaching Duration s
Landfill % "
Surface Impoundment
Waste Pile
Land Application Unit''
•/••':£
^Description or Value
Derived from Industrial D Waste Survey Data
User-specified
Site-based, derived from water balance using HELP model
Site:based, derived from impoundment depth using Darcy's law
Sitfr-based, derived from water balance using HELP model
Site-based, derived from water balance using HELP model
Derived from total mass of waste, continues until all constituents
have leached out
20 years (operational life of unit)
20 years (operational life of unit)
40 years (operational life of unit)
IWEM-TBD.wpd
30
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Table 3-3 EPACMTP Modeling Assumptions and Input Parameters (cont'd)
Chemical-Specific Parameters: ^liy
Parameter
Decay Rate
Organic Constituents
Metals
Sorption
Organic Constituents
Metals
Description x ,
/
V J> ,„*-
Hydrolysis constants compiled by U.S. EPA ORD ^
No decay ,. , „ ~, < ^ £~""
S ,, <*'
KQC constants compiled by U.S. EPA ORD x "
MINTEQA2 sorption isoflierm coefficients for [Pb, Hg (H), Ni, Cr
(IH), Ba, Cd, Ag, Zn, Cu (II)? Be]; pH-dependent isotherm coefficients
for[As(m),Cr(Vl),Se(VD;TJi^ f ^
Unsaturated Zone Parameters
Parameter
Depth to Ground water
Soil Hydraulic
Parameters
Fraction Organic
Carbon
Bulk Density
\ •" "
^ —^ ^ Description'
^ A i- +f
Site-based, from API/USGS hydrogeologic database
ORD data based on national distribution of three soil types (sandy
loam, silt loam, silty clay loam)
ORD data based on national distribution of three soil types (sandy
loam, silt loam, silty clay loam)
ORD data based on national distribution of three soil types (sandy
loam, silt loam, silty*clay loam)
'\ Saturated-Zone Parameters
Parameter
Recharge Rate *- - V'
' > \
Saturated Thickness (depth to water ,
table) •'- -- - ':,
Hydraulic Conductivity - ^
Hydraulic Gradient
Porosity
Bulk Density
Dispersivity
Ground-water Temperature
Fraction'Organic Carbon
pH
Description
Site-based, derived from regional precipitation and evaporation: and
~soil type
"Site-based, from API and USGS hydrogeologic database
Site-based, from API and USGS hydrogeologic database
Site-based, from API and USGS hydrogeologic database
Effective porosity derived from national distribution of aquifer particle
diameter
Derived from porosity
Derived from distance to ground-water well
Site-based, from USGS regional temperature map
National distribution, from U.S. EPA STORET database
National distribution, from U.S. EPA STORET database
IWEMJTBD.wpd
31
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Table 3-3 EPACMTP Modeling Assumptions and Input Parameters (cont'd)
Ground-Water Well Parameters
Well Element
Radial Distance From WMU
Angle Off-Center
Depth of Intake Point
Description x y
X
Nationwide distribution^ from U.S. EPA screening" survey (U.S.
EPA, 1987) " ~ , ^ \
Uniform within ±90° from plume center line {ho restriction within
plume) £
Uniform throughout saturated thickness of aquifer ^
Notes:
API
HELP
ORD
USGS
STORET =
American Petroleum Institute " f
Hydrologic Evaluation Landfill Performance
Office of Research and Development ^
United States Geological Survey x
Data Base Utility for the Storage and Retrieval of ChemicaL Physical, and Biological Data for Water Quality.
For Additional information on EPACMTP data sources, see U.S EPA (1996a, 1996b, and 1996c)
?
f « * ' :TV •£,»,? ,*•
3.3.1 Analysis of Unsaturated Zone and Saturated Zone Flow and Transport
The method used in EPACMTP to analyze flow and transport in the unsaturated and saturated
zones is described as follows: -/"
Flow in the Unsaturated Zone. Flow in the unsaturated zone is assumed to be steady-state,
one-dimensional; vertical flowfrom beneath the source toward the water table. The lower
boundary of the unsaturated zone is assumed to be the water table. The flow in the unsaturated
zone is predominant gravity-driven, and therefore the vertical flow component accounts for most
of the fluid flux between the source andthewater table. The flow rate is assumed to be
determined by the long-term average infiltration rate through the WMU. In surface
impoundments, the flow rate is assumed to be determined by the average depth of ponding.
Transport in the Unsaturated Zone. Contaminant transport in the unsaturated zone is assumed
to occur by advection and dispersion. The unsaturated zone is assumed to be initially
contaminant-free, and contaminants are assumed to migrate vertically downward from the
disposal facility. EPACMTP can simulate both steady-state and transient transport in the
unsaturated zone \wtii single-species or multiple-species chain decay reaction, and with linear or
nonlinear sorptiori.
Flow in the Saturated Zone. The saturated zone module of EPACMTP is designed to simulate
flow in an unconfined aquifer with constant saturated thickness. The model assumes regional
flow in a horizontal direction with vertical disturbance resulting from recharge and infiltration
from the overlying unsaturated zone and waste disposal facility. The lower boundary of the
aquifer is assumed to be impermeable. Flow hi the saturated zone is assumed to be steady-state.
IWEMJTBD.wpd
32
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
EPACMTP accounts for different recharge rates beneath and outside the source area. Ground
water mounding beneath the source is represented in the flow system by increased head values at
the top of the aquifer. This approach is reasonable as long as the height of the mound is small
relative to the thickness of the saturated zone. *
/• *
^ -^ s ^ x: -^
Transport in the Saturated Zone. Contaminant transport in the saturated zone is assumed to be
a result of advection and dispersion. The aquifer is assumed to be initially contaminant-free, andp
contaminants are assumed to enter the aquifer only from the unsaturated zone immediately ^ ^ ^
underneath the waste disposal facility, which is modeled asja rectangular, horizontal plane */
source. EPACMTP can simulate both steady-state and transient three-dimensional transport in
the aquifer. For steady-state transport, the contaminant mass flux entering at the water table must
be constant with time; for the transient case, the flux at the waterlable may be constant or may
vary as a function of time. EPACMTP can simulate the transport of a single-species or multiple-
species subject to chain decay reactions, and linear or non-linear sorption^
EPACMTP also accounts for chemical and biological transformation processes. All
transformation reactions are represented by first-order decay^proc^sseC An overall decay rate is
specified for the model; the model cannot explicitly consider thepgparate effects of multiple
degradation processes such as oxidation, hydrolysis, and biodegradation. The user must
determine the overall, effective decay rate when multiple decay processes are to be represented.
EPACMTP also has the capability of determining the overall decay rate from chemical-specific
hydrolysis constants using soil and 'aquifer temperature and pH values. EPACMTP assumes that
reaction stoichiometry is prescribed for scenarios involving chain decay reactions. The
speciation factors are specified as constants by the user (see the "EPACMTP Background
Document and User's Guide", U.S- EPA, 1996a). In reality, these coefficients may change as
functions of aquifer conditions (for example, temperature and pH), concentration levels of other
chemical components, or both.
3.3.2 Data Sources
Data were obtained from a nationwide survey of industrial non-hazardous WMU's (landfills,
surface impoundments, waste piles, and land application units) (Westat, 1987) to characterize
WMUs and hydrogeolpgic characteristics at facilities nationwide. Parameters and assumptions
used to estimate injpfation of leachate from each type of WMU are provided in the "EPACMTP
Background r^ocument and User's Guide" (U.S. EPA, 1996a).
3.3.3 Model Sensitivities to Input Parameters
Given the large number of EPACMTP input parameters (52) and the variability and wide ranges
of values for some input parameters, EPA's objective in developing the two-tiered approach was
to focus on the most important or most sensitive input parameters (i.e., those parameters whose
IWEMJIBD.wpd
33
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
a deterministic analysis of high and low values of input parameters;
a probabilistic analysis of high and low values of input parameters; and
a probabilistic analysis of the linear and rank order correlation coefficients to determine
sensitivities of input parameters (Allison Geoscience Consultants, lac., 1997b).
values would have the greatest impact on the output of EPACMTP). Focusing on the most
sensitive parameters minimizes the amount of data required to perform an analysis while
maintainhig the robustness of the analysis.
'" V
A t* \ ' ** *"
The most sensitive input parameters were identified by performing sensitivity analyses which
consisted of running EPACMTP with high and low values of each input parameters and looking v
at the resultant difference in the output (the DAF). Three separate sensitivity analyses were .jj'
conducted: ~>
1)
2)
3)
These three types of sensitivity analysis were performed and the combined results used to best
determine the list of most sensitive parameters. As is described hi the summary of each analysis,
subtleties in the model sensitivity for probabilistic runs might only become apparent in a
deterministic sensitivity analysis. However, probabilistic sensitivity analysis were needed
because the modeling for Tier 1 and Tier 2 is probabilistic. The two different probabilistic
sensitivity analyses were performed to provide rndependent;eoniirmation of the sensitivity of
input parameters for probabilistic modeling analyses.
., * \ *%° ^
The list of 52 EPACMTP input parameters to evaluate in the sensitivity analyses was reduced to
28 by identifying the independent variables thafdirecfly affect the EPACMTP output (well
concentration). The 24 parameters not evaluated are either parameters included in the model to
consider additional fate and transport phenomenon, but not used for the scenarios of interest to
this analysis, dependent variables, or fixed (constant) values. Each of these parameters was not
evaluated because they are not used for this analysis or are held constant and therefore have no
impact on the model output.
A deterministic sensitivity analysis described in the following section was performed to
determine the ranked sensitivity of each of these 28 parameters (Table 3-4).
IWEMJTBD.wpd
34
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Table 3-4
List of 28 EPACMTP Input Parameters Used in the Deterministic Sensitivity
Analysis.
Variable
Infiltration
Recharge
WMU area
Depth of landfill
Fraction of industrial waste in landfill
Density of industrial waste
pH of ground water
Temperature of ground water
Dissolved phase hydrolysis decay rate '
Sorbed phase hydrolysis decay rate.
Average particle diameter of aquifer material
A ? -^
Unsaturated zone thickness (depth to water table)
/
Residual water content of unsaturated zone
Soil bulk density
>
Soil moisture parameter x
/ \
Soil moisture parameter
Unsaturated zone hydraulic conductivity *-" N
Aquifer Thickness -
Hydraulic conductivity of aquifer
Ground-water hydraulic gradient
_, t -y
Porosity
Longitudinal dispersivity-of aquifer
•"•Sf
Organic Carbon Partition Coefficient
Percent organic matter
Fraction* Organic Carbon
Angle of well off plume centerline
Radial distance to downgradient well
Depth of well intake point (Traction of aquifer
Code Name
SINFIL „ ,
RECH <
AREA
DEPTH
FRAC
6.r'^ -v^
14:** ^-^
0.06/v
0.06
^0.025
1.17
0.07
1.65
0.0153
1.37
0.091
1.182
3.2
0.0057
0.236
7
0.8
0.8
0.00043
45
427
0.5
Units
m/yr
"m/yr J!
^•^x ^U^
fOfff&X-rKrt&fty
^.Jf^KOS;^
unitless
g/cm3
unitless
°C
1/yr
1/yr
cm
m
unitless
g/cm3
I/cm
unitless
cm/hr
m
m/yr
m/m
unitless
m
ml/g
ml/g
ml/g
degrees
m
unitless
IWEM_TBD.wpd
35
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Deterministic Sensitivity Analysis
Deterministic sensitivity analyses of the EPACMTP evaluated the effects of changes in single
values of input parameters on a single output value. The deterministic sensitivity analysis
evaluated the impact on predicted ground-water well concentration as a function of changing
each parameter from it's 10th to 90th percentile value while holding other parameters at their 50th
percentile values.
/ / AC
The advantage of the deterministic sensitivity analysis is that it allowed the isolation of sensitive
parameters that might be masked in Monte Carlo runs and might not reveal how sensitivities are
dependent upon values of other parameters (e.g., whether or not contaminant degradation is
considered, as described above). The disadvantage of me deterministic sensitivity analysis is that
it might not reveal sensitivities or interdependencies thai are representative of the probabilistic
modeling required to evaluate industrial waste management scenarios.
For example, the deterministic analysis of longitudinal hydraulic conductivity of the saturated
zone (XKX) indicated that it was not a sensitive parameter. However, closer examination reveals
that at low values of XKX, the contaminant travels so slow that it does not reach the ground-
water well, resulting in a low C^^. At high values of XKX, the contaminant is transported
rapidly but is highly diluted, also resulting in a low concentration CMW. The highest CMW values
are actually reached at median values of XKX. Therefore, due to the design of the deterministic
sensitivity analysis, XKX was not determined to be a sensitive parameter (i.e., there was an
insignificant difference in the output for high and low values of XKX). However, based on other
modeling analyses and knowledge of the importance of advection in groundwater flow and
transport, hydraulic conductivity is generally recognized as a sensitive parameter.
The results of the deterministic sensitivity analysis identified a ranking of parameters in terms of
model sensitivity. Those parameters tnat ranked on the bottom list were evaluated and
determined to be parameters that generally are not sensitive for the modeling scenarios
considered for the two-tiered analysis. Those parameters that are not sensitive for the scenarios
of interest to the two-tiered approach are:
• ..-.'. , FRAC - the fraction of the waste of concern hi the landfill;
• DEPTH - the depth of the landfill;
• CTDENS - the density of the waste of concern hi the landfill;
• pH-pH of the ground water;
• TEMP - ground-water temperature;
• RECH - aquifer recharge rate; and
• RLAM2 - a secondary parameter used to express first-order decay.
IWEMu.TBD.wpd
36
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Probabilistic Sensitivity Analyses
The intent of the Guidance is to incorporate the probabilistic aspect of EPACMTP while
focusing on those parameters that have the greatest impact on the model results. Therefore,
probabilistic sensitivity analyses also were performed to identify sensitive parameters when
EPACMTP is run hi Monte Carlo mode. The probabilistic sensitivity analysis evaluated the -~
effects of changes in one input parameter and holding that parameter at one value while using J /
Monte Carlo values for the remaining parameters for thousands of iterations per simulation:" The
probabilistic sensitivity analysis determined the effect of changes in a single input parameter
value on the statistical distribution of output ground-water well concentrations. For the purposes
of the Guide for Industrial Waste Management, which evaluates the 90th percentile output of
EPACMTP, the probabilistic sensitivity analyses evaluated the effects o£changes in one
parameter on the 90* percentile ground-water well concentrations. > "'-
> s ^
The sensitivity of EPACMTP to key input parameters for probabilistic simulations was evaluated
to identify sensitive parameters with which to develop a neural network. Monte Carlo
simulations were performed for the four «ypes of WMUs: landfills,* surf ace impoundments, waste
piles, and land application units. As aresult of the large amount of time required to, perform a
probabilistic sensitivity analysis on each of the 28 parameters for each of the four WMUs, the
analysis focused on a subset of thejparameters identified Is the most sensitive in the deterministic
sensitivity analysis:
/
• Source Specific parameters
- area of disposal unit (AREA);
- infiltration rate from disposal unit (SESTFIL);
*x
• Chemical-specific properties
- hydrolysis rate (RLAM1);
•' x
• Unsaturated zone properties
- thickness of unsaturated zone (DSOIL);
• Aquifer properties
— »V longitudinal hydraulic conductivity (XKK);
'"- hydraulic gradient (GRAD);
— aquifer thickness (ZB);
• Ground-water well location
- radial distance of the observation well from center of downstream edge of waste
disposal unit (RADIUS); and
IWEMJTBD.wpd
37
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
— angle off-center of observation well measured counterclockwise (ANGLE).
The method used to perform probabilistic sensitivity analyses involved performing Monte Carlo
(probabilistic) simulations to compute the "relative difference" between the output Jora low
input parameter value and the output for a high input parameter value. T^xarmne%e
sensitivity of one parameter, one series of probabilistic simulations wasl^nomied to determine
the "base case" (all input parameter values selected via MontetCarlo proceduref EPACMTP
N * r ,&•'••"•-v^'" x \^&>K^-':';.;• -^ ' "
output value. Another series of probabilistic simulations was performedltoidietennine the output,
* •*• ••'•:;f-'"£* ,i':v:-.:x' ~"-sf *~.- j
value for a "low" (10th percentile) input value of each parameter. A third series of probabilistic
simulations was performed to determine the output valud|f>r a "lu"gi^(90th percentile) input
value of each parameter.
The sensitivity of some of the EPACMTP input parameters^dependent upon the values of other
J A \
parameters. For example, if the angle of the ground-water well tf|pftj||limie centerline is held at
a high value (e.g., 90°), the ground-water well will mojre;likely be ]f6^^H;0|&side of the
contaminant plume. Therefore, the concentration; jja|||f ground waterAw|||tikely be zero
regardless of changes in values of any other pafaihefesrsl; The sensitivity of parameters also is
dependent on whether or not contaminant d6gradati(^(sorption/hydjFolysis) is considered. For
example, if adsorption and hydrolysis areinot considered, changes in the depth of the unsaturated
zone (and other contaminant degradation-related parameters) will have no effect on the ground-
water well concentrations. ,/&?
' ,-:SjS!
The results of the ranked probabilistic sensitiptyjianalysis are shown in Tables 3-5 and 3-6. The
ranked order of the most sensitive parameters reflect both the spread in the input distribution of
each parameter and the intrinsic sensitivity of "that parameter. In other words, input parameters
with a high degree of variation between their low and high values will tend to show a higher
sensitivity than parameters with a low Degree of variation. WMU area consistently shows a high
ranking, because it has a wide frequency distribution and because changes in the source area will
directly affect the predicted ground-water well concentration.
Additional observations from the sensitivity analysis are summarized as follows.
• As a general trend for the four types of management units, the ground-water well
location (ANGLE and RADIS) and the source area (AREA) most strongly influence
the results. The rate of water infiltration through the source (SINFIL) is among the
most sensitive parameters for landfills and land application units, while it is less so
for surface impoundments and waste piles. For surface impoundments, the rate of
leakage from the waste unit is controlled primarily by the operating depth of the
impoundment (HZERO) and the permeability and thickness of an impending layer at the
base of the impoundment, but natural precipitation has little impact.
IWEMJBD.wpd
38
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Table 3-5 Ranked Sensitivity of Parameters Evaluated in Probabilistic Sensitivity
Analysis for (a) Landfills, (b) Surface Impoundments.
/ "*
(a) Landfills
Decay Rate (A.) = 0 yr'1
Area
Infiltration
Radial Distance to Well
Angle of Well off of Plume
Centerline
Hydraulic Conductivity
Hydraulic Gradient
Aquifer Thickness
Depth of Unsaturated Zone
Depth of Landfill
Sensitivity
Rank
6.85
3.62
3.20
3.19
1.69
1.22
1.17
0.01
0.01
Decay Rate (X) = 0.01
yr-1
Area
Infiltration
Angle of Well off of
Plume Centerline
Radial Distance to Well
Aquifer Thickness
Hydraulic Conductivity
Hydraulic Gradient '
Depth oCUnsaturated?
Zone
Depth of Landfill
Sensitivity
Rank
8.07
4.02
3.59 /
^
3.54 /
1.47
1-39 N
1.05
061
0.61
< "" x-
Decay Rate $.) = 0.1 jr'1
y\rea ^~ *•
X *
S s /
Infiltration
' \
Angle of Well off of Plume
SCenterline
RadialDistance to Well
*\
S
Aquifer Thickness
Depth of Unsaturated Zone
Hydraulic Conductivity
Hydraulic Gradient
Depth of Landfill
Sensitivity
Rank
22.80/f
X >* '
8.79
7.43
7.07
2.78
2.67
0.94
0.24
0.00
(b) Surface Impoundments
Decay Rate (X) = 0 yr~l
Angle of Well off of Plume
Centerline
Radial Distance to Well
Area.
Hydraulic Conductivity
Aquifer Thickness ^
Ponding Depth
Hydraulic Gradient
Infiltration
Depth of Unsaturated zone
Sensitivity
Rank
6.61
6.30
5.89 ^
- "/2.48
2.47
2.34
2.01
1.93
0.47
Decay Rate (X.) =
" 0.01 yr'1
Angle ofWell off of
Plume Centerline
Radial Distance to Well
Area
Aquifer Thickness
Ponding Depth
Hydraulic Conductivity
Infiltration
Hydraulic Gradient
Depth of Unsaturated
zone
Sensitivity
Rank
7.84
7.75
7.17
2.84
2.79
2.20
2.14
1.32
0.88
Decay Rate (A.) = 0.1 yr'1
Radial Distance to Well
Angle of Well off of Plume
Centerline
Area
Ponding Depth
Aquifer Thickness
Depth of Unsaturated Zone
Infiltration
Hydraulic Conductivity
Hydraulic Gradient
Sensitivity
Rank
20.19
15.94
12.64
5.93
5.37
4.09
4.05
1.30
0.24
IWEM_TBD.wpd
39
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Table 3-6 Ranked Sensitivity of Parameters Evaluated in Probabilistic Sensitivity
Analysis for (a) Waste Piles, and (b) Land Application Units.
(a) Waste Piles
Decav Rate (JO — 0 vr"*
Angle of Well off of Plume
Centerline
Area
Radial Distance to Well
Hydraulic Conductivity
Aquifer Thickness
Hydraulic Gradient
Infiltration
Depth of unsaturated zone
Sensitivity
Rank
34.2
27.3
23.0
6.09
4.85
2.82
1.86
0.40
Decay Rate (X) = 0.01
vr-1
Angle of Well off of
Plume Centerline
Area
Radial Distance to Well
Aquifer Thickness
Hydraulic Conductivity,,
Infiltration
Hydraulic;Gradient
Depth, of unsaturated
zone
Sensitivity,
Rank
32.7
255
24.0
54.89
4.79
1-88 ' „
1.87
0.8
Decay Rate (M = 0.1 vr-I
Angle of Well off of Plume
Centerline
Radial Distance to Well
Area
Aquifer Thickness
Infiltration
Depth of unsaturated zone
'Hydraulic Conductivity
Hydraulic Gradient
Sensitivity
Rank
5.46.,.'-
' 5.27
4.17
1.34
0.42
0.30
0.30
0.12 .
(b) Land Application Units
Decay Rate (X) = 0 yr"1
Area ,
Radial Distance to Well
Infiltration
Aquifer Thickness
Hydraulic Conductivity
Hydraulic Gradient
Angle of well off of plume
Centerline ' a
Depth of unsaturated zone
Sensitivity;
Rank
2.55
2.37
1.78
1.49 ":f
,1.18
0.68
0.58
0.42
: Decay Rate (X) = 0.01
vr-1
Area"
Radial Distance to Well
// -^
Infiltration
Aquifer Thickness
Depth of unsaturated
zone
Hydraulic Conductivity
Angle of well off of
plume centerline
Hydraulic Gradient
Sensitivity
Rank
3.38
2.92
2.75
1.88
1.21
1.05
0.97
0.40
Decay Rate (A,) = 0.1 yr'1
Infiltration
Area
Radial Distance to Well
Angle of well off of plume
centerline
Depth of unsaturated zone
Aquifer Thickness
Hydraulic Conductivity
Hydraulic Gradient
Sensitivity
Rank
11.6
8.39
7.82
6.00
5.26
2.54
1.38
0.03
IWEMJTBD.wpd
40
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
The relative ranking of the four most sensitive parameters (ANGLE, RADIUS, AREA,
AND SINFIL) is not the same for all four types of WMUs because the source specific
parameters (AREA and SINFIL) have different ranges of values for each of the different
WMUs. For example, 50th percentile values of area for the four WMUs are:
landfills
surface impoundments
waste piles
land application units
18,500m2
2,685 m2
405m2
80,900m2
Model results are highly sensitive to the location of the ground-water well, because the
choice of the well location affects the concentration that will r>e measured at that location.
Additionally, the ground-water well location increases in importance as the contaminant
degradation (i.e., sorption and/or hydrolysis) rate increases. \For example, as the
hydrolysis rate increases, well location becomesWen more critical because the
contaminant will hydrolyze and will not travel as far as a conservative (non-hydrolyzing)
contaminant. 4/ ,
, f ' *
As the hydrolysis rate increases, the thickness of the unsaturated zone (DSOIL) and
aquifer thickness (ZB) increase in importance, and ttie hydraulic conductivity (XKX) and
gradient (GRAD) decrease^in importance.
For hydrolyzing constituents, the contaminant concentration reaching the ground-water
well is highly dependent on the travel time (in both the unsaturated zone and the aquifer).
The travel time, in turn, is controlled by the total travel distance (through the unsaturated
and saturated zones) and the travel velocity. Thus the unsaturated zone thickness
(DSOIL) and aquifer thickness (ZB) increase in relative importance to other parameters
as hydrolysis rate increases.
/:- , <, ;.>
The travel velocity in the saturated zone is a function of the aquifer hydraulic
conductivity (XKX) arid hydraulic gradient (GRAD). It may, therefore, seem inconsistent
that the sensitivity of these two parameters would decrease at higher hydrolysis rates.
This apparent inconsistency highlights the fact that very complex interactions in the
subsurface environment control the ground-water well concentration.
jf
. The effect of increasing ground-water flow rate (by increasing either the hydraulic
conductivity or the hydraulic gradient) results in a shorter travel time, and in the case of
degraders, higher ground-water well concentrations. On the other hand, an increased
ground-water flow rate also will cause a greater degree of dilution and dispersion of the
contaminant, which will tend to lower the ground-water well concentration. The effects
IWEMJTBD.wpd
41
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
of dilution will be more important for non-degrading constituents, whereas the effects of
shorter travel time will become more important for hydrolyzing constituents.
An independent review conducted by Allison Geoscience Consultants (Allison Geoscience
Consultants, 1997b) of the probabilistic sensitivity analysis was performed for EPA OSW. This
review reached the following conclusions:
• the sensitivity analysis met the objective of identifying the most important parameters iri;
determining EPACMTP model output. - "' V "'?
• an independent sensitivity analysis using a method that relied on linear and rank order
correlation coefficients to estimate the performance in determining all of the model
responses in a Monte Carlo simulation indicated that this method ofranking model input
parameters yielded results similar to the probabilistic ranked difference method. The
linear correlation coefficient results identify approximately the same parameters as being
most important, although the relative importance i& somewhat different.
' ^ S y ~>'
• Spearman rank order sensitivity analysis results, which are probably a more appropriate
ranking because of the non-linearrelation between many of the input parameters and the
model response also indicated similar parameters as being most important. The exception
is the appearance of aquifer dispersivity in the ranking, which is not a parameter that is
commonly known and will not be requested of the user.
\ - -^ ^, -
• the similarities in both rank order analyses confirmed the results of the probabilistic
differences method used to identify the most sensitive parameters.
Based on the combined results of each of the three sensitivity analyses, the Agency developed a
list of the twelve most sensitive EPACMTP parameters for the industrial waste management
scenarios considered for the Tier 2 analyses (Table 3-7). These twelve parameters were selected
to develop the neural networks for each WMU (see Section 5) for the Tier 2 analysis. As will be
described in Section 5, initial attempts to train on all twelve parameters were unsuccessful.
However, accurate neural networks were developed using 6-7 EPACMTP input parameters as is
described in Section 5.1r
IWEM_TBD.wjxl
42
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Table 3-7 Most Sensitive EPACMTP Input Parameters Based on the Combined Results
of Deterministic and Probabilistic Sensitivity Analyses
Parameter Group
Source Parameters
Chemical-Specific Parameters
Unsaturated Zone Parameters
Saturated Zone Parameters
•v V
s ^ V,
•" > /
/
f, y
' * "" '/
s
/• "*
Well Parameters
' >•.
EPACMTP Input Parameter
area (AREA) '
infiltration (SINFIL) ' ^C' , ^,^'
>, , /
ponding'depth for surface impoundments
(HZERO) , ^
hydrolysis rate (RLAM1)
V/ ff A- <^
organic carbon partition coefficient (KOC)
{ '
unsaturated zone thickness (DSOIL) -
sensitive only for hydrolyzing and adsorbing
chemicals <. f '
percent organic matter in the unsaturated zone
(POM)
aquifer thickness (ZB)
hydraulic conductivity of saturated zone
(XKX)
fraction organic carbon in the saturated zone
(FOC)
hydraulic gradient (GRADNT)
distance to well (RADIS)
angle of well off the contaminant plume
centerline
IWEMJTBD.wpd
43
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
4.0 CALCULATION OF THE TIER 1 LEACHATE CONCENTRATION
THRESHOLD VALUE (LCTVs)
This section describes the method used to compute the Tier 1 National Evaluation LCTVs,
including the assumptions and EPACMTP input parameters used (Section^.!), the calculation of
DAFs that are used to develop Tier 1 LCTVs (Section 4.2), ani|;the determination of Tier 1 .^
LCTVs (Section 4.3). EPACMTP was run using nationwide distributions of input parameters /j~
and some conservative assumptions to determine the 90th percentile DAF for each chemical on>?
the list of Industrial Waste chemicals of concern. This section includes a description of the x
assumptions and input parameters used to compute Tier 1 LCTVs.
' < > >
4.1 Assumptions and Parameters
Many of the assumptions used in other Agency applications of EPACMTP were used to develop
Tier 1 LCTVs, Including the nationwide empirical data used to characterize WMUs and the
saturated and unsaturated zones at waste management facilities (U.S. EPA, 1987). The reader is
referred back to Section 3.3 of this document for an overview of the modeling assumptions and
to previously developed background documents for EPACMTP (U.S. EPA, 1996a, 1996b, and
1996c) for additional details on the development and application of EPACMTP. Additional
modeling assumptions that are similar to other J3PA applications include:
"* / /
• The time period during which contaminant migration was modeled was 10,000 years.
£ " < r
• The leachate pulse duration for landfills was modeled as a finite-source controlled by the
ratio between rinitial contaminant mass in the landfill and the annual mass removed by
leaching.
» f
• In the case of surface impoundments, waste piles and land application units, the leachate
pulse duration has taken to be the same as the unit's operating life, as follows:
— Surface impoundment: 20 years
— Waste pile: 20 years
r Land application unit: 40 years
It was assumed that any waste remaining at the time the units are closed, is either
., removed or has negligible additional contribution to leaching.
• Rather than running EPACMTP for 2,000 simulations for each of 175 organic chemicals,
EPACMTP was run for a selected number of values of hydrolysis rate (X) and retardation
factor (R). Then for chemicals whose A, and R fell in between these values, an
IWEM.TBD.wjxi
44
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
interpolation procedure was used to determine the constituent's DAF. This technique
reduced the computational effort required wile maintaining the desired accuracy.
• A separate series of simulations were performed for each of the non-linear metals, given
the required difference in the modeling methodology used to model the monitoring well
concentration of metals as a function of its leachate concentration.
1 ' ' &•
v ^ tV , .4F
The major differences between other previous EPA applications of EPACMTP^and the v^ v f
assumptions used for the Tier 1 analyses are: 1) hi addition to computing mfiltrationfor the%o-
liner scenario for all four WMUs, infiltration from the WMU was computed for single-liner and
composite-liner scenarios for landfills, waste piles, and surface impoundments and 2) rather than
allowing the monitoring wells to be located within plus or jninus 90° of the plume centerline, the
monitoring wells were located 150 m downgradient on the plume centerline. These assumptions
were based on agreements reached by the Association of State and Territorial Solid Waste
Management Officials (ASTSWMO) Steering Committee, consisting of representatives from the
states and EPA. The details of the differencesjn thes^assumptions areNdiscussed in the
following sections. ^ *v ''
Assumptions Used to Calculate Infiltration Rates
•c
3
The Tier 1 analyses are designed to allow a uslfto selecflhe most appropriate type of WMU
liner from three liner scenarios for landfills, surface impoundments, and waste piles (land
application units consider only the no-liner scenario). The assumptions used to compute
infiltration rates for each of the three liner scenarios for all four WMUs are summarized in
Tables 4-1 through 4-4.
' s
Infiltration rates were computed with the HELP model (see Section 3.1) for two liner scenarios:
the no-liner scenaric-rconsisting of a WMU that is developed on underlying native soils and the
single-liner scenario, consisting of a single two-foot thick clay liner (refer back to Figure 2-2).
Infiltration rates for the composite-liner scenario, consisting of a clay liner with a flexible
membrane liner (FML) on top of the clay layer were computed using a liner leakage equation
developed by Bonaparte et al (1989) to estimate leakage through pinholes in a geomembrane for
good contact conditions:
0.9
0.74
's
where: Q = rate of leakage through a circular hole in the geomembrane component of
the composite liner (m3/s)
a = geomembrane hole area (m2)
IWEMJTBD.wpd
45
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
hw =
ks =
head of liquid on top of the geomembrane (m)
hydraulic conductivity of the low-permeability soil component of the
composite liner (m/s)
/>
The specific assumption regarding hole size and frequency of holes are listed in Tables 4-1
through 4-3. Further details of the liner leakage modeling are'provided in Appendix B.
y
'
Ground-Water Well Assumptions /
-^ ,f<-~ *- " „ ^ ;
Two parameters used to determine the location of the downgradient ground- water monitoring
well were set to constant values to approximate a conservative scenario for well location. The
downgradient distance to the well was set to a constant value of 150 m from the boundary of the
WMU and the angle of the monitoring well off of the leachate plume centerline was set to 0°
(Figure 4-1). Each state reserves the final authority on the location of ground- water wells.
However, for the national Tier 1 evaluation, the location was set to a conservative distance of
150m. , > x
•x "••'
^
4.2 Determination of Dilution Attenuation Factors (DAFs) ~
•xi'.1 ?•-•>•• **
/
To calculate the Tier I LCTVs, the EPACMTP model is run. for 2,000 iterations (2,000 sets of
random selections of parameter values) for each chemical. The output of each simulation, the
ground-water concentration is converted to a DAF and the 2,000 values are ranked from highest
to lowest to form a cumulative probability density function (PDF). The 90th percentile lowest
DAF is selected from this PDF (Figure 4-2).
DAFs were determined for all chemicals except for those metals that exhibit non-linear sorption
(see Section 3.2). As is described in Section 3.2, the ground-water well concentrations of the
non-linear metals is dependent upon the initial concentration of metals (leachate concentration)
and the total concentration .of metal in the subsurface. A DAF is not calculated for non-linear
metals, per se. A back calculation is performed to determine the initial leachate concentration for
which 90% of the simulations fesults in a ground-water well concentration that does not exceed
the toxicity reference level CMCL or HBN). This is the equivalent of computing a 90th percentile
DAF.
43 Calculation of Leachate Concentration Threshold Values (LCTVs)
To develop liner-specific LCTVs for the chemicals of concern, the DAF is multiplied either by
the maximum contaminant (MCL), or the health-based number (HBN). For chemicals with
hydrolysis daughter products the MCL or HBN of the most toxic daughter compound (the lowest
MCL or HBN) is used to compute the LCTV.
IWEM-TBD.wpd
46
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Table 4-1 Assumptions Used to Compute Infiltration for Landfills
Method
Final
Cover
Liner
Design
Infiltration
Rate
No Liner
HELP model simulations to
compute an empirical
distribution of infiltration rates
for a 2 ft. thick cover of three
native soil types for 97
nationwide climate stations
represented by 30 climate
classes. Infiltration rates for a
specific site were obtained by
using the infiltration rate for
the nearest climate station.
Monte Carlo selection from
distribution of soil types.
2 ft thick native soil (1 of 3 soil
types: silty clay loam, silt loam,.|
and sandy loam) with a range ••-If
of mean hydraulic ' ,/
conductivities (4.7x10 6 cm/s to
6.4X10-4 cm/s).
Assumes no liner beneath
waste. •""'/"
s" - / .
^ s>
V S \ ""
S
Monte Carlo selection from an
empirical distribution of
values:
min=lxlO"5 m/yr
med=0.13 m/yr
max=1.08 m/yr
Single Liner
HELP model simulations
to compute an empirical
distribution of infiltration
rates through a single clay
liner for 97 nationwide
climate stations
represented by 25 climate <.
classes. Infiltration rates
for a specific site were
obtained by using the
infiltration rate for the
nearest climate station.
3 ft thick clay coyer with a
hydraulic conductivity of ^
iiixlO"7 cm/sec and a 10 ft ,
fthick wastefiayer. On top
of the coyer, a 1 ft layer of
loam to'support vegetation
andva 1 ft drainagfelayer.
,...-o->'/:W"
^XX-X-Y ;-:;•"
3 ft thick clay liner with a
hydraulic conductivity of
IxlO"7 cm/sec. Assumes
^constant infiltration rate
^(assumes no increase in
hydraulic conductivity of
liner) over modeling
period.
Monte Carlo selection
from an empirical
distribution of values:
min=0.0 m/yr
med=0.043 m/yr
max=0.053 m/yr
Composite Liner
Bonaparte'et al (1989) Liner
Leakage Equation (See Appendix
*•)•.*** /:
'' / ~ s > ~-"¥
,-'+ V
/ ? ' ^
j-> ~x
No cover modeled. DAFsfor
composite liner are already high
with a liner only and the
geomembrane is limiting factor in
determining infiltration.
3 ft thick clay liner with a
hydraulic conductivity of IxlO"7
cm/sec, 1 ft hydraulic head,
40milHDPEFML
geomembrane w/ small holes
(0.005 in2), one hole per acre,
"good" field conditions, and
good contact between the liner
and soil. Assumes same
infiltration rate (i.e., no increase
in hydraulic conductivity of liner)
over modeling period.
Constant value:
3.41xlO"5m/yr
IWEMJTBD.wpd
47
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Table 4-2 Assumptions Used to Compute Infiltration for Surface Impoundments
Method
Ponding
Depth
Liner Design
Infiltration
Rate
No Liner
Darcy's Equation for
infiltration through a
single sludge/native soil
layer with a Monte Carlo
selection of ponding depth
from the nationwide OPPI
survey of surface
impoundments.
Based on Monte Carlo
selection from nationwide
distribution of ponding
depths.
1 ft to 3 ft thick layer of
sludge/native soil in the
bottom of the
impoundment. Monte
Carlo selection from
distribution of hydraulic ,
conductivity values for this
layer ranging from IxIO"7
cm/s to lxlO-5cm/s. Waste
is removed after 20 years
and no final cover is
installed.
Calculated based on
Monte Carlo selection of
ponding depth from an
empirical distribution of
values. ./The resulting
distribution of infiltration
rates is:
miri=1.8xlO'2 m/yr,
med=3.94 m/yr,
max= 89.6 m/yr.
Single Liner
Darcy's Equation for
infiltration through a single
clay liner with a Monte Carlo
selection of ponding depth
from the nationwide OPPI
survey of surface
impoundments.
Based on Monte Carlo
selection from nationwide
distribution of ponding
depths. ^d
48
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Table 4-3 Assumptions Used to Compute Infiltration for Waste Piles
No Liner
Single Liner
Composite Liner
Method
HELP model simulations
to compute distribution of
infiltration rates for a 2 ft
thick layer of three native
soil types for 97
nationwide climate stations
represented by 30 climate
classes. Infiltration rates
for a specific site were
obtained by using the
infiltration rate for the
nearest climate station.
HELP model simulations to
compute distribution of
infiltration rates through a single
clay liner for 97 nationwide
climate stations represented by
25 climate classes. Infiltration
rates for a specific site were
obtained by using thejnfiltration^
rate for the nearest climate
station. , '>
Bonaparte et al (1989) Liner
Leakage Equation (See
Appendix B).
Liner Design
Assumes no liner.
Infiltration computed for
three native soil types with
hydraulic conductivity
similar to range of waste
hydraulic conductivities -<
(4.7xlO-6 cm/s to 6.4X10"4
cm/s). Assumes wasta is
removed after 20 yrs and
no final cover is Installed.
3 ft thick clay finer with a
hydraulic conductivity of IxlO;7
ciriteec, no leachate collection
^system, and a 10 ft thick waste
layer. Assumes no increase in
hydraulic conductivity of liner
over modeling period? Assumes
waste is removed after 20 yrs
and no final cover is installed.
3 ft thick clay liner with a
^hydraulic conductivity of
IxlO"7 cm/sec, 1 ft hydraulic
head, 40 mil HOPE FML
geomembrane w/ small holes
(0.005 in2), one hole per acre,
"good" field conditions, and
good contact between the liner
and soil. Geomembrane is
limiting factor in determining
infiltration rate. Assumes
waste is removed after 20 yrs
and no final cover is installed.
Infiltration
Rate
Monte Carlo selection
'from an empirical
distribution of values:
min=lxlO~ym/yr \ >
med=0.17 m/yr, "
max=1.08tn/yr
Monte Carlo selection from an
empirical distribution of values:
min=0.0 m/yr
med=0.126 m/yr
max=0.135 m/yr
Constant value:
3.41xlO-sm/yr
IWEM_TBD.wpd
49
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
PLAN VIEW
CONTAMINANT
. PLUMED
MONITORING WELL
LOCATION
CONTAMINANT
PLUME
CENTERUNE
SECTIONALVIEW
MONITORING
WELL
LOCATION
RADIS
GROUNDWATER
FLOW
LAND SURFACE
UNSATURATED
ZONE
WATER TABLE
CONTAMINANT
PLUME
NOTE:
RADIS = Radial distance to monitoring well
WEM_TBD.wpd
Figure 4-1 Well Location Parameters Used in the Tier 1 Analysis.
50
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
L 0.6-
TCT
0.8-
0. 0.6-
1 0.4-
2
0.
02-
0.0-
W 10" 10" 10"' 1
Rel.Conc.,CRW/CL
0.0-
= DAFcotrespondmg to
protection tevet q^ "^
s"^
10
102 103 10* 10
Figure 4-2 Cumulative Probability Density Functions of Groraia-water Well
Concentrations and DAFs x "N ^ '
The LCTV for any constituent is capped at 1,000 mg/L. Therefore, if the LCTV calculated based
on the DAF and the toxicity reference level is greater than i;000 mg/L, the LCTV will be set to
1,000 mg/L. The Agency invites comments on whether leachate from industrial WMUs are
likely to exceed 1,000 mg/L and if so, under what circumstances.
For the 39 constituents that determine whether a waste is characteristically toxic under 40 CFR
261.24 (U.S. EPA^ 1990a), the LCTV is capped at the regulatory level for each constituent (Table
4-4). Any waste with leachate concentrations equal to or greater than the regulatory level is a
hazardous waste under the RCRA and state laws.
IWEMJTBD.wpd
51
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Table 4-4 Toxicity Characteristic Regulatory Levels (U.S. EPA, 1990)
Chemical
Arsenic
Barium
Benzene
Cadmium
Carbon Tetrachloride
Chlordane
Chlorobenzene
Chloroform
Chromium
o-Cresol
m-Cresol
p-Cresol
Cresol
2,4-D
1 ,4-Dichlorobenzene
1,2-DichIoroethane ' '
1,1-Dichloroethylene
2,4-Dinitrotoluerie
Endrin
Heptachlor
Hexachlorobenze
Hexachloro-l,3-butadiene
Hexachloroethane
Lead;-
Lindane s
Mercury
Methoxychlor
Methyl ethyl ketone '
Nftorbenzene
Pentachlorophenol
'Pyridine- ;;•
Selenium
Silver
.Tetrachloroethylene
Toxaphene
Trichloroethylene
2,4,5-Trichlorophenol
2,4,6-TrichlorophenoI
2,4,5-TPAcid(Silvex)
Vinvl chlorirfc
Toxicity
Characteristic
Regulatory Level
(mg/L)
5.0 > s
100
0.5
1.0 ,
0.5
0.03
100
6.0 - ,
5.0
200
200
200
200
10.0
7.5
0.5
0.7
0.13
0.02
0.008'
0.13
0.5
3.0
5.0
0.4
0.2
10.0
200.0
2.0
100.0
5.0
1.0
5.0
0.7
0.5
0.5
400
2.0
1.0
09
IWEM_TBD.wpd
52
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
5.0 DEVELOPMENT OF NEURAL NETWORKS FOR TIER 2 EVALUATION OF
THRESHOLD VALUES
•£%•{,
(-;£%"£*"
As stated in Section 1 of this document, to support the Tier 2 analysis, the Agency developed a
user-friendly tool that is midway between a Tier 1 analysis and a Tier 3 comprehensive site-
specific, data-intensive analysis. The Agency evaluated methods to develop a simplified
approximation of a ground-water model that focuses on the most sensitive ground-water
modeling parameters, allows
parameters, produces instantaneous i
modeling expertise to use. During the course of its evaluation, the Agency determined that
neural networks (described in Section 2.3) can and have been'successfully used hi ground-water
modeling problems (Johnson and Rogers, 1995) and would be the most appropriate predictive
tool for the Tier 2 analysis.
•x.
This section describes how neural networks were used to develop a simplified tool that would
emulate the EPACMTP ground-water analysis of liner designs using a subset of the most
sensitive parameters and how they have been developed and applied to Tier 2 analysis of WMU
liner designs. Section 5.1 describes how the key EPACMTP input parameters were selected for
training neural networks. The assumptions and EPACMTP parameters included in the neural
network training are presented in section 5.2. Section 5.3,xpscribes the process of training the
neural networks and how they were developed. Section 5|t demonstrates how well the neural
networks predict EPACMTR:;jaodeling results'and describes the uncertainty involved in the use
of the Tier 2 neural networks. Finally, Section 5.5 describes how neural networks developed for
each of the four WMUs were integrated into one user-friendly Windows-based graphical user
interface (GUI).
' s "
5.1 Sensitivity Analysis to Identify Key Parameters
X „ - "X
N ?
As explained in Section 2.3 of mis document, neural networks are used to approximate processes
in the same manner as reg|eis|^ii analysis, hi which a number of independent parameters (inputs)
are used to predict values; pf one or more observed dependent parameters (outputs). Therefore,
the first step in developing neural networks to approximate EPACMTP was to determine an
appropriate set of EPACMTP input parameters that would be used to predict the outcome of
EPACMTP, the concentration of a chemical in a downgradient ground-water well used for
drinking water and its inverse the DAF.
As was stated in Section 3.3.3, EPACMTP uses 52 input parameters to perform a ground-water
modeling simulation. However, 24 of these parameters are either constant values, internally
derived from other parameters, or are inactive for the scenarios of interest to this analysis.
Therefore, the desired number of parameters on which to train the neural networks was a subset
of the most sensitive of the remaining 28 parameters, preferably parameters that generally would
IWEM_TBD.wp
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
be known with a high degree of certainty at a particular facility and would have the most impact
on the expected concentration of chemicals in a downgradient ground-water well.
To determine which parameters on which to
train the neural network, sensitivity analyses
evaluated the change in EPACMTP output,
the ground-water concentration, as a
function of changes in each of EPACMTP's
input parameters for a landfill scenario (see
Section 3.3.3). The deterministic and
probabilistic sensitivity analyses identified
the ten to twelve (depending on the WMU of
interest) parameters that were ranked as the
most sensitive parameters for the majority of
modeling scenarios for the two-tiered
approach (refer back to Table 3-7).
EPACMTP has 52 input parameters
24 plrameters are constant,
dependent or not used
Of the 28 remaining parameters, the
Agency identified the top 12 most
sensitive parameters for Industrial
solid waste modeling scenarios
Of the top 12.garameters, 7 were
identified that represent the most
sensitive parameters that can be used
to develop a robust neural network.
Initially, the following 10 EPACMTP parameters were selected to develop neural networks:
• waste area,
• infiltration rate,
• chemical-specific organic carbon partition coefficient,
• chemical-specific decay rate,
• the product of the percent organic matter in the unsaturated zone times fraction organic
carbon in the saturated zone,
• the product of hydraulic conductivity (rn/yr) and hydraulic gradient (m/m)
• depth to water table,
• aquifer thickness,
• angle of the monitoring well off of the plume centerline, and
• distance to the monitoring well.
Several attempts to develop neural networks with this number of parameters indicated that it was
possible to develop neural networks with a high coefficient of determination (R2). However,
because of the combined non-linearities in EPACMTP and complexities in the resulting output of
the EPACMTP (i.e., the response surface), the numbers predicted by the Tier 2 neural networks
were often inaccurate. They would differ from the actual EPACMTP result by at least an order
of magnitude in many cases. This was most likely a result of the wide variation in output values,
up to several orders of magnitude. Therefore, an evaluation was performed to determine the
optimum number of EPACMTP input parameters that could be used to train neural networks
with the required accuracy. This evaluation indicated that seven EPACMTP input parameters
IWEM_TBD.wp
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
could be used to develop accurate neural networks for each of the four WMUs for the Tier 2
analyses:
waste area,
*• f
infiltration rate,
chemical-specific KQC (for surface impoundments, waste piles, and land application
units), " *
chemical-specific decay rate, " \s >s
depth to water table, s # t
aquifer thickness, and
distance to the monitoring well
The landfill modeling analyses for Tier 2 consider large waste volumes and assume an essentially
steady-state source scenario. As a result, KQC becomes an insensitive parameter (Koc has no
effect on the well concentration under steady-state conditions). Modeling the other three WMUs
considered each of the seven parameters listed. ^ x f*-'
A 4 *~
5.2 Assumptions and Parameters Used to Develop Neural Networks
«- f /
Once the most sensitive parameters were identified, the assumptions and the modeling scenarios
that would be considered in the Tier; 2 analyses were delineated. The modeling scenarios were
essentially similar to those used tc^develop the Tier 1 LCTVs (see Section 4.0). EPA used the
neural network training software package NNModel version 3.2, to develop neural networks for
the Tier 2 evaluation. Feed-forward^ backpropagation neural networks with one hidden layer
were designed for this investigation. The input layer consisted of six units for landfills and seven
units for the other three WMUs, with each unit representing each input parameter (area,
infiltration, KQC, decay rate,.depth to water table). The output layer consisted of the ground-
water well concentration which are then converted to a DAF. See Figure 5-1 for a schematic
diagram of the neural network structure for surface impoundments, waste piles, and land
application units). • „ '-'
The hidden layer initially consisted of one neuron, which is automatically incremented by the
neural network training software to a maximum of 35 hidden neurons. Training was performed
for 10,000 to 20,000 iterations, and a conjugate gradient optimization, a method of adjusting the
weights, was performed to minimize the errors. The neural network training software provided
summaries of trie training statistics at the end of the training session.
IWEMJTBD.wpd
55
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Input Layer
Decay Rate
W M U A rea
*
Infiltration Rate
Depth to W ater
< :
Aquifer Thickness
Radial Distance
Hidden
Layer
Output
Layer
LOGIO (Avg. Cone.)
LOG10 (Peak'Conc.)
Figure 5-1 Diagram of Neural Network for EPACMTP Model (Parameters)
IWEM_TBD.wpd
56
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
The trained network was first tested by evaluating its performance against the training set, to see
how well it learned the training data matrix. The network was then tested against data is had not
seen before, to evaluate its generalized performance. The trained and tested neural|tetwork was
then converted into a simple computer code in the C programming language. This program was
then compiled into an executable file that can be run as a user-friendly stand-alone executable
program. ' "* Sf,
:H£-j
•> &
,* "" •> /
5.3 Neural Network Development . f^ ^
The results of EPACMTP simulations were collected in spreadsheets in one initial data matrix.
To follow the network learning progress from the beginning of the training, a training data set
was coupled with a test data set. Additional data sets were created parallel to the network
training and were appended to or replaced with the current trainingand/ortest data set depending
on training requirements. The details of this process for the neural networks developed for each
WMU are described in Appendix A. / "• »
Each data set consisted of a series of EPAgptTP input and output parameter values, with each
row consisting of a combination of inputiyalues and the resulting EPACMTP output values. A
column represents one input paramete||one row represented an input parameter combination,
and each cell contained an input parameter value. Input parameters with values varying over
several orders of magnitude (e.g. |||f percentile of area==f08, and the 90th percentile of
area=40500) were transformed to logarithmic values. This conversion was made for the
following inputs: LAMBDA,* KOC, AREA, and RADIUS. The other input parameters (SINFIL,
DSOIL, and ZB) remained hi their linear forniu The values of the output parameters also varied
over more than one order of magnitude' and were converted to logarithmic values.
^ •*- •// < ,. -'
Development of each of thefour neural networks required the use of three software systems. The
training and test data were produced with EPACMTP, the data matrices were collated into
spreadsheets, which were then imported into the neural network software package NNModel
version 3.2 (Neural Fusion, 1998). Figure 5-2 gives an overview of this process. EPACMTP
was run for 2000 iterations foreach input file consisting of a unique combination of input values.
The model simulated theJeaching from each WMU type and calculated the corresponding peak
and average 90th percentile monitoring well concentrations, which were then imported to
spreadsheets. The data sets in the spreadsheets were then imported into the neural network
software. Mterhal neural network training parameters including the training method, number of
training iterations, and the input parameter descriptors were defined and the neural network
training was started. The process of training and testing the neural networks for each WMU is
described in detail in Appendix A.
IWEMJTBD.wpd
57
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Figure 5-2 Overview of Neural Network Training Procedure
5.4 Neural Network Performance Evaluation
Following the iterativei prpcess of training and testing the neural networks for each WMU, then-
predictive ability was eyaluateci through an independent validation test. The final networks were
used to predict monitoring well-concentrations and DAFs for a comprehensive set of input
parameters, which was generated independently from the data on which the networks were
trained. The validation data sets were intended to resemble the site data that a user might enter
into the neural network m^del. The validation data sets reflect values representative of real site
data and were randomly selected within the domain of values used to train the neural networks.
The same data were.also used as input to EPACMTP and the performance evaluation consisted
of comparing the neural network predictions to the actual EPACMTP results.
The results of the neural network validation are summarized graphically in Figures 5-3 through
5-6. These figures show the performance of the neural networks for each of the four WMUs.
IWEM_TBD.wpd
58
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
The performance of the neural networks is summarized in Table 5-7. This table provides, for
each network, the number of training and testing data sets used, the number of data points in the
final validation test, and the linear correlation, expressed as the coefficient of determination (R2),
for actual (EPACMTP) and predicted concentration values.
^ v> <•
The validation results for the landfill neural network are shown in Figure 5-3. The figure shows ;
the comparison of EPACMTP and neural network-generated peak monitoring well ^1
concentrations for a total of 115 validation cases. The monitoring well concentrations are shojyn
on a logarithmic scale. The EPACMTP model simulations were all performed using an '
(arbitrary) initial leachate concentration of 106 mg/L. The highest monitoring well concentration
obtained also approaches this same value, i.e., it corresponds to the case where the concentration
in the monitoring well is the same as the concentration in the leachate. For this case, the DAF is
equal to 1.0.
"S
For clarity in the presentation of results, the results are shown in ranke1||^er, from the cases
with the lowest monitoring well concentration to the highest. The figure demonstrates that the
neural network performs best in the high range of monitoring well concentrations (i.e., low
DAFS); the prediction of the neural network is poorest for those cases where the monitoring well
concentration was the lowest.
/ •/
As will be discussed below, generally better neural network performance was obtained for the
other WMU neural networks which had gone throujg|Lpt6re extensive training/testing than the
landfill neural network. JJevertheless, the landfill neural network met the criterion that the
coefficient;of determination, R2, should equal of exceed 0.9 for both the training data set and the
validation data set (Table 5-7). The relatively poorer prediction in cases of low monitoring well
concentration (high. DAF) was deemed acceptable because these cases correspond to situations
where leachate from the waste unit wifijrlave relatively little impact on ground-water resources.
Conversely, simulation cases of high monitoring well concentration correspond to situations
where the WMU leachate may have a significant ground-water impact, and accuracy of the neural
network is of greater concern. /
?
s
Figure 5-4 depicts the validation results for surface impoundments. The same trends in
predictive accuracy of the neural network as a function of monitoring well concentration that
were observed in the landfill case, are also evident here. Interestingly, although the surface
impoundment neural network visually does an overall better prediction job than the landfill
network, theR2 value for the validation test, is actually somewhat lower (Table 5-7). This may
be attributed to the presence of a number of poor predictions in a small number of cases in the
low concentration range.
Results for the waste pile and land application unit validation tests are shown in Figures 5-5 and
5-6, respectively. In both cases, the neural network shows very good prediction accuracy, and
IWEMJTBD.wpd
59
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
•EPACMTP
• NN-Predicted
20
40 60 80
Realization (Nmax=115)
100
120
Figure 5-3 Comparison of EPACMTP Results and Neural Network Predictions of Peak
Monitoring Well Concentration for Landfills.
•EPACMTP
•NN-Predicted
-7
50
100 150
Realization (Nmax=258)
200
250
Figure 5-4 Comparison of EPACMTP Results and Neural Network Predictions of Peak
Monitoring Well Concentration for Surface Impoundments.
IWEM_TBD.w[xl
60
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
O 5-j
UJ
a.
CD
•EPACMTP
•NN-Predicted
50 100 150 200
Realization (Nmax=291)
250
300
V
Figure 5-5 Comparison of EPACMTP Results and Neural Network Predictions of Peak
Monitoring Well Concentration for Waste Piles.
•EPACMTP
-NN-Predicted
20 40 60 80 100
Realization (Nmax=138)
120 140
Figure 5-6 Comparison of EPACMTP Results and Neural Network Predictions of Peak
Monitoring Well Concentration for Land Application Units.
IWEMJIBD.wpd
61
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Table 5-1 Neural Network Summary Statistics
Parameter
Number of training/testing data sets
Maximum number of hidden neurons
Maximum number of nn iterations
Training R2 (log peak cone.)
Number of validation data sets
Validation R2 (log peak cone.)
Landfill
712
10
31,000
0.992
115
0.902 '
Surface
Impoundment
2,747
20
388,399
0.992
* 253 ,
0.825
Waste Pile
N 4,668. '
' 32"
160,082
0.995
-291
' 0.978
Land
Application
Unit
4,870 #
34 > -
/ f
• 120,073
0.997
138
0.994
IWEMJIBD.wjxl
62
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
correspondingly high R2 values (Table 5-7). It may also be noted from this table that the number
of training/testing data sets for the waste pile and land application unit networks were
significantly greater than used for landfills and surface impoundments.
X "**
x ,Af'if
Collectively, the validation tests show good predictive ability for all four networks, with R2
values equal to, or greater than 0.992. The comparison of landfill versus land application
network results also indicates the incremental gains in predictive ability tfiat,mayj>& realized ff
through additional neural network training. The trained neural networks seem to give good -,"'
generalized performance (i.e., they are able to predict the DAF to a high degree of accuracy,for
input values on which they have not been trained). * ? " "Vs'
.v x
Uncertainty Assessment „ >
S ~* -"s, *
"^ V />
*\
Because the neural network is an approximation of EPACMTP, which is itself an approximation
of real-world systems, the DAF values predicted by the neiiral network are inevitably subject to
various sources of uncertainty. It is difficult to quantify the impacts of the various sources of
uncertainty. The selection of a conservatively chosen DAF (i.e., 90th, percentile) is one way to
reasonably ensure that the model predictions are appropriately protective in spite of uncertainty.
A preliminary evaluation of the uncertainty associated with the neural network approximation of
EPACMTP was done and is described in the Draft Uncertainty/Sensitivity Analysis Background
Document (U.S. EPA, 1998).
< < •"
Further investigation into the uncertainty associated with use of infiltration rates less than 0.001
m/yr revealed that constituents with a degradation rate or KQC greater than zero gave monitoring
well concentrations of zero, and were insensitive to infiltration rate. This suggests that the
response surface that the neural hetwdrkjts intended to simulate has a discontinuity in it where
the infiltration/ates are low. This affirmed the difficulties in training the networks using low
infiltration rates. /
>
5.5 Integration of the Neural Networks into a Single User-Friendly GUI
•J-
Following the development of neural networks for each of the four waste management units, the
predictive coefficients and software that make up each neural network were incorporated into one
user-friendly Windows-based graphical user interface (GUI) that is used to perform both Tier 1
and Tier 2 analyses of WMU liner designs.
To verify that the GUI was passing the correct parameter values and computing DAFs in the
same manner as each individual neural network was demonstrated to do, verification tests were
performed on the GUI using sample data sets of input and output for each of the four neural
networks. The results indicated that the neural networks in the GUI performed as expected.
IWEMJTBD.wpd
63
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
6.0 APPLICATION OF MODEL RESULTS TO WASTE MANAGEMENT
The Tier 1 LCTV Look-Up tables discussed in Section 4.0 and the neural networkApproximation
of EPACMTP have both been linked to a Microsoft Windows™-based graphical user interface.
This user-friendly software package (IWEM) allows the user to enter chemical and facility
information and determine if a proposed WMU will be considered protective of ground water.
The use and interpretation of the Tier 1 and Tier 2 evaluations are described in this section
/ x / ?r?
(Sections 6.1 and 6.2, respectively). ^ ^ x / - ,„ -
Jivj£ f i
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
then the user can proceed to the Location-Adjusted Evaluation (Tier 2) or conduct a site-specific
ground-water fate and transport analysis (Tier 3).
s
6.2 Use and Interpretation of Tier 2 Evaluation
.
The neural networks developed for Tier 2 have been incorporated into the Industrial Waste
Evaluation Model (IWEM) graphical user interface and are designed to jnrovide the user
flexibility in evaluating WMU designs. IWEM incorporates the neural-networks for each
four WMUs and is designed to provide a user-friendly software tool that allows users to injjut
location-specific data for up to seven EPACMTP input parameters and quickly detennineif a
proposed WMU design will be protective of human health and the environment.
' ~w<- ^.
As with Tier 1, a list of chemicals commonly encountered in industrial waste is provided along
with necessary chemical-specific data such as decay rate and sorption coefficients, as well as
HBNS and/or MCLs. IWEM also allows the user to enter user-specified'chemicals and required
chemical property data, including user-specified TRLs. The user is requested to input location-
specific data, where available, and to document the source of these data. If location-specific data
are not available the Tier 2 evaluation prides default values that are based on representative
nationwide data. Upon entering the required data^ the user is provided with recommendations
regarding whether or not a specific liner type for a WMU is protective based on the location-
specific data, the modeling results, and on the TRLs for the chemicals of concern.
>.
Similar to the Tier 1 analysis, the leachate concentration for each waste constituent is compared
with the LCTV determined by the neural networks. If the user chooses to have the infiltration
rate estimated by the model, then the results are presented in terms of two types of liners: no liner
(in-situ soil) and single clay liner. These results are calculated from both MCLs and HBNs. If
the leachate concentrations for all constituents are lower than the no-liner LCTVs, then no liner
is recommended as being sufficiently protective of ground water. If any leachate concentration is
higher than the no-liner LCTV, then at least a single clay liner is recommended. For waste
streams with multiple constituents, the most protective liner that is specified for any one
constituent is the recommended liner design.
a measufed or calculated value for infiltration rate, that value can be directly
s input for Tier 2. In this case, the user's expected leachate concentrations are compared
! calculated for this scenario. The modeling results are then presented to the user
detailing whether the given scenario is recommended as being sufficiently protective of the
ground water. These results are calculated for both MCLs and HBNs.
IWEM.TBD.wpd
65
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
7.0 REFERENCES
Allison, J.D., D.S. Brown, and K.J. Novo-Gradac, 1991. "MINTEQA2/PRODEFA2:x A
Geochemical Assessment Model for Environmental Systems. Version 3.0 User's Manual.
EPA/600/3-91/021. March 1991. U.S. EPA. Athens, GA. * x ,x
*. ,f *>*,*<• m%
Allison Geoscience Consultants, Inc., 1997a, A Sensitivity Analysis of the Hydrologic
Evaluation of Landfill Performance Model (HELP) Using a Monte Carlo Sheli, w /J/'
Submitted to EPA Office of Solid Waste under EPA Contract No. 68-W6-0053, Work
Assignment B-20. x "*'",.,*
* s
Allison Geoscience Consultants, Inc., 1997b, Review of EPACMTP Sensitivity Analysis,
Submitted to HydroGeoLogic, Inc. under EPA Contract No. 68-W6-0053, Work
Assignment B-20, December, 1997.
Aziz, A.R.A, and Wong K.V., 1992, A Neural-Network Approach to the Determination of
Aquifer Parameters, Ground Water, iVf 30, No. 2, PP 164-166.
y v
Bonaparte, R., J.P. Giroud, B.A. Gross, 1989, Rates of leakage Through Landfill Liners,
Conference Proceedings, Volume I, Geosynthetics '89 Conference, San Diego,
California, February 23,1989;
\ s\ *
Davis, J.C., 1986, Statistics and Data Analysis I Geology, 2nd Edition, John Wiley & Sons, Inc.
Johnson, V.M. and Rogers, L.L. 1995, Location Analysis in Ground-Water Remediation Using
Neural Networks Ground Water, V. 33, No. 5, pp. 749-758, September-October, 1995.
V > v '
•" I*
Kalos, M.H. and Whiflock, P.A.,,1986, Monte Carlo Methods, Wiley, New York.
\
Neural Fusion, 1998. NNMbdel Version 3.2 User's Guide. Neural Fusion, Inc.
Rogers, L.L., Dowla, F.U;, and Johnson, V.M., 1995, Optimal Field-Scale Groundwater
Remediation Using Neural Networks and the Genetic Algorithm, Environmental Science
and Technology, V. 29, No. 5, PP. 1145-1155.
Rizzo, D.M. and Daugherty, D.E., 1994, Characterization of Aquifer Properties Using Artificial
Neural Networks: Neural Kriging, Water Resources Research, v. 29, No. 3, pp. 483-497.
IWEAL.TBD.wpd
66
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
Salhotra, A., Schanz, R., and Mineart, P. 1988. A Generic Monte Carlo Simulation Shell for
Uncertainty Analysis of Contaminant Transport Models. Woodward-Clyde Consultants,
Oakland, California, for the U.S. Environmental Protection Agency, Environmental
Research Laboratory, Athens, Georgia. ' x
N 1, y J ^
Sarle, W.S., 1994, Neural Networks and statistical Models, Proceedings of the 19th Annual SAS.
' y*-s f *" > ^ **'
User's Group International Conference, April, 199C, *- '
* ~v" <. ^ -'-^
Schroeder, P.R., et al., 1984. The hydrologic evaluation of landfill performance model (HELP):
Volume 1 - Users guide for version I and Volume E- Documentation for version I.
EPA/530-SW-84-009, U.S. EPA, Washington, D.C.,»20460A
Smith, M.S., 1996, Neural Networks for Statistical Modeling, rntemationalThompson Computer
••j\fsS".-jV.~'-' "^vMW5-• K
Press, Boston, MA. •-— x
SPSS, 1996, Information obtained from the SPSS Home Page on the World Wide Web. Home
Page Address: http://www.spss.com/ " *
*>
U.S. EPA. 1987. "EPA Screening Survey of Industrial Subtitle Establishments." Docket No.
CMLP-50030. December 29,1987.
/
/ <. **
U.S. EPA. 1990a. "Toxicity Characteristic Final Rule." 55 FR 11796. March 29, 1990.
•> i - f
U.S. EPA. 1990b. "Statistics of Aquifer Material Properties and Empirical pH-dependent
Partitioning Relationships for As(JD), As(V), Ba(D), Be(IT), Cd(II), Cr(VI), Cu(II), Hg(IT),
Ni(IT), Pb(B), Sb(V), Se(IV), Se(Vl), T1(I), and Zn(IT). U.S. Environmental Protection
Agency, Office of Research and Development, Athens, GA, Internal report.
U.S. EPA, 1994. The Hydrologic Evaluation of Landfill performance (HELP) Model, User's
Guide for Version 3, U.S< EPA ORD, Washington, DC 20460, EPA/600/R-94/168a,
;' September, 1994.
^ /
U.S. EPA, 1995. "Hazardous Waste Management System: Identification and Listing of
7 ,, Hazardous Waste—Hazardous Waste Identification Rule (HWIR)." 60 Federal Register
% '66344;
U.S. EPA. 1996a. "EPACMTP Background Document and User's Guide." Office of Solid
Waste. Washington, DC.
U.S. EPA. 1996b. "EPACMTP Background Document for the Finite Source Methodology."
Office of Solid Waste. Washington, DC.
IWEMLTBD.wpd
67
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
U.S. EPA. 1996c. "EPACMTP Metals Background Document." Office of Solid Waste.
Washington, DC.
U.S. EPA, 1996d. Location and Risk in the Choice of Industrial Non-Hazardous Solid Waste
Unit liner Design. Issue Paper developed by US EPA Office of Solid Waste, Washington,
D.C
• • '«•"*, <
U.S. EPA. 1997. "Supplemental Background Document; NonGroundwater Pathway Risk
Assessment; Petroleum Process Waste Listing Determination." 68-W6-0053. Office of
Solid Waste. Research Triangle Park, North Carolina. March 20.
U.S. EPA. 1999. "Guide for Industrial Waste Management." EPA530-R-99-001. Office of
Solid Waste. Washington DC.
\
U.S. EPA. 1998. "Draft Uncertainty/Sensitivity Analysis Background Document." U.S. EPA
Office of Solid Waste.
IWEM_TBD.wjxi
68
-------
APPENDIX A
INDUSTRIAL WASTE MANAGEMENT EVALUATION MODEL (TWEM),
GROUND WATER MODEL TO SUPPORT THE GUIDE FOR
INDUSTRIAL WASTE MANAGEMENT
-------
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
TABLE OF CONTENTS
A.1 INTRODUCTION A-l
A.2 PROCESS USED TO TRAIN NEURAL NETWORKS A-2
A.2.1 Development of Training and Test Data Samples A-2
A.2.1.1 Initial Training andTestData A-7
A.2.1.2 Secondary Training and Test Data A-9
A.2.2 Selection of Neural Network Training Method and Other Training
Parameters A-14
A.2.2.1 Background on Neural Network Training Methods and
Parameters A-14
A.2.2.2 Specific Training Parameters Used to Develop Tier 2 A-16
A.2.3 Neural Network Quality Criteria A-19
A.2.4 Creation of Validation Data Sets and Evaluation of Final Neural Network
Performance A-29
A.2.5 Performance of Neural Network With Input Data Outside the 10th to 90th
Percentile Range A-29
A.3 DETAILS OF NEURAL NETWORK TRAINING FOR EACH WASTE
MANAGEMENT UNIT (WMU) A-32
A.3.1 Landfill Neural Network A-32
A.3.1.1 Details of the Neural network Training Process A-32
A.3.1.2 Performance of the Final Landfill Neural Network A-35
A.3.2 Surface Impoundment Neural Network A-42
A.3.2.1 Details of Training Process A-42
A.3.2.2 Performance of Final Surface Impoundment Neural Network A-48
A.3.3 Waste Pile Neural Network A-56
A.3.3.1 Details of Training Process A-56
A.3.2.2 Performance of the Final Waste Pile Neural Network A-62
A.3.4 Land Application Unit Neural Network A-70
A.3.4.1 Details of LAUNN Training Process A-70
A.3.4.2 Performance of Final Land Application Unit Neural Network A-74
A.3.5 Recommendations for Further Work A-83
A.4 REFERENCES A-85
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
LIST OF FIGURES
Figure A.2.1 Overview of Neural Network Training and Testing Process A-3
Figure A.2.2 Overview of Steps in Neutral Network Training Process A-12
Figure A.2.3 How to Read the M&P Graph A-22
Figure A.2.4 Comparison of Neural Network Training Progress A-23
Figure A.2.5 Measure vs. Predicted Log of Peak Concentration (logpkconc)
Graph, Training Matrix, for the Landfill Neural Network A-24
Figure A.2.6 Frequency Distributions of Log (AREA), LFNN, Training Matrix
(at the Start of Training on the Left, After Training on the Right) A-25
Figure A.2.7 Frequency Distributions of log (A), LAUNN, Training Matrix
(at the Start of Training-Right, and After Training-Left) A-26
Figure A.2.8 Input (log KOC) vs. Output (log pk cone). EPACMPT Data
(left), and Trained Neural Network (right) A-27
Figure A.2.9 Histogram of the Difference Between Predicted and Observed
(Residual) Log Peak Concentration A-28
Figure A.3.1 M&P Graph for LFNN, Training Data (Top) and Test Data
(Bottom) A-36
Figure A.3.2 Histogram of the Difference Between Predicted and Observed
(Residual) Log Peak Concentration for the LFNN A-37
Figure A.3.3 95% Confidence Interval Graph for LFNN, Training Set A-38
Figure A.3.4 95% Confidence Interval Graph for LFNN, Test Set A-39
Figure A.3.5 Comparison of EPACMTP-generated and Neural Network
Predicted Monitoring Well Concentration for Landfills A-40
Figure A.3.6 Interpretation of Predictive Neural Network Capability A-46
Figure A.3.7 M&P-Graph for SINN, Training Section (Top) and Test Section
(Bottom) A-49
Figure A.3.8 Histogram of the Difference Between Predicted and Observed
(Residual) Log Peak Concentration for SINN A-50
Figure A.3.9 95% Confidence Interval Graph for SINN, Training Matrix A-51
Figure A.3.10 95% Confidence Interval Graph for SINN, Test Matrix A-52
Figure A.3.11 Comparison of EPACMPT-Generated and Neural Network
Predicted Monitoring Well Concentrations for Surface
Impoundments A-53
Figure A.3.12 Comparison of EPACMPT-Generated and Neural Network
Predicted Monitoring Well Concentrations for Waste Pile A-61
Figure A.3.13 M&P-Graph for WPNN, Training Section (Top) and Test Section
(Bottom) A-63
Figure A.3.14 Histogram of the Difference Between Predicted and Observed
(Residual) Log Peak Concentration for WPNN A-65
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
LIST OF FIGURES (con't)
Figure A.3.15 95% Confidence Interval Graph for WPNN, Training Matrix A-66
Figure A.3.16 95% Confidence Interval Graph for WPNN, Test Matrix A-67
Figure A.3.17 M&P-Graph for LAUNN, Training (Top) and Test
(Bottom) A-75
Figure A.3.18 Histogram of the Difference Between Predicted and
Observed (Residual) Log Peak Concentration for LAUNN A-77
Figure A.3.19 95% Confidence Interval Graph for LAUNN,
Training Matrix A-78
Figure A.3.20 95% Confidence Interval Graph for LAUNN, Text Matrix A-79
Figure A.3.21 Comparison of EPACMPT- Generated and Neural Network
Predicted Monitoring Well Concentrations for Land Application
Units A-80
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
LIST OF TABLES
Table A.2.1 Selected Percentiles of Landfill Parameters A-5
Table A.2.2 Selected Percentiles of Surface Impoundment Parameters A-5
Table A.2.3 Selected Percentiles of Waste Pile Parameters A-6
Table A.2.4 Selected Percentiles of Land Application Unit Parameters A-6
Table A.2.5 Selected Percentiles of Chemical-Specific Hydrolysis (A.)
and Organic Carbon Partition Coefficients (K^.) Unit Parameter A-6
Table A.2.6 Range of EPACMTP Inputs Values Used to Train Neural
Networks A-18
Table A.2.7 Test Results of Neural Network Classification Skills A-31
Table A.3.1 Summary of Neural Network Data Sets for Landfills A-33
Table A.3.2 Summary of Final LENN Features A-34
Table A.3.3 Comparison of Observed and Predicted DAF Values, LF-Log
Peak Well Concentrations A-41
Table A.3.4 Summary of Neural Network Data Sets for Surface Impoundments ... A-43
Table A.3.5 Summary of Final SINN Features A-48
Table A.3.6 Comparison of Observed and Predicted DAF Values Based on
Peak Well Concentrations A-54
Table A.3.7 Comparison of Observed and Predicted DAF Values Based on
Average Well Concentrations A-55
Table A.3.8 Summary of Neural Network Data Sets for Waste Piles A-57
Table A.3.9 Characteristics of WPNN-A and WPNN-B A-60
Table A.3.10 Summary of Final WPNN Features A-62
Table A.3.11 Comparison of Observed and Predicted DAF Values Based
on WP-Peak Well Concentrations A-68
Table A3.12 Comparison of Observed and Predicted DAF Values Based
on WP-Average Well Concentration A-69
Table A.3.13 Summary of Neural Network Data Sets for Waste Piles A-71
Table A.3.14 Summary of Final Neural Networks for Land Application Units A-73
Table A.3.15 Summary of Final LAUNN Features A-74
Table A.3.16 Comparison of Observed and Predicted DAF Values Based on
LAU-Peak Well Concentrations A-81
Table A.3.17 Comparison of Observed and Predicted DAF Values Based on
LAU-Average Well Concentrations A-82
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
LIST OF ACRONYMS
AREA: Area of waste management unit
BEP: Back Error Propagation training method for neural networks
CG: Conjugate Gradient training method for neural networks
DAF: Dilution Attenuation Factor
DSOIL: Depth to water table from the base of a waste management unit
EPACMTP: Environmental Protection Agency-Composite Model for Leachate Migration with
Transformation Products
IWEM: Industrial Waste Evaluation Model
IOG: Input versus Output Graph
IVH: Input Value histogram, frequency distribution
KOC: Waste constituent organic carbon partition coefficient
LAMBDA Waste constituent decay coefficient
X: Waste constituent decay coefficient
LAU: Land Application Unit
LAUNN: Land Application Unit Neural Network
LCTV: Leachate Concentration Threshold Value
LF: Landfill
LFNN: Landfill Neural Network
log: logarithmic
M&PG: Graph of Measured (observed data) and Predicted (neural network output) values
MvPG: Measured versus Predicted Graph for a trained neural network
NN: Neural Network
OSW: Office of Solid Waste
R2: Coefficient of Determination
RADIS: Radial distance from the waste management unit to ground-water monitoring well
SI: Surface Impoundment
SINFUL: Rate of infiltration of waste leachate from the base of a waste management unit
SINN: Surface Impoundment Neural Network
TBD: Technical Background Document
WMU: Waste Management Unit
WP: Waste Pile
WPNN: Waste Pile Neural Network
ZB: Aquifer thickness
-------
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.1 Introduction
December 16,1998
A.1 INTRODUCTION
This document is an appendix to the Technical Background Document (TBD) developed for the
two-tiered approach for the evaluation of waste management unit liner designs (U.S. EPA, 1998).
The discussion of neural networks in Section 5 of the TBD summarizes neural network theory,
and general aspects of neural network modeling performed to develop neural networks for the
two-tiered approach. This appendix describes additional details of the development of non-linear
neural networks that are the basis of EPA Industrial Waste Management Evaluation Model
(EPAIWEM).
This appendix consists of two parts. The first part, Section A.2, describes the development (i.e.
training) and testing of the EPAIWEM neural networks and provides details of the following
aspects of neural network training:
an overview of the training process,
selection of data values for the data sets required to train neural networks,
selection of neural network training parameters, and
selection of neural network quality criteria.
The second part of this appendix, Section A.3, summarizes the assumptions and methods used to
train neural networks for the four different waste management units (WMU) of concern:
Landfills (LF),
• Surface Impoundments (SI),
Waste Piles (WP), and
• Land Application Units (LAU) also known as Land Treatment Units.
Section A.3 describes how the issues outlined in Section A.2 were addressed in the development
of neural networks for each of the four WMUs. Section A.3 highlights the differences in the
training methods used for each WMU based on lessons learned during the sequential
development of each WMU. Tables and figures are provided to summarize important aspects of
the training methods and highlight the advantages and disadvantages of the training process used
for each neural network. Section A.3 concludes with an overview of the performance of the final
neural network produced for each WMU.
R12-98.034_AI.O.wpd
A-l
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
A.2 PROCESS USED TO TRAIN NEURAL NETWORKS
This section provides a general overview of the neural network training process, highlighting the
similarities and differences in the development of neural networks for each of the four waste
management units (WMU) of concern: landfills (LF), surface impoundments (SI), waste piles
(WP), and land application units (LAU). In general, the neural network training process
consisted of the following steps:
1. Selection of input values for initial and subsequent neural network training and test data
sets (as described in Section A.2.1.1 and A.2.1.2) and their use in data samples
(combination of input values),.
2. Selection of neural network training parameters (Section A.2.2), and
3. Identification of neural network quality criteria (Section A.2.3).
4. Creation of validation data sets and evaluation of final neural network performance
(Section A.2.4),
These steps were used to develop neural networks for each of the four WMUs, as outlined in the
flow chart in Figure A.2.1. Neural networks were developed sequentially, beginning with
landfills and progressing through surface impoundments, waste piles, and land application units.
Lessons learned during the training of one neural network were applied to training the next
WMU. In addition, the behavior and the output of EPACMTP varied as a function WMU.
Therefore, the methods used and the details of the training process varied for each WMU, as
explained further in Section A.3. The evaluation of the performance of each neural network was
based on quality criteria that also evolved during the development of neural networks for each
WMU. During the trairdng processes, a variety of training techniques were applied
simultaneously for each WMU to develop a neural network with the best predictive capabilities,
resulting hi multiple neural networks for each WMU. The best performing neural network
developed for each WMU was selected based on its performance using an independent validation
test set consisting of unbiased randomly chosen test data sets.
A.2.1 Development of Training and Test Data Samples
In general, neural network development requires training data (input to the neural network
development software to develop the neural network's predictive capability) and data to test the
trained neural network (test data). The general approach used to develop of neural networks for
the Industrial Waste Management Guidance used a training data set, a test data set consisting of
predetermined input parameter combinations that was used to optimize the generalization
capability of the neural network, and an unbiased final test set or validation data set consisting of
randomly selected data samples. After data from the test set were selectively input to the training
data set, additional secondary
A-2
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
Raw Data from EPACMTP
1
Creation of Initial Data Set
Sequental/Random Selection (30%)
of Validation Data Set
Neural Network Quality Criteria
Validation Data Matrix
IWEM
Graphical User Interface
— Coefficient of Determination, R
— XY-Graphs (input vs. output)
Measured vs. Predicted Graphs
Histograms (input values distribution
— per input parameter)
_ Interrogation of Trained Network
Figure A.2.1 Overview of Neural Network Training and Testing Process
R12-98.034_A2.0.wpil
A-3
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
test data were developed. The purpose of the test data set is to serve as an intermediate test to
assess when the network is sufficiently trained. As indicated in Figure A.2.1, the training/testing
process is an iterative process. The purpose of the unbiased final validation set is to provide an
independent measure of the predictive capability of each of the four the final neural networks
(Schwingler, 1996).
Range of Input values Used to Train Neural Networks
The full range of values for each input parameter is bounded by the 0th and 100th percentile values
obtained from the OSW survey data (U. S. EPA, 1998). The input parameter value ranges for
each of the four WMUs are provided in Tables A.2.1 to A.2.4. As described in Section 5 of the
Technical background Document (TBD), initial attempts to train on data sets that included the
full 0th to 100th percentile input parameter value ranges were unsuccessful. The reason is that the
tails of the distribution represent extreme values whose probability of occurrence is very low.
Therefore, to produce the best possible predictive tool with broad applicability and acceptable
accuracy, the decision was made to generally train and validate the neural networks using input
values in the range of 10th to the 90th percentile. The impact of omitting input values between the
0th and 10th percentile and between the 90th - 100th percentiles is an issue that may require further
evaluation, as discussed in section A.2.4.
Range of Output Values Used to Train Neural Networks
Running EPACMTP for the full range of possible input parameter values results in
dilution/attenuation factors or DAFs (see Section 1.0 of the TBD) that range from 1.0 on up to
1030. The highest DAFs may be indicative of extensive contaminant dilution or degradation
(hydrolysis or sorption), or may be the result of placement of the monitoring well outside of the
contaminant plume. However, in general, given the various sources of uncertainty inherent hi the
model and in the input data, this analysis focuses on a range of DAFs from 1.0 to 106. In
addition, initial attempts to develop neural networks that predicted over a range of DAFs from
1.0 to 1030 were not successful. Therefore, it was determined that to develop neural networks
with the best overall predictive capability, while still addressing a wide range of DAF variability,
the DAF values used to train the neural networks would focus on the same range of DAF values,
from 1.0 to 106.
R12.9S.034_A2.<>:wp
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
Table A.2.1 Selected Percentiles of Landfill Parameters
Input
Parameter1
AREA
SINFIL
DSOIL
ZB
RADTS
Units
m2
m/yr
m
m
m
0%
40.5
0.00001
0.305
0.305
0
10%
979.7
0.0239
1.58984
4.04
104
50%
18474.3
0.13074
5.3384
10.708
427
90%
150350
0.4511
31.9
81.49
1220
100%
2024000
1.076
609.8
914
1610
Table A.2.2 Selected Percentiles of Surface Impoundment Parameters
Input
Parameter
AREA
SINFIL
DSOIL
ZB
RADTS
Units
m2
m/yr
m
m
m
0%
13.5
0.006777
0.305
0.305
0
10%
107.9
0.08953
1.53
3.301
104
50%
2700
0.1614
6.11
10.9
427
90%
40500
0.3571
31.3
84.1
1220
100%
4047000
1.981
609.8
914
1610
AREA=area of WMU; SINFIL=regional infiltration rate; DSOIL=depth to groundwater table; ZB=aquifer
thickness; RADIS=distance to receptor well
R12-98.034_A2.0.wixl A~5
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
Table A.2.3 Selected Percentiles of Waste Pile Parameters
Input
Parameter
AREA
SEMFDL
DSOIL
ZB
RADTS
Units
m2
m/yr
m
m
TT1
0%
6.75
0.000254
0.305
0.305
o
10%
31
0.133
1.53
3.42
104
50%
425
0.265
6.51
12.06
427
90%
8850
0.548
33.6
84.2
1220
100%
1470000
1.21
609.8
914
1610
Table A.2.4 Selected Percentiles of Land Application Unit Parameters
Input
Parameter
AREA
SINFTL
DSOIL
ZB
RADTS
Units
m2
m/yr
m
m
TTI
0%
20.2
0.00001
0.305
0.305
o
10%
2160
0.03397
1.9
3.697
L04
50%
83800
0.142
7.02
18.32
427
90%
916000
0.355
34.1
91.55
1220
100%
80900000
0.745
609.8
914
1610
Table A.2.5 Selected Percentiles of Chemical-Specific Hydrolysis (X) and Organic
Carbon Partition Coefficients (K.J Unit Parameters
Input
Parameter
X
K^
Units
1/yr
ml/g
0%
0.0
0.0
10%
0.0
0.0
50%
3.3X10'10
98
90%
.039
2.19X105
100%
1.98xl08
3.98xl07
R12-9S.034_AZ.O.wpd
A-6
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
A.2.1.1 Initial Training and Test Data
The training process used to develop neural networks for the four WMUs focused on the
following seven EPACMTP input parameters:
• area (logAREA),
infiltration rate (SESTFIL),
• organic carbon partition coefficient (logKOC),
• decay rate (logLAMBDA),
• depth to the water table (DSOIL),
• aquifer thickness (ZB), and
• distance to receptor well (logRADIUS).
The two EPACMTP output parameters used to train the neural networks were:
• the peak well concentration (log pk cone) and
• the maximum 30-yr average well concentration (log avg cone).
The peak well concentration was used to calculate the Leachate Concentration Threshold Value
(LCTV) for non-carcinogens and the maximum 30-yr average well concentration was used to
develop LCTVs for carcinogens. The reason for using logarithmic rather than linear values for
some of the input and the output parameter was to transform the parameter ranges to a more
nearly normal distribution and improve the predictive capabilities of each neural network. The
landfill neural network was developed with six input parameters and one output parameter,
whereas the neural networks for surface impoundments, waste piles, and land application units
were developed with seven input parameters and two output parameters. Modeling the landfill
scenario with EPACMTP assumes an essentially steady-state scenario hi which the organic
carbon partition coefficient (KOC) has little or no effect on the output. Therefore, the landfill
neural network did not consider KOC as an input parameter and did not use the average peak 30-
year concentration as an output parameter.
To develop the initial training sets for each of the neural networks, data samples were created by
running EPACMTP for all combinations of input values at their respective minimum, median,
and maximum values (that is, 10th, 50th and 90th percentile). Collectively, these combinations of
input values can be visualized as a grid of points in space, which consists of as many dimensions
as there are input parameters. Visualizing the points as being connected by lines produces a
volumetric star or heptagon (polygon with seven corners), representing the of seven input
parameters. These data points are referred to as star points. The combination of all input
parameters with values set to these 10* percentile establishes the inner boundary of the star and
all input parameter values set to their 90th percentile establishes the outer boundary. All other
variations of parameter combinations form the inner structure of this complex geometry. These
R12-98.034_AZO.wpd
A-7
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
star points establish a fundamental conceptual foundation for successful neural network training
by defining the limits of the input variable space. The training data set that contains these star
point values, consisting of all combinations of parameters set to their 10th, 50th, and 90th
percentile values, is called the initial training data set.
After a comprehensive initial training data set has been built, an independent and similarly
comprehensive test data set is developed to measure the performance of the trained neural
network. The results of the testing process provide a measure of the completeness of the training
process and thus, the accuracy and predictive capability of the neural network. The training set
should ideally cover general range of the response surface of EPACMTP's behavior, and the test
set should address other more subtle variations in the response surface. The test data set should
not be derived from the training data but should contain values within the same data range, to
increase the predictive capability of the neural network with data on which it has not been trained
(Smith, 1996).
The NNModel software (Neural Fusion, 1998) used to develop the four neural networks,
provides two options for specifying which data should be used for training and which should be
used for testing. The first option is to have the test set sequentially defined as a subset of data
samples in the overall data set imported into NNModel. The second option is to randomly assign
a given percentage of all data (at least 20%) to the test data set. This second option helps to
ensure that the data set used for testing is similar to but sufficiently different from the training
data. For the initial test data set, values corresponding to the 20th, 40th, 70th, and/or 80th
percentile values of the OSW survey distributions for each input parameter were chosen. These
test data points fall in between the points that were established as star-points. Note that the
neural network was not initially trained on the test set. Test data are data which the model has
not "seen" before. The test set is used to measure how well the model can generalize from the
trends seen in the training data. Based on the results of testing the neural networks, in some
cases test data were transferred into the training data sets
Another important aspect of the development of training data sets is the balanced content of a
data set. A training data set should contain a similar number of samples from different "classes"
of data that the neural network should learn, especially if one of the classes has a relatively small
number of samples compared to the others (Swingler, 1996). Classes of data samples are based
on the combinations of input values. For example, a data sample that combines the 10th and 50th
percentile values of the OSW Survey Data as input can be considered as a different class from a
data sample combining the 60th and 80th percentile as input values.
RI2-98.034-A2.0,wp
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
A. 2.1.2 Secondary Training and Test Data
Generally, coverage of the input parameter space with data samples using only the 10th, 50th, and
90th percentile input values in the initial data sets created a good generalized data matrix with
some data gaps, given that the neural network model does interpolate between the data points.
Filling the data space between the 10th and 50th percentile range with 20th and 40th percentile
values, for example, decreased the size of the data gaps. Filling in data samples with
combinations of input values between these percentile range will minimize the data gaps further.
Continuing this process will eventually cover all possible data space in the model response
surface. However, from a practical point of view, it was not feasible to develop training and test
data sets of all possible combinations. Therefore, methods were developed to target these data
gaps in order to augment the initial training data set. This serves two purposes: it provides more
data samples for neural network training and also strengthens the model's ability to generalize
(Swingler, 1996).
Histograms of input values used in the initial training were used to identify input value ranges
that were sufficiently and insufficiently covered with training data samples. Equal frequencies
of values over regular increments of input values would represent a balanced input value
distribution, the ideal scenario for training a neural network development. However, using the
star point method resulted in frequency distributions that were not uniformly distributed, not only
within the range of each individual input parameter, but also between the histograms for each
input parameter for training and test data sets.
Therefore, during the course of training neural networks for each WMU, four different
approaches were used to develop additional or secondary training and test data sets, three of
which required the production of new test data samples. The four methods to develop additional
data are:
1) data samples developed from existing testing data using a threshold value of the
prediction residual test data samples for which the neural network performed poorly
were extracted from the test set and put into the training data set (Group 1 Data
Samples)
2) data samples that were randomly selected from values in the training set for each input
parameter into new combinations (Group 2 Data Samples),
3) data samples selected to correct imbalances in the parameter frequency distributions
(Group 3 Data Samples), and
4) data samples constructed by combining two strategically selected input parameters in
all possible combinations of values (Group 4 Data Samples).
RI2-98.034_A2.0.wpd A~9
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
Each of these methods of developing additional training data samples are described in the
following sections.
Group 1 Data Samples
The first group of additional data samples were developed by transferring existing test data to the
training data set, based on a threshold value of the residual (i.e., the difference between the actual
output value (EPACMTP log peak concentration) and the predicted output value (neural network
log peak concentration). To reduce the size of the residuals, test data samples with residual
values higher than an arbitrary residual threshold value of between 0.5 or 0.1 were transferred
from the test data set to the training data set. To maintain balance for the different classes of data
samples (i.e. specific combinations of percentile values) within the training data set, test data
samples in some cases were duplicated and then transferred to the training data set (i.e. used
more than once in the training set).
Group 2 Data Samples
Group 2 Data Samples contained "new" input values that were not yet used to train the neural
network. However, developing a neural network with seven input parameters requires a large
input data space (i.e. all possible combinations of seven input parameter values) to be filled with
values on which to train. Testing the trained neural network with input values in new
combinations generally did not lead to major increases in the predictive capabilities of the neural
networks. The neural network was not able to interpolate from existing reference values on
which it had been trained. Therefore, it became necessary to target specific ranges of input values
that resulted hi better or worse predictive capability of the neural network rather than testing with
randomly selected values.
Group 3 Data Samples
A third method was developed, Group 3 Data samples, in which data samples were moved from
test to teaming data sets, again using residual threshold values (as with Group 1 Data samples),
but with the added goal of targeting the transfer of test data to the training data set to fill specific
data gaps. Data samples were moved from the test to the training data set if the neural network
produced poor predictions (high residuals) for these test data samples. To better target the
existing data gaps in the training data, three classes of data gaps were identified (see Figure
A2.2):
1) an area of the test data histogram with a high frequency of an input value but lower
frequency in the training data set;
2) conversely, a low frequency of an input value in a test data set histogram but larger
frequency in the training data; and
R12-9S.034_AZO.wpd
A-10
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
3) an input value missing in the histogram of test data histogram, but present in the
training histogram.
Using these three classes of data gaps as a guide to transferring data samples between training
and test data sets were used to improve the waste pile, neural network, and in slightly modified
form, the surface impoundment neural network (see Section A.3.2).
The assumption that input values with a high frequency of occurrence in the test data set will lead
to a better prediction of the output than a low frequency input value was confirmed. It was also
apparent that not only does the input value itself have a strong influence on the output result, but
the relationship of this value to other input values is important as well. Specifically, the
distribution and use of the input values (especially the most sensitive ones, such as KOC and
LAMBDA) strongly impacts the ability of the neural network to make accurate predictions. In
addition, understanding the manner in which these values are combined with each other is critical
for improving the predictive capability of a neural network. Thus, the emphasis was to evaluate
the combinations of input values rather than just looking at combinations of single input values.
These observations led to the next type of secondary training and test data sets, Group 4 Data
Samples.
Group 4 Data Samples
Development of this type of additional data set involved the development of strategically selected
pairs of input parameters and combinations of these input values. Three sets of seven to eight
values for each input parameter were selected. The first set contained values that were already
used in the test and/or training data set, as well as new values to fill the data space in between.
The second and third set used similar values, but the number of values already used in training
was decreased. These eight input values for area and infiltration each were combined with each
other in all possible combinations. The five other input parameters were selected from any one
of the three data sets.
Summary of Methods Used to Develop Additional Training Data
The method used to augment the initial training data samples varied for each of the four WMUs.
Figure A.2.2 summarizes the steps used in the neural network training process, including the
steps used to improve the neural network training after the initial training and test data sets were
constructed and initial neural network training was completed. It should be noted that some of
these steps were used more than once, whereas other steps may not have been used for one or
R12-98.034_A2.0.wpd
A-ll
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
Neural Network
Training
Neural Network
Criteria
Yes
No
Switch Training
Methods
(CGorBEP)
Increase Number
of Hidden
Neurons
Check
Residuals
Transfer Data
from Test to
Training Set
Weight Data
Samples in
Training
Develop
Additional
Test Data
No
Figure A.2.2 Overview of Steps in Neural Network Training Process
Neural Network
Validation
Append New
Data Samples
to Test Set
Review:
- Measured vs.
Predicted Graphs
-R Values
Interrogate with
Random Validation
Data Set
Final Neural
Network
R12-9S.034jAZO.wpd
A-12
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
more WMUs, depending on the behavior and performance of each neural network. It is worth
noting that as training progressed through each of the four WMUS, we learned effective training
approach and thus was able to streamlined improve predictive capabilities for the last two WMUs
(waste piles and land application units).
R12-98.034j\2.0.wpd
A-13
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
A.2.2 Selection of Neural Network Training Method and Other Training Parameters
Section 5.0 of the TBD presented an overview of neural networks and how they were developed
for the two-tiered modeling approach. This section provides additional details on the options and
parameters that are used in the neural network modeling software NNModel, to determine how a
neural network is trained. These parameters are used to optimize the training of a neural network
for a given set of independent and dependent variables in a system (i.e. the EPACMTP model).
A brief background on the parameters required to train a neural network used is presented hi this
section, followed by a description of the specific training methods used and finally, a summary of
the training parameter settings used to train the four neural networks.
A.2.2.1 Background on Neural Network Training Methods and Parameters
A number of parameters and training options are available to the user of NNModel to optimize
the development of neural networks. Values selected for these "training parameters" are selected
on the type and characteristics of the system being modeled and on the results of empirical
evaluations of the effects of these training parameters on the predictive capability of the neural
network. While a number of parameters are available, the following were determined to be most
significant with respect to the impact on the training results:
• Initialization
• Maximum number of hidden neurons and hidden layer addition
• Number of eons
• Maximum number of total counts
• Learning Rate and Hlearning Rate
• Training method: Conjugate Gradient Optimization (CG) or Back Error
Propagation (BEP)
Each of these parameters are discussed briefly in this section. The remaining parameters were set
to default values provided in the software and can be found in the software or in the NNModel
User's Guide (Neural Fusion, 1998).
Initialization
Initialization involves setting the weights on the connections between input, hidden and output
layers to random values at the beginning of the training process. These weights will then be
adjusted while training the neural network to minimize the error between the predicted and
observed data. We recognize that initial settings may have a significant affect on overall
performance, however we did not perform an analysis to determine the optional initial settings.
A-14
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
Maximum number of hidden neurons
Hidden neurons (also known as hidden nodes), connects the inputs and outputs. The value of
each hidden neuron is the sum of the product of each input times the associated weight to that
neuron. The hidden layer contains the hidden neurons. This layer is called "hidden" because its
nodes make contact only with nodes in the input and output layers; they are "hidden" from the
user of the neural network. Although it is possible for the network to have more than one hidden
layer, a single hidden layer was used to develop Tier 2 neural networks. The content of the
hidden layer (number of hidden neurons) must be determined by the modeler. The number of
hidden neurons determines the complexity of the mapping function carried out by the neural
network. A network can approximate a target function of any complexity, if it has enough hidden
nodes (Smith, 1996).
Number of Eons
An eon is the number of presentations of the training data set to the network before checking the
statistics or updating the trainhig progress graph. Continually updating the statistics and the
training process graph could slow the training process. Therefore an optimum number of eons is
selected that will optimize the time spent training versus the time spent updating and presenting
the statistics of the training process.
Maximum Number of Iterations or Total Counts
The maximum number of iterations or total counts is the number of times the NNModel software
adjusts the weights of the connections between the input nodes and the output nodes.
Learning Rate and Hlearning Rate
A neural network has at least three interconnected components: a layer of input parameters, a
layer of hidden neurons, and a layer of output parameters. The number of input and output
parameters (also known as nodes) is determined by the number of independent and dependent
variables. The input and output parameters are interconnected with each hidden neuron. The
interconnection and the degree to which a neural network is trained, is characterized with either a
Learning rate or an Hlearing rate. The Learning rate is the rate of learning (i.e., training)
measured while the NNModel software adjusts the weights of the connections of the hidden layer
to the output layer. The Hlearing rate is the initial learning rate for the input to hidden layer
connections. For additional information refer to the NNModel Users Guide (Neural Fusion,
1998) or Smith (1996). These two learning rates were used for the entire training session for
each WMU neural network. Each was set by initializing the model and was not modified during
training.
R12-98.034_A2-2.wpd
A-15
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A2, Process Used to Train Neural Networks
April 13,1999
Training Methods
Two training methods were used to develop the Tier 2 neural networks: BackErrorPropagation
and the conjugate gradient method. Each are described briefly here.
BackErrorPropagation (also known as backpropagation or BEP) is the method of finding the
optimum values of the weights between the hidden nodes and the input/output nodes. The name
backpropagation is derived from the process of propagating the error information backward from
the output nodes to the hidden nodes (Smith, 1996). For each data sample there is a forward pass
from the input nodes to the output nodes to determine the neural network output values, followed
by a backward pass from the output nodes to the hidden neurons to determine (based on the
difference between the predicted output values and the observed values) how and in which
direction the weights of each connection need to change. The back propagation algorithm adjusts
the weights in the direction in which the slope of the error decreases most rapidly (also known as
the steepest descent method).
The conjugate gradient training method is a neural network training algorithm that performs a
search for the minimum residuals using the first- and second-derivative of the error surface. This
method generally produces faster convergence (in the reduction of the error) than steepest
descent directions (Demuth and Beale, 1997). The CG-technique is a second-order method
which involves calculating an approximation of the second derivative of the error (difference
between predicted and desired output value) with respect to a weight and using this quantity, in
conjunction with the first derivative, to adjust the values of the weights. The first derivative of
the error represents the speed of decrease and the second derivative of the error represents the
rate of decrease of the error (Smith, 1996). This method is more computationally intensive than
BEP and requires more computational time to train. However, it is generally a more accurate for
optimizing the weights.
A.2.2.2 Specific Training Parameters Used to Develop Tier 2 Neural Networks
In general, the same neural network training and method parameters were selected for each of the
four WMUs. Specific training methods and training parameter values are described in this
section.
Training parameters
NNModel provides default values for each of the neural network training parameters. However,
to optimize the training of the four neural networks for the Tier 2 analyses, the following
parameter values were used:
R12-9S,034_AZ2.wpd
A-16
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
• Maximum number of hidden neurons = 10 to 32
Number of eons: 1 for CG and 20 or 100 for BEP
• Maximum number of total counts = 31,000 to 120,000
• Learning Rate: 0.5
« Hlearning Rate: 1.0
To optimize the adjustment of the weights, the Learning Rate and the Hlearning Rate were set to
a ratio of 0.5:1.0, for all WMUs except for landfills. The landfill neural network was trained with
a ratio of 0.75:1.5 (NNModel default values).
TrainingMethod parameters
Several additional parameters are set to optimize the training process. The following training
method parameters were selected for the neural network development:
• Hidden layer addition: Fixed # of hidden neurons
• Stop network on criteria: None
• Training method: Interchanging use of Conjugate Gradient Optimization (CG)
and Back Error Propagation (BEP)
• All other NNModel training method parameters set to NNModel default values.
EPACMTP Input Parameter Bounds
NNModel also requires a maximum and minimum value for each input parameter (i.e.
independent variable used to predict the dependent variable). These values are provided for each
of the four neural networks in Table A.2.6.
A-17
-------
Table A.2.6 Range of EPACMTP Inputs Values Used to Train Neural Networks
Inputs
Output
Variable
KOC
logKOC
LAMBDA
log LAMBDA
AREA
log AREA
SINFIL
DSOIL
ZB
RADIUS
log RADIUS
log peak cone
log avg cone
Unit
m2
m/yr
m
m
m
Landfills
Min
.
-
0
-9
39.81
1.6
l.OOe-
05
0.305
0.305
0
-9
-6
-
Max
.
-
1
0
1.99E+06
6.3
1.1
610
914
1584.9
3.2
6
-
Surface Impoundments
Min
0
-2
0
-4
100
2
0.08
1.4
3
39.81
1.6
0
0
Max
50118.72
4.7
0.631
-0.2
5.01E+04
4.7
0.6
32
85
1000
3
6
6
Waste Piles
Min
0
-2
0
-4
10
1
0.02
1.4
3
39.81
1.6
0
0
Max
50118.72
4.7
0.631
0.2
2.00E+04
4.3
0.6
34
85
1000
3
6
6
Land Application Units
Min
0
-2.1
0
-4.1
1000
3
0.10
1.5
3
39.81
1.6
0
0
Max
63095.73
4.8
1
0
l.OOE+06
6
0.6
35
95
1000
3
6
6
00
g 8-
2,
n- W S
C M 2
3.
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
A.2.3 Neural Network Quality Criteria
Network quality criteria are used to judge the performance of the neural networks developed for
each WMU. The criteria used to evaluate the neural networks developed for each of the four
WMUsare:
• the coefficient of determination, R2,
• the magnitude and distribution of the residuals (difference between predicted and
desired output value),
• graphical displays of the relationship of measured and predicted output values,
and
• results of neural network interrogation
Each of these criteria are used to evaluate and improve the performance of neural networks
during the training and testing process are described in this section.
Coefficient of Determination. R2
The coefficient of determination, R2, is defined as:
D2 explained variation
K —
total variation
- Y)
The goal for each neural network was to reach an R2-value greater than or equal to 0.9 for both
the training data set and the test/validation set. Neural network training was halted if one of the
following symptoms appeared, which generally demanded a change in training procedure:
1.
2.
3.
4.
Revalues for training and testing stopped increasing,
R2-value for the test data set decreased rapidly while the R2-value for the training
data set remained constant or continued to increase (a symptom of overtraining),
the R2-value for the training data set decreased,
the R2-value for the training and validation set decreased.
RI2-98.034.AZ3
A-19
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Process Used to Train Neural Networks
April 13,1999
When the training was halted following one of these four symptoms, changes were generally
made to the training parameters or the training method.
Network Error and Residuals
The neural network error or individual output residuals are a measure of the accuracy of the
neural network predictions after a neural network has been trained. The residuals are a measure
of the confidence in the prediction of the neural network. It is important to distinguish between
predictive errors and uncertainty in the predicted output, based on uncertainty in the model
inputs. Thus, neural network accuracy measures may be made for the neural network as a whole
(R2), or on a case by case basis; the neural network may be more accurate for some inputs than it
is for others (Swingler, 1996). In order to optimize the accuracy of a neural network, two neural
network quality criteria were monitored: Firstly, the overall neural network accuracy, measured
in R2, was always recorded. Secondly, the individual residuals, resulting by taking the absolute
difference between the desired output and the predicted output of a training or test data sample
were investigated. Residual threshold values were used to decide how to proceed hi neural
network training, e.g. duplication, transfer or new creation of additional data samples (see
Section A.2.1.2).
Overtraining
The neural network was trained on the training set, but periodically stopped to measure the error
on the test data set. The smaller the error, the closer is the coefficient determination, R2, will be
to one. Both, error and R2 were monitored, but most times only R2 was recorded. Each time the
training process was stopped, the neural network weights were saved. When the error on the test
data increases (i.e., R2 decreases), overtraining, also known as over fitting has begun.
Consequently, training was stopped and continued from a previous status that produced the
lowest error on the test data set (Smith, 1996). The training weights also were saved in between
lengthy training sessions to ensure a quick response to undesired training evolutions and to be
able to go back to a certain training stage without major repetition of training steps.
If overtraining occurs too early in the development of a neural network, it is indicative of
problems with the neural network training, which may include too rapid and too large an increase
in a) the number of hidden neurons or in b) the number of neural network iterations.
Overtraining, however, indicates that the neural network training has reached a plateau if all
other neural network quality criteria, such as high Revalues, have been achieved.
FU24S.034JU3
A-20
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
Overtraining, generally is a result of too long a training period for the neural network. It is
indicated by an increasing training data R2 with a simultaneous decrease in the test data R2
Overtraining can be remedied by changing one or more of the following training parameters:
the training method (BEP, CG),
• the number of hidden neurons, and/or
• the number and frequency of data samples present in the training data matrix.
Changes can be made in single steps or hi combination, whereas the result after a single change is
easier to analyze than a combined alteration. The number of hidden neurons generally were
increased. The number of data samples can be manipulated in several ways, including:
• exchange test and training data samples, and
• add new training and test data samples (extension of overall data samples for
neural network learning).
Each of these methods are described in section A.2.1.
Swingler (1996), and Demuth & Beale (1997) provide equations for approximating the size of a
neural network (i.e., number of hidden neurons). However, they emphasized that there is no
single formula for deriving the "right" number of hidden neurons or data samples. The optimum
number of hidden neurons or data samples is dependent upon system modeled and the structure
and quality of the training data.
Visual Analysis
Graphical measures of the neural network's characteristics were used to investigate the neural
network structure and the training progress and to characterize the final neural networks. The
graphs used for these purposes are listed below and briefly described in the following paragraphs:
1) Measured & Predicted Graphs (M&P) for log peak concentration,
2) Measure versus Predicted Graphs (MvP) for log peak concentration,
3) Histograms of the input value frequency distributions,
4) Input versus Output Graphs,
5) Residual histogram, and
6) 95% Confidence Interval Graph (CIG).
R12-98.034.AZ3
A-21
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
Graph types 2,3, and 4 generally were used to document the progress of the training process,
whereas graph types 1,5, and 6 were used to describe the final neural network performance. The
latter graphs are displayed twice by the NModel software, to characterize the final training data
set and to characterize the final test data set. Each graph is described briefly in the following
sections. Whereas three of the neural networks were trained on two output parameters, i.e., the
log peak and the log average concentrations, for simplicity, the graphs shown here display only
the log peak concentration.
1)
Measured & Predicted Graph (M&P^ for Log of Peak Concentration
This type of graph (Figure A.2.3) shows the measured and predicted output values for each
observed data sample. The dashed line represents the "measured" value of the selected variable
(i.e., the EPACMTP result) and the solid line represents the predicted value. The closer the
curves for the measured and predicted output, the higher the accuracy of the neural network.
Figure A.2.3 How to Read the M&P Graph
LOGFKCON Measured & Predicted
i i
'I Measured log peak concentration
100
200 300 -400
ObserwatIon
500
R12-9S.OH.A2.3
A-22
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
Figure A.2.4 Comparison of Neural Network Training Progress
LOGPKCON Measured & Predicted
N-SS9
200 30a
Obcerv/at Son
LOGPKCON Measured » Predicted
5-
H
1-
The figure to the left in Figure A.2.4 represents a neural network in the early stages of training
with 559 training samples, 10 hidden neurons, and 2871 neural network iterations; the figure to
the right displays a trained neural network that used many additions of training data sets and has
34 hidden neurons, and 120,073 neural network iterations. For purposes of illustration, a subset
of 563 out of a total of 4504 training samples are displayed. The comparison of these figures
demonstrates the accuracy gained during the training process.
2) Measured versus Predicted Graph TMvP) for Log Peak Concentration
The neural network training software NNModel, provides graphs of the measured output
parameter (EPACMTP log peak concentration) versus the predicted output parameter (neural
network predicted log peak concentration). As seen in Figure A.2.5, for every measured value, a
predicted value is calculated, and a symbol is plotted on the graph. The heading of the graph
indicates the number of total data samples N, as well as the neural network tolerance T. The
tolerance is the acceptable error in percent of the total error, and represents a band around the 1:1
fit line (default of ± 5% of total error). The 1:1 fit line is an idealized curve with a 45 degree
slope dividing the graph diagonally. The tolerance can generally be used as a training stopping
point. However, this option was not applied to develop the four neural networks.
Ideally, all data points fall within the tolerance boundaries and group as close as possible around
the 1:1 fit line. This ideal scenario was the goal for all four neural networks, and was achieved
most closely for the waste pile and land application unit neural networks. Outliers, data samples
with predicted values that fall obviously outside the tolerance band, are characterized by high
residuals. Preliminary analyses were performed to determine the source of the deviation for
R12-98.034_A2.3
A-23
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3. Process Used to Train Neural Networks
April 13,1999
some of these outliers, however, no systematic source of error was identified.
Figure A.2.5 Measured vs. Predicted Log of Peak Concentration
(logpkconc) Graph, Training Matrix, for the Landfill
Neural Network
Measured vs Predicted CLOGPKCON) N-455 T-0.050000
I I | I I
o
o
D_
8
u
l_
Q-
5-
A-
3-
1 -
Measured LOGPKCON
3)
Histogram of the Input Values Frequency Distributions (IVH)
The input value histogram (IVH) (see Figure A.2.6) was used to plot all input values for a
selected input parameter. The graph heading indicates the total number of data samples N as well
as the name of the input parameter displayed. The x-axis represents the range of the values of the
selected input parameter, the y-axis represents the percent frequency of each value present within
the total data set Input values were chosen specifically to provide an even distribution of values
over the 10th to 90th percentile OSW data range with a similar frequency for all values. The
selection of these input values was also based on the frequency distribution of the EPACMTP
Monte Carlo input parameters to assure the coverage of input values most likely entered by a
user.
Figure A.2.6 indicates how the frequency distribution of the input parameters improved during
the training process. It also illustrates the concept of balanced data sets. The ideal histogram
R12-98.034.A2J
A-24
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
would show equal frequencies for each value over the range of values. The distance between the
columns represents uncovered data space (i.e., data gaps). The three columns in the left hand
chart represent the pure "star-point"-distribution (10th, 50th, 90th percentile values from the OSW
survey data). The creation of test data samples starts the process of filling data space between the
values with a high frequency of occurrence. The final result is graphed in the right hand graph.
For the landfill neural network (LFNN), the gaps in between these columns were only partially
filled, indicating the potential for further improvement. Figure A.2.6 shows the log of AREA as
the selected input variable for the landfill neural network at the start of the training session (left)
and at the final training stage (right).
Figure A.2.6 Frequency Distributions of log(AREA), LFNN, Training Matrix (at the start
of training on the left, after training on the right)
0.35-
0.3B-
0.25-
37
U
SJ0.20-
D
F0.15-
t.
a. 10-
0.85-
0
Distribution of LOGBREB N-387
4.0 1.5
LOGBREB
Distribution of LOGflREB N-15S
0.32-
0.EB-
0.24-
3)
|0.16-
0.BS-
0.04-
pi
n
n n
4.0
LOSSREB
The goal is to reduce the number and size of data gaps. This was best accomplished in the second
example for land application units (LAUNN), which indicates that neural network training can be
extended with additional input values until the data gaps are filled (Figure A.2.7).
R12-98.034.A2.3
A-25
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
Figure A.2.7 Frequency Distribution of log(X), LAUNN, Training Matrix (at the start of
training-right, and after training -left)
Distribution of LOGLfiflDB N-512
O.1S-
O.1B-
O.3S-
O.30-
91
is.25-
S"o.2a-
L.
O.IS-
0.10-
8.B5-
a-
-1
-3.6 -3.2 -2.8 -2.1 -2.0 -1.6 -1.3
LOCUP.MDO
Distribution of LOGLflMDfl N-1501
0.11-
0. IZ-
a.ia-
31
o
£0.08-
3
o-
ei
,j-0.as-
0.01-
a.02-
0
-, n
_Q
-2
UOELBHOB
4) Input versus Output Graph (TOG or XY-Grapfr)
After initial training, the neural network will have established patterns between the input and
output parameters as shown in XY-graphs. Ideally, these trends reflect the observed curves of the
original EPACMTP output as a function of input parameter values . These curves plot the
relationship between a single input parameter and the output parameter.
The graphs allow the opportunity to double-check the predicted neural network graphs for
expected trends. While they might not be identical, they should have the same slopes and
patterns, as is shown in Figure A.2.8. The "Input versus Output Graphs" should not only agree
intuitively with the expected patterns, but their curves should also confirm hydrogeological
realities. For example, the plot of KOC in Figure A.2.8 confirms that with increasing adsorption
(log KOC), the monitoring well concentration (log peak cone) decreases.
RI2-9S.034.A2.3
A-26
-------
Guide for Industrial Waste Managements
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
Figure A.2.8 Input (log KOC) vs. Output (log pk cone). EPACMTP data (left), and
the trained neural network (right)
o 4-
8 3--
O>
2 1 --
-5
tog KOC vs. tog pkconc
LOGKOC vs LOGPKCON
0
log KOC
1 2
LOGKOC
5)
Histogram of Residuals
The residual histogram depicts the difference between the measured and the predicted output
values for the individual data samples for the final neural network. Two graphs were used, one
for the training data matrix, and the other one for the test data matrix. Frequency histograms
were used to display residuals. The left y-axis represents the frequency distribution of the
residual values, and the right y-axis shows the cumulative distribution in percent. The outline of
residual columns ideally follows a well proportioned normal distribution around zero, signalizing
that the neural network has potential for high quality predictions (Figure A.2.9).
R12-98.034_A2.3
A-27
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
Figure A.2.9 Histogram of the Difference between Predicted and
Observed (Residual) log peak concentration
1000
100%
-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5
Residual log peak concentration
6)
95% Confidence Interval Graph (CIG)
The 95% confidence interval graph is similar to the "Measured versus Predicted Graph" (MvP)
provided in the neural network training software. The confidence interval is a range of values
with a lower and upper limit around a point of interest. The graph also includes the R2-value and
the best fit regression equation. Confidence bands generally run parallel to this best fit line.
Ideally, most data points should fall within the confidence interval and group as close as possible
around the best fit line.
R12-9S.QHJAZ3
A-28
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3, Process Used to Train Neural Networks
April 13,1999
A.2.4 Creation of Validation Data Sets and Evaluation of Final Neural Network
Performance
Unlike the test data sets that were created strategically using predetermined combinations of
input values, validation data sets were developed in an unbiased manner to ensure a thorough
evaluation of the general predictive capability of each neural network. The validation data set is
intended to resemble data that a user would enter into the neural network model. The validation
data set used to test the neural network should reflect values representative of real site data and
should be randomly selected numbers within the domain of values used to train the neural
network. The purpose of interrogating the neural network with a validation data set is to
determine when to stop training and to determine if over training has occurred (i.e. if the neural
network predicts well with data it has seen but not with data it has not seen) and to determine the
reliability of neural network predictions. The identification of a neural network with the best
generalization is better determined with a measure of the test-sample error (residuals of the test
or validation data set), than with the training-sample error (residuals of the training data set).
Ideally, the validation data set should contain values that are selected randomly within a
determined range. Input data for the waste area (AREA), infiltration rate (SINHL), depth to
water table (DSOIL), unsaturated zone thickness (ZB), and distance to receptor well (Radius)
were randomly picked by running EPACMTP in Monte Carlo Mode to produce up to 200 sets of
input values. The chemical-specific input parameters, KOC and Lambda (A,), were randomly
selected from the list of values for 191 constituents listed in the Industrial D Guide. Both random
sets (KOC, X and the other five inputs) were then randomly matched with each other. The
resulting data set contained data samples with input values in the 10th to 90th percentile of the
OSW Survey Data. These combinations of input values were then entered into EPACMTP input
files and run for 2000 iterations. The resulting data set, including the EPACMTP DAF-outputs,
were then input as validation data sets to the trained neural network.
A.2.5 Performance of Neural Network With Input Data Outside the 10th to 90th Percentile
Range
The validation data set was also used to evaluate the neural network's ability to handle input
values outside the neural network training range. The final neural networks were expected to
deliver accurate predictions for unbiased data samples within the neural network training range
(10th to 90th percentile OSW data). This expectation was met for each WMU, as described in
Sections A.3.1 to A.3.4. While preliminary efforts were made to evaluate the neural network's
performance outside the 10th to 90th percentile training range, additional work is necessary. In
general, the neural network predictions for input parameters outside the 10th to 90th percentile
value range will likely be less accurate than for input values within this range. Also,
combinations of input values outside the 10th to 90th percentile range may produce output values
R12-98.034.AZ4
A-29
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
outside the desired output value range (l
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.2 Process Used to Train Neural Networks
April 13,1999
Table A.2.7 Test Results of Neural Network Classification Skills
Input Values for each
Input Parameter within a
Data Sample
a) all 10th to 90th or
b) all 0th to 10th / 90th to 100th
some 10th to 90th combined
with some 0th to 10th / 90th to
100th
EPACMTP
DAF
IxlO10
IxlO10
IxlO20
IxlO20
IxlO9
Neural Network
DAF
IxlO10
IxlO13
IxlO20
IxlO16
IxlO2
Neural Network
Classification
correct?
yes (IxlO10 >lx!06 and
Ixl013>lxl06)
yes (Ixl020>lxl06
and Ixl016>lxl06)
no, IxlO9 >lx!06
(outside range) and
Ixl02
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
A3 DETAILS OF NEURAL NETWORK TRAINING FOR EACH WASTE
MANAGEMENT UNIT (WMU)
This section describes the specifics of the approaches used to train each of the four neural
networks. The neural network training approach is described for landfills, surface
impoundments, waste piles, and land application units in Sections A.3.1 through A.3.4,
respectively. The step-by-step description highlights the difference in training techniques used
for each of the four WMUs, followed by a summary of the characteristics of the final neural
network chosen for each WMU.
A.3.1 Landfill Neural Network
The landfill neural network (LFNN) was the first of the four neural networks that was developed
for the two-tiered approach.
A.3.1.1 Details of the Neural network Training Process
As was described in Section A.2.1 of this appendix, the LFNN, was developed for six of the
seven desired EPACMTP input parameters and one output parameter. Landfills are modeled as a
permanent WMU with an essentially steady-state waste constituent mass source, which means
that the chemical-specific sorption coefficient (KOC) is an insensitive input parameter. In
addition, the steady-state contaminant source results in an essentially steady-state contaminant
concentration in the ground water monitoring well. Therefore, the peak concentration will be
equal to the 30-yr. average concentration, which requires only one output parameter be
incorporated into the neural network. The 30 yr average concentration was chosen as the only
output parameter of concern.
The landfill neural network consisted of a training data set and a test data set, a second training
data set to cover more data space in the input variable space, and finally, a validation data set to
evaluate the performance of the final neural network. Table A.3.1 summarizes how data were
sampled based on fundamental principles described in Section A.2.1 of this appendix.
R12-9f.Q34_A30
A-32
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
Table A.3.1 Summary of Neural Network Data Sets for Landfills
Number of values picked per parameter, LFNN
Run Set
name
Train
Test
Train 2
Validation.
Set
log
(AREA)
3
3
SESfFEL
3
3
DSOIL
3
3
ZB
3
2
log
(RADIS)
3
2
log(X)
3
3
sub-set of Train with 0
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
rate, depth to water table, log(RADIS), log(AREA), and aquifer thickness (ZB).
Table A.3.2 summarizes important characteristics of the final LFNN. The following section
elaborates on the final performance of the LFNN, including graphical and numerical results.
Table A.3.2 Summary of Final LFNN Features
Neural network Parameter
Number of Training Data Samples
Number of Hidden Neurons
Number of Maximum NN-Iterations
R2 (log peak cone, training data set)
R2 (log peak cone, test data set)
Initial
644
(387 training/ 257 test)
1
1
0
0
Final
712
(455 training, 257 test)
10
31000
0.992
0.976 (not including val.-
set);
0.951 (including val.-set)
RI2-9S.034.A30
A-34
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.3 Details of Neural Network Training for Each WMU
April 13,1999
A.3.1.2 Performance of the Final Landfill Neural Network
The final LFNN performance is summarized here briefly, followed by some recommendations
for further improvements. The summary is divided into a graphical and a numerical description.
Graphical Results
As discussed in Section A.2.3, several graphical tools are used to evaluate the final performance
of the neural network. As indicated below in Figure A.3.1, the close match of the predicted and
"measured" (i.e., EPACMTP generated) curve confirms the good generalization properties of the
LFNN. The uneven course of the curve, however, indicates that the model can be unproved by
providing more data patterns in the response surface which will extend its generalization skills as
well. The overall ability of the neural network to predict a correct output value, however, is
acceptable, as evidenced by an R2 value >0.9.
In general, the landfill neural network was supplied with fewer data samples than it would need
to establish an ideal model. The test data curve shows less than perfect predictions. The
mismatch is greater than hi the M&P-Graph for the training data set. That means, the neural
network is capable to predict useful output values for data samples it was trained on, but it
probably predicts with less accuracy for data samples it was not trained on. Therefore, to improve
the LF neural network predictive capabilities, further training with additional data is desirable.
The M&P graph is followed by the histograms of the individual residuals for the data samples of
the final neural network (Figure A.3.2). The figure shows the residual histograms for the training
and test data set. The training residuals are normally distributed with a mean of zero.
The distribution of residuals for the test set shows a somewhat normal distribution with a mean
that is grater than zero, indicating a bias of the model to slightly underpredict monitoring well
concentrations. This can also be seen in Figure A.3.1.
One more graphical tool, the "95th confidence Interval Graph" (Pi-Graph), was used to confirm
the performance of the LFNN. As shown in Figure A.3.3, a high percentage of all data points fall
within the prediction band of the best fit line, which is a linear regression line through all data
points. That is true for all data samples of both, the final training and test data set The Revalues
meet the criteria of being greater than 0.9, and the slope of the best fit line is around 1 as can be
read from the regression equation. Each of these measures confirm the accuracy of the LF neural
network performance.
R12-98.034.A.30
A-35
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.3 Details of Neural Network Training for Each WMU
April 13,1999
Figure A3.1 M&P Graph for LFNN, Training Data (Top) and Test Data (Bottom)
5 —
•4-
Q
2-
1-
L.OGPKCON Measured & Predicted
i i i
100
200
300
N-455
•400
3 on
1_OGPKCON Measured & Predicted
I i i
N-25T
_l i
3-
2-
1-
50
100
150
Ion
200
250
R12-9S.034_A30
A-36
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.3 Details of Neural Network Training for Each WMU
April 13,1999
Figure A.3.2 Histogram of the Difference between Predicted and Observed (Residual) log
Peak Concentration for the LFNN
Training
120%
0%
-0.5 -0.3 -0.1 0.1 0.3 0.5 More
Residual LF-log peak concentration, Training Matrix
Test
-0.3 0 0.3 0.6 0.9 1.2
Residual LF-log peak concentration, Test Matrix
R12-98.034_A.30
A-37
-------
95% Confidence Interval for Training Data Examples of
Final LF-Nerual Network
0.9903x + 0.0134
FP = 0.9924
Lower 95% CI-Limit
Upper 95% Cl-Linnit
EPACMTP-Datapoints
• Linear (EPACMTP-
Datapoints)
246
EPACMTP-log(DAF) (measured)
8
00
Figure A.3.3 95% Confidence Interval Graph for LFNN, Training Set
Ill
g g »
>s?
«-|B«
**l
III
5N
?<6 g
H I?
IH
§. w s
""H
ra a
3*0
1 O
w §
w g
*!
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
tf
g
— CO
CD g
8"-
CD
T3
§
5 gg
i fe
o
n
fi
vP
O^
8
O
5
a
i
<* ^-^
^^ to
s
co
CM
CO U3 Tf CO CM T- O
(pejospajd) (dVQ)6o|-NN
H
S5
&
f
a>
u
i
«0
•<
i
R12-98.034_A3.0
A-39
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
Validation Results
Interrogation of the landfill neural network with the independent validation data samples showed
the neural network's ability to predict output values close to the desired EPACMTP output
values for most of the data samples, especially for the low range of DAF values (from 1.0
through 1,000). Table A.3.3 presents a small sample (16 out of 115) of the results of the
validation test. The examples presented in Table A.3.3 for comparison were chosen randomly
from the validation data set. Note, that these test data samples were not included in the neural
network training.
The tabulated results demonstrate the features of the validation test discussed in Section 5.4 of
this report, namely that the neural network produces accurate predictions in the most relevant
region of lower DAF values, e.g., DAF^ 100, while the larger derivations occur when the DAF
values are high land concentrations are low.
The validation test results are presented graphically hi Figure A.3.5. These graphs have been
discussed in Section 5.4 of this report.
•EPACMTP
•NN-Predicted
20
40 60
Realization (
80
100
120
Figure A.3.5 Comparison of EPACMTP-generated and Neural Network Predicted
Monitoring Well Concentration for Landfills
R12-98.034.AJO
A-40
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
1
•**
1
i
§
O
<:
4J
J
*%
JO
~ct
5
"«
>
^
i
s
"E
a
O
tratlon (Non
*
3
?
S-
^
£
i
i
1-4
•a
o
i
2
s
£
Eh
£
•H-
2
2
i
o
,J
Residual
B
Z -3
£
EU «•
s i
u c£
"S
•§
i
S
zl
Z -a
1
ft. ^
II
II
"' (8
03
SI
I
.£
V.
&<
S3
•r>
in
o\
in
VO
VO
S
1
S
oi
Js
5;
r
en
g.
9
Tl-
O
in
in
•*
TT
««l
0
in
ON
en
9
•f
oo
en
r-
en
K
cs
ON
SS
VI
O
en
§
o
oo
T
en
NO
NO
o
•*
5;
CN
m
3
>n
_
m
S
s
o
s
en
K
en
00
oq
ON
eM
eN
S
en
c
o
g
>n
NO
NO
c>
in
en
r-
TT
£
9
S
m
en
O
K
tN
OO
NO
oo
NO
t-
g
O
oo
T
en
S
9
OO
en
Vi
o
rf
oo
oo
CM
en
o
en
S
S
0
s
CM
OO
in
CM
S
CM
f-
o
Ti-
en
>n
1
o
g
in
NO
NO
o
ro
0
—
CM
en
CM
•«•
o
0
oo
NO
m
en
NO
in
8
CM
O
Tf
S5
S
^a-
o
S
en
g
o\
r*
O
n
oo
oo
1~*
ON
S
S
en
0
o
s
in
8
°f
•*
0
oo
CM
CM
en
g
9
NO
in
oe
NC
oc
NC
t-
E
c
oc
T
e*
C
o
R12-98.034_A3.0
A-41
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.3 Details of Neural Network Training for Each WMU
April 13,1999
A.3.2 Surface Impoundment Neural Network
This chapter explains in detail the development of the surface impoundment neural network
(SBSfN) and is divided into two parts: Section A.3.2.1 describes the training process, and Section
A.3.2.2 explains the characteristics of the final SINN and its predictive capabilities.
A.3.2.1 Details of Training Process
The development of the surface impoundment neural network (SINN) was not as simple as the
landfill neural network. SINN training was accomplished through the use of an initial training
data set, an initial test data set, an additional training data set, five additional test data sets, and
finally, a validation data set. In a departure from the training method used for the LFNN, all data
samples were imported into the NNModel software and initially designated as training data.
Immediately after importing the data, 30% of the available data samples were randomly chosen
and assigned to the test set.
Table A.3.4 summarizes how the data were sampled for the individual data set types. As
described in Section A.2.1, our initial training and testing sets were constructed according to the
star point theory. That is, the "Trainl" data sets contains combinations of the 10th, 50th, and 90th
percentile values from the OSW-survey data (star-point theory); and the "Tesf data set contains
intermediate percentile values, such as 20th or 70th percentile; and all possible combinations of
these input parameter values were used. The "Train2" data set is composed of values that were
not chosen according to the star point theory, rather they were randomly chosen within the 10th to
90th percentile range of the OSW-survey data. Data values and combinations of values for the
data samples in "Additional Test Set 1" through "Additional Test Set 5" were chosen in order to
fill in gaps within the distributions of values for each input parameter.
The training, testing, and validation procedure is described in detail following Table A.3.4.
RI2.98.034.A3.2
A-42
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.3 Details of Neural Network Training for Each WMU
April 13,1999
Table A.3.4 Summary of Neural Network Data Sets for Surface Impoundments
Number of values picked per parameter, SINN
Set Name
Trainl
Test
Trainl
Add.
Testl
Add.
Test2
Add.
TestS
Add.
Test4
Add.
TestS
A
A
A
Al
Al
Validation data
set
AREA
3
2
2
6
4
3
7
9
SINFIL
3
3
2
6
4
3
7
9
DSOIL
2
2
2
5
4
3
7
9
ZB
2
2
2
5
4
3
7
9
RADIUS
2
2
2
6
4
3
7
9
KQC
3
3
4
5
4
9
21
27
X
3
3
4
5
4
9
21
27
Each run has it's values randomly selected from the OSW Survey Data
10th to 90"" percentile or the chemical list (KOC, X)
Total number of CMTP runs
CMTP runs
648
432
1080
512
160
256
27
63
219
281
2598
RUNS Used
936
615
321
397
113
183
27
60
219
258
2193
Initial Training (Trainl. Test. Train!)
The SINN training was begun by creating several neural networks using the same sets of data
samples (Trainl and Test). We then began training each neural network using different options
in the NNModel software to discover which ones resulted in the most successful training. This
analysis showed that the best training initiation was achieved by importing both the training and
the test data were as one data matrix, with all data samples assigned as training data. Before any
training began, 30% of the data were randomly designated as test data. Using this method, the
neural network development evolved slowly but with steady improvement. Thus, this version of
the SINN was chosen for continued training.
After initial training, the Train2 data set was also imported, with a random 30% of the data
designated as test data and the rest designated as training data. The training process continued
and was only interrupted for several residual checks (see Section A.2.3 for more details). Based
on the residuals, some test data samples were moved to the training data set. Additionally, these
R12-98.034.A3.2
A-43
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
examinations of the residuals provided a measure of the progression of neural network training
and of the predictive properties of the neural network.
During this stage of network training, two additional test data sets were produced (see Section
A.2.1 for more details). Assuming that the network was moderately well-trained at this point, we
tested the SINN with this test data based on Group 2-data samples.
Training with 2 Test Data Sets based on Group 2 - Data samples (Additional Test Sets 1 and 2)
The input values for the first two additional test data sets (Add. Test Set 1 and 2) were essentially
randomly chosen. This method was used in an attempt to produce an unbiased test of the
network — to mimic the inputs to the neural network that an end-user of the software might
choose. The established data samples consisted mainly of up to two new input values per
parameter and their combinations, whereas each set and sub-set contained four different values
(Table A.3.4). The impact of having only new input values and combinations is significant when
looking at the approximately 0.1 decrease in the coefficient of determination, R2 which occurred
after appending tie new data samples to the existing test data matrix. This decrease in R2
indicated that the SINN still was not fully trained. We then developed more additional test data
sets (the Group 3 data samples) which utilize strategically designed input values.
Training with 3 Additional Test Data Sets Based on Group 3 - Data Samples fAdd. Test Sets 3. 4
These additional test data sets were explicitly built for and applied to the SINN using the strategy
for Group 3 data samples (for more detailed information, see Section A.2.1.2). In creating the
Group 3 data samples, we tried to identify regions of the response surface in which the neural
network gave good predictions and regions in which the neural network gave poor predictions.
This information would then be used to select additional values of input parameters to be used in
the Group 3 data samples.
Values for input parameters for the data samples in these additional test data sets were chosen by
examining the histograms of input values for the existing test and training data sets and
performing the following procedure for each input parameter:
1) We looked for input values that contributed to accurate predictions. These values are
represented on the input value histograms as a tall column or a group of columns. That
is, these are input values that are used with a high frequency in the training data set and
tend to be used in data samples for which the neural network gave satisfactory
prediction. Thus, we refer to these values as "good".
R12-98.034_A3i
A-44
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.3 Details of Neural Network Training for Each WMU
April 13,1999
2) We looked for input values that resulted in minor contributions to accurate predictions.
These are input values that are used with a low frequency (or are missing) in the
training data set and tend to be used in data samples for which the neural network did
not give satisfactory prediction. Thus, we refer to these values as "bad".
3) We incrementally stepped into the data gap adjacent to "bad" values (either increasing
or decreasing the value, depending on which direction the gap was the largest or the
columns the shortest). The magnitude of these steps were determined as incremental
percentages of the "bad" value (or log value, as appropriate). We refer to these
incremental values as "daughter" values.
4) The input value with the tallest neighboring columns on the histogram was picked as
"the best" value. The input value with the shortest neighboring columns was selected
as "the worst" input value.
5) We constructed combinations of the good/best, bad/worst, and daughter input values,
performed these EPACMTP modeling runs, and used this data to continue the training
of the SINN. Our expectation was that continuing the neural network training using
data which increased the frequency of input values from under-sampled ranges will
improve the neural network's predictive capability.
The process of identifying areas in the input parameters' data space which are insufficiently
sampled in the training data set ("bad" values) and areas in the input parameters' data space
which are sufficiently sampled in the training data set ("good" values) is illustrated in Figure
A.3.6.
R12-98.034.A3.2
A-45
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
Distribution of LOGKOC N-2454
0.18-
0.16-
0.14-
3)0.12-
u
§0.10-
0.0S-
0.04-
Region of assumed poor predictive capabili iy
Region of assumedvgood prediijSi
capability
-i
1 2
LOGKOC
Figure A.3.6 Interpretation of Predictive Neural
Network Capability
As documented in Table A.3.4, five additional test data sets were used in training the SINN. The
purpose of additional test set 3 was to locate zones in the response surface that were already
sufficiently covered with data samples. Therefore, the data samples in additional test set 3 were
created using the best value of each input parameter and daughter values created using
increments of 5% and 20% of the gap. Adding these additional data samples to the existing
SINN test data matrix did not produce a significant improvement in neural network prediction.
That is, an increment size of 20% was too small to produce the desired improvement. Keeping
this observation in mind, we created additional test set 4.
The purpose of the fourth test data set was to locate areas in the response surface that were
insufficiently covered with data samples. Thus, the data samples in additional test set 4 were
created using the worst value of each input parameter and daughter values created using
increments ranging from 15% to 70% (instead of the maximum value of 20% which was used in
additional test set 3).
Graphical analysis of these additional data samples (appended to the network's test data matrix)
and the neural network predictions indicated good predictive capability. We then created an
additional test set 5, to evaluate the importance of the combinations of values used for a given
input within a testing data set, instead of choosing only single values.
R12-9S.OM_A3.2
A-46
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
Analyzing the behavior of the network after appending the Additional test set 5 to the existing
neural network indicated that the neural network predictions were reasonably accurate for most
of these data samples.
Validation Test
The neural network training and testing was stopped when the R2 value for the SINN reached
0.980. For a final evaluation of the performance of the SINN, the final neural network was tested
using a validation data set which was constructed of unbiased data samples (see Section A.2.1.3
for a more detailed discussion of how this type of data set was created).
An evaluation of the graphs of neural network performance indicated that the SINN performed
well for these unbiased data samples when the input values fall within the neural network's
training range (generally values corresponding to the 10th to 90th percentile values from the OSW
Survey Data). This analysis also showed that the neural network's classification skills are well
developed (see Section A.2.1.3 for a detailed discussion of classification ).
Summary
The following observations and conclusions were gained from the training of the SINN:
• The neural network performance is highly dependent on both the input values used in
training data and, perhaps even more importantly, on the combinations of these input
values which are chosen for the training data.
• The two neural network inputs, logKOC and logX, are the most sensitive inputs for the
SINN. However, the correlation of logKOC and logA, values to the neural network
output decreased dramatically when data samples were used which combine
"good"values of the other five input parameters.
The following table (Table A.3.5) summarizes the properties of the final surface impoundment
neural network.
R12-98.034.A3.2
A-47
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
Table A.3.5 Summary of Final SINN Features
Neural Network Parameter
Number of Training Data Samples
Number of Hidden Neurons
Maximum Number of NN-Iterations
R2 (log peak cone, training data set)
R2 (log peak cone, test data set)
Initial
936
(615 training, 321 test)
10
1
0
0
Final
2747
(2454 training, 293 test)
28
388399
0.992
0.980 (not including Validation
Set)
0.917 (including Validation Set)
A.3.2.2 Performance of Final Surface Impoundment Neural Network
Graphical Results
As discussed in Section A.2.3, several types of graphs was used to evaluate neural network
performance at intermediate stages of training and to document the performance of the final
network. Figure A.3.7 is the "Measured & Predicted Graph" (M&P-Graph) for the SINN. The
close match of the predicted and measured curves confirms that the SINN has good
generalization properties. Comparison of Figure A.3.7 with Figure A.3.1 shows that: 1) the
predicted curve of the SINN is smoother than that of LFNN, and 2) there is a closer match
between measured and predicted values for the SINN than for the LFNN. These differences are
due in large part to a longer and more intense overall training session for SINN than was used for
the LFNN. The close match shown for the training data on the SINN M&P Graph suggests that
the neural network successfully learned the majority of the data patterns in the training data. This
observation is supported by good fit between the measured and predicted values for the test data
sets, indicating that SINN is generalizing well.
Figure A.3.8 presents two histograms of the individual residuals for the data samples comprising
the final neural network; the first graph hi this figure is a histogram for the final training data set,
and the second graph is a histogram of the final test data set. These histograms show the desired
normal distribution of the residuals around zero.
The "95th Confidence Interval Graph" (CIG), was used to confirm the SINN performance. As
R!2-9S.034_A3.2
A-48
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section AJ3 Details of Neural Network Training for Each WMU
April 13,1999
Figure A.3.7 M&P-Graph for SINN, Training Section (Top) and Test Section (Bottom)
LOGPEftKC Measured & Predicted
5.5-
2.5
650 700 750 800
Otoservat1on
B50
N-30S
I
900
N-146
R12-98.034_A3.2
LOGPEftKC Measured & Predicted
A-49
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
Figure A.3.8 Histogram of the Difference between Predicted and Observed (Residual)
Log Peak Concentration for SINN
Training
Test
auu -
800 -
700-
^ 600 -
0
§ 500-
3
{jf400-
£ 300-
200-
100-
T
i/
* F
i
' -V!
i
~
i
•
j
j
i i
• tn
' @
- 100%
- 80%
- 60%
- 40%
- 20%
no/
-0.75 -0.6 -0.45 -0.3 -0.15 0 0.15 0.3 0.45 0.6
Residual Si-log peak concentration, Training Matrix
-0.4 -0.25 -0.1 0.05 0.2 0.35
Residual Si-log peak concentration, Test Matrix
A-50
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
ffi
CO
* J
§ CO
CD -=
£ I
CD LJL
•8
8
p
O
I!
CD
I I I
CO CM T- O
x
5
BD
"S
oa
I
tt
i
i
e
10
o\
I
RI2-98.034_A3i
A-51
-------
95% Confidence Interval for Training Data Examples of
Final Si-Neural Network
0.9907x +0.0165
r? = 0.992
1234
BDACMTP-log(DAF) (measured)
Lower 95% d-Lirrit
U0per95%O-Lirrit
tFACMTP-Datapoints
• Linear (B^CMTP-
Datapoints)
I
l-i
It)
Figure A.3.10 95% Confidence Interval Graph for SINN, Test Matrix
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
shown in the CIGs (Figures A.3.9 and A.3.10), a high percentage of all data points fall within the
confidence band of the best fit line, for both the final training and test data sets. The Revalues
are greater than 0.9, and the slope of the best fit line is approximately equal to 1, as can be read
from the regression equation.
Validation Results
Interrogation of the SINN showed the network's ability to predict output values close to the
desired EPACMTP output values for most of the validation data samples. Tables A.3.6 and
A.3.7 contain a subset (19 out of 258) of data samples randomly chosen from the validation set,
which represents the 10* to 90th percentile OSW-Survey data range, with an emphasis on DAFs
in the range of 1.0 to 1,000. However, DAFs above 1,000 are also predicted well by the SINN
(Figure A.3.9 and A.3.10). The results indicate that the predictions of the SINN are generally
within one order of magnitude of the EPACMTP modeled output value. The results of the
validation test are presented graphically in Figure A.3.11. A discussion of these results is
provided in Section 5.4 of this report.
o
Figure A.3.11
•EPACMTP
•NN-Predicted
50 100 150
Realization (Nmax=258)
200
250
Comparison of EPACMTP-generated and Neural Network Predicted
Monitoring Well Concentrations for Surface Impoundments.
R12-98.034JA3.2
A-53
-------
g
Table A. 3.6 Comparison of Observed and Predicted DAF Values based on Peak Well Concentrations
Validation of SINN
Neural Network Inputs
LOO
KOC
0.13
1.66
0.35
1.76
1.26
1.73
1.68
2.71
2.97
0.44
3.06
1.99
2.34
1.70
3.11
0.45
1.05
1.61
1 04
LOG
LAMBDA
-0.24
-2.30
-1.20
-4.00
-4.00
-3.10
-4.00
-2.39
-4.00
-4.00
-4.00
-2.35
-4.00
-4.00
-4.00
-4.00
-4.00
-4.00
•400
LOG
AREA
2.00
2.31
3.70
2.75
2.08
2.31
3.02
3.93
2.31
3.56
2.89
2.92
3.91
3.68
4.00
2.84
4.65
3.73
169
SINFIL
0.367
0.086
0.083
0.137
0.129
0.169
0.083
0.209
0.208
0.160
0.186
0.369
0.116
0.194
0.317
0.295
0.144
0.280
0247
DSOIL
6.1
9.1
2.1
21.3
4.0
9.1
1.5
21.3
1.5
22.9
1.5
2.4
9.1
16.8
21.3
9.1
18.3
4.0
0 1
ZB
7.3
3.0
12.2
3.0
3.0
3.0
4.9
3.0
3.0
3.0
3.0
3.0
3.0
3.0
3.0
3.0
3.7
3.0
30
LOG
Radius
2.25
2.39
2.26
2.42
2.14
2.29
2.37
2.40
2.00
2.48
2.08
2.54
2.43
2.40
1.87
2.13
2.46
2.48
1 88
Peak Well Concentration (Non-Carcinogens)
LOG (Concentration)
CMTP
Result
2.36
4.14
4.49
4.61
4.67
4.73
4.79
4.98
4.99
5.07
5.11
5.12
5.13
5.29
5.38
5.39
5.44
5.46
"570
NN
Prediction
2.31
4.28
4.24
4.51
4.71
4.77
4.75
5.04
5.19
5.21
5.34
5.25
5.07
5.21
5.44
5.41
5.44
5.45
SfiR
Residual
0.05
-0.14
0.25
0.1
-0.04
-0.04
0.04
-0.06
-0.2
-0.14
-0.23
-0.13
0.06
0.08
-0.06
-0.02
0
0.01
0.02
DAF
CMTP
Result
4365.2
72.4
32.4
24.5
21.4
18.6
16.2
10.5
10.2
8.5
7.8
7.6
7.4
5.1
4.2
4.1
3.6
3.5
2.0
NN
Prediction
4897.8
52.5
57.5
30.9
19.5
17.0
17.8
9.1
6.5
6.2
4.6
5.6
8.5
6.2
3.6
3.9
3.6
3.5
2.1
Residual
-532.6
20.0
-25.2
-6.4
1.9
1.6
-1.6
1.4
3.8
2.3
3.2
2.0
-1.1
-1.0
0.5
0.2
0.0
-0.1
-0.1
ro Bt *•*
t MI
-------
Table A.3.7 Comparison of Observed and Predicted DAF Values based on Average Well Concentrations
Validation of SINN
Neural Network Inputs
LOG
KOC
0.13
1.66
0.35
1.76
1.26
1.73
1.68
2.71
2.97
0.44
3.06
1.99
2.34
1.7
3.11
0.45
1.05
1.61
1.03
LOG
LAMBDA
-0.24
-2.30
-1.20
-4.00
-4.00
-3.10
-4.00
-2.39
-4.00
-4.00
-4.00
-2.35
-4.00
-4.00
-4.00
•4.00
-4.00
-4.00
-4.00
LOG
AREA
2.00
2.31
3.70
2.75
2.08
2.31
3.02
3.93
2.31
3.56
2.89
2.92
3.91
3.68
4.00
2.84
4.65
3.73
3.15
SINF1L
0.367
0.086
0.083
0.137
0.129
0.169
0.083
0.209
0.208
0.160
0.186
0.369
0.116
0.194
0.317
0.295
0.144
0.280
0.246
DSO1L
6.1
9.1
2.1
21.3
4.0
9.1
1.5
21.3
1.5
22.9
1.5
2.4
9.1
16.8
21.3
9.1
18.3
4.0
9.1
ZB
7.3
3.0
12.2
3.0
3.0
3.0
4.9
3.0
3.0
3.0
3.0
3.0
3.0
3.0
3.0
3.0
3.7
3.0
3.0
LOG
Radius
2.25
2.39
2.26
2.42
2.14
2,29
2.37
2.40
2.00
2.48
2.08
2.54
2.43
2.40
1.87
2.13
2.46
2.48
2.02
Average Well Concentration (Carcinogens)
LOG (Concentration)
CMTP
Result
2.18
4.01
4.31
4.54
4.49
4.58
4.63
4.92
4.84
4.97
4.98
4.95
5.03
5.16
5.33
5.21
5.33
5.29
5.32
NN
Prediction
2.14
4.16
4.11
4.42
4.57
4.65
4.63
4.95
5.03
5.08
5.18
5.11
4.95
5.09
5.31
5.25
5.31
5.29
5.32
Residual
0.04
-0.15
0.20
0.12
-0.08
-0.07
0.00
-0.03
-0.19
-0.11
-0.20
-0.16
0.08
0.07
0.02
-0.04
0.02
0.00
0.00
DAF
CMTP
Result
6606.9
97.7
49.0
28.8
32.4
26.3
23.4
12.0
14.5
10.7
10.5
11.2
9.3
6.9
4.7
6.2
4.7
5.1
4.8
NN
Prediction
7276.8
69.3
78.3
38.2
27.1
22.4
23.4
11.3
9.3
8.4
6.6
7.7
11.1
8.2
4.9
5.6
4.9
5.1
4.7
Residual
-669.9
28.4
-29.3
-9.4
5.3
3.9
0
0.7
5.2
2.3
3.9
3.5
•1.8
-1.3
-0.2
0.6
•0.2
0
0.1
I
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
A.33 Waste Pile Neural Network
This section explains in detail the development of the waste pile neural network (WPNN) and is
divided into two parts: Section 3.3.1 describes the training process, and Section 3.3.2 explains
the characteristics of the final neural network and its predictive capabilities.
A.3.3.1 Details of Training Process
The development of the waste pile neural network (WPNN) benefitted from lessons learned
during the training of the surf ace impoundment neural network. The selection of values for the
training data sets and the training process itself were significantly streamlined. The WPNN was
developed with an initial training data set, an initial test data set, an additional training data set,
25 additional test data sets, a chemical test data set, and finally a validation set. As was done for
surface impoundments, the initial training and test data were combined in one data set. Before
training began, 30% of the available data samples were randomly chosen and assigned to the test
data set.
Table 3.8 summarizes how the data were sampled for the individual data sets. As described in
Section 2.1, the initial training and testing sets were constructed according to the star point
theory. That is, the "Trainl" data set contains combinations of the 10th, 50th, and 90th percentile
of the OSW-survey data; and the "Test" data set contains intermediate percentile values such as
the 20th or 70th percentile. All possible combinations of these input parameters were used.
The "Train2" data set is composed of values that were not chosen according to the star-point
theory, rather they were randomly chosen within the 10th to 90th percentile range of the OSW
survey data. Additionally, these data samples were chosen in an attempt to provide an optimal
coverage of values in the input data space. Data values and combinations for the additional test
sets were chosen to fill gaps within the distributions of values for each input parameter.
The training, testing, and validation procedure for the WPNN is summarized in Table A.3.8.
R12-98.034-A33
A-56
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.3 Details of Neural Network Training for Each WMU
April 13,1999
Table A.3.8 Summary of Neural Network Data sets for Waste Piles
Number of values picked per parameter, WPNN
Set Name
Train
Test
Train2
AdeLTestl
Add.Test2
Add.Test3
Add.Test4
AdeLTestS
Add.Test
sets
6 to 25
Additional
Test
Sets 26-30
Validation
Set
AREA
2
2
3
2
2
2
1
1
lor 8
10
SINCTL
3
3
3
3
2
2
1
1
Ior8
15
DSOEL
3
3
2
2
2
2
1
1
Ior8
15
ZB
2
2
2
2
2
2
1
1
lor 8
10
RADIUS
2
2
2
2
2
2
1
1
1,8 or 7
10
KOC
3
3
3
3
2
2
8
8
Ior8
5
X
3
3
2
3
2
2
8
9
Ior8
5
Each run has it's values randomly selected from the OSW Survey data 10"1 -
90111 or the chemical list (KOC, A.)
Total number of CMTP runs
CMTP Runs
648
1296
648
432
432
128
128
64
24
1248
360
140
4254
Runs Used
358
669
311
430
432
128
126
64
15
1129
227
64
3284
R12-98.034_A3.3
A-57
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
Initial Training (Train. Test. Train2)
As was done for the surface impoundment neural network training, the WPNN training was
started with several neural networks using the same data sets. Each of these neural networks
were trained using different options hi the NNModel software to discover which ones resulted in
the most successful training. The best initial training was achieved with a mixed training and test
data set. Therefore, the neural network chosen for continued training was the one in which the
test data were randomly assigned as 30% of the total number of data samples.
During neural network development, the second training data set (Train2) was added to the
neural network. The training process was interrupted to perform several residual checks (see
Section 2.3 for details of this process). Based on the residuals, some test data samples were
moved to the training data set. Additionally, examination of the residuals provided a measure of
the progress of the neural network training and of the predictive properties of the neural network.
During this stage of neural network training, additional test data sets were produced (see Section
2.1 for more details). Assuming that the network was moderately well-trained at this point, we
tested the WPNN with this test data, which is based on Group 2 data samples.
Training with 3 Additional Test Data Sets based on Group 3 Data Samples
Using the initial distribution of the input values for each input parameter, these data sets were
created using the theory for Group 3 data samples (see Section A.2.1.2 for more details). New
and existing input values were selected (generally two or three values per parameter) in all
possible combinations. The three additional test sets constructed hi this manner were then
appended to the test data matrix of the WPNN. The goal of constructing these data sets was to
more completely sample areas of the input parameters' data space that are insufficiently sampled
in the existing training data set. This approach worked well, as was demonstrated by the
WPNN's rapidly unproved R2 values.
However, some ranges of input values still did not produce accurate predictions. Therefore, a
new approach was used to develop the next set of additional testing data.
Training with Additional Test Data - Test Set 4
This validation data set was created to determine the importance of combinations of input values
within the data set. Thus, eight existing input values for logKOC and logX (the most sensitive
input parameters) were selected and combined with one value for each of the other five input
parameters a value which was already used in the training data (e.g. 10th, 20th, or 50th percentile
value from the OSW Survey data). Neural network training continued after these additional data
RI2-9S.OJCA3.3
A-58
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
sets were appended three times to the training data matrix. This duplication of data sets
"emphasized" to the network the importance of these new data. The R2 values of the WPNN
continued to improve.
Training with Additional Test Added - Test Set 5
Test Set 5 was designed and used to test how well the neural network generalized with the
appended data in Additional Test Set 4. Test Set 5 used some of the same input values as Add.
Test Set 4, but these values were used hi different combinations than had been used in Add.
Test4. How these data samples were used in further training is described in the next section,
which elaborates on Group 4 -Data Samples.
Training with Group 4 Data Samples
The R2 values at this point of training were extraordinarily high and the graphs indicated a well
trained neural network. Therefore, further network testing was performed to evaluate the
performance of the WPNN. Twenty-one additional test sets composed of Group 4 Data Samples
were created to test the input value combinations (see Section A.2.1.2 for more details on Group
4 Data Samples). These 21 additional test sets and Test Set 5 were utilized hi further training of
the network together as a group (1144 data samples total). Neural network training reached a
plateau after which the R2 values did not increase.
To produce an acceptably trained WPNN and to identify the best and the most time-efficient
network training methods, a copy of the most recent WPNN was created and the training of these
two neural networks (WPNN-A and WPNN-B) continued using different methods for adding
data to the training set. In WPNN-A, only the training data matrix was appended. All new 1144
validation data examples were appended three times to the training matrix. Note that data
duplication helps to balance data combinations in a data set (see Section 2.1.1 for more details).
In WPNN-B, all 1144 data samples were initially appended once to the test data set. Then, using
a check of the residuals on this expanded test set, some data samples were moved from the test
data set to the training data set. These transferred data samples were appended three times to the
training data set to ensure the neural network trained well on these values. After training both
networks to similar conditions (e.g. similar number of hidden neurons and number of teaming
iterations), both networks produced acceptable results. The following table (Table 3.9)
summarizes the characteristics of the two trained neural networks.
RI2-98.034.A3.3
A-59
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
Table A.3.9 Characteristics of WPNN-A and WPNN-B
Network
A
B
Hidden
Neurons
32
32
Total
Number of
Counts
160050
160082
R2 log pk cone
Train
0.996
0.995
Test
0.995
0.996
R2 log avg cone
Train
0.996
0.996
Test
0.996
0.996
Number of
Data Samples
Train
5403
3795
Test
464
873
R2 log pk cone = Linear Coefficient of Determination of receptor well peak concentration (log)
R2 log avg cone = Linear Coefficient of Determination of receptor well average concentration (log)
Validation of Neural Network
The neural network training and testing was stopped at this point, given that there were now two
neural networks which produced acceptable predictions. The two neural networks were validated
using two types of validation data sets: a chemical validation set and an unbiased validation set.
These two sets were appended to the test matrices of WPNN-A and WPNN-B.
These validation data were designed to provide the data necessary to decide which WPNN should
be used as the final neural network. The unbiased validation set was constructed using random
values for the input parameters, as described in Section A.2.1.3. The logKOC and logA, values
for the chemical validation set were determined by choosing five representative waste
constituents from among those considered for this guidance. The other input values for the
chemical validation set were chosen randomly, as described hi Section A.2.1.3.
The measured versus predicted graphs for WPNN-A and WPNN-B indicated that both neural
networks are able to deliver accurate results for random data entries for input values fall within
the neural network's training range (10th to 90th percentile OSW-Survey Data). However, the R2
values of WPNN-B are slightly better than those of WPNN-A. For this reason, WPNN-B was
chosen as the final neural network. Further analysis of WPNN-B showed that this network had
the best classification capability and that this network generally produces good predictions when
interrogated (see Tables 3.11 and 3.12). The results of the validation text are shown in Figure
A.3.11.
R12-98.034JO.3
A-60
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.3 Details of Neural Network Training for Each WMU
April 13,1999
•EPACMTP
•NN-Predicted
6
O 51
I •=
S v
Q.
O 2-)
O
~" 1 -\
50 100 150 200
Realization (Nmax=291)
250
300
Figure A.3.12
Comparison of EPACMTP-Generated and Neural
Network-Predicted Monitoring Well Concentrations for
Waste Piles.
Summary
A neural network that net the neural network quality criteria was developed. WPNN-B was
chosen as the final neural network for waste piles. Table A.3.10 summarizes the properties of
the final waste pile neural network.
R12-98.034_A3.3
A-61
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
Table A.3.10 Summary of Final WPNN Features
Neural Network Parameter
Number of Training Data Examples
Number of Hidden Neurons
Maximum Number of NN-Iterations
R2 (log peak cone, training data set)
R2 (log peak cone, test data set)
R2 Gog average cone, training data set)
R2 (log average cone, test data set)
Initial
669
(358 training, 311 test)
10
1
0
0
0
0
Final
4668
(3795 training, 873 test)
32
160082
0.995
0.996 (not including validation
sets)
0.990 (including validation sets)
0.996
0.996 (not including validation
sets)
0.992 (including validation sets)
The most sensitive input parameters for the WPNN are logKOC and logA,. Using Group 3 and 4
data samples led to quick success in the WPNN training. Using a large number of Group 4 data
samples was especially helpful in efficiently covering the input data space. The extremely high
R2 values for the WPNN attest to the high accuracy of this network's predictions.
The training methodology employed for the WPNN was a result of lessons learned during the
training of the SINN. However, WPNN training could be continued, if a further improvement of
the network's performance is desired.
A.3.2.2 Performance of the Final Waste Pile Neural Network
Graphical Results
As discussed in Section 2.3, several types of graphs were used to evaluate the final performance
of the network. Shown below in Figure A.3.13 is a part of the "Measured & Predicted Graph"
(M&P-Graph) for the WPNN in which the network output is log peak concentration. The close
match of the predicted and measured curve confirms that the WPNN has good generalization
properties. This graph suggests that the neural network successfully learned the majority of the
RI2-9S.03-CA3.3
A-62
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
Figure A.3.13 M&P-Graph for WPNN, Training Section (Top) and Test Section
(Bottom)
UOGPEftKC Measured & Predicted
1900
2000
2100
2200
. Ion
5-
4-
3-
2-
1-
/! f]
i\
700
750 B00
Otacer-^at Jon
N-474
2300
LOGPEflKC Measured & Predicted N-21B
i i i i
BS0
RI2-98.034_A3.3
A-63
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
data patterns in the training data. This observation is supported by the good fit between the
measured and predicted values for the test data sets — data samples that the neural network was
not trained on — indicating that the WPNN is generalizing well.
Figure A.3.14 presents two histograms of the residuals for the data samples comprising the final
neural network; the first graph in this figure is the histogram for the training data set, and the
second graph is a histogram of the test data set. These histograms show the desired normal
distribution of the residuals around zero.
The "95th Confidence Interval Graph" (CIG) was used to confirm the WPNN performance. As
shown in the CIGs (Figures A3.15 and A3.16), a high percentage of all data points fall within the
confidence band of the best fit line, for both the final training and test data sets. The Revalues
are greater than 0.9, and the slope of the best fit line is approximately equal to 1 as can be read
from the regression equation.
Numerical Results
Interrogation of the WPNN showed the network's ability to predict output values close to the
desired EPACMTP output values for most of the validation data samples. Tables 3.11 and 3.12
contain data examples randomly chosen from the validation set, which represents the range of
10th to 90th percentile values from the OSW Survey data, with an emphasis on DAFs in the range
of 1 to 1,000. However, DAFs above 1,000 are also predicted well by the WPNN. The results
indicate that the predictions of the WPNN are generally within one order of magnitude of the
EPACMTP modeled output value.
R12-9S.034JOJ
A-64
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
Figure A.3.14
Training
Histogram of the Difference between Predicted and Observed
(Residual) log peak concentration for WPNN
1000 i
900-
800-
700-
o 600 -
=3 500-
f 400-
300-
200-
100-
0 -
^^g/
\
I
-0.5 -0.4 -0.3 -0.2 -0.
1 ^
I
!
f
\
\
!
/
1 0 0.1
»
,JI • M • • • • •
lra
0.2 0.3 0.4 0.5
r 100%
• 90%
- 80%
•70%
• 60%
• 50%
• 40%
- 30%
- 20%
• 10%
-0%
Residual WP-log peak concentration, Training Matrix
Test
350 -i
300 -
250 -
>.
c 200 -
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
(dVd)6o|-NN
I
M
"5
2
I
t
a>
u
I
R12-9S.034_A3J
A-66
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.3 Details of Neural Network Training for Each WMU
April 13,1999
O
&
Q.
II
^ CD
s 2
F -g
^5 3
^^ •
II
o 1
£iZ
*
'E
eg O
8 II
o Jt
T~
II
CO
.c
co
c
CD
Q.
CO
£
Z]
i
I
cl
t
O
I
I
s
I
VO
2
Rt2-98.034_A33
A-67
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
U]
O
a
t*
S
1
^
(3
£
a,
^
tt
c
c>
0
TJ
C'
V
-------
Table A.3.12 Comparison of Observed and Predicted DAF Values Based on WP-Average Well Concentrations
Validation of WPNN
Neural Network Inputs
LOG
KOC
4.30
4.00
3.40
-0.50
0.70
1.20
4.00
1.80
1.00
2.00
1.20
1.80
3.10
1.20
1.10
1.10
0.30
-0.80
0.70
LOG
LAMBDA
-1.70
-2.80
-4.00
-4.00
-1.50
•4.00
-4.00
-4.00
-4.00
-4.00
-4.00
-3.10
•4.00
-4.00
-2.50
-4.00
-4.00
-4.00
-1.50
LOG
AREA
2.75
2.97
2.93
1.91
3.57
1.61
2.61
1.61
2.56
2.08
2.65
1.91
3.00
2.67
1.61
3.14
3.08
3.61
3.31
SINFIL
0.227
0.168
0.172
0.168
0.152
0.172
0.353
0.136
0.136
0.531
0.269
0.531
0.152
0.251
0.423
0.310
0.325
0.467
0.353
DSOIL
6.1
18.3
15.2
18.3
9.1
2.1
4.9
5.2
6.1
1.8
9.1
7.6
3.1
5.2
1.7
3.7
6.1
15.2
49
ZB
8.3
37.4
18.3
37.4
34.1
12.2
22.0
10.1
21.3
61
64.7
30.5
18.3
10.1
7.3
18.3
6.1
76.2
122
LOG
Radius
2.15
2.38
2.19
2.17
2.51
2.18
1.94
1.84
2.29
2.24
2,31
2.09
1.88
2.49
2.01
2.54
2.54
2.44
204
Average Well Concentration (Carcinogens)
LOG (Concentration)
CMTP
Result
0.40
2.10
4.12
4.14
4.09
4.12
4.27
4.37
4.43
4.52
4.58
4.67
4.78
4.71
4.73
4.93
5.04
5.11
5 15
NN
Prediction
0.55
2.40
4.20
4.13
4.02
4.13
4.36
4.37
4.42
4.52
4.57
4.57
4.79
4.65
4.78
4.89
4.97
5.14
5 16^
Residual
-0.15
-0.30
-0.08
0.01
0.07
-0.01
-0.09
0.00
0.01
0.00
0.01
0.10
-0.01
0.06
-0.05
0.04
0.07
. -0.03
-001
DAF
CMTP
Result
398107.2
7943.3
75.9
72.4
81.3
75.9
53.7
42.7
37.2
30.2
26.3
21.4
16.6
19.5
18.6
11.7
9.1
7.8
7 i
NN
Prediction
281838.3
3981.1
63.1
74.1
95.5
74.1
43.7
42.7
38
30.2
26.9
26.9
16.2
22.4
16.6
12.9
10.7
7.2
69
Residual
116268.9
3962.2
12.8
-1.7
-14.2
1.8
10
0
-0.8
0
-0.6
-5.5
0.4
-2.9
2
-1.2
-1.6
0.6
02
I
$
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.3 Details of Neural Network Training for Each WMU
April 13,1999
A3 A Land Application Unit Neural Network
This section describes the process of training the land application unit neural network (LAUNN)
and consists of two parts: the first describes the training process, and the second summarizes the
final neural network performance.
A.3.4.1 Details of LAUNN Training Process
The land application unit neural network (LAUNN) training benefitted from observations and
techniques learned while training the landfill, surface impoundment, and waste pile neural
networks. The teaming and testing process is summarized in Table A.3.13. The LAUNN was
built from an initial training data set (Train), an initial test data (Test) set, an additional training
data set (Train2), 23 additional test data sets, and finally a validation set. As was done for surface
impoundments and waste piles, the initial training and test data were combined into one training
data set. Thirty percent of the LAUNN-test data were then randomly assigned to the initial
training data. Table A.3.13 summarizes how the data were sampled for the individual data sets.
As described in Section A.2.1, the initial data sets contained combinations of the 10th, 50th, and
90th percentile values of the OSW-survey data (based on the star-point theory), as well as
intermediate percentile values corresponding to OSW Survey data such as the 20th and 70th
percentile values.
The second training data set was composed of values independent of the star-point theory, but
within the 10th to 90th percentile values of the OSW Survey data, oriented towards an optimal
coverage of the data space hi the response surface. Data values and combinations of the test data
sets to fill data gaps (mainly Group 4 Data Samples; see Section A.2.12 of this appendix).
RI2-9S.034JU.4
A-70
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.3 Details of Neural Network Training for Each WMU
April 13,1999
Table A.3.13 Summary of Neural Network Data Sets for Waste Piles
Number of values picked per parameter
Run Set
name
Train
Test
Train2
Test-sets
1-22
Test-test
set
Validatio
nSet
AREA
3
2
3
Ior8
1
SEMFIL
3
3
3
Ior8
8
DSOEL
2
2
2
lor 8
1
ZB
2
2
2
1
or
8
1
RADIS
2
2
2
Ior8
8
KOC
3
3
3
lor 8
1
X
3
3
3
lor 8
1
Each run has it's values randomly selected from the OSW-Survey
Data 10th - 90* percentile or the chemical list
Total number of EPACMTP runs
EPACMTP
Runs
648
432
108
0
648
1344
64
232
3368
Runs Used
forNN
818
512
306
648
1227
64
196
2952
Initial Training
As was done for the other three WMU neural networks, the initial training process utilized the
Train, Test, and Train2 data matrix, whereas the second training data set (Train2) was added to
the neural network during early neural network development. The training process progressed
well and was only interrupted for residual value checks (see Section A.2.3 for more details). The
residual threshold values were used to transfer data samples between both training and test data
sets. The residual diagnosis influenced the ratio of training to test samples used to measure the
learning progress and improve the predictive properties of the neural network. While neural
network training progressed, three LAUNN's were derived using different methods. Five test
data sets were produced simultaneously (see Section A.2.1) and 21 additional test sets were
developed using Group 4 Data Samples, and utilized in the training process. A validation data
set was used for neural network analysis and evaluation.
R12-98.034_A3.4
A-71
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
The twenty-one test data sets containing Group 4 Data Samples were appended to the test data
set. The results were graphically evaluated within the network model and it was shown that the
LAUNN was able to predict output values close to the EPACMTP output values for most of
these samples.
From this point, three neural networks were trained simultaneously: LAUNN- A, LAUNN-B, and
LAUNN-C.
LAUNN- A had all test sets appended three times to the training matrix, to balance the data set.
For LAUNN-B, all test sets were appended twice to the training matrix. Analysis of the neural
network's "Measured & Predicted Graphs" showed large errors for five of the test data sets.
These test data sets were added one more time to the training data set.
Jn LAUNN-C, all 1227 test samples were appended once to the test data matrix. Two different
residual checks helped to extract samples with high residuals. The first check addressed residuals
greater than 0.1 and/or less than -0.1, and data samples were transferred from the test to the
training matrix twice. Based on the second residual check (residuals > 0.25 and /or <-0.25),
samples were transferred additionally one more time from the test to the training matrix. Thus,
samples transferred after the first and second residual check were presented three times in the
extended training set.
The simultaneous development of parallel LAUNNs using different training methods enabled the
identification of the best-trained neural network. After training all three neural networks to
similar conditions (e.g. same number of hidden neurons, and network iterations) including an
analysis of the coefficient of determination R2, as well as the conceptual neural network graphs,
the NNs seemed to be of a similar high quality. The following table (Table A.3.14) summarizes
the characteristics of the three LAUNs.
R12-98.034.A3.4
A-72
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
Table A.3.14 Summary of final Neural Networks for Land Application Units
Network
A
B
C
Hidden
neurons
34
34
35
Total
Number of
Counts
116160
120073
125779
R2 log pk cone
train
0.997
0.997
0.997
test
0.992
0.994
0.996
R2 log avg cone
train
0.997
0.997
0.997
test
0.991
0.994
0.995
Number of
Samples
train
5454
4504
3688
test
441
366
724
Training Completion and Validation
The neural network training and test was stopped at this point, having three final neural networks
with similar properties. The three neural networks were evaluated and analyzed with a validation
set, which was appended to the test data matrix of each of the networks. Overall, all three neural
networks met the quality criteria and produced DAF values close to the desired output values.
Therefore, all three training approaches produced neural networks that met the quality criteria,
with only minor differences in the Revalues and in the graphical outputs.
Summary
LAUN-B was selected as the final neural network for land application units based on graphical
and numerical results of the validation set. Although LAUNN-B had the smallest number of test
samples, it produced the best predictions, when interrogated with random samples (see Tables
A.3.16, A.3.17, at end of this section). The performance of LAUNN-B was confirmed with
numerical results of interrogation with unbiased validation samples (Tables A.3.16, A.3.17).
Test sets of Group 4 Samples efficiently covered the data space in the response surface of the
LAUNN, as evidenced by the extremely high Revalues and highly accurate neural network
predictions. The following table (Table A.3.15) summarizes the characteristics of the final
LAUNN-B.
R12-98.034_A3.4
A-73
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
Table A.3.15 Summary of Final LAUNN Features
Neural Network Parameter
Number of Training Samples
Number of Hidden Neurons
Number of Maximum NN-Iterations
R2 (log peak cone, training data set)
R2 (log peak cone, test data set)
R2 Gog average cone, training data set)
R2 (log average cone, test data set)
Initial
1080
(512 training/306 test)
10
1
0
0
0
0
Final
4870
(4504 training, 366 test)
34
120073
0.997
0.994 (not including Validation
Set)
0.994 (including Validation Set)
0.997
0.994 (not including Validation
Set)
0.994 (including Validation Set)
A.3.4.2 Performance of Final Land Application Unit Neural Network
Graphical Results
As discussed in Section A.2.3, several graphical tools help to evaluate the final performance of
the network. As shown in the Measured & Predicted Graph (M&P-Graph) for the LAU-log peak
concentration (Figure A.3.1), the close match of the predicted and measured values confirms the
good generalization properties of the LAUNN. As can be concluded from the close fit of the
curves, the model learned the vast majority of the data patterns hi the response surface, and its
overall ability to predict accurate output values is good. The test data indicates that the network
performs well with data samples it was not trained on, indicating good predictive capabilities
within the desired input data range.
R!2-98.034_A3.4
A-74
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
Figure A.3.17
M&P-Graph for LAUNN, Training (Top) and Test (Bottom)
5-
4-J
o
u
3
2-
1-
LOGPKCON tie
I i
i &
_l i
3cted
i
-N-140
I
1140 1160 1160 1200 1220 1240 1260
Obeerwat Son
L-OGPKCON Merassur-ed &
N-366
6.
5.
I-
3,
2,
2.
,5-
,0-
,5-
,0-
0-
5-
0-
50
100
150
200
250
300
350
Obsenuat Son
R12-98.034.A3.4
A-75
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
Histograms of the residuals for the data samples of the final neural network (Figure A.3.17) show
the desired normal distribution of the residuals around zero. The "95th Confidence Interval
Graph" (CIG), was used to confirm the LAUNN performance. As shown in the CIG (Figure
A.3.18, A.3.19), a high percentage of all data points falls within the confidence band of the best
fit line, for both the final training and test data sets. The Revalues are greater 0.994, and the
slope of the best fit line is close to 1.0 as can be read from the regression equation.
Validation Results
Interrogation of the LAUNN showed the network's ability to predict output values close to the
desired EPACMTP output values for most of the validation data samples. Tables A.3.16 and
A.3.17 contain samples randomly chosen from the validation set, which represents the 10th to 90*
percentile OSW Survey data range, with emphasis on DAF from 1.0 to 1,000. However, DAFs
in the higher ranges are also predicted well. The results indicate predictions within one order of
magnitude of the desired output value. The results of the validation test are presented graphically
in Figure A.3.21.
R12-91034_A3.4
A-76
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.3 Details of Neural Network Training for Each WMU
April 13,1999
Figure A.3.18
Training
Histogram of the Difference between Predicted and Observed
(Residual) log peak concentration for LAUNN
2000
100%
-0.3
-0.15
0.15
0.3
Residual LAU-log peak concentration, Training Matrix
Test
70
-0.1 -0.07 -0.04 -0.01 0.02 0.05 0.08 0.11
Residual LAU-log peak concentration, Test Matrix
R12-98.034_A3.4
A-77
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
s s
+ SB
>£
c5
II
>>
c>
ii
tt
fc
CO
CO
CM
COCMt-
(dVa)6o|-NN
I
5"
"2
I
&
•a
2
o
1
t-t
s
i
a
§
u
*
R12-98.034.A3.4
A-78
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.3 Details of Neural Network Training for Each WMU
April 13,1999
+
>£
c>
II
5 tt:
6
b
CO
CO
!•_!
If
CM
T- O
•**
s
a
1
a)
•a
!=
I
R12-98.034.A3.4
A-79
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
EPACMTP
NN-Predicted
20 40 60 80 100 120 140
Realization (Nmax=138)
Figure A3.21 Comparison of EPACMTP-generated and Neural Network Predicted
Monitoring Well Concentrations for Land Application Units.
R12-SS.<»t_jW,4
A-80
-------
Table A3.16 Comparison of Observed and Predicted DAF Values Based on LAU-Peak Well Concentrations
Input Values, Output Values and Predicted Output for Random Runs in the 0%-100% Range
Log
KOC
3
2.58
1.99
1.04
0.8
3.75
2.39
5.39
1.03
-0.09
0.6
0.78
3.75
3.08
1.27
4.15
-2.1
1.13
1.82
1
LOG
LAMDA
-4
-4
-2.35
-4
-1.18
-4
-4
-4
-4
-4
-4
-4
-4
-4
-1.72
-4
-4
-2.55
-4
.a
LOG
AREA
5.13
6
5.61
5.08
5.96
4.91
5.43
4.78
6.91
4.08
4.31
5.51
5.39
1.31
5.81
3.51
3.08
3.61
4.21
AM
SINFIL
0.1842
0.1156
0.102
0.1554
0.0765
0.1676
0.3256
0.3256
0.1024
0.1651
0.438
0.109
0.109
0.0798
0.0894
0.1684
0.1641
0.0947
0.19
nioRi
DSOIL
12.2
7.6
14.6
1.5
15.2
30.5
0.6
1.5
4.6
8.5
3.1
9.1
3.4
30.5
15.2
12.2
9.1
4.9
1.5
7fi
ZB
4.6
7.6
24.4
10.8
21.3
111.
4
3
6.5
19
18.3
39.6
15.2
52.1
6.1
4.6
58.5
9.1
3
10 R
Log
RADIS
2.972
2.994
2.913
3.142
1.81
2.248
2.461
1.139
3.198
1.172
2.955
1.835
2.327
2.596
0.089
1.081
2.579
2.714
2.477
1 S7fi
Log Peak Well Concentration
CMTP
5.6
5.82
3.15
5.84
5.63
0.0426
3.89
5.74
5.51
0.851
5.95
5.94
5.51
5.68
5.83
5.9
4.139
5.95
5.68
*> fiS
NNpred
5.57032
5.78002
3.07082
5.84939
5.60005
0.09292
3.96177
5.68119
5.47208
1.14789
5.96224
5.89949
5.47208
5.67396
5.78796
5.85656
4.18807
5.96224
5.67396
5 62192
Residual
0.02968
0.03998
0.07918
-0.00938
0.02995
-0.05032
-0.07177
0.05881
0.03792
-0.29689
-0.01224
0.04051
0.03792
0.00604
0.04204
0.04344
-0.04807
-0.01224
0.00604
0 02808
Log Peak Well
Concentration
CMTP-DAF
398107.17
660693.45
1412.54
691830.97
426579.52
1.10
7762.47
549540.87
323593.66
7.09
891250.94
870963.59
323593.66
478630.09
676082.98
794328.23
13772.09
891250.94
478630.09
44fififtt 59
NN pred-DAF
371809.09
602587.34
1177.12
706952.12
398153.01
1.24
9157.35
479943.37
296537.76
14.06
916726.95
793395.99
296537.76
472019.56
613705.48
718720.45
15419.49
916726.95
472019.5647
41871641
9
fr
3:
:>
00
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
&
1
sS
*
0
1
a
£
s
o
•§
«
t*
£
'•a
a.
edicted Out
<£
•o
e
OS
M
£
I
O
i
t
|UO])BJ)U33U03 IP
?
8,
1
oo
3
?
£ g
fl> ^3
co a
§ £
> §
-5 g
So
j U
c
B£ •
3i
<
s
J
1
i
5s
0
S-
o
S
I
1
I
to
CMTP-DA
Residual
•a
§.
1
CL,
O
g
<
•A
<
1
•<
•<
J
U
§
00
CN
1
1
c-
3
i
en
o
S
0
e*l
S
m
vq
iri
S
vq
•*3-
o
SH
e>
01
o
e*
en
f»
in
rr
in
i
m
S
•<*•
in
VO
§
•*
SO
|
VO
cs
r~
i~
oc
$
c\
ci
vo
t~
vo
P~
vo
t/
o
VO
oo
V
e-
1149.90
CO
n
f
£
s
VO
vo
o
VO
0
f>
TI;
f
0
c
Tf
S
\O
^
S
o
V(
c*
f
ov
o
r>
O
VO
en
3
Si
691830.
VO
vo
O
vo
S
S
t>
in
VC
c
vo
OV
—
OO
o
•4
-
-0.03818
oo
XI
oc
o
o
••*
en
O
00
n
e>
c
—
V
en
vo
§
O
5
S
<»
i
oo
Tf
3
00
VO
S
9
S
s.
OS.
oo
§.
es
•<»•
vo
vo
in
S
0
en
T
Ov
e«
c-
n
n
>n
*-4
OO
T
«
i
in
t~
1
CS
efl
S
0\
e^
en
V
a
*
o
oo
t»
o\
e«
V
294645.63
!g
S
CO
o
en
V
VO
VO
1
O
o
en
C
2
VO
«s
-0.29093
g
n
V
V£
0
g
o\
c
o
fc
w5
2
Jl
S
S
^
N
3\
OO
OO
r»
O
en
oo
§
v>
n
o\
2
VO
S
i
o
S
§
m
£
o\
oo
c
o
VO
e»
I
c
o
oo
5
S
1
€
S
S
i
1
0
o\
00
in
T*
oo
i
VO
Tf
C^
TJ-
oo
vc
o
m
i/
oo
1
o
2
S
en
1
o
en
«n
en
t--
Tf
OX
er
•*
g:
v
r*
V
ec
in
—
i
o
oo
c
c
V
T!
O
o
a
S
7
e>
3
OC
1
c
e*
c
V
tf
V
F
c
o
o
i
(
c
t
tf
1
t;
w
I
1
I
"S
ca
t=<
•o
I
TJ
es
•ts
en
O
'S
1
«
I
i
R12-9S.034JO.4
A-82
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
A.3.5 Recommendations for Further Work
The effort to develop neural networks to approximate probabilistic EPACMTP simulations for
the Guide for Industrial Waste Management was the first of its kind. Overall, the quality of the
neural network performance was judged to be good, based upon the various criteria outlined in
Section A.2.3. As indicated by Figures 5-3 through 5-6, the performance of the neural networks
could be further improved with additional training. The following is a list of lessons learned
regarding the training of the neural networks:
• A combination of several quality criteria can be used to judge the overall performance of
neural networks including: R2 values; histograms of input parameter values; predicted
trends of DAFs as a function of single input parameter compared to EPACMTP-
computed trends; plots of EPACMTP versus neural network -predicted output; numerical
comparison of EPACMTP and neural network output.
• Appending test data to the training data set is an important first step to improve the
quality of the neural network. Determining which test data should be appended to the
training data set can be aided with the use of a residual threshold value (e.g., 0.1 or 0.5)
and transferring values from the test data to the training if they exceed this threshold
value.
• Strategically selecting input parameter values to fill in data gaps can significantly
improve the quality of the neural network. Data gaps can be identified simply by
examining histograms of input parameter values used in the training data.
• Testing and training with combinations of parameters can improve the fit of the neural
network predictions.
• Test and validation data sets consisting of randomly selected input values may provide
more information about the overall predictive quality of the neural networks. These
additional random data samples may help to determine if the neural network has good
generalization skills and can identify portions of the input data space which have been
insufficiently sampled in the training data.
If this work continues and it is determined that the neural networks should be further developed
and/or their capabilities unproved, the following recommendations are provided:
• The actual values of the residual thresholds should be evaluated to determine the
optimum value for transferring test to training data sets.
R12-98.034_A3.S
A-83
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A3 Details of Neural Network Training for Each WMU
April 13,1999
Further evaluate how to most efficiently train on specific combinations of 2 and 3 input
parameter values.
Further evaluate how to best use randomly selected data samples to identify gaps in the
neural network response surface.
Determine if there is a systematic source of error in outlier values predicted with the
neural networks.
Further evaluate the error associated in neural network predictions using the 0th - 10th
percentile and the 90th - 100th percentile input parameter values.
1112.98,034.^.5
A-84
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Section A.4 References
April 13,1999
A.4 REFERENCES
* f
Demuth, H. and Beale, M. Neural Network Toolbox: for use with Matlab* User's Guide version
3.0, The MathWorks, July, 1997. .. , , N ' '
i << ;o: \
Kreyszig, E, 1 988. Advanced Engineering Mathematics, Sixth Edition, John Wiley and Sons jff
Inc., New York, New York. x>x "
»J> \
Smith, M. 1996. Neural Networks for Statistical Modeling. International Thomson G|mputer
Press. Boston, MA. 1996. v
f / \
v
Spiegel, M.R. 1961. Theory and Problems of Statistics. Schaum's Outline Series in
Mathematics. McGraw Book Company. *
< xx
s
Swingler, K., 1996. Applying Neural Networks: A Practical Guide, Academic Press, San Diego,
CA, 1996.
Neural Fusion, 1998. Neural Model User's Guide, 1998, Neural Fusion.
/ s
U.S. EPA, 1998a. Guide for Industrial Waste Management, U.S. EPA Office of Solid Waste,
December, 1998.
^
/• *-
U.S. EPA, 1998b. Technical Background Document for the Development of a Two-Tiered
Approach for evaluating Waste Management Unit Liner Designs, Office of Solid Waste,
December, 1998. „
A-85
-------
-------
APPENDIX B
SENSITIVITY ANALYSIS OF COMPOSITE LINER
LEAKAGE RATES
-------
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B - Sensitivity Analysis of Composite Liner Leakage Rates
April 13,1999
B.1 INTRODUCTION
x
An implicit assumption in the analysis of leakage from various liners as presented hi this
Guidance is that liner performance does not change with time; i.e., leakage through a particular
liner system remains constant throughout the 10,000-yr perioj^^peribrmaricel Also implicit in
the analysis of leakage is the assumption that the nature ofJjuality control during installation
results in minimal defects in the liner system. In order to initiate an understanding'of how the '
leakage rate may change with time as the liner degrades, and to account for the "less-thari- v'
perfect" liner, this sensitivity analysis was undertaken to evaluate^how the number an&size*of
defects in a geomembrane affect the leakage rate of industrial waste leachate from a landfill
through a composite liner. As outlined in Section 4.1 of the Technical Background Document,
the infiltration rates to the unsaturated zone as a result of leakage from waste management units
(WMU) through native soil, a single clay liner, and a,composite liner were determined by three
different methods. The assumptions for calculating leakage from the three liner-types are
outlined in Tables 4-1 to 4-3 and summarized below. , y/
> v«
The infiltration rates for the no-liner scenario were calculated with the HELP model (Schroeder
et al., 1994) for a range of soil types and a range of precipitation rates that are representative of
rates throughout the United States. The no-liner infiltration late is effectively the same as
percolation through the native soil. Because soil type andiprecipitation rates vary across the
nation, the infiltration rates into the'unsaturated zone from a landfill range from 1 x 10"5 m/yr to
1.08 m/yr. ,/\ .< f "',-'
\ ' "f
Single-liner infiltration rates were calculated with the HELP model (Schroeder et al., 1994),
based upon Darcy's law, using a range of precipitation rates from across the United States.
Similar to the no-liner scenario, a range of infiltration rates to the unsaturated zone was
determined: 0.0 m/yr to 0.53 m/yr for landfills.
'•<"''"' N v. v
' s
The composite-liner leakage .rate was calculated as a single value, using an equation from
Bonaparte et al. (1989); assuming a constant 1-ft hydraulic head and three feet of low-
permeability (10~9 m/s hyjiraulic conductivity) soil underlying the geomembrane.
Because the analysesjor the no-liner and single-liner scenarios are based upon a range of
infiltrations rates, the question was raised concerning why a single infiltration/leakage rate was
used for evaluation of the composite liner scenario. The singular value for the composite liner is
presented in the Guidance as a design and performance goal. This sensitivity analysis is based
upon the recognition that a range of performance values might be expected. However, in order
to assess what that range might be, there is a need to first evaluate how the type, number, and size
of defects, the hydraulic head, and the effectiveness of the underlying low-permeability soil
beneath the geomembrane affect the infiltration to the unsaturated zone.
3/12/99
B-l
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B - Sensitivity Analysis of Composite Liner Leakage Rates
April 13,1999
This analysis illustrates that the hydraulic head and the contact between the geomembrane and
the underlying clay, have a significant effect on the rate of leakage through a composite liner.
3/12/99
B-2
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B - Sensitivity Analysis of Composite Liner Leakage Rates
April 13,1999
B.2 COMPOSITE LINER LEAKAGE RATE DETERMINATION
The equations used to calculate the leakage from a waste management unit (WMU) through a
composite liner depend on the type of defect and the contact between the geomembrane material
and the underlying low-permeability soil. The equations used in this analysis were empirically
derived by Bonaparte, Giroud and others (1989,1992). The following discussion outlines the
equations and the assumptions used in this analysis. , ° , < /-
-' ,> "• /
•M'-V^
B.2.1 EVALUATION OF LEAKAGE THROUGH HOLES -
Bonaparte et al. (1989) described the leakage rate through a single hole in the geomembrane as:
Q = 0.21a°-1ha9ksa74 (1)
> y
where
Q = steady-state flux from one hole in the geomembrane component of a composite
liner (mVs);
a = area of hole (m2);
h = head of liquid on geomembrane (m);
k. = , hydraulic conductivity of the low-permeability soil underlying the geomembrane
'(m/s). , """?' '
Equation (1) computes leachate flux through a hole in the geomembrane for which there is good
contact between the geomembrane and the underlying low-permeability soil. Similarly, if there is
poor contact between the geomembrane material and the low-permeability soil, the flux may be
described as: " - " , ~^
Q=1.15a°-1ha9ks0.74
(2)
Two equations were developed by Giroud and Bonaparte (1989) to estimate the leakage through
a single hole in a geomembrane with perfect contact between the synthetic and natural materials.
Giroud and Bonaparte (1989) described the leakage as:
(3)
3/12/99
B-3
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B - Sensitivity Analysis of Composite Liner Leakage Rates
April 13,1999
where
Q = leakage rate (m3/s);
kj = hydraulic conductivity of the low-permeability soil (m/s);
d = diameter of the circular hole (m);
b^ = head of liquid on top of the geomembrane (m);
HS = thickness of the low-permeability soil (m).
<• / * \ f
Because the ratio of the size of the hole to the thickness of the low-permeability soil layer is
small,
equation (3) reduces to:
(4)
\ ff v
The leakage rate from a composite liner used in developing the Guidance was determined with
equation (1). The parameter values used are listed, below:
• one hole per acre;
• 0.05 in2 hole (3E-06m2);
• 1 ft hea£(0.305 m) for landfills and waste piles; 10-ft head for surface
impoundments;
• under-lying clay layer with 10"9 m/s hydraulic conductivity.
Using these parameter valuesVthe leakage rate used in the Industrial Waste Guidance
groundwater analysis is 3.4-1 x 10"5 m/yr for landfills and waste piles; 3.1 x 10"4 m/yr for surface
impoundments.
Bonaparte et al. (1989) stated that the use of the above empirically-based equations should be
restricted to cases where the underlying low-permeability soil has hydraulic conductivity less
than lO^m/s; and the head of liquid on top of geomembrane is less than thickness of underlying
low-permeability soil. It should also be emphasized that the above equations describe leakage
through a single defect. In addition to evaluating the sensitivity of leakage through a single hole,
this sensitivity analysis also considered the effects of 1000 holes per acre.
3/12/99
B-4
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B - Sensitivity Analysis of Composite Liner Leakage Rates
April 13,1999
B.2.2 Evaluation of Linear Defects
S
The following four equations were developed by Giroud and Badu-Tweneboah (1992) to
estimate the leakage through long defects in a geomembrane. This analysis considered two
lengths of long defects (i.e., tears): 1-m and 63-m. A roll of geomembrane material is
approximately 63 m long and this length was considered analogous to the infinitely long defect
described by Giroud and Badu-Tweneboah (1992). >
Leakage through a 1-m tear where there are good contact conditions was determined with:
(5)
where:
Q =
B
b
hw =
;*' " -
rate of leakage through a tear in the geomembrane component of the composite
liner (m3/s);
geomembrane tear length (m);
geomembrane tear width (m);
head of liquid on top of the geomembrane (m);
hydraulic conductivity of the low-permeability soil component of the composite
liner (m/s); " V
average hydraulic gradient beneath the rectangular portion of the wetted area;
average hydraulic gradient of soil beneath the circular portion of the wetted area.
i*me and ime are described by Giroud and Badu-Tweneboah (1992) in terms of the radius of the
wetted area (R) and the thickness of the low-permeability soil, H,:
(6)
(7)
The radius of the wetted area may be determined with Equations (8) and (9) for good and poor
3/12/99
B-5
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B - Sensitivity Analysis of Composite Liner Leakage Rates
April 13,1999
contact conditions:
n ^£i 1.01
f =0.26-0 "^
0.45 , -0.13
lw ~KS
(8)
Similarly, the leakage through a linear defect for which there is poor contact between the
geomembrane and low-permeability soil is described as:
(9)
where:
(10)
Leakage through tears of infinite length for good contact conditions is described as:
(ID
and for poor contact conditions
(12)
where: Q* = rate of leakage per unit length of the tear;
im* = hydraulic gradient beneath the rectangular portion of the wetted area.
3/12/99
B-6
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B - Sensitivity Analysis of Composite Liner Leakage Rates
April 13,1999
B.3 THE APPROACH
The objective of this sensitivity analysis was to determine whether there is a range of leakage
rates given various defects in liners. Consequently, the analysis focused on properties of the
defects rather than on the overall design of the liner system. The parameters and the values used
in this analysis are listed in Table B- 1 . All possible combinations of each of the parameters were
determined using Equations (1) - (4) for notes and Euqations (5) - (12) for rips/tears.
v
Table B.1 Parameters for Composite Liner Infiltration Sensitivity
\
Defect/
Equations
Used
Hole
(l)-(4)
Rip/tear
(5)-(12)
Defect Size
0.0001 m2
0.00003 m2
1m
infinitely long
(63 m)a
width: 0.01 m
0.03 m
Density
1 per acre
1000 per acre
1 per acre
10 per acreb
FME/CIay
Contact
X
poor , >
good
perfect „ '
poor
good '
„ Clay
Conductivity
1E-06 m/s
lE-t)9 m/s
1E-06 m/s
1E-09 m/s
Head
3.05E-01 m
3.05m
3.05E-01 m
3.05m
a approximate length of roll of geomembrane material
b number of widths of geomembrane per acre
3/12/99
B-7
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B - Sensitivity Analysis of Composite Liner Leakage Rates
April 13,1999
B.4 RESULTS
The steady-state leakage rates calculated using Equations (1) - (12) assume dimensions for
defects in the geomembrane that range over many orders of magnitude. Specifically, the
minimum and maximum values for leakage through a hole are 1.5 x 10"8 m/yr to 353 m/yr,
respectively. The minimum and maximum leakage rates through a linear defect are 7.45 x 10"5
m/yr to 67.4 m/yr, respectively. These rates overlap with the ranges observed for the no-liner and
single-liner scenarios described earlier.
B.4.1 Leakage Through Holes
Figure B-l presents the results for the case of leakage through holes in the geomembrane. The
leakage rates through a single hole with 1-ft hydraulic head are less than the leakage through a 3-
ft layer of clay with a hydraulic conductivity of 10"7 cm/s for all contacts and both hole sizes.
The leakage rates for the single-clay liners, as determined with Darcy's law, are given for
comparison. When there are 1000 holes per acres and a 1-ft hydraulic head, leakage through the
geomembrane with poor contact exceeds that of the clay. When there is a 10-ft head of liquid on
the geomembrane, leakage through the: geomembrane with good and poor contact and 1000
holes per acre exceeds that of the clay liner. These latter results appear to be counter-intuitive
because they suggest that the clay and geomembrane, together, do not perform as well as the
single-clay liner. •
3/12/99
B-8
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B - Sensitivity Analysis of Composite Liner Leakage Rates
April 13,1999
in
e\i~
o
CM "
._ to
t -'
£
2
o
o>
J
8 2-
in
c> ~
o
o ~
1 -ft head
^
~,>
'
0
% poor o
<
1OA.O mtc /"lay linofnply n rl n
° all contacts, sizes ' plirfect 8
in
cu"
o
c\i"
10
t ""
o ^
2
•* o> -v
i
§ 2-
in
o~
p_
^
>,
X
10-fthead
x poor o
big
y
^ "x
^ ^ v\
< - , •
"• xy t t *• poor °
little
f1 f
,
• •>_•
flood dfe l
10^9 m/s clav liner onlv
0 all contacts.sizes perfect °
200- 400 600 800
•^ ^
number of holes
1000
0 200 400 600 800 1000
number of holes
Figure B-l. Leakage Rates from a Composite Liner with Holes: 10'9 m/s Hydraulic
Conductivily Clay (geomembrane-clay contact quality denoted as poor, good,
and perfect; "big" holes have an area of 1 x 10"4 m2 and "little" holes have an
areaofSxlO^m2).
3/12/99
B-9
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B - Sensitivity Analysis of Composite Liner Leakage Rates
April 13,1999
Figure B-2 presents the results for leakage through holes in the geomembrane with an underlying
clay of 10"4 cm/s hydraulic conductivity. With the exception of the "poof contact values, the
10-ft head leakage rates indicate that the geomembrane and clay perform at least as well as the
clay itself. Of note are the high leakage rates (>100 m/yr) for the clay when there is a 10-ft head.
1-ftlwad
< 1(V-4 m/crlaylrnArnnlxr
-poorJ
ipoorji
o a»contaaasizes
perfect *
-
®
-
10-ftlwad
pootbig
poor.littto °
0 ancontacts,sizes
goooVlittle o
perfect °
200
800
1000
200
800
1000
'number ot holes
number of holes
Figure B-2. Leakage Rates from a Composite Liner with Holes:
m/s Hydraulic Conductivity Clay (geomembrane-clay
contact quality denoted as poor, good, and perfect; "big"
Jioles have an area of 1 x 10"4 m2 and "little" holes have an
;:areaof3x lO^m2).
3/12/99
B-10
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B - Sensitivity Analysis of Composite Liner Leakage Rates
April 13,1999
B.4.2 Leakage Through Linear Defects
Leakage through the composite liner when the geomembrane exhibits tears or rips is presented
in Figures B-3 and B-4. With the exception of the 10 tears with poor contact conditions and a
10-ft hydraulic head, the leakage rates from the defective geomembrane are less than those from
the underlying clays with permeabilities of 10"4 cm/s and 10"7 cm/s. These figures suggest that ''"
even with defects, the geomembrane affords more protection than the clay liner alone'.
One result of note: when there are many defects, the holes generally leak more than the tears.
Figure B-5 illustrates how 1000 holes and a 1-ft hydraulic head have a higher leakage rate than
10 63-m tears. The area covered by 10 1-m rips of 0.03 nvwidth is 6.3 m2, whereas the area of
1000 3 x 10"6 m2 holes is 0.003 m2. The tears would be expected to leak more than the holes
because the defect has more area. The result presented in Figure 5 is counter-intuitive.
Giroud and Badu-Tweneboah noted that for large hydraulic fiead, it takes fewer holes to
approximate the same wetted area as the tear, than when there is a small hydraulic head. Perhaps
this effect is due to fluids spreading laterally between the geomembrane and the underlying clay:
while the actual area of the defect represented by the holes is smaller than that of the tears, the
affected/wetted area beneath the geomembrane is actually bigger for many small holes than for a
single tear. ... / r<:"*
3/12/99
B-ll
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
April 13,1999
^ /
o
o
10
*" —
o
1 s.
& °
CO
1
«,
o
8.
o
"
1 0*9 mfe day liner; 1 -ft head
poor, 10 °
•S
"\ good, 10 o
o poor, 10 P°°T:l Q
Obothcontacts good, 1 °
S.
d
in
o
i |-
;'.'i
§_
0
"O
O -
0
w ^
% /"" •• /
< "
'
poor 10 0'
<" ^ >
10^-9 m/sctay linen 10-ft head
s.
good,10
o
o pppr,10
••
poor,1 o
sfe 9°°d'1
j!v
~xt*
Lj _, 1 1 j 1 1 r i i i > > •
0 10 20 30 40 50 60 0 10 20 30 40 50 60
length of tear, m length of tear m
Figure B-3. Leakage from Linear Defects in a Composite Liner: 10'9 m/s Hydraulic
Conductivity Clay (geomembrane contact quality denoted as good and poor;
number of; linear defects per acre noted as 1 and 10).
3/12/99
B-12
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B - Sensitivity Analysis of Composite Liner Leakage Rates
April 13,1999
t §-
ID
2
ID
10*-6 m/s day liner; 1-ft head
poor,1 o
^ good,10 o
@ poor and good, land 10 poorand good, ~e
10 20 30 40 50 60
length of tear m
o
1-
o
o —
t g-
to
2
ID
:o>'
£ o
«J
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B - Sensitivity Analysis of Composite Liner Leakage Rates
April 13,1999
CO
d
0)
£2
0)
I
as
0
O
o 10X)1Emrrr2hdes
lOfthead
0 1OX)3E06rTr2hdes
lOfthead
o{ IcfefectjUtarllO-fthead
T
T
0
10
20
30 40
azBcfdsfectm
1063iTltBaiS
lOfthead
lOfthead
50
80
Figure B-5. Summary of Leakage Rates from a Composite Liner with Good
Geomembrane-Clay Contact (holes noted as "O"; 1-m tears and 63-m tears
noted at "t" and "T', respectively).
3/12/99
B-14
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B - Sensitivity Analysis of Composite Liner Leakage Rates
April 13,1999
B.5 DISCUSSION
The results of this sensitivity analysis of composite liner leakage rates are presented in
comparison to the leakage through a 3-foot clay layer with hydraulic conductivities of 10'9 m/s
and 10"6 m/s. The leakage of a single clay liner can expected to be the limiting condition for the
leakage, since the clay and geomembrane together should afford a higher level of protection tharT,
the clay liner alone. In general, the infiltration rates do not exceed the limiting;leakage rate for'
the underlying clay except when there is poor contact and the hydraulic head is high. , '
, /* ,'*' •"•«•,'
There is an exception to this general conclusion: leakage from 1000 holes per acre, 10r9 m/s clay,
good contact, and 10-ft hydraulic head also exceeds that of the clay alone. This result calls into
question the validity of the equation for high heads or a large number of defects.
/ "^
x "*• ./*"*
The equations presented by Bonaparte, Giroud and others (1989,1992) are based on one defect.
The equations do not take into account interac|ipri;bf leakage from many defects. It was
mentioned above that these authors caveated the use of the equations such that their application
should be limited to those cases where thejhydraulic head on top of die geomembrane is less than
the thickness of the underlying low-perrpability material. The results for the 10-ft hydraulic
head on top of the good contact and poof contact liner with 1000 holes per acre support the
validity of the Bonaparte and others' caveat: the equation for flow through holes is valid when
the hydraulic head is less than the thickness of the underlying clay. The results also highlight the
issue of how to determine leakage through composite liners for which the conditions defined by
Bonaparte" et al. are not appropriate.
> "
There are circumstances considered in the design of this sensitivity analysis which implicitly
require parameter values outside the range of values defined by Bonaparte et al. Specifically, it is
doubtful that a geomembrane would Have only one hole per acre. Surface impoundments often
have hydraulic heads several fold greater than the thickness of the underlying clay. There are
also conceivable circumstances for which the hydraulic conductivity of the underlying clay
exceeds 10"6 m/s; e.g., when the clay has saturated with organics, or desiccates.
jf
The leakage rates calculated in this analysis range over many orders of magnitude. If modeling
of leakage from a composite liner were to be done in a Monte Carlo fashion with a range of
values, the^criteriafor defining a conceivable range leakage rates must be considered. For leakage
rates*in excess of 10 m/yr, there is the need to consider whether such a leakage rate could be
maintained.
3/12/99
B-15
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B - Sensitivity Analysis of Composite Liner Leakage Rates
April 13,1999
The Florida Department of Environmental Protection studied the performance of 24 active
double-lined landfill cells for the purpose of comparing predicted leakage rates with actual
leakage rates through the liner components (Teller, 1997). The observed leakage rates were
generally less than those predicted using equations from Bonaparte et al. (1989). The observed
leakage through a primary liner that consisted of a 60 mil HjDPB membrane ranged from 5 x 104
m/yr to 0.2 m/yr. Leakage through the HDPE membrane an|?l geosynthetic^clay liner with
hydraulic conductivity of 2 x 10'9 cm/s ranged from 7 x l(ff m/yr to 5 x 10'5 m/yr. Given these'
leakage rates, an HDPE liner underlain by a clay liner with hydraulic conductivity of 10"Tcrn/s'
might exhibit a leakage rate on the order of 10'5 m/yr to 10^2 m/yr. The leakage ratelassumed for
the composite liner in development of the Guidance (3 x lOf m/yr) is at the low end of this
range.
f
N >»
The leakage rates for the various liner scenarios used in developing the Guidance do not account
for time-dependent changes in liner competence. The analysis presented here only assumes the
existence of the defects. It does not allow for the development of defects as a function of stress
due to loading, chaotic events such as earthquakes, or chemical interactions with the waste.
There have been many studies of the effects of various stresses on the competence of
geomembrane liner materials. Further work is needed to evaluate how liner systems degrade
with time and the effect of such on leakage rates.
' #•>
The EPA welcomes comments concerning the use of the Bonaparte and Gkoud equations for
estimating leakge through a composite liner. The Agency is also interested in comments
concerning the use of a single leakage rate or a set of leakage rates, such as those sampled for a
Monte Carlo-style analysis, or specifically chosen to represent degradation of the liner system
with time.
3/12/99
B-16
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B - Sensitivity Analysis of Composite Liner Leakage Rates
April 13,1999
B.6 CONCLUSIONS
The results of a parametric sensitivity analysis indicate that the leakage rate used for the
composite-liner scenario in the Industrial Waste Guidance is at the low end of the'range of
leakage rates determined in this analysis. This suggests that while the leakage rate of 3.4 x 10'5
m/yr is a good performance goal, it is not conservative hi that is doesn't result in a higher 4
estimated risk. Data from the Florida study indicate higher leakage rates with"similar designs.- T
The results of this analysis need to be evaluated in terms of which scenarios are plausible before
a range of leakage rates can be defined for a Monte Carlo style analysis. J-
y
The leakage rates calculated in the course of this sensitivity analysis raise questions concerning
the general applicability of the equations developed by Bonaparte, Giroud, and others. While
these authors caveat the use of the equations to certain conditions, it is unlikely that these
conditions would always exist, particularly a low hydraulic head in a surface impoundment.
t < v
Given the uncertainties associated with this ^analysis, there is a need to verify the equations with
more data. In order to better define a range of leakage rates for through composite liners for the
purpose of including the uncertainty associated with leakage rates in a Monte Carlo analysis
of the composite liner scenario, there is a need to better understand the nature of defects in
composite liners, how defects develop with time, and how leakage rates vary with tune.
3/12/99
B-17
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B - Sensitivity Analysis of Composite Liner Leakage Rates
April 13,1999
B.7 REFERENCES
Bonaparte, R., J.P. Gkoud, B.A.Gross (1989). Rates of Leakage Through Landfill Liners;
Geosynthetics '89 Conference Proceedings, vol. 2; pp. 18-29; Amer. Soc. Civil Eng.
j~ A x
Gkoud, J.P., and K. Badu-Tweneboah (1992). Rate of Leakage Through a composite Liner Due
to Geomembrane Defects; Geotextiles and Geomembranes, vol. 11, pp. 1-28.
' +'*• '/
Gkoud, J.P., and R. Bonaparte (1989). Leakage through Liners constructed with
Geomembranes- Part I. Geomembrane Liners; Geotextiles and Geomembranes, vol. 8,
pp. 27-67.
Gkoud, J.P., and R. Bonaparte (1992). Leakage through Liners constructed with
Geomembranes-Part n. Composite Liners; Geotextiles and Geomembranes,vol. 8, pp. 71-111.
Shroeder, P.R., T.S. Dozier, P.A. Zappi, R:ML McEnroe, J.W/^ostrom, and R.L. Peyton (1994).
The Hydrologic Evaluation of Landfill p|formance (HELP) Model; EPA/600/R-94-168b.
Teller, R.B. (1997). Evaluating the Performance' of Florida,Double-lined Landfills;
Geosynthetics'97; Conference Proceedings, vol. 1, pp. 425-438.
3/12/99
B-18
-------
APPENDIX C
HISTOGRAMS OF EPACMTP INPUT PARAMETERS
USED IN THE NEURAL NETWORKS
-------
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B -Histograms of EPACMTP Input Parameters Used in the Neural Networks
__ April 13,1999
Infiltration Rate (SINFIL/NoLiner) in (m/yr)
for Landfills
Figure C-l
1
0.8
1*0.6
CD
|o.4
LL
0.2
0
i
21
i
100
EP
\C
;/TTI
Ml
•
jra
Hoi
s
I
i
.
.
nil
CD
3
cr
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B -ffistograms of EPACMTP Input Parameters Used in the Neural Networks
April 13,1999
Figure C-2
Infiltration Rate (SINFEL/NoLiner) in (m/yr)
for Waste Piles
1
0.8
>»
g 0.6
(D
|o.4
*_
LJ_
0.2
n
2(
100
EPi
\Cb
UK
-Ite
rati
oru
^_i
!_.
i
"1
J
III
10"'
103
10'
,-2
10"
12/11/98
C-2
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B -Histograms of EPACMTP Input Parameters Used in the Neural Networks
April 13,1999
Figure C-3
Area (AREA) in (m2)
for Landfills
0.8
£0.6
CD
10.4
2000
EPACW TP-Herktions
0
0
10°
12/11/98
C-3
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B -Histograms of EPACMTP Input Parameters Used in the Neural Networks
April 13,1999
Figure C-4
Unsaturated Zone Thickness (DSOIL) in (m)
for Landfills
103
10'
12/11/98
C-4
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B -Histograms of EPACMTP Input Parameters Used in the Neural Networks
; April 13,1999
Figure C-5
Aquifer Thickness (ZB) in (m)
for Landfills
2000 EP ACMTP-lfe rations
0
0
12/11/98
C-5
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B -Histograms of EPACMTP Input Parameters Used in the Neural Networks
April 13,1999
Infiltration Rate (SINFIL/No Liner) in (m/yr)
for Surface Impoundments
Figure C-6
/
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B -Histograms of EPACMTP Input Parameters Used in the Neural Networks
— April 13.1999
Infiltration Rate (SINFBL/Single Liner) in (m/yr)
for Surface Impoundments
Figure C-7
2000 BPACIW P-ltere tions
0
0
10
.-1.5
10'
10"
10U
12/11/98
C-7
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B -Histograms of EPACMTP Input Parameters Used in the Neural Networks
April 13,1999
Figure C-8
Area (AREA) in (m2)
for Surface Impoundments
0.8
(D
10.4
2000
'-It ;rati
ons
EPACMTP-fteratk ns
10°
12/11/98
C-8
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B -Histograms of EPACMTP Input Parameters Used hi the Neural Networks
April 13,1999
Figure C-9
Unsaturated Zone Thickness (DSOIL) in (m)
for Surface Impoundments
12/11/98
C-9
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B -Histograms of EPACMTP Input Parameters Used in the Neural Networks
April 13,1999
Figure C-10
Aquifer Thickness (ZB) in (m)
for Surface Impoundments
2000 EPACMTP Iterati
12/11/98
C-10
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B -Histograms of EPACMTP Input Parameters Used in the Neural Networks
April 13,1999
Figure C-ll
Infiltration Rate (SINFIL/NoLmer) in (m/yr)
for Waste Piles
1
0.8
1*0.6
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B -Histograms of EPACMTP Input Parameters Used in the Neural Networks
April 13,1999
Infiltration Rate (SINFIL/Single Liner) in (m/yr)
for Waste Piles
Figure C-12
1
0.8
»
'0.4
0.2
0
10000EPACI
CTP
ten
tion
CD
LL
103
10"'
10
.-1
10U
12/11/98
C-12
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B -Histograms of EPACMTP Input Parameters Used in the Neural Networks
April 13,1999
Figure C-13
Area (AREA) in (m2)
for Waste Piles
CD
10
10°
10
10°
12/11/98
C-13
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B -Histograms of EPACMTP Input Parameters Used in the Neural Networks
April 13,1999
Figure C-14
Unsaturated Zone Thickness (DSOIL) in (m)
for Waste Piles
1
0.8
2000 fPM
VCMTJP-I
terat ons
CD
10.4
L_
Li.
0.2
12/11/98
C-14
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B -Histograms of EPACMTP Input Parameters Used in the Neural Networks
. April 13,1999
Aquifer Thickness (ZB) in (m)
for Waste Piles
Figure C-15
2000 EP fcCMTP-lfeirations
0
1CT
10
10000 E 'ACMTP-Uerations
10U
10
12/11/98
C-15
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B -Histograms of EPACMTP Input Parameters Used in the Neural Networks
April 13,1999
Infiltration Rate (SINFEL/NoLiner) in (m/yr)
for Land Application Units
Figure C-16
o>
1
0.8
'0.6
'0.4
0.2
200C
EPA
3vnr
-Hera
lions
1
10"'
10"'
10
,-1
1
0.8
1*0.6
CD
10.4
uZ
0.2
0
100C 0 ER KMI P-ltei ation;
10''
10
,-2
10"
12/11/98
C-16
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B -Histograms of EPACMTP Input Parameters Used in the Neural Networks
____ April 13,1999
Area (AREA) in (m2)
for Land Application Units
Figure C-17
1
0.8
1*0.6
0)
10.4
200
)EP
'-ttjrati
ans
0
0
12/11/98
C-17
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B -Histograms of EPACMTP Input Parameters Used in the Neural Networks
April 13,1999
Figure C-18
Unsaturated Zone Thickness (DSOIL) in (m)
for Land Application Units
12/11/98
C-18
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix B -Histograms of EPACMTP Input Parameters Used in the Neural Networks
April 13,1999
Figure C-19
Aquifer Thickness (ZB) in (m)
for Land Application Units
1
0.8
CD
1 0.4
2000 E >ACMTP iteration s
OL-
0
12/11/98
C-19
-------
-------
APPENDIX D
GLOSSARY
-------
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix D - Glossary
April 13,1999
Glossary to the Neural Network Appendix
BackErrorPropagation Method - neural network training method to find the optimum values
of the connection strength between hidden nodes and input/output nodes (weights). For each data
example there is a forward pass from the input nodes to the output nodes to determine the current
neural network's output values, followed by a backward pass from the output nodes to the hidden
neurons to determine (based on the difference between the predicted output values and the
desired values) how and in which direction the weights need to change. The name
backpropagation is derived from the process of propagating the error information backward from
the output nodes to the hidden nodes (Smith, 1996); also known as the steepest descent method.
Batch Learning - calculation of the residual (Le., the difference between predicted and measured
output value) for each data example within the entire training matrix during feed forward
propagation to the neural network, summation of the individual residuals and calculation of an
overall average residual which then is fed backward once to adjust the layer weights of the neural
network.
Conjugate Gradient Method - neural network training method that performs a search along two
directions. This neural network training method generally produces faster convergence than
back-error-propagation. It is a second-order weight optimization method that involves
calculating an approximation of the second derivative of the error with respect to a weight
change and can be computationally expensive.
Coefficient of Determination, R2 - estimation of the degree of interrelation between two
variables in a manner not influenced by measurement units. It is the ratio of the explained
variation to the total variation of a population (Spiegel, M.R., 1961).
explained variation
r>2 _ explained variation
/< — =
total variation unexplained variation + explained variation
R =
hi the neural network study, R2 was used as a linear measure of the correlation between input and
output values, reflecting the goodness of the linear regression fit. R2 values near zero mean
almost no linear relation between the variables (pointing to highly non-linear relationship); R2
12/11/98
D-l
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix D - Glossary
April 13,1999
values close to one indicate a highly linear relationship. R2 is dimensionless and always non-
negative.
DAF - (Dilution Attenuation Factor), dimensionless ratio of the leachate concentration to the
receptor well concentration. The DAF represents the combined effects of the site characteristics,
hydrogeologic settings and chemical-specific parameters on the receptor well concentration.
Data Sample - particular combination of parameters (input/output) with their associated values.
Data Matrix - rectangular array of numbers, or elements, arranged in m rows and n columns,
where each row represents one data sample.
EPACMTP - (EPA's-£omposite Model for Leachate Migration with Transformation Products),
probabilistic modeling software used to produce the raw data for the neural network training by
simulating the subsurface fate and transport of waste constituents leaching from land disposal
units; processes accounted for: advection, hydrodynamic dispersion, linear/non-linear sorption,
first-order decay (hydrolysis).
IWEM - (Industrial W_aste Management Evaluation Model), windows-based software for
determining the groundwater protection afforded by various WMU liner systems. Consists of a
two-tiered approach for evaluation of WMU liner systems: Tier I - conservative analysis based
on national data; Tier II - analysis with a limited set of site-specific data based on the neural
networks described hi this document.
Feed Forward Propagation - neural network method that passes values from the input
parameter hi the input layer to the hidden neuron in the hidden layer and then to the output
parameter in the output layer.
Hidden Neuron - a neuron or processing unit within a neural network that connects the known
input and output parameters. Hidden neurons contain the sigmoid transfer function needed to
calculate the weight adjustment factors; connected by weights with the input and output
parameters, also called the hidden node.
Hidden Layer - a layer of neurons or processing units within a neural network that are not
directly visible to the user, as the input and output values are. See hidden neuron.
Hlearning Rate - constant coefficient applied during adjustment of the neural network weights
between the input and the hidden layer after each batch; also called hidden rate. The Hlearning
rate should be twice that of the learning rate (output rate).
12/11/98
D-2
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix D - Glossary
April 13,1999
Input/Output Value - input value: the independent parameter data values that are used to
predict the dependent (output) parameter values; output value: the dependent parameter value
predicted by the neural network, based on combinations of input parameter values.
Input Layer - a layer of neurons or nodes that contains all neural network input parameters and
their data values. The input layer is linked directly to the hidden layer(s).
Interpolation - estimation of input values XH between given input values Xj for a known function
fixd based on the assumption that in a neighborhood of the xrvalue in question,/can be
approximated by a polynomial p, whose value at that x; is then taken as an approximation of the
value off at that x}; types: linear, and quadratic (Kreyszig, 1988).
Interrogation - a test of the predictions of a trained neural network by querying it with desired
input values.
Land Application Unit - above ground waste management unit in which a liquid waste or
sludge is spread as a thin layer on the ground surface to promote biochemical deecay of waste
constituents. Assumed maximum lifetime of a land application waste management unit and the
resulting leachate pulse duration is 40 years.
Landfill - a permanent waste management unit in which solid waste is deposited generally above
the ground surface. The base of the landfill may contain a liner to prevent leakage of waste
constituents into the groundwater. The duration of the leaching period for landfills is dependent
on the initial amount of contaminant, the infiltration rate, the landfill dimensions, and waste
density. For large landfills, the pulse duration can approach steady-state conditions.
LCTV - Leachate Concentration Threshold Value, the maximum allowable leachate
concentration of a waste constituent in a waste that is intended to be disposed in an industrial
waste management unit.
Learning Rate - rate of neural network learning (ability to predict dependent parameter values
based on independent parameter values) measured while adjusting the weights of the connections
between the hidden layer and the output layer.
Model Dimension - dimensionality of a neural network model based on one dimension per
input/output parameter set. For example, 7 inputs and 1 output results in seven model
dimensions.
Neural Network -a predictive model of a real world system in which dependent parameter
12/11/98 £)_3
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix D - Glossary
April 13,1999
values (outputs) are determined based on a set of independent parameter values and the
relationships of the connections between these values, which are based on hidden neurons and
weights. Weights are adjusted during the development of a neural network to minimize the error
between the neural network answer (prediction) and the desired output (Swingler, 1996).
Neural Network Learning/Generalization - learning: the development of a predictive
capability of a neural network model based on patterns of data (i.e., input values, layer weights,
and hidden neurons); generalization: the ability of a neural network model to predict well over a
wide range of input values.
Number of Counts - number of neural network model iterations or adjustments of weights.
Output Layer - layer of neurons or nodes in a neural network that contains all output parameters
and their data values.
Overtraining - the quality of a neural network in which it predicts very well for data values on
which it was trained, but cannot predict well for input values on which the neural network has
not been trained. Indicated when during the neural network training process the error of the
validation data set suddenly increases (i.e., R2 value decreases) when the error is expected to
steadily decrease.
Parameter Range - range of input parameter values that encompasses entire set of values which
are used in training and validating neural networks; defined by a minimum and maximum value.
Parameter - independent (input) or dependent (output) variable contained in the input or output
layer of a neural network.
Random Sample - a data sample created using a random number generator to produce an
unbiased data set which can be used to test or train a neural network.
Residual - the difference between an observed (e.g. EPACMTP) output value and the neural
network predicted value, also called error.
Response Surface - the output of a model or system as a function of a set of input parameters
represented by a line or multi-dimensional surface. For example, a two-dimensional response
surface is created by connecting the input data points for each input parameter (ID) and output
parameter (ID).
Stratified Sample - data sample produced with pre-determined (non-random) combinations of
input parameter values.
12/11/98
D-4
-------
Guide for Industrial Waste Management:
Ground-Water Modeling Technical Background Document
Appendix D - Glossary ,
April 13,1999
Surface Impoundments - waste management unit in which a liquid waste is held in a pond-like
enclosure above or partly integrated in the ground surface. The assumed maximum life of a
surface impoundment and its duration of leaching period is 20 years.
Test Set/Master Test Set - an independent, randomly established set of data examples used to
test the accuracy of neural network predictions during or after the training process. Master test
sets are used to test the predictive capability of a neural network, whereas test sets maybe totally
or partially incorporated into a training data set.
Tlearning Rate - initial neural network learning rate (rate at which neural network develops
predictive capabilities) for all threshold connections.
Training parameters - settings of the neural network software (NNModel) used to start the
neural network training process, including: learning rates, number of hidden neurons, number of
training counts, or training method; settings such as number of training counts or number of
hidden neurons may be changed during the neural network training process.
Training - process of developing the predictive capability of a neural network consisting of
adjusting the weights of a neural network by passing a set of data examples of input-output
values through the model and adjusting the weights between the input/output parameters and the
hidden neurons to minimize the error between the neural network prediction and the desired
answer (Swingler, 1996).
Validation - method used to test the predictive capability of a neural network using independent
training data examples and to monitor the progress of the training process; validation data sets
can be completely or partially incorporated into a training data matrix to improve the neural
network predictive capability.
Variable Space - also called data space; a term used to describe the range of neural network
input parameter values and combinations of these values.
Waste Piles - an uncovered temporary waste management unit used to store solid waste above
the ground surface, with an assumed maximum lifetime and duration of leaching period of 20
years.
Waste Management Unit - permanent or temporary facility used to store/manage industrial
waste. Includes landfills, waste piles, surface impoundments, and land application units.
12/11/98
D-5
-------
------- |