Data Quality Objectives Decision Error Feasibility Trials (DEFT) Beta Version 1


         United States        Office of        540R93600
         Environmental Protection     Solid Waste and
         Agency          Emergency Response    August 1993
         Superfund

&EFW    DATA QUALITY OBJECTIVES
         DECISION ERROR
         FEASIBILITY TRIALS (DEFT)
         BETA VERSION 1.01

         User's Guide

-------
       DATA QUALITY OBJECTIVES
DECISION ERROR FEASIBILITY TRIALS (DEFT)

           BETA VERSION 1.01

             USER'S GUIDE
               August 2, 1993

-------
                                DATA QUALITY OBJECTIVES
            DECISION ERROR FEASIBILITY TRIALS (DEFT) - USER'S GUIDE
Directions;

Insert floppy into drive B (A)
typo   b:  (a:)
typo   deft
Background:

        One of the most intensive steps in the DQO Process1 is the final step, Optimize the Design.
During this step, the entire set of DQO outputs are incorporated into a sampling design. If the DQO
constraints are not feasible, it may be necessary to iterate through one or more of the earlier steps of
the DQO Process to identify a sampling design that will meet the budget and generate data that are
adequate for the decision.  This iteration can be time consuming and costly. The  proposed PC
package should reduce the intensiveness of this step by allowing a decision maker or member of the
DQO planning team to generate information about several simple sampling designs based on the DQO
constraints.  Through this  process,  the decision maker can evaluate whether these  constraints are
appropriate or feasible before the sampling and analysis design  team begins identifying a sampling
design.

        There is no easy method for developing a statistical sampling design. Factors such as the
media, the parameter of interest,  the contaminant of interest and the sampling boundaries all affect the
choice of a sampling design.  For instance, volatile and non-volatile contaminants must be treated
differently both in the field and in  the laboratory.  Sampling designs for soil, where samples can be
randomly placed, are different from sampling designs for ground water, where sampling locations are
fixed.  A composite  sample design is applicable for testing hypothesis concerning the mean, however,
it is not applicable for testing hypothesis concerning percentiles. An optimal sampling design accounts
for these factors and others, yet is practical, feasible, and satisfies the DQO constraints.  The proposed
package is not an expert system that will design a optimal (or even feasible) sampling design.

        Decision makers will  be able to tailor the application of this proposed package to their basic
needs by entering basic information on the DQO constraints. Then the user will be able to change
DQO constraints such as the limits on decision error or the grey region, and evaluate how  these
changes affect the sample  size for several basic sampling designs.  The relations and values that are
generated by the package can be  used to set upper bounds on the sample size and eventually optimize
the sampling design.
    1 The DQO Process is described in "Data Quality Objectives Process for Superfund: Interim Final
Guidance" EPA/540/G-93/071.  U.S. Environmental Protection Agency.  1993.

DEFT - Beta Version 1.01                       1                                         7/2/93

-------
Current Version:

The current version of this program assumes that a mean is being compared to some fixed standard.

Entering Information;

1. Parameter

The user will need to enter the parameter of interest. A parameter is a numerical descriptive
measure of a population. Possible parameters include:

1. the mean
2. a percentile
3. a proportion.

Since the current version of this program assumes that a mean will be used, the user will not have to
enter this information. The parameter of interest should have been identified in Step 5, Develop a
Decision Rule, of the DQO Process.

2. Minimum and Maximum Values

Estimates of the minimum and maximum possible values of the population are necessary for
scaling and graphing purposes, default estimates, and to create bounds on the parameter of interest.
These values should have been identified in Step 6, Specify Limits on Decision Errors, of the DQO
Process.

3. Action Level

The action level is a measurement threshold that provides the criterion for selecting among
alternative actions. The action level is bounded by the minimum and maximum values of the
population. It should have been identified in Step 5, Develop a Decision Rule, of the DQO Process.

4. Null Hypothesis

The direction of the hypothesis is very important. The null is a 'default' or base line
condition. If the data do not demonstrate enough information to reject the null, then this is the
conclusion. The null hypothesis may have been identified in Step 6, Specify Limits on Decision
Errors, of the DQO Process.

The current version of the software compares a mean to a fixed standard (or action level).
Therefore the choices for selecting the form of the hypothesis are:

1. H0: u > Action Level vs. Ha: n < Action Level

2. H0: n < Action Level vs. Ha: p > Action Level
DEFT - Beta Version 1.01 2 7/2/93

-------
5. Gray Region

The user will need to enter both the upper bound of the gray region and the lower bound,
although one of these bounds may be set equal to the action level depending on the null hypothesis.
The values which define the gray region are bounded by the maximum and minimum values of the
population and the action level. The boundary values of the Gray Region should have been identified
in Step 6, Specify Limits on Decision Errors, of the DQO Process.

6. Estimate of Variability

An estimate of variability is necessary for computing sample sizes. This program will ask for
an estimate of the standard deviation (the square root of the variance). If there is no estimate
available, use the rough approximation:

(Maximum Value - Minimum Value) / 6.

However, this approximation should only be used if there is absolutely no other information available.
The standard deviation must be greater than zero and less than the range of the population. An
estimate of standard deviation may have been derived in Step 3, Identify Inputs to the Decision, of the
DQO Process.

7. Limits on Decision Errors

The limits on decision errors are entered as alpha (a) and beta (p). Alpha and Beta are based
on the null hypothesis. At the current time, the program only considers one alpha and one beta.

Alpha is the probability of rejecting the null hypothesis when it is true, or the probability of a
Type I error. Therefore, if the null hypothesis is that u>AL, then alpha is the probability of
concluding that uAL.

Beta is the probability of accepting the null hypothesis when it is false, or the probability of a
Type II error. Therefore, if the null hypothesis is that u>AL, then beta is probability of concluding
that u>AL when the true state of nature is u
-------
Changing Information;

Once all the information has been entered, the user will be able to change various information and see
the effect this has on the sample size, cost, and the expected power curve.

The user may change:

Alpha - Press 'A' to change alpha.

Beta - Press 'B' to change beta.

Action Level - Press 'C' to change the action level.

Gray Region - 'L' will change the lower value of the gray region.
'IT will change the upper value of the gray region.

Standard Deviation - Press 'D' to change the estimate for the standard deviation.

Sample Size - Press 'N' to change the number of samples. The program will compute the optimal
beta based on this sample size while alpha will remain fixed. This option is not implemented yet.

Total Cost - To change costs, enter 'T'. Then enter values for both the cost of analyzing a sample (A)
and the cost of collecting a sample (F). In future versions, it will be possible to change these costs
individually as well as to set the total cost to restrict sample size.

Sampling Design: Press 'S' to switch the type of sampling that is used. Possible sampling designs
include:

1. Simple Random Sampling
2. Composite Sampling
3. Stratified Sampling
4. Sequential Sampling

This option is not yet implemented.

General Information;

(P)rint - press 'P' to send the current information to a printer. This option has not yet been
implemented.

Save(F)ile - press 'F' to save the current information in a file. The user will be prompted to enter a
filename. This option has not yet been implemented.

(O)riginal DQOs - press 'O' to restore the original DQO constraints entered at the start of the
program. This is useful for comparing variations of several sampling designs. For instance, the user
will start with a set of DQO constraints. The first sampling design may be too expensive to satisfy
these constraints, so the user may want to relax some constraints to obtain a feasible sample size.

DEFT-Beta Version 1.01 4 7/2/93
-------
After this is complete, the user may want to examine the performance of another sampling design
using the original DQO constraints. This option saves the user from re-entering the original
information. This option has not yet been implemented.

(G)raph - press 'G' to switch to the graph of the design performance. This diagram summarizes the
gray region, the limits on decision errors, the action level, and the expected power curve of the design.
This option has not yet been implemented. At the current time the graphics display is on the same
screen as the prompts for changing information.

DESIGN PERFORMANCE GRAPH

The design performance graph summarizes the gray region, the limits on decision errors, the
action level, and the expected power curve of the design. Information on sample size and costs will
also be included for the user. The only option on this screen will be to return to 'DQO Constraints'
screen. At the current time, the graphics display (Figure 1) is on the same screen as the prompts for
changing the DQO constraints.
DEFT - Beta Version 1.01 5 7/2/93
-------
o
cr
» &
f«
so o
? 5
r*
-------