-------
o The NAVIGATION PANEL displays the name of data sets and all
generated outputs.
o At present, the navigation panel can hold at most 20 outputs. In order
to see more files (data files or generated output files), one can click on
Widow Option.
® The LOG PANEL displays transactions in green, warnings in orange, and
errors in red. For an example, when one attempts to run a procedure
meant for censored data sets on a full-uncensored data set, Scout will print
out a warning message in orange in this panel.
o Should both panels be unnecessary, you can click Configure > Panel
ON/OFF.
The use of this option will give extra space to see and print out the statistics of interest.
For an example, one may want to turn off those panels when multiple variables (e.g.,
multiple Q-Q plots) are analyzed and GOF statistics and other statistics may need to be
captured for all of the variables.
20
-------
Chapter 2
Working with Data, Graphical Output, and
Graphical Output
2.1 Creating a New Spreadsheet (Data Set)
To create a new worksheet: click File > New
Configure Programs Window Help
'anel
Open
Import ~
Exit
2.2 Open an Existing Spreadsheet (Data Set)
If your data sets are stored in the Scout data format (*.wst), Scout output format (*.ost),
Scout graphical format (*.gst) or an Excel spreadsheet (*.xls), then click File !> Open.
o If your data sets are stored in the Microsoft Excel format (*.xls), or in the DOS-
Scout format (*.dat) or ParallAX format (*.pax), then choose File > Import
> Excel or Old Scout or ParallAX.
Configure Programs Window Help
New
Open
Import '~
?anel
Oil
Exit
Excel (,xls) Data
Old 5cout (,dat) Data
Comma Delimited (.csv) Data
Blank or Tab Delimited (.txt) Data
Parallax (.pax) Data
° Make sure that the file that you are trying to import is not currently open.
Otherwise, there will be the following warning message in the Log panel:
"[Information] Unable to open C:\***.xls. " Check the validity of this file.
Note: *.csv files and * .txt Files will be available in later versions of Scout.
21
-------
2.3 Input File Format
® The program can read Excel files (*.xls files), data files (*.dat files for DOS
versions of GeoEas and Scout software packages), ParallAX files (*.pax files),
comma delimited data files (*.csv files), and tab or space delimited files (*.txt
files).
o The user can perform typical Cut, Paste, and Copy operations, as in Microsoft
Excel.
o The first row in all input data files should consist of alphanumeric (strings of
numbers and characters) variable names representing the header row. Those
header names may represent meaningful variable names such as Arsenic,
Chromium, Lead, Temperature, Weight, Group-ID, and so on.
o The Group-ID column has the labels for the groups (e.g., Background,
AOC1, AOC2, 1, 2, 3, a, b, c, Sitel, Site2, and so on) that might be
present in the data set. The alphanumeric strings (e.g., Surface, Sub-
surface) can be used to label the various groups.
o The data file can have multiple variables (columns) with unequal number
of observations. NOTE: Some of the robust methods require all of the
variables to have an equal number of observations.
o Except for the header row and columns representing the group labels, only
numerical values should appear in all of the other columns.
o All of the alphanumeric strings and characters (e.g., blank, other
characters, and strings), and all of the other values (that do not meet the
requirements above) in the data file are treated as missing values.
o Also, a large value denoted by 1E3I (= 1 x 1031) can be used to represent
missing data values. All of the entries with this value are ignored from the
computations. Those values are counted when missing data values are
tracked.
2.4 Number Precision
° You may turn Full Precision on or off by choosing: Configure > Full Precision
On/OFF.
22
-------
Seout -4>.0j
Configure :
Programs Window Help
Nav v Panel On/Off
° By leaving the Full Precision turned on, Scout will display numerical values using
an appropriate (the default) decimal digit option. However, by turning the Full
Precision off, all of the decimal values will be rounded to the nearest thousandths
place.
o Full Precision On option is specifically useful when one is dealing with data sets
consisting of small numerical values (e.g., <1) resulting in small values of the
various estimates and test statistics. Those values may become very small with
several leading zeros (e.g., 0.00007332) after the decimal. In such situations, one
may want to use the Full Precision option to see nonzero values after the decimal.
2.5 Entering and Changing a Header Name
1. Highlight the column whose header name (variable name) you want to change by
clicking either the column number or the header as shown below.
0
1
2
Arsenic
1
4.5j
2
5 G
3
4.3
4
5.4
5
9.2
2. Right-Click and then click "Header Name"
0
1 | 2
2
5 6
A 3
54
92
3
4
5
3. Change the FleaderName.
23
-------
Header Name
¦m
Header Name:
|Arsenic Site 1
OK
| Cancel |
4. Click the "OK" button to get the following output with the changed variable
name.
0
1
2
Arsenic Site 1
1
4 5
2
56
3
43
4
5.4
5
9.2
2.6 Editing
Click on the Edit menu item to reveal the following drop-down options.
H?
^^9 Configure Data Graphs Stats/GOF Outliers/Estim
Naviga
Cut (Ctrl-X)
Copy (Ctrl-C)
Paste (Ctrl-V)
0
1
Name
Wo
1
— -
2
3
I
The following Edit drop-down menu options are available:
o Cut option: similar to a standard Windows Edit option, such as in Excel. It
performs standard edit functions on selected highlighted data (similar to a buffer).
° Copy option: similar to a standard Windows Edit option, such as in Excel. It
performs typical edit functions on selected highlighted data (similar to a buffer).
o Paste option: similar to a standard Windows Edit option, such as in Excel. It
performs typical edit functions of pasting the selected (highlighted) data to the
designated spreadsheet cells or area.
° Note that the Edit option could also be used to Copy Graphs.
24
-------
2.7 Handling Non-detect Observations
Scout can handle data sets with single and multiple detection limits.
For a variable with non-detect observations (e.g., arsenic), the detected values, and the
numerical values of the associated detection limits (for less than values) are entered in the
appropriate column associated with that variable.
Specifically, the data for variables with non-detect values are provided in two columns.
One column consists of the detected numerical values with less than (< DL,) values
entered as the corresponding detection limits (or reporting limits), and the second column
represents their detection status consisting of only 0 (for less than values) and 1 (for
detected values) values. The name of the corresponding variable representing the
detection status should start with d_, or D_ (not case sensitive) and the variable name.
The detection status column with variable name starting with a D_ (or a d_) should have
only two values: 0 for non-detect values, and 1 for detected observations.
For an example, the header name, D Arsenic, is used for the variable, Arsenic having
non-detect observations. The variable D_Arsenic contains a 1 if the corresponding
Arsenic value represents a detected entry, and contains a 0 if the corresponding entry for
variable, Arsenic, represents a non-detect.
There should not be any missing value in the non-detects column. If there exists an
observation with no indication of "0" or "1" in the non-detects column, then that
observation should be deleted if the various methods for non-detects are to be used.
Otherwise the methods for detected data (i.e., methods which do not require a non-detects
column) can be used.
25
-------
fcS] D:\exaniple.ws
t
0 I 1
2
3
4
5
6 1 31
Aisenrc
D_Ar*enic
Mercury
D_Mercuiy
Vanadium
Zinc
Group 1 —IH
1
45
0
0 07
11 16 4 89 3iSurface
2
56
1
0 07
1
168
90 7|Suiface
3
4 3
I 0
011
0
772
95 5 Surface 1
4
54
1
02
0
~79~4~
113jSurface
5
9 2
1
0 61
l"
153"
266 Surface
G
62
1
012
30 8
80 9[Surface
7
67
1
0 04
29 4
80 4lSurface
8
56
1
006
| 13*8
89 2^Surface
9
85
1
099
1
r8 9
182jSurf.se© ,
10
565
1
0125
11
i 1725
80 4'Surface
i
11
54
1
018
1]
I f7 2
91 9j Subsurface.
12
55
1
0 21
! IS 3
112] Subsurface i
13
59
l~ 1
023
1
: ~~ "i6~8
172|SubswfacO|
14
5.1
1
0 44
1
i 171
99( Subsurface
15
52
1
012
1
j 10*3
90 7| Subsutfaco
16
45
0 055
1
r ~ " 75" i
66 3(Subsuface;
17
61
1
-
0 055
1
"24 3;
! 75
|
Subsurfacej ||
18
"~6T
i 021
_
1 10
185
Subsurface1 B
19
68
1
; 067
__
! VGS
164
Subsurface. 1]
20
* 5
"01
_
r 12
I 68 4|Subsutface| |j
21
08
il
I 1 II
22
026
~_ . _ii
i
!
23
0 97
24
0 05
j
I "
1 " "j ,
y
02G
Ti
I
1
^
2.8 Handling Missing Values
| sS File Edit Configure Graphs 5tats/GOF Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window
Help
Navigation Panel |
C°PV 1 | 2 3 4 | 5
6
7
8
Name
Generate Data ~ i
Impute ND Data ! ¦ i
Worksheet
¦ !R!)!ll9AIM!f!l>1^n Rpntarfi Mrainn ujihh Hpan
Transform (No NDs) 1 Replace Missing with Median
1
i
| J | | Remove Rows with Missing Data
1
Section 4.4 details how missing values are treated in Scout.
26
-------
2.9 Saving Files
~9| File ' Edit Configure Window j
N<, New J
—Open
— Import ~
Close "°-
Print
Print Preview
Exit
t-
o The Save option allows the user to save the active window.
° The Save As option allows the user to save the active window. This option
follows typical Windows standards, and saves the active window to a file in Excel
(*.xls) format or an output sheet (*.ost) format.
2.10 Printing Non-Graphical Outputs
l. Click the output you want to copy or print in the Navigation Panel.
27
-------
Scout:-4.0:-[HuberQiM-nsljJ
py Fie Edit Conftgue
Navigalion Panel
Programs Window Hdp
_L
J I L
-I
| Hubei MullivaiiateOulliei Anafems
D \Narain\Scout_Fo
OLSOut ost
OLSresQQ gst
OLSresXY gst
OLSresNOX gst
0LS_YYhat gst
OLSresY gst
OLSresYhat gst
Huberlndex of Obse
HuberDD gst
HuberClQ gst
Use Selected Option* 1
"D^/TimTo* CWialan |1>n/200B 4*15 4~6PM
From Fie \>J«an\Scc*i_Fo»_Window\ScoutSouiceV.WofkD8lln£xcdSBRADU
F J Pi ecu on OFF
DtcdAlpha 005
Influence Function Alpha 0 05
Initial Eitrftatej Robust Medan Vecto* aid OKG (M.wcmaZama< 2002] Matjw
Display Condaboo R Mabw Do Mot Dupity Con elation R matiK
Dft&bjboridSquaedMOs Be»a Dctrtxion
ShowIntennedalsReaJt DoNot DapttylnteimeddieResuts
Ttie lor Index Plot (Hubei Estimate
T tie foi Didance-Dctance Plot Hubef Ettmata
' Tile l«Q~CfPlot HiieiEimie
Graphs Cntical Alpha ,005
MDt Dctoixiton Beta
Nunber of ObietvViom
Nuntw of Selected Vanktfw
liU
"^r1
Log Panel
LOO 4 15 26 PM »f1nfoimalion] Y-Hatvs Residuals Plot of Residuals was generated!
LOG 4.15 49PM '(Information] Index Plot of MDs was geneiated'
LOG 4 15 51 FM '[Information] Classical Distance versus Distance Plot ofMDs was generated!
LOG 4 15 53 PM '(Information] GQ Plot of MDs was jwrieraled1
2.
Click File E> Print.
Sil Scout; 4|J). [jHuberOut.ost])
~Sljjjj Edit Configure Programs
Nc'
_l
N
New
Open
Import
Close
Save
Save As,
Print Preview
Exit
0.
Huberlndex of Obse
HuberDD gst
HuberQQ.gst
2.11 Working with Graphs
Advanced users are provided with two sets of tools to modify graphics displays. A
graphics tool bar is available above the graphics display, and as the user right clicks on
the desired object within the graphics display, a drop-down menu will appear. The user
can select an item from the drop-down menu list by clicking on that item. This will allow
28
-------
the user to make desired modifications as available for the selected menu item. An
illustration is given below.
2.11.1 Graphics Toolbar
aHuff's is<3Efti * a I "3i
Scatter Plot of Discriminant Scores
-14.2
10.! 11.1 12.1 13.1 14.1 15.1 161 17.1 18.1 19.1 20.1 21.1 218
DS2
ai »2 a 3
The user can change fonts, font sizes, vertical and horizontal axes, and select new colors
for the various features and text. All of those actions are generally used to modify the
appearance of the graphic display. The user is cautioned that those tools can be
unforgiving and may put the user in a situation where the user cannot go back to the
original display. Users may want to explore the robustness of those tools and become
more experienced in their use before actually trying to use those graphic tools on real
data sets.
Another feature in this graphics tool bar is the presence of one, two, or three drop-down
variable selection boxes, depending upon the type of graph.
o The XY Plot in Regression has only one drop-down variable selection box
for different X variables.
o The Scatter Plots in 2D Graphs, Principal Component Analysis, and
Discriminant Analysis have two drop-down variable selection boxes for
29
-------
selecting different X and Y variables. The first box is for the X variable
and the second box is for the Y variable.
• Scatter Plots in 3D Graphs have three drop-down variable selection boxes
for selecting different X, Y and Z variables.
• The user can select the required variables and the new graph is obtained
by clicking the "Redraw'" button. An example is given below.
Note: One can select variables from the graph itself, as shown in the following figure.
Graph: PROP principal components scatter plot.
Data Set used: Well-known Wood data set. All five of the X-variables were selected to derive the PCs.
Default Graph Obtained: PCI is drawn along the X-axis and PC2 is drawn along the Y-axis.
Changing X-axis variable to PC4 and Y-axis variable to variable X2.
h- ; ¦ L_ H
BU- » 'M ' ^ x Ira
.rJ
l
-
ReDraw j
271
2.24
Scatter Plot of PROP PCs
PC1
PC2
PC3
PC4
PC5
Ml
A
M
d
1.74
x3
V
1.24
J
0.74
jri
M
PC2
o a
as k
J
m
-0.76
-1.26
d
M
-1.76
-2.26
M
-2.76
-2.53
-2.03 -1.53 -1 03 -0.53 -0.03 0.47 0.97
PC1
1.47
1.97
2.47 2.71
30
-------
The X-axis variable is PC4 and the Y-axis variable is variable X2.
5 « # L'lUr/ * »
i J *2 % ¦ a *
|PC4
»| |x2 t| ReDiaw |
Scatter Plot of PROP PCs
0.175
0.166
J
ji
0.156
JI
A
0.146
X 0.136
M
M
M
0.126
j*
a
M
0.116
4
J *
A
0.106
d *
M
0.096
-5.9 -4.9
-3.9 -2.9 -1.9
PC4
-0
1 i
.9 0.1 1.1 1.3
2.11.2 Drop-Down Menu Graphics Tools
Those tools allow the user to move the mouse icon to a specific graphic item such as an
axis label or a display feature. The user then right clicks the mouse button and a drop-
down menu appears. This menu presents the user with available options for that
particular control or graphic object. If one is not careful and experienced, then there is a
small risk of making an unrecoverable error when using those drop-down menu graphics
tools. As a cautionary note, the user can always delete the graphics window and redraw
the graphical displays by repeating their operations from the datasheet and menu options
available in Scout. An example of a drop-down menu obtained by right clicking the
mouse button on the background area of the graphics display is given as follows. Some
of the options are: changing the color of the observations, changing the type of graph,
viewing the observation numbers (Point Labels), and editing the title of the graph.
31
-------
>ba a <
0.175
0166
0.156
0 096
& s ij
PC4
-4.9
Scatter Plot of PROP PCs
v Toolbar
U Data Editor
[HI Legend Box
A
|L_ Gallery
2£ 1=* fei kl Ci te!
Color ~
Ljkiteiiiiiiitt
3 oj. Edit title
Point Labels
Font...
lit 9 9 taa A sir
'~AIL
£HLs.LE,1»£
~sj Properties...
g] Statistical Studies
~t\ pr
~ ReDraw
PC4
d
A
1.1 1.3
1
0.7 | 0.1 | -0.8 | -4.7 | -0.3 j -5Q | -02 | -5.0 | 0.0
| 03 |-1.1 | 0.6 |
I 0 .6
mm
-5.3
-0.3 I
m±j
0.1061 0.1361 0.127 1 0.159 | 0.114 | 0.163 | 0.1231 0167 ! 0.118| 0.156 1 0.159 | 0.134 | 0 1401 0111 |
0.1141
| 0.1321 0 125 1 0 1031
0169 i
0106 |
Scout provides a different Drop-Down Menu Graphic Tool in the presence of
observations of various groups. This can be used to change the grouping of the
observations on the graph. To perform this feature, move the mouse icon to the particular
observation and click the right click button on the mouse. A menu comes up. Click the
"Change Group" option. A window comes up with "Change Group Drop-Down
Box." Select the new group of the observation and click "OK" to continue or "Cancel"
to cancel the option. Once a selection has been made, move the mouse icon to that
particular observation and click on the left mouse button. This will change the
observation group assignment and the observation will belong to the new group shown on
the graph.
32
-------
Graph of 2D scatter plot with groups from graphs.
Data Set used: Beetles.
Changing the left-most observation from Group 3 (red triangle) to Group 2 (green circle).
_ 1 ^ir
Scatterplot
L_
¦Satery »
*
Studwign
Change
¦
Color >
<*•
Pont UWs
*
72 82 92 102 112 122 132 142 152 163 160
x2
Change group option brings up a Change Group window, as shown below.
Scatterplot
72 62
'<3 122 132 142 1$2 162 16.8
x2
33
-------
The left-most observation from Group 3 (red triangle) now belongs to Group 2 (green circle) on the
graph.
Scatterplot
161.1
KtlA)
To incorporate the changes in the graph to the worksheet, click the "Save Changes"
option after using the right-click button on the mouse. This saves the new grouping to
the first available column on the worksheet as "newGrp."
Observation 53 changed from Group 3 to Group 2.
0
1
2
3
4
Group
h1
x2
newGtp
43
2
124
15 2
44
2
120
13 2
45
2
119
16 2
46
2
119
14 2
47
2
133
13 2
48
2
121
152
49
2
128
142
50
2
129
14 2
51
2
124
132
52
2
129
142
53
3
145
8 2
54
3
140
113
55
3
140
11 3
56
3
131
103
57
3
139
11 3
58
3
139
103
59
3
136
12
3
60
3
129
11
3
61
3
140
10 3
62
3
137
9 3
34
-------
2.11.3 3D Graphics Chart Rotation Control Button
The axes in a 3D scatter plot can be rotated using the Chart Rotation Control button
present on the top-left corner of the 3D scatter plot.
@ Scatterplot
When this Chart Rotation Control button is clicked, the Chart Rotation Control tool box
appears. This tool box has three scroll bars for the three axes and a fourth scroll bar for
adjusting the brightness of the graph. The scroll bars can be used to rotate any or all of
the three axes. When the "Reset" button is clicked, the graph is reset to the standard
front view. The "Cancel" button brings the graph to its default view.
Chart Rotation Control
Y-Axis Z-Axis
0 aJ 0
d
zl
XAxis
jlI J ±\ o
Light Level
JLl J Jj1™
Reset
OK Cancel
35
-------
The angle of rotation for the three axes ranges from -120 to +1 I I degrees. The positive
sign is for rotation in clockwise direction and the negative sign counter-clockwise
direction. The Light Level scroll bar ranges from 0 for black to 391 for the white
(brightest) level.
36
-------
References
ProUCL 4.00.04. (2009). "ProUCL Version 4.00.04 User Guide." The software
ProUCL 4.00.04 can be downloaded from the web site at:
http://www.epa.gov/esd/tsc/softvvare.htm.
-------
Chapter 3
Select Variables Screens
Scout provides a number of variable selection screens for different types of statistical
analysis. Most of them are illustrated here.
3.1 Data Drop-Down Menu
3.1.1 Transform (No NDs)
o When the user clicks Data £> Transform (No NDs), the following window wil
appear:
EH SelectTransform,Variable
"Select Tremfwm -
(• Z-Tiamlotm
f Lmearfax + b)
f* NatualLog
C Log Base 10
E»*M
C Pow(x,a)
<"* Box-Co*
f Ranked
C Ordefed
C Rank*
AicSne
C Group Item*
Select a Variable to Transform
ID Count
75
75
75
75
m
Variable to Transform
Name
I ID
I Count
I i£l
'Select Worksheet
(• NewWaksheet
New Worksheet Filename
p Lhoose tiom txistng
Worksheets
l>n|
New Column Name
Select Column
CO
D
C2fi
C3
C4
C5'
C6
C7
CSj
C9
C10
CH|
a 2
CI 3
!
~1
C15
CI 6
a;
K»j
A
o This screen allows the user to transform a single variable. The transformations
available are in the "Select Transform" box.
° A single variable is selected and that variable appears in the "Variable to
Transform" box.
° The user can select the worksheet to store the transform using the "New
Worksheet" or the "Other Worksheets" and a set of available columns appear in
the "Select Column" box. The user has to specify a name for the new column.
39
-------
o An example of the selections made is shown below.
fH Scout 200B - [C:\0L0_0rfvB\HyFile5\WPWIN\SC0UT\Scoul 2008 Beta Test Venion 1.00.00\Scoul\DdlaU3RADU.xls] |J|^j(x]
El SelectiTransform Variable. fx]1
"Select Tiarsfoim
C Z-Ttai«fam
f Linearfax + b)
f" NatualLog
f Log B«se 10
a Exp(x]
r Pow(x.a]
f Box-Cox
f Ranked
C Ordered
C Rank*
f AicSne
Group Hems
Select a Variable to Transform
Name
Oxrtt
x1
x2
x3
75
75
75
75
Variable to Transform
Name
I 'D
1 Couni
V
1
75
<\
1 L>.|
I ,>:i
New Column Name
3.1.2 Impute: Transform Two Columns to a Column (NDs)
° When the user clicks Data I> Impute (NDs) the window given below will appear.
o This selection screen comes up only for data sets having non-detects. If the file
does not have columns for indicating non-detects, then an error message is
displayed in the Log Panel.
o This screen allows the user to transform a single variable. The transformations
available are in the "Select Transform" box.
o A single variable is selected and that variable appears in the "Variable to
Transform" box.
40
-------
Hili
-Select NDi Replacement
(* Detection LmJ
C 1/2 Detection Imt
f Ze10
f NoanaJ ROS Ettmatej
f Gamma RQS Estmalej
f LognamalROS Est
<"* Urrfam
New Column Nome
Select a Variable to Transform
Select Column
CO C1 C2 C3 C4
C5 C6 C7 C8 C9
CIO C11 CI2 CI 3 C14
C15 CIS CI 7 C18 C19
C20
Variable to Transform
Name
1 ID
\ Count |
X
1
53
GioiplX
3
10
Giixp3<
5
20
wimica
v
W 1
I ID I Ctml|
Select Worksheet
f* NewWaktheel
Select New Worksheet Filename
Olhei WorktheeJ®
° The user can select the worksheet to store the transform using the "New
Worksheet" or the "Other Worksheets" and a set of available columns appear in
the "Select Column" box. The user has to specify a name for the new column.
o An example of the selections made is shown below:
EM
'Select HD» Replacement —
f Deled/on lm*
1/2 Detection lm*
Zeio
(* NotmalROS Esbmate*
<"* Gartrua ROS Estxnale*
f LognoimdROS Est
Unfotm
New Column Nome
GrouplXJmputed
Select a Variable to Transform
X
Gfotp2><
Gicmj3"I
1 53
5 20
7 23
Select Column
C3 EE) ci 1 C12 C13
C14 CI 5 CI 6 CI 7 CI 8
CI 9 C20 C21 C22 C23
C24 C25 C2S C27 C29
Variablo to Transform
Name
Gioi^IX
I ID I Com~
"Select Worksheet_
f New Worksheet
(• Other Worksheet*
BRADU
Worksheet
censor-by-grpsl
41
-------
3.1.3 Copy
° When the user clicks Data t> Copy, the following window will appear:
H Select' Vajiiab(ej toiCo py
Select a Column to Copy
Variable to Copy
| Name
1 ID
1 Count |
1 Aroclorl 25.4-~
0
""" 53
Aroclor_Without_NonD
2
44
>>
«
Name
ID Count
OK
Cancel
'Select Worksheet
(• New Worksheet
Select New Worksheet Filename
f Other Worksheets
New Column Name
Select Column
CO C1 C2
C3 C4 C5
CB C7 C8
C9 CIO C11
C12 C13 C14
CI 5 CIS C17
C18 CI 3 C20
o This screen allows the user to copy a single variable to a new column.
3.2 Graphing and Statistical Analysis of Univariate Data
° Variables need to be selected to perform statistical analyses.
42
-------
o When the user clicks on any drop-down menu (Except Background vs. Site
Comparison option), the following window will appear.
Select Variables
Variables
Selected
Name
ID
I Count |
Name | ID | Count |
Arsenic
0
20 I
Mercury
2
30
» 1
Vanadium
4
20
Zinc
5
20
Group
6
20
« 1
Group by variable
1 ^
OK | Cancel |
o The Options button is available in certain menus. The use of this option leads
to a different pop-up window.
° Multiple variables can be processed simultaneously in Scout.
o Moreover, if the user wants to perform a statistical analysis on a variable (e.g.,
contaminant) by a Group variable, click on the arrow below the "Group by
Variable" to get a drop-down list of the available variables to select an
appropriate group variable. For an example, a group variable (e.g., Site Area)
can have alphanumeric values, such as AOCI, AOC2, AOC3, and
Background. Thus, in this example, the group variable name, Site Area, takes
4 values, such as AOCI, AOC2, AOC3, and Background.
° The Group variable is particularly useful when data from two or more samples
need to be compared.
o Any variable can be a group variable. However, for meaningful results, only a
variable that really represents a group variable (categories) should be selected
as a group variable.
43
-------
o The number of observations in the group variable and the number of
observations in the selected variables (to be used in a statistical procedure)
should be the same. In the example below, the variable, "Mercury," is not
selected because the number of observations for Mercury is 30; in other
words, Mercury values have not been grouped. The group variable, and each
of the selected variables, has 20 data values.
—
Variables
Selected
Name | ID | Count |
Name | ID
I Count |
Mercury 2 30
Arsenic 0
20
Group 6 20 i
Vanadium 4
20
" 1
Zinc 5
20
« I
Group by variable
[ ^
Arsenic ( Count = 20)
Mercury (Count =* 30 )
Vanadium (Count = 20
Zinc f Count = 20)
mm ff(cMegi)ll
UN | Lancei |
Caution: Care should be taken to avoid misrepresentation and improper use of group
variables. It is recommended not to assign any missing values for the group variable.
More on Group Option
° The group option provides a powerful tool to perform various statistical tests
and methods (including graphical displays) separately for each of the groups
(samples from different populations) that may be present in a data set. For an
example, the same data set may consist of samples from the various groups
(populations). The graphical displays (e.g., box plots, Q-Q plots) and statistics
of interest can be computed separately for each group by using this option.
o In order to use this option, at least one variable representing the group ID
(alphanumeric characters) should be included in the data set. The various
values of that group variable represent different group categories.
44
-------
o Note that the number of values (representing group membership) in a group
variable should equal the number of values in the variable (e.g., Arsenic) of
interest that needs to be partitioned into various groups (e.g., monitoring
wells).
° The group column can be any qualitative group ID representing different
species, laboratories, shifts, regions, and so on. For an example, in
environmental applications, data for the various groups represent data from
the various site areas (e.g., background, AOC1, AOC2, ...), or from
monitoring wells (e.g., MW1, MW2, ...).
3.2.1 Graphs by Groups
® Individual or multiple graphs (Q-Q plots, box plots, and histograms) can be
displayed on a graph by selecting the "Graphs by Groups" option.
° Individual graphs for each group (specified by the selected group variable) are
produced by selecting the "Individual Graph" option.
o Multiple graphs (e.g., side-by-side box plots, multiple Q-Q plots on the same
graph) are produced by selecting the "Group Graph" option for a variable
categorized by a group variable. Using this "Group Graph" option, multiple
graphs can be displayed for all of the sub-groups included in the Group
variable. This option is useful when data to be compared are given in the
same column and are classified by the group variable.
o Multiple graphs (e.g., side-by-side box plots, multiple Q-Q plots) for selected
variables are produced by selecting the "Group Graph" option. Using the
"Group Graph" option, multiple graphs can be displayed for all selected
variables. This option is useful when data (e.g., lead) to be compared are
given in different columns, perhaps representing different populations.
Note ¦ It is the users' responsibility to provide cm adequate amount of detected data to perform the group
operations. For an example, if the user desires to produce a graphical Q-Q plot (using only detected data)
with regression lines displayed, then there should be at least two detected points (to compute slope,
intercept, sd) in the data set. Similarly if the graphs are desired for each of the group specified by the
group ID variable, there should be at least 2 detected observations in each group specified by the group
variable Scout generates a warning message (in orange color) in the lower panel of the Scout screen
Specifically, the user should make sure that a variable with non-detects and categorized by a group
variable should have enough detected data in each group to perforin the various methods (e.g., GOF tests,
0-0 plots with regression lines) as incorporated in Scout
The analyses of data categorized by a group ID variable such as:
1) Surface vs. Subsurface,
2) AOC l vs. AOC 2,
45
-------
3) Site vs. Background, and
4) Upgradient vs. Downgradient monitoring wells, are quite common in many
environmental applications.
3.2.2 Select Variables Screen for Two-Sample Hypothesis Testing
The variables selection screen is different for two-sample hypothesis testing when
compared to single sample hypothesis testing. The "Select Variables" screen is as
shown.
il Select Variables,
Variables
Name
I ID
I Count I
0
25
y
1
25
z
2
25
Jj^£Lj Without Group Variable
First Sample Set
Second Sample Set
f With Group Variable
>> | Variable
Group Var
First Sample Set
Second Sample Set
"3]
"31
3]
Options
OK
Cancel
A
3.2.2.1 Without Group Variable
o The first sample set (e.g., background concentration) and the second sample set
(e.g., site concentration) of variables (e.g., COPC) are selected.
o The "Options" button provides the various options available with the selected
test.
3.2.2.2 With Group Variable
° This option is used when data values of the variable (e.g., COPC) for the first
sample set (e.g., site) and the second sample set (e.g., background) are given in
the same column. The values are separated into different populations (groups) by
the values of an associated group variable. The group variable may represent
46
-------
several populations (e.g., several AOCs, MWs). The user can compare two groups
at a time by using this option.
When using this option, the user should select a group variable by clicking the
arrow next to the Group Var option for a drop-down list of available variables.
The user selects an appropriate (meaningful) variable representing groups, such as
Background and AOC. The user is allowed to use letters, numbers, or
alphanumeric labels for the group names. A sample variables selection screen is
shown below.
OH
Variables
Name
1 ID
I Count
Group
0
53
X
1
53
GrouplX
3
10
Gioup2<
5
20
Group 3*1
7
23
Without Group Variable
» | Background / Ambient
>> Area of Concern / Site
<* With Group Variable
» | Variable
Group Var
Background / Ambient
Area of Concern / Site
I*
[Group (Court ¦ 53) j*]
I* 3
F 3
Options
OK
Cancel
47
-------
Regression Menu
When the Regression Menu is clicked on, the following window pops up.
I Select' RegressibnrVhriab les
Variables
Name
Court
V
*1
x2
*3
ID
i Count
75
75
75
75
75
Selected Dependant Variable
Name
| ID | Count "|
Selected Independant Variables
Name | ID | Count |
Graphics
Options
Group
"3
OK.
Cancel
Both dependent and independent variables need to be selected.
The use of the "Options" button leads to a new options window. The methods on
regression drop-down menu have different "Options" and "Graphics" screens.
They are discussed in Chapter 8.
Grouping works in the same way as for univariate data.
-------
An example of the selected screen is shown below.
50
Select' Rcgressionj Variables,:.
Variables
Selected Dependant Variable
Name
I ID
I Count
Count
» 1
Name
j ID | Count
« 1
y
1 75
Selected Independant Variables
Graphics
Options
Name
| ID | Count 1
2 75
3 75
4 75
Group
OK
Cancel
A
Multivariate Outliers and PCA Menu
For multivariate outliers or multivariate PCA, the following "Select Variables'
screen appears:
Select Variables;
Variables
Name
I ID
I Count |
y
1
75
x1
2
75
x2
3
75
x3
4
75
Selected
Name
I Count
Group by Variable
I 3
Options Graphics
OK [ Cancel |
-------
The variables that are to be considered for the analyses are selected and the
"Options" button may be clicked to select from the various options available
Those options are discussed in Chapters 7 and 9.
A "Graphics" button is provided for Robust/Iterative methods and Principal
Component Analysis methods as shown below. Those options are discussed
Chapters 7 and 9.
Select Variables,
Variables Selected
Name | ID | Count |
Group by Variable
I 3
Options Graphics
OK | Cancel |
-------
3.5 Multivariate Discriminant Analysis Menu
o When the Multivariate EDA > Discriminant Analysis is clicked on, the following
window appears.
Kf- _ 3
Group by Variable
; 3
LinearJJiscrim^^ Method
Variables
Name
I ID
I Count |
Count
"0
75
y
1
75
x1
2
75
x2
3
75
x3
4
75
Selected Matrix Columns
Options
Name
ID Count
• Prior Probability
(*; Equal
C Estrnated
C User Supplied
Graphics 1
Options
Cancel
Scores Storage
(• No Storage
C Same Worksheet
C New Worksheet
° There should be a group column specifying the various groups present.
° The group variable is selected from the "Group by Variable" drop-down bar.
o The various variables required for the analysis are then selected.
o If the prior probabilities are supplied by the user, then a column should exist in
the work sheet for the prior probabilities and the probabilities can be selected
from the "Select Group Priors Column" drop-down bar.
51
-------
° An example is illustrated below.
^Linear bisrnminan^'AnalysislGlassical Method
Variables
Name
count
Priors
ID I Countf'
150
3
Group by Variable
(count (Count = 150)
Selected Matrix Columns
Options
Graphics
Options
Name
I ID
Court
sp-length
1
150
sp-width
2
150
pt-length
3
150
pt-width
4
150
OK
Cancel
,xj
33:
"Prior Probability
f*1 Equal
C Esbrrwted
(*; User Supplied
Select Group Priors Column
jPrtors (Count = 3)
"Scores Storage
ff No Storage
C Same Worksheet
C New Worksheet
A
Note: The Prior Probability box is not available for the Fisher Discriminant Analysis since equal priors
are assumed.
52
-------
Chapter 4
Data
Scout provides the user with an array of options to modify the given data, both without
non-detects and with non-detects. The various options include:
° Copy: copies data from one column to another.
o Generate: generates univariate and multivariate data.
° Impute: generates estimated data for non-detect observations.
o Missing: handles missing observations.
o Transform: transforms data without non-detects using mathematical functions.
4.1 Copy
l. Click Data E> Copy.
|g| Seoul 2008 [D:\Narain\WorkDatlnExcel\STACKliOSSjj
Multivariate EDA GeoStats Programs Window Help!
1 Navigation Panel |
copr n
2 3 | 4
1 5
6
7
8
9
11 Name 1
venerate uata * »
Temp Acid-Conc |
(¦D:\Narain\WorkDatL.
Handle Mesng Data ^ ^
Transformation (No NDs) ~ l—l
60
Expand Data j
Bensforcfs Analysts
27 89 1 |
1 ! 1 1 1
OLSOut ost
OLSresXY gst
HI QOnt nrt
27, 88 j
!
25 90 1
i i i
i
1 1
1
2. The "Select Variable to Copy" screen (Section 3.1.3) will appear. Also, see
example screens shown below.
o A single variable is selected and that variable appears in the "Variable to
Copy" box.
o The user can select the preferred worksheet in storing the transformed data
using the "New Worksheet" or the "Other Worksheets" and a set of
available columns appear in the "Select Column" box. If the "New
Worksheet" option is selected, then the data is copied onto the new
worksheet. If the "Other Worksheets" option is selected, a set of
available worksheets arc displayed and the columns available for the
selected "Other Worksheet" are also displayed. The user has to specify a
name for the new column.
53
-------
° Examples for the selections using "New Worksheet" and "Other
Worksheet" are shown below.
B1MI
Select a Column to Copy
Variable to Copy
Name
| ID
I Count |
» I
Name
1 ID
1 Court
Aiodor_WithouLNonD
2
44
« I
Aioclorl 254
0
53
OK
Cancel
Select Worksheet-
(~ New Worksheet
Select New Worksheet Filename
|NewFileName
Olhef Wotksheels
New Column Name
CopiedColumn
Select Column
CO C1 C2
C3 C4 C5
C6 C? C8
C9 DO C11
C12 CT3 C14
C15 C16 C17
CI 8 CI 9 C20
A
EBB
Select a Column to Copy
Variable to Copy
Name
I ID
I Coirrt |
»
Name
ID
I Count
Arocloi1254
0
53
« |
Atoclctf_V/4houl_N onD
2
44
pSeled Woiksheet-
C NewWorksheet
(* OtherWorksheets
BRADU
censor-by-grps1 _xls
Aroclor 1254
New Column Name
CopiedColumn
Select Column
C3 E3 C5
CG C7 C8
C9 CIO C11
C12 C13 CI 4
C15 C16 CI 7
C18 C19 C20
C21 C22
54
-------
4.2 Generate
The Generate option generates univariate uniform, normal, gamma and lognormal
distributed random numbers, and also multivariate normal data.
4.2.1 Univariate
l. Click Data t> Generate E> Univariate.
BS pH ^ ^it Configure ^^3 Graphs StatsfGOF Outliers/Estimates QA/QC Regression Multivariate EDA GeoStats Programs Window Help
Navigation Panel |
Copy t 1 o 1 i 1 a
5
G |
7
8
3
Name |
Generate Data ~ ! Univariate ~
Uniform 1
J
Handle Missing Data >| Multivariate ~
Normal 1
Mu^NarairAWotkUatl... I
Transforation (No NDs) ~ M 1—
^ r. . 80! 27i
Exoand Data I i 1
Gamma
Lognormal
OLSiesXY gst
I m en..* 1
Bensford's Analysis ^5j 25^
1
_
J
i
2. Random numbers from the four different distributions are generated:
° Uniform distribution: input parameters are "a" (lower limit) and "b"
(upper limit).
o Normal distribution: input parameters are "Mu" (mean) and "Sigma"
(standard deviation) of raw data.
° Gamma distribution: input parameters are "Alpha" (scale parameter) and
"Beta" (shape parameter).
° Lognormal distribution: input parameters are "Mu" (mean) and "Sigma"
(standard deviation) of data is log-transformed space (logged data).
55
-------
An example for the normal distribution is illustrated.
o Click Data > Generate > Univariate > Normal.
0
Number of Observations
20
Mu (Mean)
(o
Sigma (Stdv)
rr—
Name of New Column
OK
Cancel
-Select Worksheet
(* New Worksheet
Select New Worksheet Fdename
I
f Other Worksheets
A
o Specify the number of observations required. The default is "20."
o Specify "Mu" (mean) and "Sigma" (standard deviation). The
defaults are "0" and "1," respectively.
o Specify the name of the new column.
o Select the worksheet into which the new data is to be generated.
-------
Click "OK" to continue or "Cancel" to cancel the Generate option.
Number of Observations
I
Mu (Mean)
Sigma (Sldv)
I 05
Name of New Column
RandomNumbers
OK
Cancel
r Select Worksheet
New Worksheet
Select New Worksheet Filename
[NormalData
Other Worksheets
//,
-------
Output Screen for Univariate Normal Data.
m
Scout -4.0; -- [Norma IDataijj
File Edit Configure Data Graphs Stats/GOF Outliers/Estimates F
Navigation Panel
Name
Worksheet
NormalData
0
1
RandomN umbers
1
3 58556289197292
2
3.461244035533121
I
3'
2.81215307221327j
4
2.0734818083191800
5
3.49467504882474
6
3.40443935566417!
7
3.01967228931611
8
3.882082735344311
9
2.641370539537981
10
3.18959283116352
11
12
The new worksheet has been named "Normal Data," as seen in the Navigation Panel.
4.2.2 Multivariate
1. Click Data E> Generate > Multivariate l> Normal.
Scout 2008 - [D:\Narain\WorkPatlnExcel\STACKliOSSjl
Data j
?9 ?9 F|k Edit Configure
Navigation Panel
Graphs Stats/GOF Outliers/Estimates QA/QC Regression Multivariate EDA GeoStats Programs Window Help
Name
D:\Narain\Woft
Expand Data
Bensforcfs Analysis
JJ ? 1—| 3
4
5
6
7
9
9
Uravanate ~ jj r i
I I
Multivariate ~
Normal
t !
1
J
i80 27i »• I
i
P, 25
90
I . I
58
-------
11 GeneratejMultinormali
Available Columns
Select Mean Vector Column
Name
Sdl
Sd2
I ID | CounTf
Number of Observations
Covariance S Matirx
Name
1 ID 1 Court j
OK
Cancel
Select Worksheet
r New Worksheet
Select New Worksheet Filename
I
f Other Worksheets
Note. In order to use this option, the user should make sure that there is a column for the mean vector and
p columns for the variance covariance matrix, where p is the number of variables in the matrix
o The mean vector is chosen from the "Select Mean Vector
Column" drop-down bar and the columns representing the
columns of variance-covariance matrix are chosen for the
"Covariance S Matrix."
o The selected worksheet represents the worksheet where the new
generated data would be stored. The generated data then can be
used in various other modules of Scout or some other software,
o If the "New Worksheet" is selected, then a name for the
worksheet has to be specified,
o Click "OK" to continue or "Cancel" to cancel the Generate option.
SHU Generate Mullinormali
Available Columns
Name
Wean
MN 0
MN 1
ID Count
0 2
3 10
4 10
Number of Observations
Select Mean Vector Column
|Mean (Count"2)
3
Covariance S Matirx
Std Dev 1
Std Dev 2
OK
Cancel
'Select Worksheet-
f New Worksheet
(* Other Worksheets
B...
59
-------
Output Screen for Multivariate Normal Data.
H Scout 4'.0j = [Worksheet]]
File Edit Configure Data Graphs 5tats/GOF Outliers/Estimates Regression Multivariate EDA Geo5tats Programs Window Help
Navigation Panel |
0
1
2
3 | 4
5
6
Name
Mean
MN_D
Std Devi
Std Dev2 | MN_0
MN_I
Worksheet
D \Narain\Scout_Fo
D.\Narain\Scout_Fo
1
10
15
2
0 6
OS; 1G 2537653947062
1243900850408
--
2
3| 15 3297427239163j 12 2910863942053
3
I 17 2531862559383, 8 21118433085578
4
| 14 4396726483095, 8 60121110989546'
1 ( i
5
- -
1 15 3956066747923^ 12 48778492786680
r
i i q 7nd <;n7ni 1 ?: q sq/i n wi k?ki nq
4.3 Impute (NDs)
Data sets with non-detect observations are transformed using the impute option. Various
options are available to impute (estimate or extrapolate) the non-detect observations. The
use of this option generates additional columns consisting of all of the extrapolated non-
detects and detected observations. Those columns can be appended to the any of the
existing open spreadsheets or in a new worksheet.
Click Data > Impute (NDs).
153 File Edit Configure ^^3 Graphs Stats/GOF Outliers/Estimates QA/QC Regression Multivariate EDA GeoStats Programs Window Help
Navigation Panel |
Copy
Generate Data ~
Handle Missing Data ~
Transformation (No NDs) ~
Expand Data
Bensford's Analysis
1
2 !
3
4
5
6
7
8
Name
:ngth
sp-width
p* length
pl-widlh
a_sp-
UnnJh
a.sp-
wnrlth
1
a_pt-
Iwirrfh
d_pt-width
D\Narain\WorkDat
OLSOut ost
OLSresXY gst
OLSOut a ost
OLSresXY a ost
51
3 5
1 4
02
; 02
r 02
1!
1
1
1j
1
4~9
-47;
4 8
""" c:
i 5
I 32
14i
l 13l
1i
1
T
1
1
| 311 15|
! °2'
°i
0,
0
0
2. The "Select Variable to Impute" screen (see Section 3.1.2 and the screen below)
appears. The various options available are:
o Detection Limit: the non-detect observations are given the values of the
detection limit.
p '/2 Detection Limit: the non-detect observations are given the values of the
one-half of the detection limit.
° Zero: the non-detect observations are given zero values.
° Normal ROS: Regression on Order Statistics (ROS) is used to extrapolate
the non-detect observations using a normal model.
o Gamma ROS: Regression on Order Statistics (ROS) is used to extrapolate
the non-detect observations using a gamma model.
60
-------
o Lognormal ROS: Regression on Order Statistics (ROS) is used to
extrapolate non-detect observations using a lognormal model.
° Uniform: the non-detect observations are given a value of a uniform
distribution random number with the lower limit as zero and upper limit as
the detection limit.
3. An example for the Normal ROS is illustrated.
o Click Data > Impute (NDs).
o In the "Select Variable To Impute" screen, the following options
are selected.
o Select the method to replace NDs ("Select NDs Replacement"),
the variable to transform, the New Column Name, and the
worksheet.
° Click "OK" to continue or "Cancel" to cancel the impute option.
61
-------
Output Screen for Impute using Normal ROS.
Id Scout-4,0j - fD:\Nflrain\Scout'_por_V/indows\ScoulSQurce\V/orkDatlnExcel\Data\censQr,-by.-grps1ili
cy File Edit Configure Data Graphs
Navigation Panel
Stats/GOF Outliers/Estimates Regression Miitivariate EDA GeoStats Wndow Help
4.4 Missing
Scout has three methods to handle missing observations. The first method replaces the
missing observations by the mean of the data, the second method replaces the missing
observations by the median of the data and the third method removes the rows with
missing observations. A new column is created for the selected variable using the
selected option. This new column can be added to a new worksheet or an existing
worksheet. Note that observations are given values 1E-31 or 1E+31 (considered to be
missing).
1. Click Data Missing > Replace Missing with Median.
gQ Fde Edit Configure Graphs Stats/GOF Outliers/Estimates QA/QC Regression
Multivariate EDA GeoStats Programs Window Help,
Navigation Panel |
Copy
Generate Data ~
Impute ND Data
1
2 | 3 4
1 5 I S
f
8
Name
ength
sp-width | pl-length pt-wtdlh
d_sp-
Iprinlh
1 cl_sp- 1
1 lAiiHfh
a.pt-
Unnlh i
d pt-width
DANarainWVorkDatl
OLSOut ost
OLSresXY gst
OLSOut_a ost
HI ^rocYY a net
Repl
-1 C 1 i '
lace Missing with Mean
0 2
0*21
02^
1|
i|
i
0
l
1 1l
1
T
L 1
1
'i
1
—
Transformation (No NDs) ~
Expand Data
Bensford's Analysis
Replace Missing with Median
Remove Rows with Missing Data
1 Tbr"~
i
i
o
0
0|
62
-------
2.
The following screen appears:
Variables
Name
1 ID
1 Count I
IA&
u
\ti 1
Selected
Name
I Count
OK
Cancel
Select Worksheet
(• New Worksheet
Select New Worksheet Filename
C Other Worksheets
/a
o Select the variable to modify ("Variables").
o Specify whether the new column should be added to a "New
Worksheet" or to existing "Other Worksheets" (under "Select
Worksheet").
o Click "OK" to continue "Cancel" to cancel the missing option.
Output Screen for Missing (Replace rows with the median).
-PMSBEBCg
~5 File Edit Configure Data Graphs Stats/GOF Outliers/Estimates Regression Multivariate EDA
Navigation Panel
Name
Worksheet
0
1
2
3
Data
m_Data
1
3
3
2
5
5
3
06
06
4
08
08
5
4
4
6
8
8
7
9
9
8
4
4
9
4
10
1
1
11
1
1
12
3
3
13
J0000E+031
4
14
4
4
15
5
5
16
I
63
-------
4.5 Transform (No NDs)
Scout offers a number of options to transform the variables without non-detects:
° z - transform: standardizes the variable; i.e., the mean of the observations is
subtracted and the result is divided by the standard deviation.
® Linear (ax + b): gives a linear transformation of x. The values of "a" and "b" are
entered by the user.
o Natural Log: gives the natural logarithm transform of the variable.
° Log Base 10: gives the logarithm to the base 10 transform of the variable.
° Exp(x): gives the exponential transformation of the variable.
o Pow(x, a): gives the value of the variable "x" raised to power "a."
(x"
\ a /
the
° Box-Cox: gives the Box-Cox transformation of the variable; i.e.,
value of "a" is entered by the user.
o Ranked: gives the order number of the observations in the variable after sorting.
° Ordered: sorts the data in ascending order.
° Rankit: gives the expected values of ordered statistics of the standard normal
distribution corresponding to the data points in a manner determined by the order
in which the data points appear.
° Arcsine: gives the arc-sine value of the observations in the selected variable.
° Group Items: this option is used in conjunction with the Discriminant Analysis
for data sets with groups. This option outputs the group names in a sorted order
in the selected column. This option is useful when the user wants to input the
values of prior probabilities for the groups.
Click Data ^ Transform (No NDs).
tt&l Scout- 4'.0j = [D.: UH a r a i n \Scou r,_W, i ndov
ys^coutSourceAWo
rkDatlnEx
cel\BRADy]j
File Edit Configure
[23 Graphs Stats/GOF
Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window
Help
Navigation Panel |
Copy
1 | 2
3
4
5
6
7
9
Name
Generate Data ~
Impute ND Data
Handle Missing Data ~
V [ k1
x2
x3
D\Narain\Scout_F<
97^ 101
lOTl 95;
19 G
2T5
29 3
299
I
i
i
iml in 7>
?n?
31
I
1
1 1
64
-------
2. The "Select Transform Variable" screen (See also Section 3.1.1) appears.
o Specify the transform to apply ("Select Transform").
o Specify a variable to transform ("Select a Variable to
Transform").
o Specify whether the new column should be added to a "New
Worksheet" or existing, "Other Worksheets" (under "Select
Worksheet"; then, enter a name for the transformed variable
(under "New Column Name").
o Click "OK" to continue or "Cancel" to cancel the Transform option.
1! Select Transform^ Variable
"Select Transform -
(* Z-Transform
f Lmear(ax + b]
C Natural Log
C Log Base 10
Exp(x)
r Pow(n,a)
r Box-Cox
f Ranked
f Ordered
C Rankit
f AicSine
C Group Hems
Select a Variable to Transform
Name
Data
l
' S e t e d Worksheet"
("* NewWorksheet
Other Worksheets
Worksheet
New Column Name
| Z_Transforrn
Select Column
C2 C3 C4 _-l
C5 C6 C7
C8 C3 C10
cn ci 2 d:
C14 C15 C1E:
CI 7 CI 8 CI?
< j
I >1
A
65
-------
Output Screen for Transform (No NDs).
Selected options: z - transform and Ranked.
S.eftut' [MorkSheetaJ
~S File Edit Configure Data Graphs Stats/GOF Outliers/Estimates Regression Multivariate EDA GeoStats Window
Navigation Panel
Name
Worksheet
WorkSheei a
0
1
2
3
4
5
Data
m_Data
Z_T ransform
Ranked
1
3
3
-0 31038S96593722
3
2
5
5
0.5064208391607280
4
3
0.6
'0.6
¦1.290556332054760
10
4
0.8
0.8
¦1.20887555154496
11
5
4
4
0 098016936611754
1
G
8
8
1.73163254680765
12
7
9
9
. 2.14003644935G63
5
8
4
4
0.098016936611754
8
9
4
0.098016936611754
9
10
1
1
-1.12719477103517
13
11
1
1
-1.12719477103517
14
12
3
3
-0 31038696593722
2
13
)0000E+031
4
0.09801693GG11754
15
14
4
4
0.098016936611754
6
15
5
5
0 5064208391607280
7
1. . 1fi
I I
4.6 Expand Data
Scout allows the user to generate the interaction terms using the available variables. This
part of the Scout program was developed so that the user can generate interaction terms
for regression analysis. The highest power supported by Scout is 10. But the user is
cautioned that the maximum number of interaction terms supported by Scout is 256. If
more than 256 terms are generated, then those terms will not be displayed on the
worksheet. The user is also cautioned that generating interaction terms with high degrees
takes up considerable computer resources and computing time.
66
-------
Click Data > Expand Data.
2. The following "Select Transform Variable" screen appears.
Select Variables for Expansion.
Variables
Name
I ID 1 Count]
Stack-Loss
nn
Variables to Expand
Nana I ID I Count I
Ai-Flow
Temp
Acid-Conc
21
21
21
"Expand Selected Variables to this Power
"Select where to place the expanded columns ~
r* Add columns to current woiksheet
(* Race expansion ti a new worksheet
Select Filename for new worksheet
|New Sheet foi Data Expansion
Copy Dependent Colunn to new worksheet
[Stack-Loss (Count = 21)
I- Copy Group column to new woiksheet
Tr]
OK | Cancel |
//.
o Specify the variable to expand ("Variables to Expand").
o Specify the power /degree ("Expand Selected Variables to this
Power").
o Specify whether the new columns should be added to a "New
Worksheet" or existing, "Other Worksheets" (under "Select
Worksheet"; then, enter a name for the transformed variable
(under "New Column Name").
o If new worksheet option is selected specify if the dependent
variable used in regression should be copied to the new worksheet.
o If new worksheet option is selected specify if the group column
should be copied to the new worksheet.
® Click "OK" to continue or "Cancel" to cancel this option.
67
-------
in
Scout 2008j =. [New/Sheet forj Data; ExpansionJ
SO File Edit Configure Data Graphs 5tats/G0F Outliers/Estimates QA/QC Regression Multivariate EDA GeoStats Progr
Navigation Panel
Name
D \Narain\WorkDatl.
Expansion ost
1
0
1
2
3
4
5
6
7
btack-
AA
AB
. AC
BB
BC
CC
42
8,400
2,160
7.120
729
2,403
7.921
2
37
6,400
2,160
7.040
729
2,376
7.744
3
37
5,625
1,875
6.750
625
2,250
8.100
4
28 3.844
1,488
5.394
576
2,088
7,569
5
18] 3,844
1,364
5,394
484
1,914
7,569
6
18, 3,844
1,426
5,394| 529
2,001
7,569
7
19] 3,844
1,488
5,7661 576
2,2321 8.649
8
20
3,844
1,488
5,766
576
2,2321 8,649
9
15
3,384
1,334
5.046
529
2.001
7,569
10
14
3,364
1.044
4.640
324
1,440
6.400
11
14
3,364
1.044
5,162
324
1,602
7,921
12
13
3,364
986
5,104
289
1,496
7.744
13
11
3,364
1,044
4,756
324
1,476
6,724
14
12
3,364
1,102
5,394
361
1,767
8,649
15
8
2,500
900
4,450
324
1,602
7,921
16
7
2,500
900
4,300
324
1,548
7,396
17
8
2.500
950
3,600
361
1,368
5,184
18
8
2,500
950
3.950
361
1,501
6,241
19
9
2.500
1.000
4,000
400
1,600
6,400
20
15
3.138
1.120
4.592
400
1.640
6,724
21
15
4.900
1.400
6.370
400
1.820
8,281
Note' /( second output sheet called " Expansion.ost" will be generated This output sheet will indicate what
the variables in the column header stand for in the interaction terms
Scout 2000' px0ansi6n.ost']|
File Edit Configure Programs Window Help
Navigation Panel
Name
D \Naram\WorkDatl.
New Sheet for Data .
Expansion.ost
' I I
D ate/Time of Computation
I I I
Expansion Legend
10/29/2008 1 2 49 41 PM
From File
D. VN arainSWorkDatl r£ xcel\S T ACKL0 SS
To New Worksheet
New Sheet for Data Expansion
Expanded to the
2nd Power
Representation
Actual Variable Name
j "A"
Air-Flow
"B"
Temp
I „c„
Acid-Conc
68
-------
4.7 Benford's Analysis
Benford's law (see separate pdf file of Appendix C for details), less commonly known as
Newcomb's law, the first digit law, the first digit phenomenon, and the leading digit
phenomenon, was independently discovered first by Simon Newcomb (1881), and then
by Frank Benford (1938). Each noticed that the beginning tables of books of logarithms
were "dirtier" at the beginning (due to use) rather than at the end, noting that some
particular first digits should occur with a greater "natural" frequency.
Newcomb's form of the law is given as
/?(*/, (/) = /) = lo g,o
1 + -Try
i = 1, 2, 3 9
And the equivalent Benford's form of the law is given as
p(d,(/) = i) = log,0
rfi(')+'
M>)
i = l,2,3,...,9
wherep(d\(i) =/) is the probability that the first place,j - 1 (/'= 1,2, 3, ..., w), significant
non-zero integer digit, dj(/) = d/(i), of a number, N, has a particular integer value, /.
Those logarithmically distributed significant digits can be calculated and summarized as
First Place Digit Integer, d\(i)
Probability of Occurrence p(d\(i) = i)
3 9
/= I,2,3,...,9
l
0.30103
2
0.17609
3
0.12494
4
0.09691
5
0.07918
6
0.06695
7
0.05799
8
0.05115
9
0.04578
Click Data E> Benford's Analysis.
Scout 2QQ8; - f D:\yarain\ScoutJgn_WindowsWcgutSource\WorkDat InE-^e Wfeiidom0ata250Q.xls]!
Data
~§ File Edit Configure
Navigation Panel
Graphs Stats/GOF Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window Help
Name
MHMIBfflMBIIM
GOFNoNDsStats o
Copy
Generate Data
Impute ND Data
Handle Missing Data
Transform (No NDs)
Bensford's Analysis
i
MN 1
MN_2
MN_3
MN 4
MN 5
MN_6
MN_7
MN
752359686 5858412241 150022582571065760841749422081 3922904801,460564284139256E
J 1 I L
,523887297 3962833375D4751683674467529427190353773332340182683469816317182592
t 1 1 1 J 1 1 :
I 2822S29723914046790 33296291612993850442'557545660 3685536433 585416521130540?
It i i i I i i >
69
-------
2. The "Select Variables" screen (Section 3.2) will appear.
° Select one or more variables from the "Select Variables" screen.
o If graphs have to be produced by using a group variable, then select a group
variable by clicking the arrow below the "Group by Variable" button. This
will result in a drop-down list of available variables. The user should select
an appropriate variable representing a group variable.
o Click "OK" to continue or "Cancel" to cancel Benford's analysis.
Output example: The data set "RandomData2500.xls" was used. The results of the
first digit analysis and the second digit analysis were computed.
Output for Benford's Analysis.
'Benford Analysts
User Selected Options
Date/T ime of Computation
1/30/2008 5 53 14 PM
From FOe
DANarainVScout_Foj_Windows\ScoutSource\WoikDatlnExcel\1FlandomData2500 xls
Full Precision
OFF
MN_0
Numbei of Valid Observations
2500
Number of Distinct Obseivations
2500
Benford's First Digit Analysis
0
1
2
3
4
5
6
7
8
9
Expected
0 00000
030103
017609
012494
0 09691
0 07318
0 06695
0 05799
0 05115
0 04576
Actual
0 0)000
0 40280
0 20040
0 07920
0 05080
0 05480
0 05600
0 05360
0 05040
0 05200
Benford's Second Digit Anajpss
0
1
2
3
4
5
6
7
8
9
Expected
011968
011389
010832
010433
010031
0 09668
0 09337
0 09035
0 08757
0 08500
Actual
0.11760
0.11400
012520
0 10520
010640
0 09640
0 09200
0 08080
0 08920
0 07320
70
-------
References
F. Benford, "The Law of Anomalous Numbers." Proceedings of the American
Philosophical Society, 78, 551-572 (1938).
ProUCL 4.00.04. (2009). "ProUCL Version 4.00.04 Technical Guide." The software
ProUCL 4.00.04 can be downloaded from web site at:
http://www.epa.gov/esd/tsc/software.htm.
ProUCL 4.00.04. (2009). "ProUCL Version 4.00.04 User Guide." The software
ProUCL 4.00.04 can be downloaded from the web site at:
http://www.epa.gov/esd/tsc/software.htm.
S. Newcomb, "Note on the Frequency of Use of the Different Digits in Natural
Numbers," American Journal of Mathematics, 4, 39-40 (1881).
71
-------
Chapter 5
Graphs
The Graphs option provides graphical displays for both univariate and multivariate data.
~5, file Edit Configure Data
Stats/GOF Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window Help
Navigation Panel |
Univariate ~ | -j
2
3
4 | 5
6 | 7
8
Name
bcatter plots ~ |
' y
x1
x2
x3 |
|
D \Narain\Scout_Fo .
1 i l; 97
101
19 G
28 3|
1
5.1 Univariate Graphs
Three commonly used graphical displays are available under the Univariate Graph
Option:
o Box Plots
o Histogram
o Multi-Q-Q
° The box plots and multiple Q-Q plots can be used for full data sets without non-
detects and also for data sets with non-detect values.
o Three options are available to draw Q-Q plots with non-detect (ND) observations.
Specifically, Q-Q plots are displayed only for detected values, with NDs replaced
by Vi detection limit (DL) values, or with NDs replaced by the respective
detection limits. The statistics displayed on a Q-Q plot (mean, sd, slope, and
intercept) are computed according to the method used. The NDs are displayed
with a smaller font and in red color.
° Scout can display box plots for data sets with NDs. This kind of graph may not
be very useful if many NDs are present in the data set.
o A few choices are available to construct box plots for data sets with NDs.
For an example, all non-detects below the largest detection limit (DL) and
portion of the box plot below the largest DL are not shown on the box
plot. A horizontal line is displayed at the largest detection limit level.
o Scout constructs a box plot using all of the detected and non-detect (using
DL values) values. Scout shows the full box plot; however, a horizontal
line is displayed at the largest detection limit.
73
-------
o When multiple variables are selected, one can choose to: 1) produce multiple
graphs on the same display by choosing the "Group Graphs" variable option, or
2) produce "Individual Graphs" for each selected variable.
o The "Graph by Group" variable option produces side-by-side box plots, multiple
Q-Q plots, or histograms for the groups of the selected variables representing
samples obtained from multiple populations (groups). Those multiple graphs are
particularly useful to perform two (background vs. site) or more sample visual
comparisons.
o Additionally, the box plot has an optional feature which can be used to
draw lines at statistical limits (e.g., upper limits of background data set)
computed from one population on the box plot obtained using the data
from another population (e.g., a site area of concern). This type of box
plot represents a useful visual comparison of site data with background
threshold values (background upper limits).
o Up to four (4) statistics can be added to a box plot. If the user inputs a
value in the value column, then the check box in that row will get
activated. For example, the user may want to draw horizontal lines at 80th
percentile, 90th percentile, 95th percentile, or a 95% UPL on a box plot.
5.1.1 Box Plots
l. Click Graphs > Univariate > No NDs or With NDs > Box Plot.
~§ File Edit Configure" Data
Navigation Panel
Name
D \Narain\Scout Fo ..
Stats/GOF Outlier's/Estimates Regression Multivariate EDA GeoStats Proc
univariate.
0^ EMEB c>
Scatter Plots ~
With NDs ~
3.202!
I asstea
Histograms
Q-Q Plots
4
5
6
u b roup I
y
Groups
u ijrou
y '
1
19 601
2. The "Select Variables" screen (Section 3.2) will appear.
o Select one or more variables from the "Select Variables" screen.
° If graphs have to be produced by using a group variable, then select a group
variable by clicking the arrow below the "Group by Variable" button. This
will result in a drop-down list of available variables. The user should select an
appropriate variable representing a group variable.
74
-------
o When the "Options" button is clicked, the following window appears.
_Gr=ph by Groups
(• Individual Graphs C Group Graphs
Label
Value
1. r
2. r
3. r r
4. r r
-Graphical Display Optior.s
<• Color Gradient
C For Export (BW Printers)
OK
Cancel
o The default option for "Graph by Groups" is "Individual Graphs."
This option will produce one graph for each selected variable. If you
want to put all the selected variables into a single graph, then select the
"Group Graphs" option. This group graphs option is used when
multiple graphs categorized by a group variable have to be produced
on the same graph.
o The default option for "Graphical Display Options" is "Color
Gradient." If you want to use and import graphs in black and white
into a document or report, then check the radio button next to "For
Export (BW Printers)."
o Click on the "OK" to continue or "Cancel" to cancel the options.
° Click on the "OK" to continue or "Cancel" to cancel the Box Plot.
75
-------
Box Plot Output Screen (Single Graph).
Selected options: Label (Background UPL), Value (103.85), Individual Graphs, and Color Gradient.
Box Plot Output Screen (Group Graphs).
Selected options: Group Graphs and Color Gradient.
76
-------
5.1.2 Histograms
5.1.2.1 NoNDs
I. Click Graphs > Univariate > No NDs i> Histograms.
H Scout'4.^ = [p:Warain\ScoutLFdr^WlindiTO^\ScputSourc^\WgrkDatlnExi:el\Data\censoriib^.-gii^S;1i]J
Graphs
Help
Navigation Panel | I
Boxplots
4
5 | 6 I 7
8
Scatter Plots ~ j With NDs ~
O-O Plots
Name |
U Uioupl i
" V :
Group2X | u-^ouf*| Groups
U LaroupJ
y
RBmmHaaneael
1 | 1| 3202;
y y riyu
i 1j 19 601- 1| 116 467
1
2. The "Select Variables" screen (Section 3.2) will appear.
° Select one or more variables from the "Select Variables" screen.
° If graphs have to be produced by using a group variable, then select a group
variable by clicking the arrow below the "Group by Variable" button. This
will result in a drop-down list of available variables. The user should select
an appropriate variable representing a group variable.
° When the "Options" button is clicked, the following window appears.
fefil Graphs* Histogrami
"Graph by Groups
f* Individual Graphs'
C Group Graphs
¦ G raphical D isplay 0 ptions —
f* Color Gradient
For Export (BW Printers)
•Select Number of Bins
10
OK
Cancel
A
o The default selection for "Graph by Groups" is "Individual
Graphs." This option produces a histogram (or other graphs),
separately for each selected variable. If multiple graphs or graphs by
77
-------
groups are desired, then check the radio button next to "Group
Graphs."
o The default option for ''Graphical Display Options" is "Color
Gradient." If you want to use and import graphs in black and white
into a document or report, then check the radio button next to "For
Export (BW Printers)."
o Specify the number of bins for the selected variable in "Select
Number of Bins" text box. The default is "10."
o Click "OK" to continue or "Cancel" to cancel the option.
• Click "OK" to continue or "Cancel" to cancel the Histogram.
Histogram Output Screen.
Selected options: Group Graphs and Color Gradient.
Histograms for X (1), X (2), X (3)
¦xpiMxcaBxra
78
-------
5.1.2.2 WithNDs
Click Graphs > Univariate E> With NDs l> Histograms.
File Edit Configure Data
Navigation Panel
Name
ET^Nara irRS goatlESBI
rin nut
©gsfiS 5tats/GOF Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window Help
No NDs
5catter Plots ~
IS D
3 202
"4?M
Boxplots
GCBcaiBug
Q-Q Plots
ILMjmupl
—y—
I 1
.ir
Group2X
19 601|
u_u roup-i
y.
GmupGX
1, 1164671
ii fn?a??r
u_u r<
X
2. The "Select Variables" screen (Section 3.2) will appear.
o Select one or more variables from the "Select Variables" screen.
° If graphs have to be produced by using a group variable, then select a group
variable by clicking the arrow below the "Group by Variable" button. This
will result in a drop-down list of available variables. The user should select
an appropriate variable representing a group variable.
o When the "Options" button is clicked, the following window appears.
"Display Non-Delects ——-—1
(* Do not Use Non-D elects
Use Nori-Detect Values
0" Use 1/2 Non-Detect Values
-Graph by Groups
f* Individual Graphs
C Group Graphs
¦ G raphical D isplay 0 ptions —-
(* Color Gradient
C For Export (BW Printers)
"-Select Number of Bins
|l0
OK
Cancel
79
-------
o Specify the "Use Non-detects" option. The default is "Do not Use
Non-detects."
Do not Use Non-detects: Selection of this option excludes the NDs detects
and uses only detected values on the associated histogram.
Use Non-detect Values: Selection of this option treats detection limits as
detected values and uses those detection limits and detected values on the
histogram.
Use 'A Non-detect Values: Selection of this option replaces the detection
limits with their half values, and uses half detection limits and detected
values on the histogram.
o The default selection for "Graph by Groups" is "Individual
Graphs." This option produces a histogram (or other graphs)
separately for each selected variable. If multiple graphs or graphs by
groups are desired, then check the radio button next to "Group
Graphs."
o The default option for "Graphical Display Options" is "Color
Gradient." If you want to use and import graphs in black and white
into a document or report, then check the radio button next to "For
Export (BW Printers)."
o Specify the number of bins for the selected variable in "Select
Number of Bins" text box. The default is "10."
o Click "OK" to continue or "Cancel" to cancel the option.
Click "OK" to continue or "Cancel" to cancel the Histogram.
-------
Histogram Output Screen.
Selected options: Group Graphs and Color Gradient.
5.1.3 Q-Q Plots
5.1.3.1 NoNDs
1. Click Graphs ~ Univariate ~ No NDs ~ Q-Q Plots.
Scout 4.0 [D:\Narai11\Scout_For_Windaw5\ScoutSource\WorkDall11Excel\Dala\Gehan Tes
Navy.xls]
^3 File Edit Configure Data
^223 Stats/GOF Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window
Help
Navigation Panel |
Univariate ~ | No NDs ~
Boxplots
Histograms
4
5
G
7
£
Name
Scatter Plots ~ With NDs ~
1 ! 1! 2
2. The "Select Variables" screen (Section 3.2) will appear.
• Select one or more variables from the "Select Variables" screen.
• If graphs have to be produced by using a group variable, then select a group
variable by clicking the arrow below the "Group by Variable" button. This
will result in a drop-down list of available variables. The user should select
an appropriate variable representing a group variable.
81
-------
When the "Options" button is clicked, the following window appears.
1! OotibnsjGiiaphlilniwar,iate... _ '
-Display Regression Lines
C Do Not Display
Display Regression Lines
"Graphical Display Options —
(* Color Gradient
For Export (BW Printers)
OK
Cancel
o The default option for "Display Regression Lines" is "Do Not
Display." If you want to see regression lines, then check the radio
button next to "Display Regression Lines."
o The default option for "Graphical Display Options" is "Color
Gradient." If you want to use and import graphs in black and white
into a document or report, then check the radio button next to "For
Export (BW Printers)."
o Click "OK" to continue or "Cancel" to cancel the option.
Click "OK" to continue or "Cancel" to cancel the Q-Q Plot.
-------
Q-Q Plot for No NDs Output Screen.
Q-Q Plot for Background
29 40
28 40
27.40
26.40
25.40
24 40
23.40
22 40
21 40
20.40
1940
1840
W 17.40
O 16.40
> 15.40
«a 14.40
J3
o 13.40
T3
g 12 40
"2 11 40
o
10.40
9.40
8.40
N-10
Most ¦ 135000
Sd-91682
Slop#-9.5771
Irtercept • 135000
Correiaticri, R - 0 9622
-0 60
-150
Theoretical Quantiles (Standard Normal)
Note: For Multi-Q-Q plot option, for both "Full" as well as for data sets "With NDs, " the values along the
horizontal axis represent quantiles of a standardized normal distribution (Norma! distribution with mean 0
and standard deviation 1). Quantiles for other distributions (e.g., Gamma distribution) are used when
using Goodness-of-Fit (GOF) test option.
5.1.3.2 With NDs
1. Click Graphs ~ Univariate ~ With NDs ~ Q-Q Plots.
IS Scout 4.0 [D:\Narain\Scout_For_Windows\ScoutSource\WorkDatlnExcel\Data\censor-by-grps1]
ay File Edit Configure Data Stats/GOF Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window
Help
Navigation Panel |
No NDs ~ 19 q
4
5 6
7
8
Name
Scatter Plots ~ Boxplots
U broupl
X
Group2X
Group2><
U UK
x
ItMflaMIIM
, 1 3.202 MfcUfadtu*
0-0 Plots
1
19.601 1
116.467
nO HIM nnt
-> 14
1
7 3 RSfi 1
in?n??
2. The "Select Variables" screen (Section 3.2) will appear.
• Select one or more variables from the "Select Variables1' screen.
If graphs have to be produced by using a group variable, then select a group
variable by clicking the arrow below the "Group by Variable" button. This
will result in a drop-down list of available variables. The user should select
an appropriate variable representing a group variable.
83
-------
When the "Options" button is clicked, the following window appears.
HI 0^tionsJilh|yariiat^Q.QwND^ ^
"Display Non-Delects
C Do not Display Non-Delects
(* Display Non-Detect Values
C Display 1/2 Non-Detect Values
-Display Regression Lines
<• Do Not Display
C Display Regression Lines'
-.G raphical D isplay 0 ptions —
(* Color Gradient
C For Export (BW Printers)
OK
Cancel
A
o Specify the "Display Non-detects" option. The default is "Do not
Display Non-detects."
Do not Display Non-detects: Selection of this option excludes the NDs
detects and displays only detected values on the associated Q-Q Plot.
Display Non-detect Values: Selection of this option treats detection limits
as detected values and displays those detection limits and detected values
on the Q-Q Plot.
Display 'A Non-detect Values: Selection of this option replaces the
detection limits with their half values, and it displays half detection limits
and detected values on the Q-Q Plot.
o The default option for "Display Regression Lines" is "Do Not
Display." If you want to see regression lines, then check the radio
button next to "Display Regression Lines."
o The default option for "Graphical Display Options" is "Color
Gradient." If you want to use and import graphs in black and white
into a document or report, then check the radio button next to "For
Export (BW Printers)."
-------
o Click "OK" to continue or "Cancel" to cancel the option.
• Click "OK" to continue or "Cancel" to cancel the Q-Q Plot.
Q-Q Plot Output Screen
Selected options: Do not Display Non-detects and Color Gradient.
29 40
28 40
27 40
28 40
25 40
24.40
23 40
22 40
21.40
20.40
Q-Q Plot for Background
* / '
nwkQfound
TotalNunberol 10
Nunber of Non-Deled* ¦ 4
Number ol Detects ¦ 6
Mean-121867
Sd» 5 6419
Slope • 10 3239
intercept -12 1667
Con eteton. R. 0 9806
Ordered Observations
M
6.40
5.40
4.40
M
J.40
2.40
1 40
0.40
-3.60
-1 5
-10
-05 0.0 05
Theoretical Quantiles (Standard Normal)
1 j
15
5.2 Scatter Plots
Two-dimensional (2D) and three-dimensional (3D) Scatter Plots displays are available
under the Graphs Scatter Plots menu. Those graphs can be numbered according to
observations or by groups if a group variable exists in the data set.
5.2.1 2D Scatter Plots
1. Click Graphs ~ Scatter Plots ~ID.
SB Scout 4.0 [D:\Narain\Scoul_For_Windows\ScoutSource\WarkDatlnExcel\BRADU]
¦2 File Edit Configure Data
h| Stats/GOF Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window
Help
Navigation Panel |
Univariate ~ i 1
J 2_
3
4
5
G
7
8
Name
x2
x3
D:\Narain\Scout Fo... ||
1; ^
r io.i
19 G
28.3
85
-------
2.
The "Select Variables" screen (Section 3.2) will appear.
Select two or more variables from the "Select Variables" screen.
If the graphs have to be produced by using a Group variable, then select a
group variable by clicking the arrow below the "Group by Variable"
button. This will result in drop-down list of available variables. The user
should select and click on an appropriate variable representing a group
variable.
Click "OK" to continue or "Cancel" to cancel the Graphs.
2D Scatter Plot.
Data Set Used: Bradu (4 variables).
Scatterplot
i' '
2- , -
a —
The data set Bradu has four variables. The user can choose any one of the four variables
for the X-axis and one of the remaining three for the Y-axis using the drop-down bars in
the graphics toolbar as explained in Chapter 2. The observation numbers of the various
points on the graph can be viewed by right-clicking of the mouse and using the "Point
Labels" option.
86
-------
2D Scatter Plot.
Data Set Used: Iris (4 variables, 3 groups).
1 76 1 96
¦ < A3
Scatterplot
» 0 9
9 9 3
¦i * 9
2.16 2.36 2=6 2.76 236 3.16 3 36 3 56 3.78 3 96 4 16 4.36 4 56 4 64
sp-width
The user can choose any one of the four variables for the X-axis and one of the remaining
three for the Y-axis using the drop-down bars in the graphics toolbar as explained in
Chapter 2.
5.2.2 3D Scatter Plots
l. Click Graphs ~ Scatter Plots^ 3D.
3Scout 4.0 - [D:\Narain\Scout_For_Windows\ScoutSource\WorkDatlnExEel\FULLIRIS]
¦y File Edit Configure Data Stats/GOF Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window
Help
Navigation Panel J
Univariate ~ | 1
2
3
4
5
6
7
a
Name
D:\Narain\Scout Fa...
1 |
sp-width
3.5
ptlength
1.4
pt-width
0.2
87
-------
2.
The "Select Variables'' screen (Section 3.2) will appear.
• Select two or more variables from the "Select Variables" screen.
• If the graphs have to be produced by using a Group variable, then select a
group variable by clicking the arrow below the "Group by Variable"
button. This will result in drop-down list of available variables. The user
should select and click on an appropriate variable representing a group
variable.
• Click "OK" to continue or "Cancel" to cancel the Graphs.
3D Scatter Plot.
Data Set used: Bradu.
O Scatterplot
The user can choose different variables for the three axes using the drop-down bars in the
graphics toolbar as explained in Chapter 2.
88
-------
Rotation of axes using the Chart Rotation Control.
Chart Rotation Control
Y-Axb Z-Ax«
_-] « -JO
K-Am
J ±1 26
Light Level
±1 J _!J"»
3D Scatter Plot using groups.
Data Set Used: Iris (4 variables, 3 groups).
Scatterplot
*
i 1 i" ¦
"Hi
i J *11*1
1 i ¦> *
•i* 1 i*
2
2 -
2 3
2 ?\* 3
. ,2 ,3,, ' i£ *
2Tj3 f i * ..
tj ? . »
sp-length
3 3
"5 ^
¦ l A 2 3
89
-------
90
-------
Chapter 6
Goodness-of-Fit and Descriptive Statistics
6.1 Descriptive Statistics of Univariate Data
This option is used to compute general summary statistics for any or all of the variables
in the data file. Summary statistics can be generated for full data sets without non-detect
observations, and for data sets with non-detect observations. Two menu options: No NDs
(Full) and with non-detects (NDs) are available.
o No NDs (Full) - This option computes summary statistics for any or all of the
variables in a data set without any non-detect values.
° With NDs - This option computes simple summary statistics for any or all of
the variables in a data set that also have ND observations. For variables with
ND observations, simple summary statistics are computed based upon the
detected observations only.
° Multivariate - This option computes the mean vector, the median vector, the
standard deviation vector, the covariance matrix and the correlation matrix for
the multivariate data.
6.1.1 Descriptive (Summary) Statistics for Data Sets with No Non-detects
l. Click Stats/GOF > Descriptive l> No NDs.
§1 Scout 20.08J o[D:\yarainj\Scgul_f;or^W,indqws\ScgutSnurce\WqrMatlnE^el\BRADy])
~jj File Edit Configure Data Graphs
Navigation Panel
Stats/GOF
Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window Help
Name
D \Narain\Scout Fo
1
GOF ~
Hypothesis Testing ~
Intervals ~
With NDs
Multivariate
x3
28 3j
?a qi'
8
2.
The "Select Variables" screen (Section 3.2) will appear.
° Select one or more variables from the "Select Variables" screen.
° If the statistics have to be produced by using a Group variable, then select
a group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
° Click "OK" to continue or "Cancel" to cancel the Descriptive Statistics.
91
-------
The following summary statistics are available for the variables selected.
o Number of Observations
o Number of Missing Values
o Minimum Observed Value
o Maximum Observed Value
o Mean = Sample Average Value
o Q1 = 25th Percentile
o Q2 = Median
o Q3 = 75th Percentile
o 90th Percentile
o 95th Percentile
o 99th Percentile
o (Sample) Standard Deviation
o MAD = Median Absolute Deviation
o MAD/0.675 = Robust Estimate of Variability, Population Standard
Deviation, o
o Skewness = Skewness Statistic
o Kurtosis = Kurtosis Statistic
o CV = Coefficient of Variation
The details of these descriptive (summary) statistics are described in the EPA
(2006) guidance.
-------
Output for Descriptive Statistics - No Non-detects (NDs).
1 ! i
|
Univariate Descriptive Statistics for Datasets with No NDs
Date/Time of Computation
5/28/2007 5:42.00 PM
User Selected Options
From File
D: \N arain\S cout_For_Windows\S coutS ourceVWorkD atl nExcel\IFIIS.xls
Full Precision
OFF
Var 0: | sp-length
Vai2: ptlength
t
Var 0:
sp-width
Var 2:
pt-width |
Number of Observations
50
50
50
50
i Number of Missing Values
0
0
0
0
Minimum Observed Value
43
2.3
1
0.1
; M aximum 0 bserved V alue
5.8
4.4
1.9
06
Mean
5 006
3.428
1.4G2
0 246
(Q1) 25% Percentile
4.8
3.15
1 4
0.2
. (Q 2) Median
5
34
1 5
02
(Q3) 75% Percentile
5.2
3.S5
1 55
03
i 90% Percentile
5.4
3.9
1.7
0.4
| 95% Percentile
5.G
4.05
1 7
0.4
| 99% Percentile
5.75
4.3
1 9
0.55
| Standard Deviation
0 352
0.379
0.174
0.105
; MAD / 0.G745
0 297
0 371
0.148
0
j Skewness
0.12
0 0412
0.106
1.254
j Kurtosis
•0.253
0 955
1.022
1.719
CV
0 0704
0.111
0119
0.428
Note When the variable name is loo long to fit in a single cell, then the variable number and its name are
printed above the results table. In the above output sheet, the variable, sp-length, was chosen as the first
variable and variable, pt-length, was chosen as the third variable. The names of those two variables
cannot fit in individual cells of the descriptive statistics table, hence they are named as Var 0 and Var 2,
respectively, in the table.
93
-------
6.1.2 Descriptive (Summary) Statistics for Data Sets with Non-detects
l. Click Stats/GOF l> Descriptive > With NDs.
IS Scout' 20.0,Bj - [ID^arainVScgutuForcMndowsVScoutSf
jyrceAWorkDatlr
il
jxceNBRAI
M]J
| ~§, File Edit Configure Data Graphs
Outliers/Estimates Regression
Multivariate EDA GeoStats Programs Window
Help
I Navigation Panel I
I
Descriptive ~
No NDs
1
4
5 1
6 | 7 !
£
Name
1
GOF ~
With NDs
r
x3
1
|
Hypothesis Testing ~
Multivariate
L
1
D.\Narain\Scout_Fo .
fl 1
?!
28 3
|
Intervals ~
r
1 !
1 -> 1
J a fi ?n 5'
?na
1 i
2. The "Select Variables" screen (Section 3.2) will appear.
° Select a variable(s) from the list of variables.
o Only those variables that have non-detect values will be shown.
° If the statistics have to be produced by using a Group variable, then select
a group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
o Click "OK" to continue or "Cancel" to cancel the Descriptive Statistics,
o The following summary statistics are available for the variables selected.
o Number of Observations
o Number of Missing Values
o Number of Detects
o Number of Non-detects
o Percentage of Non-detects
o Minimum Observed Detected Value
o Maximum Minimum Observed Detected Value
o Mean of Detected Values
o Median of Detected Values
o Standard Deviation of Detected Values
o MAD/0.675 of Detected Values = Robust Estimate of Variability
(standard deviation)
o Skewness of Detected Values
o Kurtosis of Detected Values
o CV = Detected Values Coefficient of Variation
o Ql - 25th Percentile of All Observations
o Q2 = Median of All Observations
o Q3 = 75th Percentile of All Observations
o 90th Percentile of All Observations
o 95th Percentile of All observations
94
-------
o 99th Percentile of All Observations
Note: In Scout, "Descriptive Statistics "for a data set with non-detect observations represent simple
summary statistics based upon, and calculated from, the data set without using non-detect observations
The simple "Descriptive Statistics /Univariate/ IVith i\'Ds " option only provides simple statistics (e g, %
NDs, max ND, Min ND, Mean of detected values) based upon the detected values only Those statistics
may help a user to determine the degree of skewness (e g , mild, moderate or high) of the data set
consisting of detected values Those statistics may also help the user to choose the most appropriate
method (e.g, KM (BCA) UCL or KM (t) JCL) to compute confidence, prediction and tolerance intervals
Output for Descriptive Statistics - With Non-detects.
1 1
Univaiiate Descriptive Statistics foi Datasets w«lh NDs
Date/T ime of Computation
User Selected Options
5/28/2007 5.44.23 PM
- -
- -
-
.
From File
Full Precision
D' \N arain\S cout_Foi_Windows\S coutS ource\WorkD atl nE xcel\D ata\censor-by-grps1. x I s
OFF --- -- - ' "
X
— -
...... T
Number of Observations
Number of Missing Values
53
0
- -
... . I
- —
Number of Detects
Number of Non-Detects
49
4
~
.
Percentage of Non-Detects
7 547%
Minimum Observed Detect Value
3 202
Maximum Observed Detect Value
121 1
Mean of Detect values
55 05
...
Median of Detect values
31.57
Standard Deviation of Detect values
43.2
MAD / 0.G745 of Detect values
Skewness of Detect values
46.8
0~149
! I
—
Kurtosis of Detect values
-1 758
CV of Detect values
(Q 1)252 Percentile (All ON
0 785
9 608
-
(Q2) Median (All Obs)
31 57
(Q3) 75% Percentile (All Obs)
95 73
90% Percentile (All Obs)
107.6
95% Percentile (All Obs)
99% Percentile (All Obsj
1129
TT§y
- -
-
-
95
-------
6.1.3 Descriptive Statistics for Multivariate Data
l. Click Stats/GOF Descriptive > Multivariate.
SH Scout/ 2008 - [D:\Narain,\Scout_Forc_Wjindows^cqutSource\WorkDatl
Stats/GOF
Multivariate EDA GeoStats Programs Window Help
Navigation Panel |
'Descriptive'
3
NoNDs f
4
5
6
7
£
Name
GOF
~ I
With NDs
r
x3
( Hypothesis Testing
Intervals >
Multivariate
l
D \Naram\Scout_Fo .
1 '•
28.3
-> i
J 51 Rl
1
| |
2. The "Select Variables" screen (Section 3.2) will appear.
o Select a variable(s) from the list of variables.
o If the statistics have to be produced by using a Group variable, then select
a group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
° Click "OK" to continue or "Cancel" to cancel the Descriptive Statistics.
96
-------
Output for Descriptive Statistics - Multivariate.
m
Scout. 200B -[MultiDesc.ost]]
~P File Edit Configure Programs Window Help
Navigation Panel
Name
C \OLD_Drive\MyFil.
MultiDesc ost
1 1
1 1 1 1 1 1 It
Date/T ime erf Computation
User Selected Options
Multivariate Descnptive Statistics
11/13/2008 3 08 34 PM
- - - ¦
-- - -
—
--
--
From File
C VOLDJ}iive\MyFdes\WPWiN\SCOUT\Scout 200810-17-08\Data\Scout v. 2 0 DataMR
-
FuS Precision
OFF
---
.... _
- - —
-
Multivariate Statistics
Number of Observations'166
Number of Selected Variables 4
Mean
sp-length
sp-wtdth
pt-length
pt-width
597
3149
3 772
1 346
Median
sp-length
sp-width
pt-length
pt-width
58
3
4.35
1 4
Ml
S tandaid D aviation
Log Panel |
LOG 3 07 28 PM ^[Information] ParallAX program started as separate Independant program'
LOG 3 08 15 PM >[lnformationj C \OLD_Dnve\MyFiles\WPWIN\SCOU"RScout 2008 10-17-08\Dala\Scoutv 2 0 Data
URISOUT DAT was imported into IRISOUT wst
97
-------
i
Scoulj 2008 - [Mu_ltiDesc.qst]J
'~§ Fite Edit Configure Programs Window Help
Navigation Panel |
III!
—
Name |
|
:: =
C \OLD_Diive\MyFil
MultiDesc ost
Standard Deviation
-
sp-length | sp-width | pHength
pt-wtcfth
— —
1 077 | 0624 1 824
0995
1
Covariance S Matrix
sp-length
sp-width
pt-length
pt-wtcfth
1 16
0 255
1 477
0 736
0 255
0 389
-0186
00255
1.477
-0186
3326
1 176
0 736
00255
1 176
0991
Determinant
0119
Log ol Determinant
-2126
Eigenvalues of Classical Covariance S Matrix
Eva! 1
Eval 2
Eval 3
Eval 4
4 604
0 756
0 426
00806
Sum ol Eigenvalues
5 866
Classical Correlation R Matrix |
JjU
Log Panal |
I I
LOG 3 07 28 PM >[lnformation| ParallAX program started as separate independant programi
LOG 3 08 1 5 PM ^Information] C \OLD_Drive\MyFiles\WPWIN\SCOUT^Cout 2008 10-17-08\Data\Scoutv 2 0 Data
\IRISOUT DAT was imported into IRISOUT wst
98
-------
H Scout- 2008 -. [MultiDesc.qsl]]
o@ File Edit Configure Programs Window Help
Navigation Panel
MB
. n" Xl
Name
C \OLD_Drrve\MyFil
MultiDesc ost
I I
I
"I
. .....L
=1
Classical Correlation R Matrix
sp-lenglh
sp-width
pl-length
pt width
sp-lenglh
1
0 379
0 752
—-fiTsT
0 648
0 686
" 004TT
TS48-
_
sp-widlh
0 379
1
pt-length
0 752
-0164
,
pt-wicfth
0 686
0 0411
1
Determinanl
0 0802
Log of Detetmnant
-2 523
Eigenvalues of Classical ConelationR Matrix
Eval 1
Eval 2
Eval 3 Eval 4
2409
1 147
0 365
0 0795
Sum ol Eigenvalues
4
i i i r i i i
iiil jT
Log Panel |
LOG 3 07 28 PM >[lnfoimation] PaiallWt program started as separate independant programl
LOG 3 08 1 5 PM ^Information] C \OLDJDilveV«1yFiles\WPWIN\SCOUTiScout 2008 10-1 7-08\Data\Scout v 2 0 Data
ilRISOUT DAT was imported into IRI30UT wst
99
-------
Multivaiiate Descriptive Statistics
Date/Time of Computation 3/13/2008 G 27 08 AM
User Selected Options
From File D \Harain\Scout_For_Windows\ScoutSource\WorkDatlnExcel\BRADU
Full Precision OFF
Multivaiiate Statistics |
Number of Observations
75
Number of Selected Variables
4
Mean
y
x1
x2
x3
1 279
3 207
5 597
7 231
Median
y
*1
*2
x3
0.1
1 8
22
Z1
Standaid Deviation
y
*1
x2
x3
3 493
3 653
8239
11 74
Covariance 5 Matrix
y
x1
x2
x3
122
9 477
20 39
31 03
9 477
13.34
28 47
41 24
20.39
28 47
67 88
94 67
31.03
41 24
94 67
137 8
Determinant
1906
Log of Determinant
7 553
E igenvaluet of Classical Covaiiance S Matrix
Eval 1
Eval 2
Eval 3
Eval 4
0 914
1 688
5 538
2231
Sum of Eigenvalues
231 3
100
-------
6.2 Goodness-of-Fit (GOF)
Several goodness-of-fit (GOF) tests for univariate data (both for full data sets, i.e.,
without non-detects, and for data sets with NDs) and multivariate data are available in
Scout. In this user guide, those tests and available options have been illustrated using
screen shots generated by Scout. For more details about those tests, refer to the ProUCL
4.00.04 Technical Guide and the Scout Technical Guide (in preparation).
6.2.1 Univariate GOF
Two choices are available for the goodness-of-fit menu: No NDs (Full) and With NDs.
O No NDs (Full)
o This option is used to analyze full data sets without any non-detect
observations.
o This option tests for the normal, gamma, or lognormal distribution of the
variables selected using the Select Variables option,
o GOF Statistics: this option simply generates an output log of the GOF test
statistics and any derived conclusions about the data distributions of all
selected variables.
O With NDs
o Analyzes data sets that have both non-detected and detected values,
o Six sub-menu items listed and shown below are available for this option.
1. Exclude NDs
2. Normal ROS Estimates
3. Gamma ROS Estimates
4. Lognormal ROS Estimates
5. DL/2 Estimates
6. GOF Statistics
Scout handles Univariate GOF tests in the same way as ProUCL 4.00.04. More
information can be obtained from the ProUCL 4.00.04 Technical Guide and User Guide
(Chapter 8). The major upgrade in Scout for the GOF test of univariate data from
ProUCL 4.00.04 is the presence of Shapiro-Wilk's test for observations greater than 50
and less than 2000 (Royston 1982).
Classical Coiielation R Matin
a
x1
x2
x3
y
1
0 743
0 708
0 757
x1
0 743
1
0 346
0 962
x2
0 708
0 94G
1
0 379
x3
0 757
0 362
0 373
1
Determinant
0 00125
Log of Determinant
¦G 683
Eigenvalues of Classical Correlation R Maine
Eval 1
Eval 2
Eval 3
Eval 4
0 0172
0 055S
0.368
3 559
Sum of Eigenvalues
4
I0I
-------
6.2.1.1 GOF Tests for Data Sets with No NDs
6.2.1.1.1 GOF Tests for Normal and Lognormal Distribution
Click Stats/GOF > GOF > Univariate > No NDs > Normal or Lognormal.
§! Scoutj 4.0,' [B^^ain.VScoytJooJ^indgy^ScqutSgur^ 1
l stats/GOP
Navigation Panel | J
1 j
| Descriptive ~ J ? I ? I i I R I c
7
£
Name
J Normal
p*
T
Group3><
118 467
u br
>
Hypothesis Testing >| Multivariate | With NDs
~
~
Gamma
GOFNROSNorm gst
2
intervals ~ F
» 1 4 238 1
Lognormal
T
102922
II I 11 452I 11 4521 1
¦ 1 ¦¦ i—j^™
0
93 659
The "Select Variables" screen (section 3.2) will appear.
o Select one or more variables from the "Select Variables" screen.
o If graphs have to be produced by using a Group variable, then select a
group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
° Click "Options" for GOF options.
102
-------
Goodness-of-Fit (Normal, Lognormal)
Select Confidence Level -
C SO '/=
(• 95
r 99 '/,
r Method ¦
(• Shapiro V-/ilk
Lilliefors
-Graphs by Group
<• Individual Graphs
C Group Graphs
-Graphical Display Options
<• Color Gradient
f For Export (ESW Printers)
OK
m
Display Regression Lines
Do Not Display
(* Display Regression Lines
Cancel
o The default option for the "Select Confidence Level" is "95%."
o The default GOF method is "Shapiro Wilk." If the sample size is
greater than 50, the program automatically uses the "Lilliefors" test.
o The default method for "Display Regression Lines" is "Do Not
Display." If you want to see regression lines on a Q-Q plot, then
check the radio button next to Display Regression Lines.
o The default option for "Graphs by Group" is "Individual Graphs."
If you want to see the plots for all selected variables on a single graph,
then check the radio button next to Group Graphs.
Note: This option for Graphs by Group is specifically provided when the user wants to display multiple
graphs for a variable by a group variable (e.g., site AOCI. site AOC2, and background) This kind of
display represents a useful visual comparison of the values of a variable (e.g., concentrations of COPC-
Arsenic) collected from two or more groups (e.g , upgradient wells, monitoring wells, residential wells)
103
-------
o The default option for "Graphical Display Options" is "Color
Gradient." If you want to see the graphs in black and white to be
included in reports for later use, then check the radio button next to
For Export (BW Printers).
• Click "OK" to continue or "Cancel" to cancel the goodness-of-fit tests.
Output Screen for Normal Distribution (Full).
Selected options: Shapiro Wilk, Display Regression Line, and For Export (BW Printers).
Normal Q-Q Plot for Arsenic
Arsenic
N - 20
Mean - 5.8725
Sd - 1.224/
Slope - 1.1798
Intercept - 5.8725
Correlation, R - 0.9281
Shapiro-Wilk Test
Test Value - 0.868
Critical Val(0.05) - 0.905
Data not Noimal
-1 0
Theoretical Quantiles (Standard Normal)
-d- Arsenic
104
-------
Output Screen for Lognormal Distribution (Full).
Selected options: Shapiro Wilk, Display Regression Lines, and Color Gradient.
o
¦a i 80
Lognormal Q-Q Plot for Arsenic
Theoretical Quantiles (Standard Normal)
N-20
Mean- 1.7519
Sd- 0.1917
Sbpe-0.19*7
IntetC'T* ¦ 1 7519
Correlation, R ¦ 0.3533
Shapto-Wflk Test
T»*st Stali£lic» 0.932
OBIc«ValLje9;OJJ5)-0.905
Dnla appear Lognorirvil
6.2.1.1.2 GOF Tests for Gamma Distribution
Click Stats/GOF ~ GOF ~ Univariate ~ No NDs ~ Gamma.
a Scout 4.0 • [D:\Narain\Scout_For_Windows\ScoutSourceVWorkDatlnExcel\Data\censor-by-grps1]
Stats/GOF
¦y File Edit Configure Data Graphs J
Navigation Panel I
Name
I'WinriMifflii
GOFNROSNorm.gstB
Cutters/Estimates Regression Multivariate EDA GeoStats Programs Window Help
Descriptive
~ I ?
Bl
Univariate
Dl
No NDs
T1
2
Hypothesis Testing ~ Multivariate
_1T~7
With NDs ~
Normal
Intervals
4.238
A R7
Lognormal
Statistics
p
-------
Goodness-of-Rt (Gamma)
m
-Select Confidence Level
C 90°/.
(?¦ 95'/.
r 99 V.
-Method
(• Anderson Darling
C Kolmogorcrv Smirnov
—Display Regression Lines
C Do Not Display
(• Display Regression Lines
I
"Graph by Groups
<• Individual Graphs
C Group Graphs
-Graphical Display Options
(* Color Gradient
C For Export (BW Printers)
OK
Cancel
o The default option for the "Confidence Level" is "95%."
o The default GOF method is "Anderson Darling."
o The default option for "Display Regression Lines" is "Do Not
Display." If you want to see regression lines on the Gamma Q-Q plot,
then check the radio button next to "Display Regression Lines."
o The default option for "Graph by Groups" is "Individual Graphs."
If you want to see the graphs for all the selected variables into a single
graph, then check the radio button next to "Group Graphs."
o The default option for "Graphical Display Options" is "Color
Gradient." If you want to see the graphs in black and white, check
the radio button next to "For Export (BW Printers)."
o Click "OK" to continue or "Cancel" to cancel the option.
o Click "OK" to continue or "Cancel" to cancel the goodness-of-fit tests.
106
-------
Output Screen for Gamma Distribution (Full).
Selected options: Anderson Darling, Display Regression Lines, Individual Graphs, and Color Gradient.
Gamma Q-Q Plot for Mercury
c? c^" cy" o£
Theoretical Quantiles of Gamma Distribution
• Mercury
N-30
Mean-0.3)55
k star • 1 .1722
Slcipe • 1 0468
Hetcept--00111
Cor elation, R • 05713
Andersorvdarfng Test
Tesl Statistic - 0 730
Crlteai Vatae(0 05) - 0 769
Dale appea* Gamma Dtstr touted
6.2.1.1.3 GOF Statistics
1.
Click Stats/GOF ~ GOF ~ Univariate ~ No NDs ~ Statistics.
J Scout 4.0 - [D:\Narain\Scout_For_Windows\ScoutSource\WorkDatlnExcel\Data\censor-by-grps1]
Stats/GOF T
•s File Edit Configure Data Graphs
Navigation Panel I
I OutliersiEstlmatesRegression Multivariate EDA GeoStatsPrograms Window Help
Name
D:\Narain\Scout Fo..
GOFNROSNorm.gst
Descriptive
~ I 7
0
Hypothesis Testing ~ Multivariate
Intervals
Univariate ~ I No NDs
4Fv>
4.238
4 Fi?
A I . 5 I 7
Normal jpz
With NDs ~ Gamma
Lognormal
Group3X UJj£
1 11G.4E7
1 102.922
(1 93 RR9
2. The "Select Variables" screen (Section 3.2) will appear.
• Select one or more variables from the "Select Variables" screen.
• If the statistics have to be produced by using a Group variable, then select
a group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
Click "Options" for GOF options.
107
-------
-Select Confidence Level
C 30%
f 35%
C 33%
~ K
Cancel
A
o The default option for the "Confidence Level" is "95%."
o Click "OK" to continue or "Cancel" to cancel the option.
Click "OK" to continue or "Cancel" to cancel the Goodness-of-Fit
Statistics.
-------
Output for GOF Statistics for univariate data without Non-detects.
Date/Time of Computation
U sei S elected 0 ptians
From File
Full Precision
Confidence Coefficient
Goodness-of-Fit Test Statistics foi Full Data Sets without Non-Delects
i7T4/TooT4"oT4Gpm
DANaiain\Scout_For_Windows\ScoutSouice\WorkDatlnE>icel\BEETLES
OFF
095
x2
Raw Statistics
Number of Valid Samples
74
Number of Distinct Samples
9
Minimum
8
Maximum
16
Mean of Raw Data
12 99
Standaid Deviation of Raw Data
2.142
I
Kstar
32 67
Mean of Log Transformed Data
2 549
I
Standard Deviation of Log Transformed Data
0177
|
Noimal Distribution Test Regit
Shapiro Wilk Test Statistic
0.894
Shapiro Wilk Critical (0 95) Value
0 95
Lilliefors Test Statistic
0195
!
Lilliefors Critical (0 95) Valuej 0103
Data not Normal at (0.05) Significance Level
Gamma Distribution Test Results
A-D Test Statistic
3.183
A-D Critical (0 95) Value
0 749
K-S Test Statistic
0214
K-S Critica(0 95) Value
0.103
Data not Gamma Distributed at (0.05) Significance Level
I
Lognoimal Distribution T est Resifts
Shapiro Wilk Test Statistic 0.872
Shapiro Wilk Critical (0.95) Value 0.95
Lilliefors Test Statistic 0 225
Lilliefors Critical (0 95) Value 0 103
Data not Lognormaf at (0.05] Significance Level
109
-------
6.2.1.2 GOF Tests for Data Sets With NDs
6.2.1.2.1 GOF Tests Using Exclude NDs for Normal and Lognormal Distribution
Click Stats/GOF GOF > Univariate > With NDs > Exclude NDs >
Normal or Lognormal.
Scout'4'.Qi - [Q:\Narain\Scout^f?orp_V/indpwsJVSaoutSQurce\WorkpatlnExcel\DataVcensor-b^-grps1i]J
Stats/GOF
~y, File Edit Configure Data Graphs
Navigation Panel
Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window Help
Name
GOFNROSNorm gst
1
Hypothesis Testing
Intervals U
4.52
7 233
20 777|
4 238
~T52
"7 233
20 777|
Normal-ROS Estimates ~
Gamma-ROS Estimates ~
Log-R05 Estimates
DL/2 Estimates
Statistics 3S5
1— .
~ 334
t
2. The "Select Variables" screen (Chapter 3) will appear.
° Select one or more variables from the "Select Variables" screen.
° If graphs have to be produced by using a Group variable, then select a
group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
o Click "Options" for GOF options.
1 10
-------
Goodness-of-Fit (Itormal Lognormal)
"Select Confidence Level -
C 9D X
d 95
r 99 V.
-Method
(• Shapiro Wi Ik
C Lilliefors
¦Display Regression Lines
<*" Do Not Display
(• Display Regression Lines
-Graphs by Group1
(* Individual Graphs
C Group Graphs
"Graphical Display Options
(* Color Gradient
C For Export (BW Printers)
OK
Cancel
[~1
The default option for the "Confidence Level" is "95%."
The default GOF method is "Shapiro Wilk." If the sample size is
greater than 50, the program defaults to "Lilliefors" test.
The default for "Display Regression Lines" is "Do Not Display."
If you want to see regression lines on the associated Q-Q plot,
check the radio button next to "Display Regression Lines."
The default option for "Graphs by Group" is "Individual
Graphs." If you want to see the plots for all selected variables on
a single graph, check the radio button next to "Group Graphs."
-------
Note: This option for Graphs by Group is specifically useful when the user wants to display multiple
graphs for a variable by a group variable (e.g., site AOC1, Site AOC2, and background). This kind of
display represents a useful visual comparison of the values of a variable (e.g., concentrations of COPC-
Arsenic) collectedfrom two or more groups (e.g., upgradient wells, monitoring wells, and residential
wells).
o The default option for Graphical Display Option is "Color
Gradient." If you want to see the graphs in black and white,
check the radio button next to "For Export (BW Printers)."
o Click "OK" to continue or "Cancel" to cancel the option.
• Click "OK" to continue or "Cancel" to cancel the goodness-of-fit tests.
Output Screen for Normal Distribution (Exclude NDs).
Selected options: Shapiro Wilk, Display Regression Lines, Group Graphs, and For Export (BW Printers).
Arsenic (subsurface)
Total Number of Data - 10
Number treated a* NO - I
DL-4.5
N - 9
Percent NDs - 10%
Mean - 5.6778
Sd-05911
Slope - 0.1431
Intercept - 5.6778
Correlation, R - 0.2265
Shapiro Wilk Test
Test Statistic - 0.927
Critical Value|0.05) - 0.829
Data appear Normal
Arsenic (surface)
Total Number of Data - 10
Number treated as ND - 2
DL-4.5
N - 8
Percent NDs-20%
Mean - 6.6313
Sd- 1.4400
Slope - 0.1952
Intercept - 6.6313
Correlation, R - 0.1262
Shapiro-WilkTest
Test Statistic - 0.807
Critical Valuep.05) - 0.818
Data not Normal
Normal Q-Q Plots (Statistics using Detected Data)
forArsenic (subsurface), Arsenic (surface)
Theoretical Quantiles (Standard Normal)
-3 Arsenic (subsurface) -o- Arsenic (surface)
112
-------
Output Result for Lognormal Distribution (Exclude NDs).
Selected options: Shapiro Wilk, Display Regression Lines, Group Graphs, and Color Gradient.
*}1.98
Lognormal Q-Q Plots (Statistics using Detected Data)
forArsenic (subsurface), Arsenic (surface)
• Arsenic (subsurface)
Theoretical Quantiles (Standard Normal)
» Arsenic (surface)
Ai sonic (eubsiirticc)
I c(al of Date • 10
Njmber treated at I® • 1
0L- 15040774
H - 9
Percent NDs • 1U*
Mean • 1.7318
Sd-01019
Stope-0.1059
We«cef«-1.7319
ConeUnri, R • 0 8726
Shapr»-W* lest.
Test Statute-0 938
Cffccd Vo((0.05) • 9.829
Data sopMt Lognonal
¦ 10
ToMNl
Percent NDs « 20*
0.19S
18731
-09211
SfwproAMk Test
Test Statistic • 0.835
Crlicaf Va(0 OS) ¦ 0 818
Data tppem Lognormal
6.2.1.2.2 GOF Tests Using Exclude NDs for Gamma Distribution
1. Click Stats/GOF ~ GOF ~ Univariate ~ With NDs ~ Exclude NDs ~
Gamma.
Scout 4.0 • [D:\Narain\Scoul_For_Winduws\ScoutSource\WorkDatliiExcel\Data\censor-by-grps1]
Stats/GOF P
¦jj? File Edit Configure Data Graphs
Navigation Panel I
Name
D:\Narain\Scout Fo
GOFNROSNorm. gst
Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window Help
Descriptive
I 7
A
No NDs
5
~ L.-n
6
7
8
U broup^
U broupj
Hypothesis Testing ~
Intervals
Multivariate
With NDs ~ I Exdude NDs
4.52
7233
20.777
4.238
4.52
7.233
20.777
1
1;
1 i
1
Normal
Normal-ROS Estimates ~ Q
Gamma-ROS Estimates ~ Lognormal
Log-ROS Estimates ~ Fjy"
DL/2 Estimates ~ 334 1
Statistics 965 1
2. The "Select Variables" screen (Chapter 3) will appear.
• Select one or more variables from the "Select Variables" screen.
If graphs have to be produced by using a Group variable, then select a
group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
113
-------
user should select and click on an appropriate variable representing a
group variable.
Click "Options" forGOF options.
Goodness-of-Rt (Gamma)
"Select Confidence Level ¦
r SO '/.
C 95%
C 99 %
¦Graph by Groups
(~,. Individual Graphs
C' Group Graphs
"Graphical Display Options
<•" Color Gradient
C For Export (BW Printers)
OK
"Method
(* .Anderson Darling
C Kolniogorov Smirnov
"Display Regression Lines
C Do Not Display
f* Display Regression Lines
Cancel
m
o The default option for the "Confidence Level" is "95%."
o The default GOF test method is "Anderson Darling."
o The default method for "Display Regression Lines" is "Do Not
Display." If you want to see regression lines on the normal Q-Q plot,
check the radio button next to "Display Regression Lines."
-------
o
The default option for "Graph by Groups" is "Individual Graphs."
If you want to display all selected variables on a single graph, check
the radio button next to "Group Graphs."
o The default option for "Graphical Display Options" is "Color
Gradient." If you want to see the graphs in black and white, check
the radio button next to "For Export (BW Printers)."
o Click "OK" to continue or "Cancel" to cancel the option.
• Click "OK" to continue or "Cancel" to cancel the goodness-of-fit tests.
Output Screen for Gamma Distribution (Exclude NDs).
Selected options: Anderson Darling, Do Not Display, Individual Graphs, and For Export (BW Printers).
Gamma Q-Q Plot for Mercury with NDs
Statistics using Detected Data
Mercury
Total Number of Data - 30
Number treated a* NO - 5
01 - 0.5
N - 25
Percent NDs-17%
Mean - 0.3124
k star - 1.0533
Slope - 1.0294
Intercept - £.0048
Correlation, R- 0.9590
Anderson-Darling Test
Test Statistic-0.861
Critical Value|0.05) - 07/0
Data not Gamma Distributed
Theoretical Quantiles of Gamma Distribution
--J- Mercury
115
-------
6.2.1.2.3 GOF Tests Using Loa-ROS Estimates for Normal and Lognormal
Distribution
1. Click Stats/GOF GOF E> Univariate t> With NDs l> Log-ROS Estimates
> Normal or Lognormal.
Hil Scout- 4.0, - [0:\yarain\Scqut Jqty_W,indowsJScoutSource\WqrkDatlnE^el^ata\^nsorj-h^-gi;[is1j]|
Stats/GOF
~0 File Edit Configure Data Graphs
Navigation Panel
Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window Help
Name
GOFNROSNorm gst
1
Descriptive
~n
Dl
Hypothesis Testing ~
Intervals
_L
Univariate
Multivartate
No NDs
-ui—L 5 I 7
F I u btOUP^I .
-,3=
4 52i
7 233
20 777i
4 238
4 52
"7233
20 777
_LjfOUP4X
Exclude NDs
Normal-R05 Estimates ~ U
tB7|
u roup J
y
2. The "Select Variables" screen (Section 3.2) will appear.
o Select one or more variables from the "Select Variables" screen.
o If graphs have to be produced by using a Group variable, then select a
group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
Gamma-ROS Estimates ~
DL/2 Estroates ~ Gamma
Statistics Lognormal
° Click "Options" for GOF options.
116
-------
Goodness-of-Fit (Normal, Lognormal)
"Select Confidence Level -
r 90 %
& 55
r 99 s/„
¦Method
f* Shapiro WiIk
r Lilliefors
'Display Regression Lines-——
f~" Do Not Display
(* Display Regression Lines
¦Graphs by Group
(* Individual Graphs
r* Group Graphs
"Graphical Display Options
(* Color Gradient
C For Export (BW Printers)
OK
Cancel
o The default option for the "Confidence Level" is "95%."
o The default GOF test method is "Shapiro Wilk." If the sample size is
greater than 50, the program defaults to use the "Lilliefors" test.
o The default method for "Display Regression Lines" is "Do Not
Display." If you want to see regression lines on the normal Q-Q plot,
check the radio button next to "Display Regression Lines."
117
-------
o The default option for "Graphs by Group" is 'Individual Graphs."
If you want to display all selected variables into a single graph, check
the radio button next to "Group Graphs/''
o The default option for "Graphical Display Options" is "Color
Gradient." If you want to see the graphs in black and white, check
the radio button next to "For Export (BW Printers)."
o Click "OK" to continue or "Cancel" to cancel the option.
• Click "OK" to continue or "Cancel" to cancel the goodness-of-fit tests.
Output Screen for Normal Distribution (Log-ROS Estimates).
Selected options: Shapiro Wilk, Display Regression Lines, Group Graphs, and For Export (BW Printers).
10.00
9.00
2.00
t.00
0.00
Normal Q-Q Plots using Robust ROS Method
for Arsenic, Mercury
O m ™
-=i- Arsenic
1 o 1
Theoretical Quantiles (Standard Normal)
o- Mercury
Arsenic
N - 20
Mean - 5.8307
Sd-12799
Slope - 1.2517
Intercept - 5.8307
Correlation, R - 0.9421
Shapiro-WilkTest
Test Value » 0.895
Critical Val(0.05) - 0.905
Data not Normal
Mercury
N - 30
Mean - 0.2767
Sd - 0.2984
Slope - 0.2643
Intercept - 02767
Correlation, R - 0.8618
Shapiro-WilkTest
Test Value - 0.733
Critical Val(0.05) - 0.927
Data not Normal
118
-------
Output Screen for Lognormal Distribution (Log-ROS Estimates).
Selected options: Shapiro Wilk, Display Regression Lines, Group Graphs, and Color Gradient.
* Arsenic (subsurface)
Arsenic (subsurface)
N-10
Mean -1 7068
Sd-0.1246
Slope - 0 .1308
Intercept • 1 7063
Correlation, R - 0 JJ866
Shapro-WHi Test
Test Statistic - 0 931
Crticai Valued 051-0842
Data appear Lospurmai
Arsenic (surface)
N-10
Mean -1 7781
Sd-0 2678
Slope-0 2758
Iriercept ¦ 17781
Correlation, R - 0$82
Shapro-VMIk Test
Test Statistic -0.9*0
CriUeal Vafc*(0.05J - 0.842
Data appear Logncrmal
Lognormal Q-Q Plot for Group
Statistics using Robust ROS Method
-1 o 1
Theoretical Quantiles of Gamma Distribution
» Arsertc (surface)
6.2.1.2.4 GOF Tests Using Log-ROS Estimates for Gamma Distribution
1. Click Stats/GOF ~ GOF ~ Univariate ~ With NDs ~ Log-ROS Estimates
~ Gamma.
01Scout 4.0 [D:\Narain\Scout_For_Windows\ScoutSource\WorkDa1lnExcel\Data\censor by-grpsl]
^ File Edit Configure Data Graphs
Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window Help
Navigation Panel j
Name
HMffllMffM
GOFNROSNorm.gst
Descriptive
M ? i
d msmmi
Hypothesis Testing ~ Multivariate
JJ"
L A_
No NDs
¦i—. 3 6 7 8
* I U broup
-------
Click "Options" for GOF options.
Goodness-pf-Flt (Gamma)
'Select Confidence Level ¦
r so >;
55 %
C 99°/.
-Graph by Groups
(*, Individual Graphs
C Group Graphs
"Graphical Display Options
(• Color Gradient
C For Export (B'.'J Printers)
"Method
<• Anderson Darling
r Koln-.ogorov Smirnov
"Display Regression Una
C Do Not Display
(* Display Regression Lines
OK
Cancel
o The default option for the "Confidence Level" is "95%."
o The default GOF test method is "Anderson Darling."
o The default method for "Display Regression Lines" is "Do Not
Display." If you want to see regression lines on the normal Q-Q plot,
check the radio button next to "Display Regression Lines."
o The default option for "Graph by Groups" is "Individual Graphs."
If you want to put all of the selected variables into a single graph,
check the radio button next to "Group Graphs."
-------
o The default option for "Graphical Display Options" is "Color
Gradient." If you want to see the graphs in black and white, check the
radio button next to "For Export (BW Printers)."
o Click "OK" to continue or "Cancel" to cancel the options.
• Click "OK" to continue or "Cancel" to cancel the goodness-of-fit tests.
Output Screen for Gamma Distribution (Log-ROS Estimates).
Selected options: Anderson Darling, Display Regression Lines, Individual Graphs, and Color Gradient.
Gamma Q-Q Plot for Mercury
Statistics using Robust ROS Method
Mercury
N * 30
Mean • 0.2767
k star -10653
Stope» 11071
Wercepl ¦ -0.0262
Correlation, R ¦ 0 9579
AndersovOering Test
TesJ Stafette -1 425
Crtical Vslue<0 05) - 0.772
Dais not Gemma MsttxJed
0.0058 - 0 0400
0.0165 - 0 0500
0 0336 - 0 0650
0Q326 - 0.0550
0.0492 - 0.0600
0.0607 - 0 0645
0.0785 - 0 0700
0.0785 - 0.0700
00974 - 0 .0067
0.1107-0 0922
1.1000 - 0 9900
I2l
-------
6.2.1.2.5 GOF Tests Using DL/2 Estimates for Normal or Lognormal Distribution
1. Click Stats/GOF E> GOF > Univariate With NDs DL/2 Estimates >
Normal or Lognormal.
liBI Scout4.0) - [D^arainyScoutLggrjyjnd^s^coutSDurceW/orWallhExcelXBat^censoii-.b^.-giipjIi]!
Navigation Panel j I
Name |
Descriptive
~ | Univariate > 1
4 I , 5 6 7 8 |
No NDs ~ u brouoz; K u ljfOUPJ
GOFNROSNorm gst
1
2
Hypothesis Testing
Intervals
1" -
Multivariate |
_J 11 4 238
With NDS ~
Exclude NDs ~ fej ft-
Normal-ROS Estimates ~ 1 - ,
Gamroa-ROS Estimates ~ \\ ^
3
1
| 452
1| 4 52
1l
Loq-ROS Estimates ~
4
1
7.233
11 7 233
1
DL/2 Estimates ~
5
1
20 777| 1| 20.777i 11
Statistics | Gamma |
G
ij
14138, 1| 14138, 1< 18 4S7j 1| 100MUK1UUM|
2. The "Select Variables" screen (Section 3.2) will appear.
o Select one or more variables from the "Select Variables" screen.
o If graphs have to be produced by using a Group variable, then select a
group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
® Click "Options" for GOF options.
122
-------
GoodnessHif-Fit (Normal Lognormal)
"Select Confidence Level"
r SO1/.
95 %
C 99 V,
"Method
(* ShapiroV/iIk
C Lilliefors
"Display Regression Lirves
f~" Do Not Display
(*¦ Display Regression Lines'
¦Graphs by Group
(* Individual Graphs
r Group Graphs
"Graphical Display Options
(* Color Gradient
C For Export (BW Printers)
OK
Cancel
The default option for the "Confidence Level" is "95%."
The default method is "Shapiro Wilk." If the sample size is greater
than 50, the program defaults to the "Lilliefors" test.
The default method for "Display Regression Lines" is "Do Not
Display." If you want to see regression lines on the normal Q-Q plot,
check the radio button next to "Display Regression Lines."
123
-------
o The default option for "Graphs by Group" is "Individual Graphs."
If you want to put ail of the selected variables into a single graph,
check the radio button next to "Group Graphs."
o The default option for "Graphical Display Options" is "Color
Gradient." If you want to see the graphs in black and white, check
the radio button next to "For Export (BW Printers)."
o Click "OK" to continue or "Cancel" to cancel the option.
• Click "OK" to continue or "Cancel" to cancel the goodness-of-fit tests.
Output Screen for Normal Distribution (DL/2 Estimates).
Selected options: Shapiro Wilk, Display Regression Lines, Group Graphs, and Color Gradient.
124
-------
Output Screen for Lognormal Distribution (DL/2 Estimates).
Selected options: Shapiro Wilk, Display Regression Lines, Individual Graphs, and For Export (BW
Printers).
Lognormal Q-Q Plot for Mercury
Metcury
0.00
Statistics using DL/2 Method
N - 30
Mean - -1.7413
J /
4)20
J
J /
Sd - 0.9845
-0.40
J /
Slope - 0.9865
-0.60
J /
Intercept • -1.7413
Correlation, R k 0.9748
41.80
J
Shapiro Wilk Test
-1.00
l/>
§ '20
Test Statistic - 0.931
/ J
Critical Value(0.05) = 0.927
f3 -1.40
0> 1.60
(/>
O 180
/ J
J
/j
/ j
Data appear Lognormal
¦O .2.00
/ j -1
2
1 0 1
Theoretical Quantiles of Gamma Distribution
2
-J- Mercury
6.2.1.2.6 GOF Tests Using DL/2 Estimates for Gamma Distribution
1. Click Stats/GOF ~ GOF ~ Univariate ~ With NDs ~ DL/2 Estimates ~
Gamma.
[3 Scout 4.0 - [D:\Narain\Scout_For_Winduws\ScoutSource\WorkDatlnExcel\Data\censor by grpsi]
Stats/GOF P
¦S File Edit Configure Data Graphs
Navigation Panel I
Name
D:\Narain\Scout Fo.
GOFNROSNorm.gst
Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window Help
Descriptive
~ f ?
~
Hypothesis Testing ~ Multivariate
Intervals > f
I
No NDs
4.52
7.233
20.777
14.138
4238
452
7.233
20.777
14.138
5 6 7 8
I L> brouw r U_broupJ
Exclude NDs ~ "
Normal-ROS Estimates ~
Gamma-ROS Estimates ~
Log-ROS Estimates ~
DL/2 Estimates
Statistics
D
G7
22
159
Normal
18 467
2. The "Select Variables" screen (Section 3.2) will appear.
• Select one or more variables from the "Select Variables" screen.
-jQQ Lognormal
If graphs have to be produced by using a Group variable, then select a
group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
125
-------
user should select and click on an appropriate variable representing a
group variable.
Click "Options" for GOF options.
Goodness-of-Fit (Gamma)
"Select Confidence Level
C 90 %
G 95:4
C 99%
"Graph by Groups
(*,. Individual Graphs
C Group Graphs
"Graphical Display Options
Color Gradient
C For Export (BW Printers)
"Method
(* .Anderson Darling
C Kolnvogorov Smirnov
"Display Regression Lines-
C Do Not Display
(*, Display Regression Lines
OK
Cancel
o The default option for the "Confidence Level" is "95%."
o The default method is "Anderson Darling."
o The default method for "Display Regression Lines" is "Do Not
Display " If you want to see regression lines on the normal Q-Q plot,
check the radio button next to "Display Regression Lines."
-------
o The default option for "Graph by Groups" is "Individual Graphs."
If you want to put all of the selected variables into a single graph,
check the radio button next to "Group Graphs."
o The default option for "Graphical Display Options" is "Color
Gradient." If you want to see the graphs in black and white, check
the radio button next to "For Export (BW Printers)."
o Click "OK" to continue or "Cancel" to cancel the options.
• Click "OK" to continue or "Cancel" to cancel the goodness-of-fit tests.
Output Screen for Gamma Distribution (DL/2 Estimates).
Selected options: Anderson Darling, Display Regression Lines, Individual Graphs, and Color Gradient.
Gamma Q-Q Plot for Mercury
Moicuiy
Statistics using DL/2 Substitution Method
N-30
Me»n-0 2829
k star • 1 0875
Slope -1 0897
090
Nocep. -0023C
CorreWon, R • 0 9617
Andwson-Dartng Test
0 60
j
Test Statistic ¦ 1.128
CJfccal Value(0 051 ¦ 0 771
bate ix* Ottrma Ostrfculed
0 70
-
w
C 0 60
a
e
1050
o
£ o.«o
/
¦o
O
/
0J0
*
J 4
020
M
a *
0.10
/''« *
M
000
c? Or
*> O* O* 5? Q*
Theoretical Ouantiles of Gamma Distribution
S>
-*~ Msrcury
127
-------
6.2.1.2.7 GOF Statistics
I. Click Stats/GOF GOF E> Univariate l> With NDs l> Statistics.
Scout 4'.ft - [B:^ai;ainWcgutJopJA(indow%\ScoutSgurc%VWgr^atrnE^el\Data\censoi;-b^-gcpsJ]j
Stats/GOF
~0 File Edit Configure Data Graphs
Navigation Panel
Outliers/Estimates Regression Multivariate EDA Geo5tats Programs Window Help
Name
GOFNROSNorm gst
1
Descriptive ~ I ¦> I t j n
No NDs
Hypothesis Testing ~ ] Multivariate
Intervals '
5 I 6 I 7 |
^ , - ¦ - I U bfOUP^ I r... I
IS Exclude NDs ~ —
4 52
7 233
20 777j
4 238]
4 52
7 233
20 777|
1671
Normal-ROS Estimates ~ I—-
3221
Gamma-ROS Estimates ~ |
Log-ROS Estimates ~
DL/2 Estimates
2. The "Select Variables" screen (Section 3.2) will appear.
o Select one or more variables from the "Select Variables" screen.
° If the statistics have to be produced by using a Group variable, then select
a group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
o Click "Options" for GOF options.
Sal QfttiionsGQIf Stats;
"Select Confidence Level
C 90%
(*35%
C 33%
OK
Cancel
o The default option for the "Confidence Level" is "95%."
o Click "OK" to continue or "Cancel" to cancel the option.
° Click "OK" to continue or "Cancel" to cancel the Goodness-of-Fit
Statistics.
128
-------
Output for GOF Statistics for univariate data with Non-detects.
; G oodness-of-Fit T est S tatistics for D ata 5 ets wkh Non-Detects
Date/Time of Computation
1/25/20081:01-29 PM
Usei Selected Options
From File
D: \Narain\S cout_F or_W indows\S coutS ouiceW/orkD atl nE xcel\D ata\censor-by-grps1
Full Precision
OFF
Confidence Coefficient
0.95
GtouplX
Obs No.
Num Miss
Num Valid
Detects
NDs
% NDs
GrouplX Data
10
0
10
8
2
20.00%
Number
Minimum
Maximum
Mean
Median
SD
Statistics (Non-Detects Only)
2
4
4
4
4
0
Statistics (Detects Only)
8
3 202
20.78
9.277
6 704
6 283
Statistics (All. NDs treated as DL value]
10
3 202
20 78
8.222
5.347
5.971
Statistics (All. NDs treated as DL/2 value)
10
2
20 78
7.822
5.347
6 334
Statistics [Normal ROS Estimated Data]
10
-2.508
20 78
7.256
5.347
7 034
Statistics (Gamma ROS Estimated Data)
10
1.421
20 78
8.027
5 405
6182
Statistics (Lognormal ROS Estimated Data]
10
2 011
20.78
7 917
5347
6.243
KHat
K Star
Theta Hat
Log Mean
Log Stdv
Log C V
Statistics (Detects Only)
2.674
1 938
3.469
2.029
0.673
0 332
Statistics (NDs = DL)
2.578
1 872
3189
1.901
0 652
0 343
Statistics (NDs = DL/2)
1.844
1 357
4 242
1 762
0 818
0 464
Statistics (Gamma ROS Estimates)
1 995
1.463
4 024
Statistics (Lognormal ROS Estimates)
1 801
0.769
0 427
129
-------
Output for GOF Statistics for univariate data with Non-detects (continued).
Noimal Distribution T est Resits
Shapiro-Wilks (Detects Only)
T est value
0 866
0 253~
0.796
Cut (0.95)
0.818
Conclusion with Alpha(0.05)
Data Appear Normal
Lilliefors (Detects Only)
Shapiro-Wilks (NDs = DL)
0.313
" 0.842 '
Data Appear Normal
Data Not Normal
Lilliefors (NDs = DL)
Shapiro-Wilks (NDs = DL/2)
0.266
~0.848
0 28
Data Appear Noimal
Data Appear Normal
0 842
Lilliefors (NDs = DL/2)
Shapiro-Wilks (Normal ROS Estimates]
0.237
0.28
Data Appear Normal
0 941
0.842
Data Appear Normal
Lilliefors (Normal ROS Estimates)] 0.201 | 0.28 j Data Appear Normal
G amma D istribution Test R esuls
j T est value
Crit (0 95)
Conclusion with Alpha(0.05)
Anderson-Darling (Detects Only)j 0.404
0 722
Kolmogorov-Smirnov [Detects Only)
0.197
0 297
Data Appear Gamma Distributed
Anderson-Darling (NDs = DL)
0.737
0 734
Kolmogorov-Smirnov (NDs = DL]
0 244
0.367
0 269
Data appear Approximate Gamma Distribution
Anderson-Darling (NDs = DL/2)
0 737
Kolmogorov-Smirnov (NDs = DL/2)
0165
0.27
Data Appear Gamma Distributed
Anderson-Darling (Gamma ROS Estimates)
0 355
0.736
Kolmogorov-Smirnov (Gamma ROS Est)
0178
0.27
Data Appear Gamma Distributed
Lognormal D istribution T est R esdts
T est value
Crit (0.95)
Conclusion with Alpha(0.05)
Shapiro-Wilks (Detects Only)
0.932
0 818
Data Appear Lognormal
Lilliefors (Detects Only)
Shapiro-Wilks (NDs = DL)
0.191
0.313
Data Appear Lognormal
Data Appear Lognormal
0.878
0.842
Lilliefors (NDs = DL)
Thapiro-WilkslFlD7=~DL72)
0 226
0 28
Data Appear Lognormal
Data Appear Lognormal
0 94
0 842
Lilliefors (NDs = DL/2)
0.157
0 28
Data Appear Lognormal
Data Appear Lognormal
Shapiro-Wilks (Lognormal ROS Estimates)
0.951
0 842
Lilliefors (Lognormal ROS Estimates)
0.161
0 28
Data Appear Lognormal
Note: DL/2 is not a recommended method
130
-------
6.2.2 Multivariate GOF
The multivariate goodness-of-fit test to test for multinormality of a data set can be
performed using Scout. Several test statistics, including the correlation coefficient based
upon ordered Mahalanobis distances (MDs) versus beta distribution quantiles (and also
approximate chi-square quantiles), multivariate kurtosis, and multivariate skewness, are
available in Scout. The details of those statistics can be found in Singh (1993) and
Mardia (1970).
I. Click Stats/GOF > GOF >¦ Multivariate.
File Edit Configure Data Graphs
Navigation Panel
Name
D\Narain\Scout Fo
2.
? Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window Help
3_
Descriptive
Hypothesis Tesbng
Intervals
J>] Univariate
~OM® it
951
4
5
6
7
8'
i x3
3j 28 3|
5l 289j
1
2. The "Select Variables" screen (Section 3.4) will appear.
o Select two or more variables from the "Select Variables" screen.
° If graphs have to be produced by using a Group variable, then select a group
variable by clicking the arrow below the "Group by Variable" button. This
will result in a drop-down list of available variables. The user should select
and click on an appropriate variable representing a group variable.
Click "Options" for the multivariate GOF options.
@0
"Display Regression Lines 1
'' Do Not Display
O, Display Regression Lines
^Graphical Display'Options —=
(?'Color Gradient
O For Export (BW Printers)
-Select Critical Alpha
G 0 010
O 0,025
<• 0,050
0 0100
C' 0150
C 0 200
C/0 250
O'K-
Cancel
"Quantiles :
(*.' Beta Quantiles
0 Chi Quantiles
.A
131
-------
o Specify the preferred "Critical Alpha." The default is "0.05."
o Specify the distribution (scaled beta or approximate chi-square) of
the MDs used to compute the quantiles. The default is a "Beta"
distribution.
o The default option for Display Regression Lines is "Do Not
Display", and the default option for "Graphical Display Options"
is "Color Gradient."
o Click on "OK" to continue or "Cancel" to cancel the GOF options.
• Click on "OK" to continue or "Cancel" to cancel the GOF computations.
Output Screen for Multivariate GOF.
GOF Q-Q Plot of MDs
Multivariate GOF Statistics
47.94
N « 75
P = 4
45.94
Slope - 2.8882
43.94 M
Intercept - -1.7628
41.94
Skewness(0.0S)» 2.3990
Skewness - 31 0467
39.94
Kirtosis(0.05) ¦ 25.2002
37 94
Kurtosis = 53.9679
Beta Correlation Coetflciert(0.05) - 0 994"
35.94
Beta Correlation Coefficient - 0 8738
Data set does not appear to be Mullnormal
33.94
31.94
29.94
27.94 d
V)
• 25 94
j§ 23 94
w
Q 21 94
T3
£ 1994
P M
517 94
15.94
4
13.94
11.94
9.94
7 94 *
5M ,4m*"' '
3 94
194 [
-0.06
-2X6
-4 06
-1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Beta Quantiles
Note: Several test statistics (correlation coefficient, skewness, and kurtosis) are shown in the above GOF
display. Singh (1993) has outlined some of these procedures to assess multivariate normality. Critical
values for these three statistics have been computed using extensive Monte Carlo simulations. Critical
values are still being simulated at the time of publishing this document. These values will be available in
the Q-Q plots in the near future. The developers of Scout may be contacted to obtain these critical values.
They do plan to publish them in the near future.
132
-------
6.3 Hypothesis Testing
Scout can perform hypothesis tests on data sets with and without ND observations. When
one wants to use two-sample hypothesis tests on data sets with NDs, Scout assumes that
samples from both of the groups have non-detect observations. This means is that a ND
column (with 0 or 1 entries only) needs to be provided for the variable in each of the two
groups. This has to be done even if one of the groups has all detected entries; in this
case, the associated ND column will have all entries equal to " 1." This will allow the
user to compare two groups (e.g., arsenic in background vs. site samples) with one group
having NDs and the other group having all detected data.
The hypothesis testing module of Scout is exactly same as the one available in ProUCL
4.00.04. ProUCL 4.00.04 has been developed to address several environmental
applications. More information on those methods can be obtained from the ProUCL
4.00.04 Technical Guide and User Guide (Chapter 9), respectively.
Note¦ Since the hypothesis testing module of Scout is imported from ProUCL 4 00 04, most of the
terminology used (site concentration, background concentration, background threshold values, etc.) are
borrowedfrom various environmental applications However, all of those tools (e.g., t-test. Gehan test)
can be used in various other applications. For an example, a two-sample t-test can be used to compare the
means of distributions of any two variables Similarly, the Gehan test may be used to compare the
measures of centra! tendency of two distributions based upon data sets with below detection limit
observations
6.3.1.1 Single Sample Hypothesis Tests for Data Sets with No Non-detects
6.3.1.1.1 Single Sample t-Test
l. Click Stats/GOF > Hypothesis Testing E> Single Sample > No NDs > t-
Test.
Scp.ut 4.,0j - [B:J\yarainJSMut_l;q!L.Windows\ScqutSource^WQrkDatlnExcel\Data,\censori-bY.-gii50i]|
Stats/GOF
File Edit Configure Data Graphs
Navigation Panel
Outliers/Estimates Regression Multivariate EDA Geo5tats Programs Window Help
Name
D \Narain\Scout_Fo
HTSS_NoNDs_tTes.
HTSS_NoNDs_Sig .
UTCO MrtMflr. c...
4 52
452
Proportion
Sign test
Wilcoxon Signed Rank
2. The "Select Variables" screen (Section 3.2) will appear.
° Select one or more variables from the "Select Variables" screen.
If the statistics have to be produced by using a Group variable, then select
a group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
133
-------
When the options button is clicked, the following window will be shown.
"Select Null Hypothesis Form
Mean <= Compliance Limit (Forml)
C Mean ?= Compliance Limit (Form 2)
C Mean Compliance Limit + S (Form 2)
Mean = Compliance Limit (2 Sided Alternative)
o Specify the "Confidence Level." The default is "0.95."
o Specify meaningful values for "Substantial Difference, S" and the
"Compliance Limit." The default choice for S is "0."
o Select the form of Null Hypothesis. The default is Mean <=
Compliance Limit (Form 1).
o Click "OK" to continue or "Cancel" to cancel the options.
Click "OK" to continue or "Cancel" to cancel the test.
Confidence Level | 0 95
Substantial Difference. S | jj[
(Used v.ith Test Form 2)
Compliance Limit 1 0
OK
-------
Output for Single Sample t-Test (Full Data without NDs).
j. i I
1 Sample-1
Single Sample t-Test
Raw Statistics
Number of Valid Samples
" ~T '
Number of Distinct Samples
9
Minimum
82 39
Maximum
1132
Mean
59 33
Median
103.5
SD
10.41
SE of Mean
3.463
HO: Site Mean = 100
Test Value
Two Sided Critical Value (0 05)
-0.178
2 30S
P-Value
0.8S3
—
Conclusion with Alpha =0.05
Do Not Reject HO. Conclude Mean = 100
P-Value > Alpha (0.06)
6.3.1.1.2 Single Sample Proportion Test
I. Click Stats/GOF t> Hypothesis Testing B> Single Sample [> No NDs E>
Proportion.
Scquti4.Qi-. [D:l\Marain\ScputJor._V/lindows,VScgutSDurc^\^brj^atlnEOTel.\Dala1\censqr,-by/Tgii[)S;1j]|
Stats/GOF
~y, File Edit Configure Data Graphs
Navigation Panel I
Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window Help
Name
D \Narain\Seout Fo
HTSS_NoNDs_tTes
HTSS_NoNDs_Sig
~ R
Descriptive
GOF
Hypothesis Testing >1 Single Sample Tests ~
Intervals
- -.v; i U laroupi I I U broup^
~ I Two Sample Tests ~ ! With NDs ~
j
u roup J
I
4 52
1
4.52
1
| t-Test |
V
1
Proportion
i"
j 5ign test
Wikoxon Signed Rank
1
2. The "Select Variables" screen (Section 3.2) will appear.
o Select one or more variables from the "Select Variables" screen.
135
-------
If the statistics have to be produced by using a Group variable, then select
a group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
When the options button is clicked, the following window will be shown.
' Single Sample Proportion Test Options [><]
Confidence Level | 0 95
Proportion | o 3
Action/Compliance Limit | 6
"Select Null Hypothesis Form"
(• P <= Porportion (Form 1)
P ;= Proportion (Form 2)
P = Proportion (2 Side Alternatived)
OK
Cancel
A
o Specify the "Confidence Level." The default is "95."
o Specify the "Proportion" level and a meaningful
"Action/Compliance Limit."
o Select the form of Null Hypothesis. The default is P <= Proportion
(Form 1).
o Click "OK" to continue or "Cancel" to cancel the options.
Click "OK" to continue or "Cancel" to cancel the test.
-------
Output for Single Sanplc Proportion Test (Full Data without NDs).
One-Sample Proportion Test
Raw Statistics
Number of Valid Samples
35" j
Number of Distinct Samples
33 |
Minimum
0~5S3 j"
Maximum
7.676
Mean
5133
Median
5.564 J
SD
T.533 j
SEof Mean
0.172 |
Number of Exceed a rices
27 i
i
Sample Proportion of Exceedances j 0 31S |
| HO: Site Proportion <
= 0.3 (Fonnl)
Large Sample z
-Test Value j 0 237 |
Critical Value (005)| 1 645 j
P-Valuej 0.406 |
; Conclusion with Alpha -0.05
Do Not Reject HO. Conclude Site Proportion <= 0.3
P-Value > Alpha (0.05)
-
.
6.3.1.1.3 Single Sample Sign Test
l. Click Stats/GOF > Hypothesis Testing t> Single Sample > No NDs > Sign
test.
Seoul 4.0; - [IJ:\yarahKMut_For.LVVJindw^NScoutSpurce\Wgrl®atlnE>ce.l\Da_tg\censor-by,-gri|)sJj]|
Stats/GOF
~g File Edit ConFqure Data Graphs
Navigation Panel
Outliers/Estimates Regression Multivariate EDA GeoStats Programs Wndow Help
Name
D \Narain\Scout_Fo .
HTSS_NoNDs_tTes. .
HTSS_NoNDs_Sig .
HTR.S NnNDq Rm
1
~ |2 | 3
^ )fs I
Descriptive
GOF
Hypothesis Testing >1 Single Sample Tests >1 No NDs
Intervals
Two Sample Tests
With NDs ~
4 52|
7 ttjP
4 52 j
t-Test
Proportion
Fu broupj
_y
Wilcoxon Stoned Rank
7r3Tl il n Rkh ¦ ¦ — ¦ —1 t -1 "¦urrwr--
137
-------
2. The "Select Variables" screen (Section 3.2) will appear.
o Select one or more variables from the "Select Variables" screen.
o If the statistics have to be produced by using a Group variable, then select
a group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
° When the options button is clicked, the following window will be shown.
§1 Single. Sample Sign test Options,
Confidence Level | 0 95
Substantial Difference. S
(Used with Test Form 2)
1
Action/Compliance Limit | 0~
"Select Null Hypothesis Form
<• Median <- Compliance Limit (Form 1)
Median ;= Compliance Limit (Form 2)
f Median ;= Compliance Limit+ S (Form 2)
Median - Compliance Limit (2 Sided Alternative)
OK
Cancel
o Specify the "Confidence Level." The default choice is "0.95."
o Specify meaningful values for "Substantial Difference, S" and
"Action/Compliance Limit."
o Select the form of Null Hypothesis. The default is Median <=
Compliance Limit (Form 1).
o Click "OK" to continue or "Cancel" to cancel the options.
° Click "OK" to continue or "Cancel" to cancel the test.
138
-------
Output for Single Sample Proportion Test (Full Data without NDs).
— - -
Single Sample Sign Test
Raw Statistics
Number of Valid San-,pies
"10
Number of Distinct Samples
10
Minimum
750
Maximum
1161
Mean
9257
Median
388
SD
136.7
SE of Mean
43 24
Number .Above Limit
" ""3" "
Number Equal Limit
0
Number Below Limit
. __
HO: Site Median >=1000 (Form?)
Test Vciue
.......
Lower Critical Value (0.05)
1
~ P-Vsiue
o.TtF
Conclusion with Alpha = 0.05
Do Not Reject HO. Conclude Median >= 1000
P-Value> Alpha (0.05)
6.3.1.1.4 Single Sample Wilcoxon Signed Rank Test
l. Click Stats/GOF l> Hypothesis Testing Single Sample E> No NDs I>
Wilcoxon Signed Rank test.
139
-------
2. The "Select Variables" screen (Section 3.2) will appear.
o Select one or more variables from the "Select Variables" screen.
° If the statistics have to be produced by using a Group variable, then select
a group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
° When the options button is clicked, the following window will be shown.
Confidence Level | 095
Substantial Diffeience. S
(Used with Test Form 2)
I r
Action/Compliance Limit j 0
"Select Null Hypothesis Form
<• Mean/Median <= Compliance Limit (Form 1)
Mean/Median >= Compliance Limit (Form 2)
PileanMedian >= Compliance Limit + S (Form 2)
f Mean/Median = Compliance Limit (2 Sided Alternative)
OK Cancel
//,
o Specify the "Confidence Level." The default is "0.95."
o Specify meaningful values for "Substantial Difference, S," and
"Action/Compliance Limit."
o Select the form of Null Hypothesis. The default is Mean/Median <=
Compliance Limit (Form 1).
o Click "OK" to continue or "Cancel" to cancel the option.
o Click "OK" to continue or "Cancel" to cancel the test.
140
-------
Output for Single Sample Wilcoxon Signed Rank Test (Full Data without i\l)s)
Single Sample Wilcoxon Signed Rank Test
Raw Statistics
Number of Valid Samples
10
Number of Distinct Samples
10
Minimum
750
Maximum
1161
Mean
925 7
Median
m
SD
13G7
SE of Mean
4324
Number Above Limit
3
Number Equal Limit
0
Number Below Limit
/
T-plus
11 5
T-minus
43 5
HO: Site Median <= 1000 (Form 1)
Test Value
11 5
Critical Value (0 05)
45
P-Value
0 947
Conclusion with Alpha = 0.05
Do Not Reject HO. Conclude Mean/Median <= 1000
P-Value > Alpha (0.05)
6.3.1.2 Single Sample Hypothesis Tests for Data Sets With Non-detects
6.3.1.2.1 Single Sample Proportion Test
l. Click Stats/GOF > Hypothesis Testing l> Single Sample ^ With NDs >
Proportion test.
I4l
-------
3. The "Select Variables" screen (Section 3.2) will appear.
o Select one or more variables from the "Select Variables" screen.
» If the statistics have to be produced by using a Group variable, then select
a group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
° When the options button is clicked, the following window will be shown.
! PH Single Sample Proportion Test Options
0 95
03
Confidence Level
Proportion
Action/Compliance Limit | 6
"Select Null Hypothesis Form
(* P <= Porportion (Form 1)
P ;= Proportion (Form 2)
P = Proportion (2 Side Alternative^)
OK
Cance
A
o Specify the "Confidence Level." The default is "0.95."
o Specify meaningful values for "Proportion" and the
"Action/Compliance Limit."
o Select the form of Null Hypothesis. The default is P <= Proportion
(Form I).
o Click "OK" to continue or "Cancel" to cancel the option.
° Click "OK" to continue or "Cancel" to cancel the test.
142
-------
Output for Single Sample Proportion Test (with NDs).
Arsenic
Single Sample Proportion Test
Raw Statistics
Number of Valid Samples
Number of Distinct Samples
Number of Non-Detect Data
Number of Detected Data
24
10
'13'
11
Percent Non-Detects
Minimum Non-detect
Maximum Non-detect
Minimum Detected
Maximum Detected
Mean of Detected Dat3
Median of Detected Data
SD of Detected Dat3
Number of Exceed a noes
54 17'/,
0.9
2
0.5
"IT
1.23S
0.7
0.565
Sample Proportion of Exceedances
0.0S33
Some Non-Detect Values Exceed
The User Selected Action/Compliance Unit
Unabletodo Proportion Test with such param^Efs
143
-------
6.3.1.2.2 Single Sample Sign Test
1. Click Stats/GOF > Hypothesis Testing ~ Single Sample > With NDs !>
Sign test.
2. The "Select Variables" screen (Section 3.2) will appear.
° Select one or more variables from the "Select Variables" screen.
° If the statistics have to be produced by using a Group variable, then select
a group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
° When the options button is clicked, the following window will be shown.
m
Single-Sample,-Sign. Tert. Options.
Confidence Level j 0 35
Substantial Difference. S I B
(Used v.ith Test Fornv2)
Action/Compliance Limit j 0
-Select Null Hypothesis Form
(* Median <= Compliance Limit (Form 1)
f Median >= Compliance Limit (Form 2)
Median >=Compliance Limit+ S (Form 2)
Median = Compliance Limit (2 Sided Alternative)
OK
Cancel
A
144
-------
o Specify the "Confidence Level." The default is "0.95."
o Specify meaningful values for "Substantial Difference, S" and
"Action/Compliance Limit."
o Select the form of Null Hypothesis. The default is Median <=
Compliance Limit (Form 1).
o Click "OK" to continue or "Cancel" to cancel the option.
o Click "OK" to continue or "Cancel" to cancel the test.
Output for Single Sample Sign Test (Data with Non-dctccts).
<
Arsenic
Single Sample Sign Test
Raw Statistics
Number of Valid Samples
24
Number of Distinct Samples
10
(¦lumber of Mori-Detect Data
13
Number of Detected Data
11
Percent Non-Detects
5417'/,
Minimum Non-detect
09
Maximum Non-detect
2
Minimum Detected
05
jT"
Maximum Detected
Mean of Detected Data
1 236
~~0 T~
0S65
Median of Detected Data
SDof Detected Data
Number Above Limit
0
Number Equal Limit
0
Number Below Limit
24
—
HO: Site Median <=5 (Forml)
Test Value
"1
T7
Upper Critical Value (0 05)
F-Valus
Conclusion with Alpha = 0.06
Do Not Reject HO. Conclude Median <= 5
P-Value > Alpha (0.05)
145
-------
6.3.1.2.3 Single Sample Wilcoxon Signed Rank Test
I. Click Stats/GOF ~ Hypothesis Testing > Single Sample With NDs >
Wilcoxon Signed Rank test.
[H Scout; 4'.0j - [D^orain^cout Jon_\^indo>^\ScoutSource\Wor^atln(^el\Data\cenBqr,-byv-gripsj1l]|
Stats/GOF
~§ File Edit Configure Data Graphs
Navigation Panel I
Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window Help
Name
D \Narain\Scout_Fo
HTSS_NoNDs_tTes
HTSS_NoNDs_Sig
1
Descriptive
GOF
Hypothesis Testing M Single Sample Tests >,
Intervals
c... I U Uroupl | c
No NDs ~ \
G,oup3X
Tr:
~ Two Sample Tests
Proportion
__
; |
Sign test
452|
1l 4 52i 1 i|
Wilcoxon Signed Rank 1
U uioupj
y
1
~T
1
2. The "Select Variables" screen (Section 3.2) will appear.
® Select one or more variables from the "Select Variables" screen.
o If the statistics have to be produced by using a Group variable, then select
a group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
o When the options button is clicked, the following window will be shown.
^ Single SampleWHcoxon Signed, Rank Test Options
Confidence Level | 0 55
Substantial Difference. S
(Used with Test Form 2)
a
Action/Compliance Limit | 0
"Select Null Hypothesis Form
<• MeaniMedian <= Compliance Limit (Form 1)
Mean/Median := Compliance Limit (Form 2)
C Mean/Median ;= Compliance Limit + S (Form 2)
MeariMedian = Compliance Limit (2 Sided Alternative)
OK
Cancel
A
146
-------
o Specify the "Confidence Level." The default is "0.95."
o Specify meaningful values for "Substantial Difference, S" and
"Action/Compliance Limit."
o Select the form of Null Hypothesis. The default is Mean/Median <=
Compliance Limit (Form I).
o Click "OK" to continue or "Cancel" to cancel the option!
o Click "OK" to continue or "Cancel" to cancel the test.
Output for Single Sample Wilcoxon Signed Rank Test (Data with Non-dctccts).
Arsenic
Single Sample Wilcoxon Signed Rank Test
Raw Statistics
Number of Valid Samples
Number of Distinct Samples
Number of Non-Detect Data
2-1
~uT
"iT
Number of Detected Data
11
Percent Non-Detects 5^17'/.
Minimum Non-detect- 0 9
Maximum Non-detect:
Minimum Detected] 0 5
Maximum Detectea 3 2
Mean of Detected Data
Median oT Detected Data
SD of Detected Data
Number Above Limit
Number Equal Limit
Number Below Urr.it
T-p!us
T-minus
1 236
0T~
0 965
"0
0
T-T
0
3C-0
|H0:SiteMedian<=6 (Formi)
Large Sample z-Test Value
Critical Value (005)
Value
¦¦1293
16^5
Conclusion with Alpha = 0.05
Do Not Reject HO. Conclude Mean/Median <= 6
P-Value > Alpha (005)
Dataset contains multiple Nan-Detect values!
All Observations < 2 are treated 33 Non-O^ects
147
-------
6.3.2.1 Two-Sample Hypothesis Tests for Data Sets With No Non-detects
6.3.2.1.1 Two-Sample t-Test
l. Click Stats/GOF Hypothesis Testing > Two-Sample Tests > No NDs !B> t-
Test.
2. The "Select Variables" screen (Section 3.2.2) will appear.
o Select the variables for testing.
o When the options button is clicked, the following window will be shown.
@1 [5= Sample 1 (Foim 2)
Sample 2 >= Sample 1 + S (Form
2)
C Sample 2 = Sample 1 (2 Sided)
OK
Cancel
148
-------
o Specify a useful "Substantial Difference, S" value. The default
choice is "0."
o Choose the "Confidence Level." The default choice is "95%."
o Select the form of Null Hypothesis. The default is AOC <=
Background (Form 1).
o Click on "OK" to continue or on "Cancel" to cancel the option.
° Click on the "OK" to continue or on "Cancel" to cancel the test.
Output for Two-Sample t-Tcst (Full Data without NDs).
Raw Statistics
Sample 1
Sample 2
Number of Valid Samples
10
20
Number of Distinct Samples
9
19
Minimum
3 202
1.5
Maximum
20 78
37 87
Mean
8.222
17 09
Median
5.347
18 79
SD
5.971
9.713
SE of Mean
1.888
2.172
Sample 1 vs Sample 2 T wo-Sample t-T est
HO: Mu of Sample 2-Mu of Samplel < = 0
t-T est
Critical
Method
DF
Value
t (0.050)
P-Value
Pooled (Equal Variance)
28
2G37
1 701
0.007
Satterthwaite (Unequal Variance)
2G.G
3.083
1 703
0 002
Pooled SD 8.688
Conclusion with Alpha = 0.050
* Student t (Pooled) Test: Reject HO, Conclude Sample 2 > Sample 1
* Satterthwaite Test: Reiect HO, Conclude Sample 2> Sample 1
Test of Equality of Variances
Numerator DF
Denominator DF
F-T est Value
P-Value
19
9
2.64G
0.137
Conclusion with Alpha = 0.05
"Two variances appear to be equal
149
-------
6.3.2.1.2 Two-Sample Wilcoxon Mann Whitney Test
l. Click Stats/GOF >• Hypothesis Testing B> Two-Sample Tests l> No NDs >
Wilcoxon Mann Whitney test.
]1 Scout' 4'.Q) - [D:\Narain^cgut Jon_V/indmvs,\ScoutSource\WQrkDatl.nE^el\Data,\censoii=bjfrgiips/1i|J
Navigation Panel |
Descriptive M 2 | 3 | 4 5 6 7 8
Name |
^ ,_LtUaroupl Gloup2< u.broup^ e(oup5< u.uoupj
D \Narain\Scout Fo
HTS S_No N D s_tT e s
HTSS_NoNDs_Sig
HT9C; MnNnc CJin
1
y\ No nds ~ Hbq^s^^rbHhhih
1
___
"~T
i
2
3
1 ^ J- 11 4 j_ r j rj Quantde test
2. The "Select Variables" screen (Section 3.2.2) will appear.
o Select the variables for testing.
o When the options button is clicked, the following window will be shown.
IptionsHvpothesisTiest'2S Sub... |^|fn]
Substantial Difference, S | o
(Used with Test Form 2)
Confidence Coefficient
r 99.9%
r 99.5%
r 99%
r 97.5%
¦ 95%
r 90%
-Select Null Hypothesis Form
<• Sample 2 <= Sample 1 (Foim 1)
Sample 2 >= Sample 1 [Form 2)
C Sample 2 >= Sample 1 + S (Form
2)
Sample 2 = Sample 1 (2 Sided)
OK
Cancel
150
-------
o Specify a "Substantial Difference, S" value. The default choice is
"0."
o Choose the "Confidence Level." The default choice is "95%."
o Select the form of Null Hypothesis. The default is AOC <=
Background (Form 1).
o Click on "OK" button to continue or on "Cancel" button to cancel the
selected options.
° Click on the "OK" button to continue or on the "Cancel" button to cancel
test.
Output for Two-Sample Wilcoxon-Mann-Whitncy Test (Full Data).
Sample 2 Data: X(2]
Sample 1 Data: X(1)
Raw Statistics
Sample 1
Sample 2
Number of Valid Samples
10
20
Number of Distinct Samples
9
19
Minimum
3.202
1 5
Maximum
20.78
37.87
Mean
8 222
17 09
Median
5.347
18.79
SD
5.971
9.713
SE of Mean
1.888
2.172
Wilcoxon-Mann-Whitney (WMW) Test
HO: Mean/Median of Sample 2 <=Mean/Median of Sample 1
Sample 2 Rank Sum W-Stat
366
WMW Test U-Stat
156
WMW Critical Value (0.050)
137
Approximate P-Value
0 00731
Conclusion with Alpha = 0.05
Reject HO, Conclude Sample 2 > Sample 1
151
-------
6.3.2.1.3 Two-Sample Quantile Test
l. Click Stats/GOF ~ Hypothesis Testing > Two-Sample Tests > No NDs l>
Quantile Test.
ijfil Scout 4.0) [D:\^brQin\Scqut^^or3^A0^^sNScoutSource\WorkDQtlnExc^l\Data^censorj-b^-g^Sjt])
Navigation Panel |
I
Descriptive ~ | 2 1 3 4 5
6 7 8
Name
W JJJjroupi Qroup2x
G,oup3X
D:\Narain\Scout_Fo ..
HTSS NoNDs tTes
HTSS_NoNDs_Sig .
UTTT C\«
1
2
3
Two Sample Tests >| No NDs ~ ||
-t > 111- im
t Test
1
1
1 l 1 1 1 With NDs >\ Witoxon-Mann-Whitney
1 4^| 1 452i '1 '
2. The "Select Variables" screen (Section 3.2.2) will appear.
° Select the variables for testing.
° When the options button is clicked, the following window will be shown.
; H Quantile,Test Options
3(i)[
J
Select Confidence Coefficient
C 99•/. r 97 5=/.
c 95V. r 90°/.
OK
Cancel
A.
o Choose the "Confidence Level." The default choice is "95%."
o Click on "OK" button to continue or on "Cancel" button to cancel
the option.
o Click on the "OK" button to continue or on the "Cancel" button to cancel
the test.
152
-------
Output for Two-Sample Quantilc Test (Pull Data).
Non-parametric Quantile Hypothosis Test for Full Dataset (No Non-Detects]
Date/Time of Computation
3/4/2008 G. 52.32 AM
User Selected Options
From File
DAN arainSS cout_For_Windows\S coutS ource\WorkD atl nE xcel\Data\censor-by-grps1
Full Precision
OFF
Confidence Coefficient
95%
Null Hypothesis
Sample 2 Concentration Less Than or Equal to Sample 1 Concentration (Form 1)
Alternative Hypothesis
Sample 2 Concentration Greater Than Sample 1 Concentration
Sample 1 Data: GrouplX
Sample 2 Data: Group2<
Raw Statistics
Sample 1
Sample 2
Number of Valid Samples
10
20
Number of Distinct Samples
9
19
Minimum
3 202
1 5
Maximum
20 78
37.87
Mean
8 222
17.09
Median
5.347
18.79
SD
5.971
9 713
SE of Mean
1 888
2172
QuantileTest
HO: Sample 2 Concentration <= Sample 1 Concentration (Fann 1)
Approximate R Value (0 045)
14
Approximate K Value (0 045]
12
Number of Sample 2 Observations in 'FT Largest
13
Calculated Alpha
0.044G
Conclusion with Alpha = 0.045
Reject HO, Conclude Sample 2 Concentration > Sample 1 Concentration
153
-------
6.3.2.2 Two-Sample Hypothesis Tests for Data Sets With Non-detects
6.3.2.2.1 Two-Sample Wilcoxon Mann Whitney Test
l. Click Stats/GOF ~ Hypothesis Testing Two-Sample Tests > With NDs !>•
Wilcoxon Mann Whitney test.
Scout 4'._Q. - [D:\yaraia\Scout JqLV^indows\ScoutSoyrc^\Wor|^atJnE)i^l\Data,\censgr,-byr-gi:psJ]
Stats/GOF
n§ File Edit Configure Data Graphs
. Navigation Panel
Name
D:\Naram\Scout_Fo.
HTSS_NoNDs_tTes
HTSS_NoNDs_Sig .
HTSS_NoNDs_Sig
Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window Help
Descriptive
GOF
"M 2 | 3 | 4
^-nw I U UfOl
Hypothesis Testing ~
Intervals
U roup I
Single Sample Tests ~ \
siaii; "
Gioup2X
4 521
7 233|
in cn-j |
No NDs ~Jlj-
u uroup^
.v.
8
UJjtOUpJ
y
1 1164671 1
6ioup3X
452
7233
With NDs ~! Wilcoxon-Mann-Whitney
31 5
Gehan
Quantile Test
1
T
T
2. The "Select Variables" screen (Section 3.2.2) will appear.
® Select the variables for testing.
o When the options button is clicked, the following window will be shown.
H 9.p.tibns>H.ypo.the$.isTesfrK._Sub-.... (V [~" ><
Substantial Difference. S | q~
[Used with Test Form 2]
Confidence Coefficient
C 93 93£
C 99.55a:
C 99 2
C 37.5%
G 95Z
r 3QZ
¦Select Null Hypothesis Form
<• Sample 2 <= Sample 1 (Foiml)
f Sample 2 >= Sample 1 (Foim2)
r Sample 2 >= Sample 1 +S (Form
2)
Sample 2 = Sample 1 (2 Sided)
OK
Cancel
A
154
-------
o Specify a meaningful "Substantial Difference, S" value. The
default choice is "0."
o Choose the "Confidence level." The default choice is "95%."
o Select the form of Null Hypothesis. The default is AOC <=
Background (Form 1).
o Click on the "OK" button to continue or on the "Cancel" button to
cancel the selected options.
o Click on "OK" button to continue or on "Cancel" button to cancel the
test.
155
-------
Output for Two-Sample Wilcoxon-.YIann-Whitney Test (with Non-detccts).
Usei SelecledOptions'
From Fde
D \Narain\Scout_FoO//indows\ScoutSoi*ce\WorkDaUr£xce(\Data\cen$of-by-grps1
Full Precision
OFF
Confidence Coefficient
95*
Substantial Difference (S)
0 000
Selected Nufl Hypothesis
Sample 2 Mean/Median Less Than or Equal to Sample 1 Mean/Median (Form 1)
Alternative Hypothesis
Sample 2 Mean/Median Greater Than Sample 1 Mean/Meek an
Sample 1 Data:Gioup1X
Sample 2 Data: Group2<
Raw Statistics
Sample 1
Sample 2
Number of Vaid Samples
10
20
Number of N on-Detect Data
2
2
Number of Detect Data
8
18
Minimum N cm-Detect
4
1 5
Maximum Non-Detect
4
1 5
Percent Non detects
20 00*
10.002
Mriimum Detected
3 202
6316
Max mum Detected
20 78
37 87
Mean of Detected Data
9 277
10 83
Median of Detected Data
£,704
19 36
SD of Detected Data
6 283
8 582
Wilcoson-Mann-Whitney Sample 1 vt Sample 2 Test
All observations <= 4 (Max DL) are ranked the same
Wilcoxon-Mann-Whitney (WMW) Test
HO: Mean/Median of Sample 2 <¦ Mean/Median d Sample 1
Sample 2 Rank Sum W-Stat
369
WMW Test U Stat
159
WMW Critrcd Value (0 050)
137
Approximate P-Value
0 00503
Conclusion with Alpha = 0.05
Reject HO, Conclude Sample 2 > Sample 1
Note: In the WMW test, all observations below the largest detection limit are considered to be NDs
(potentially including detected values) and hence they all receive the same average rank This action may
reduce the associated power of the WMW test considerably. This in turn may lead to incorrect conclusion
All of the hypothesis testing approaches should be supplemented with graphical displays such as 0-0 plots
and box plots. When multiple detection limits are present, the use of the Gehan test is preferable.
156
-------
6.3.2.2.2 Two-Sample Gehan Test
1. Click Stats/GOF > Hypothesis Testing P- Two-Sample Tests > With NDs 6>
Gehan test.
BH
Srout. 4'.^ - [D:.\Nara|ii\Scou(ufor^Windgw^cqutSourc^Workpatlh&ccel\pata\censori-b^g[i^1i]j
Stats/GOF
~Q File Edit Configure Data Graphs
Navigation Panel
Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window Help
Name
D \Narain\Scout_Fo
HTSS_NoNDs_tTes
HTSS_NoNDs_Sig
HTSS_NoNDs_Sig
1
Descriptive
GOF
Hypothesis Testing ~
Intervals
""""452
1|
7 233
I2 I 3
4
5
6
7
8
_fr»_v Lc-.. -iv_
| Single Sample Tesl
\jl±
s ~
m
jfOUpl 1
Group2X
U broupz
y
Group3K
U IjfOUpj
y
„J u i n
| No NDs ~ ki
| V 116467! 1
1
4 52! 1 1; ¦ 11
i
Gehan
1
1;
7 233| T] 315
Quantile Test
—ii
2. The "Select Variables" screen (Section 3.2.2) will appear.
o Select the variables for testing.
° When the options button is clicked, the following window will be shown.
El Qp.tionsHy[)~ thesisXesV2S^Sub... |V|fn]fx
Substantial Difference. S | o~
(Used with Test Form 2)
Confidence Coefficient
r 99 9SJ
r 99 5%
r 3s%
c 97 55;
<* 35%
C 30%
- S elect Null Hypothesis Form
(* Sample 2 <= Sample 1 (Foim 1)
C Sample 2 >= Sample 1 (Form 2)
Sample 2 >= Sample 1 + S (Foim
2)
C Sample 2 = Sample 1 (2 Sided)
OK
Cancel
l 57
-------
o Specify a "Substantial Difference, S" value. The default choice is
"0."
o Choose the "Confidence Level." The default choice is "95%."
o Select the form of Null Hypothesis. The default is AOC <=
Background (Form 1).
o Click on "OK" button to continue or on "Cancel" button to cancel
selected options.
Click on the "OK" button to continue or on the "Cancel" button to cancel
the test.
-------
Output for Two-Sample Gehan Test (with Non-dctccts).
1 Gehan Sample 1 vs Sample 2 Comparison Hypothesis Test foi Data Sets with Non-Detects
Date/Time of Computation
3/4/2008 710 37 AM
U ser S elected 0 ptions
From File
D Warain\Scout_Foi_V/indows\ScoutSource\WoikDatlnExcel\Dala\censoi-by-grps1
Full Precision
OFF
Confidence Coefficient
95%
Substantial Difference
0 000
Selected Null Hypothesis
Sample 2 Mean/Median Less Than or Equal to Sample 1 Mean/Median (Form 1)
Alternative Hypothesis
Sample 2 Mean/Median Gieater Than Sample 1 Mean/Median
Sample 1 Data: GiouplX
Sample 2Data: Group2X
Raw Statistics
Sample 1
Sample 2
Nurnbet of Valid Samples
10
20
Number of Non-Detect Data
2
2
Number ol Detect Data
8
18
Minimum Non-Detect
4
1.5
Maximum Non-Detect
4
1 5
Percent Non detects
20.002
10 002
Minimum Detected
3 202
6316
Maximum Detected
20 78
37 87
Mean of Detected Data
9 277
1883
Median of Detected Data
6 704
19 3G
SD of Detected Data
S 283
8 582
S ample 1 vs S ample 2 G ehan Test
HO: Mean/Median of Sample 2 <=Mean/Meciafi of baduyoin]
Gehan z Test Value
2 55G
Critical z (0 95]
1 645
P-Value
0 00529
Conclusion with Alpha = 005
Reject HO. Conclude Sample 2 > Sample 1
P-Value < a!pha(0.05)
159
-------
6.3.2.2.3 Two-Sample Quantile Test
1. Click Stats/GOF ~ Hypothesis Testing > Two-Sample Tests ~ With NDs ~
Quantile Test.
m
Scout'4. Oj -J[P:\yaraih^cout'_t;or_Wlindoui!s\ScqutSourc^\Wor(UDatlnE>re(\Dataj\censon-byr-grpsJ1j]|
Stats/GOF I
File Edit Configure Data Graphs
Navigation Panel
Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window Help
Name
D \Narain\Scout_Fo
HTSS_NoNDs_tTes .
HTSS_NoNDs_Sig
HTSS_NoNDs_Sig .
2. The "Select Variables" screen (Section 3.2.2) will appear.
° Select the variables for testing.
o When the options button is clicked, the following window will be shown.
iH Quantile Test Options
rSelect Confidence Coefficient
r m r 97.5'/.
C 95'/. r 90"/.
OK
Cancel
o Choose the "Confidence Level." The default choice is "95%."
o Click on "OK" button to continue or on "Cancel" button to cancel the
option.
° Click on the "OK" button to continue or on the "Cancel" button to cancel
the test.
160
-------
Output for Two-Sample Quantilc Test (with Non-detccts).
Date/Time of Computation
Gehan Sample 1 vs Sample 2 Comparison Hypothesis Test for Data Sets withNorhDetects
3/4/2008 7 10 37 AM
User S elected ~ ptions
From File
D'\Narain\Scout_For_Windows\ScoutSource\WoikDatlnExcel\Data\censoi-by-gips1
Full Precision
OFF
Confidence Coefficient
95%
Substantial Difference
0 000
Selected Null Hypothesis
Sample 2 Mean/Median Less Than or Equal to Sample 1 Mean/Median (Form 1]
Alternative Hypothesis
Sample 2 Mean/Median Greater Than Sample 1 Mean/Median
Sample 1 Data: GrouplX
Sample 2 Data: Group2<
Raw Statistics
Sample 1
Sample 2
Number of Valid Samples
10
20
Number of Non-Detect Data
2
2
Number of Detect Data
8
18
Minirrwm Non-Detect
4
1.5
Maximum Non-Detect
4
1 5
Percent Non detects
20 00°4
10 002
Minimum Detected
3 202
6 316
Maximum Detected
20 78
37 87
Mean of Detected Data
9 277
18.83
Median of Detected Data
B 704
19 36
SD of Detected Data
8 283
8 582
Samplel vs Sample 2 Gehan Test
HO: Mean/Median of Sample 2 <=Mean/Medan of badupawJ
Gehan z Test Value
2 556
Ditical 2 (0 95)
1 G45
P-Value
0 00529
Conclusion with Alpha = 0.05
Reject HO. Conclude Sample 2 > Samplel
P-Value < alpha (0.05]
6.4 Classical Intervals
This section illustrates the computations of various parametric and nonparametric lower
and upper limits for the confidence, prediction and tolerance intervals. The data used is
univariate and can be with or with out non-detects. A detailed description of those limits
can be found in the ProUCL 4.00.04 Technical Guide.
161
-------
6.4.1 Upper (Right Sided) Limits
This module in Scout computes various parametric and nonparametric statistics and
upper limits that can be used as background threshold values and other not-to-exceed
values. The detailed illustrations of the computing of those statistics can be found in the
ProUCL 4.00.04 Technical Guide and User Guide (Chapter 10 and Chapter 11).
Right sided limits can be obtained separately, for the data following normal, gamma
lognormal or nonparametric distributions, using any of the four options ("Normal,"
"Gamma," "Lognormal" or "Nonparametric") from the drop-down menu. If the "All"
option in the drop-down menu is used, then the limits for all four distributions are printed
on single output sheet. Examples illustrated for the Upper (Right Sided) limits are shown
using the "All" option.
Scout 4".0) - fDi^araiiiVScqut^^Dr^V/indDws^ScoulSqurceW/or^atlnE^ehDataVcensqr-by-g^ps.lJj
Stats/GOF;
ay Fte Edt Configure Data Graphs
Navigation Panel I
I Ojtkers/Estmates Regression Multivariate EDA Geo Stats Programs Wndow Help
6.4.1.1 Upper (Right Sided) Confidence Limits (UCLs)
6.4.1.1.1 NoNDs
I. Click Stats/GOF Intervals > Upper (Right Sided) > UCLs > No NDs >
All.
H Seoul' A.Oj- [D:,\^arain\SnoiitiJ;flr_V/ihd[nv5\ScpujSpurce\V.forkD.allnExcGl\Dr|ta\(:Rnsor-h^r-grps1i];
Stats/GOF j
~y pjie Edit Conftgue
Navigation Panel
Data Graphs
Outlers/Estimates Regression MJtivarkate EDA GeoStats Programs Window Help
HTSS_N o N DsjT e s
HTSS_NoNDs_Sig
HTSS_NoNDs_Sig
UBSNoNDsAI! ost
2. The "Select Variables" screen (Section 3.2) will appear.
° Select one or more variables from the "Select Variables" screen.
162
-------
° If the statistics have to be produced by using a Group variable, then select
a group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
o When the option button is clicked, the following window will be shown.
ill Options, £UJM-Ls) tfe tHBs,
Confidence Level | fiHH
Number of Bootstrap Operations I 2000
OK
Cancel
A
o Choose the "Confidence Level." The default choice is "95%."
o Choose "Number of Bootstrap Operations." The default is "2000."
o Click on "OK" button to continue or on "Cancel" button to cancel the
option.
° Click on the "OK" button to continue or on the "Cancel" button to cancel
the UCLs.
163
-------
Output Screen for UCL for Data Sets with No iS'on-detccts (All option).
General UCL Statistics feu Full Data Sets
Usei Selected Options
From Fife D Warain\ScoU_FoOi^r)dow$\$cciutSource\WotkDatlnExcel\Data\cen$oi-by-grp$1
FulPieciston iOFF"
Confidence Coefficient Is*
Number of Bootstiap Operations f2000
GeneralStatistics
Number of Vabd Observation^ 53 |~
Numbei of Drstmct Observatory 51
Raw Statistics
Minmim
15
Maxirium
1211
Mean
511
Median
24 56
~ T37(T~
Coefficient of Variation
~0lG7~
Skewness
0 277
Log-transformed Statistics
Minimum of Log Dataj 0 405
Maximum of Log Dataj 4 797
Mean of log Dataj 3 325
SD of tog Dataj 1233
......
Relevant UCL Statistics
Normal Distribution Test
USefors T est Statistcj 0 247
Ljfcefors Critical Value; 0122
Data Not Normal at 52 Significance Level
Assuming Normal Distribution
95* Student's* UCLI 6118
95% UCL* (Adjusted for Skewness)
95^Adjusted-CLT UCLP 6124"
Lognormal D istribution Test
Uiefors Test Statistic! 0 225
UDiefors Critical Value' 0122
D ata N ot Lognormal at 52 Significance Level
Assuming Lognormal Distribution
~35*H-UOJ 1005
95* Chebyshev]MVUEyucL^ T24 7
97 5^Chebyshev [MVUE"]Uai " 151"5~
~ " 99* Chebysbev [MVUETuCL|^20T f
Gamma Distribution Test
k star (bias corrected)^
0312 |
Data Distribution
D ata do not follow aD iscernable Distribution (Q.05j
Theta star
56 04
nu star
36 66
Approxmate Chisquare Value (05)
7498
Nonpaiametiic Statistics
Ad|usted Level of Significance
0 0455
95£CLT"ua["
61
Adjusted Chisquare Value
74.45
953; Jackknrfe UCLJ
61 18
95* Standaid Bootstrap UCL
60 9
Anderson-Darling Test Statistic
2 591
95*BootsttaptUCL
61 13
Anderson-Darling 5* Critical Value
"0782
95* Halfs Bootstrap UCL
61 15
Kolmogorov-Smirnov Test Statisticj 0 222
Kolmogorov-Smnnov 5* Cubed Valuej~ 0 126
Data Not Gamma Distributed at 5% Significance Level
Assuming Gamma Distribution
95% Approximate Gamma UCLl
95* Percentile Bootstrap UCLj 6133
95* BCA BootsliapUCLj " 61 03
" ^Tfwb^TjMiTa SdjUCLj
d]UCL
97 5* ChebyshevfMean, Sd) L
993; Chebyshev(Mean, Sd) UCL
95* Adjusted Gamma UCLj
6588
*66*35*"
Potential U CL to Use
Use 97 5* Chebyshev (Mean, Sd] UCL 88 S6
77 32
88 SB
1109
164
-------
6.4.1.1.2 WithNDs
l. Click Stats/GOF > Intervals t> Upper (Right Sided) UCLs > With NDs
~ All.
§5) Scout- 4.0; ¦ [DiVHflrainXScout^or^WindowsVScoutSqurceUVqrkDatlnExceJVDalaVcensor-b^-grpsli];
,c§ File Edit Configure Data Graphs
Navigation Panel
Name
ESBBSEBSZI
HTSS_NoNDs_tTes
HTSS_NoNDs_Sig
HTSS_NoNDs_Sig
UBSNoNDsAII ost
UCLNoNDsAlf ost
UCLwNDsALL ost
| Outters/Estnates Regression Muttivanate EDA GeoStats Pro^arrts Window Help
2. The "Select Variables" screen (Section 3.2) will appear.
° Select one or more variables from the "Select Variables" screen.
o If the statistics have to be produced by using a Group variable, then select
a group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
o When the option button is clicked, the following window will be shown.
H Qotibnsj Wjithj MM
Confidence Level j Big
Number of Bootstrap Operations I 2000
OK
Cancel
A
o Choose the "Confidence Level." The default choice is "95%."
o Choose "Number of Bootstrap Operations." The default is "2000."
o Click on "OK" button to continue or on "Cancel" button to cancel the
option.
165
-------
° Click on the "OK" button to continue or on the "Cancel" button to cancel
the UCLs.
Output Screen for UCL for Data Sets with Non-dctects (All option).
User Selected Options
From File
General UCL Statistics foi Data Sets with Non-Detects
D l\Narain\Scout_FoT_Wmdows\ScoulSouice\WorkDatlnExce(\Data\cen$or-by-gips1
Fiil Precision | OFF
Confidence Coefficient
95%
Number of Bootstrap Operations
2000
- - —
X
General Statistic*
Number of Valid Data
Number of Distinct Detected Data
53j Number of Detected Data
43i Number of Non-Detect Data
49
4
| Percent Non-Detects
7,55*
Raw Statistics
Log-transformed Statistics
Minimum Detectedj 3 202
Minimum Detectedj 1164
Maximum Detected 121,1
i
Maamum Detected 4 797
Mean of Detectedj 55 05
Mean of Detectedj 3523
SD of Detected 43 2
!
SD of Detected
1128
Minimum Non-Detect! 15
I
Mromum Non-Detect
0.405
Maximum Non-Detect
4
Maamum Non-Detect
1 38G
Note Data have multiple DLs - Use of KM Method is recommended
Number treated as Non-Detect
5
For all rrrethods (except KM, T)U2. and ROS Methods),
Number treated as Detected
48
Observations < Largest ND aie treated as NDs
Single DL Non-Detect Percentage
943%
UCL Statistics
Normal Distribution T est with Detected Values Only
LognormalDistribution T est with Detected Values Onfci
Ldbefors Test Statistic
0 802
UDiefors Test Statistic
0 856
5% Uliefois Critical Value
0 947
5% Lilliefots Ditical Value
0947
Data Not Normal at 5% Significance Level
Data Not Lognormal at 5£ Significance Level
166
-------
Output Screen for UCL for Data Sets with Non-dctccts (All option) (continued).
I Assuming Lognormal Distribution
Assuming Noimal Distribution
DU2 Substitution Method
Mean
"95XDl72(iTUCL
Maximum Likelihood Estimate(MLE) Method
Mean
"SD
355£MLETl|Ua
952 MLE~(T ku) UCL
439
eTT
48 86
4677
59 62
DL72 Substitution Method^
Mean]
"3 273
SD I
1 406
95X H-Stat [DL/2] UCL
"1055
Log R0S Method
Mean in Log Scale'
334
594{
SD in Log Scale
Mean tn Original Scale|
SD m Original Scale
SSXT'ercentile Bootstrap UCL,
952 BCA Bootstrap UCL]
1 264 j
"5TT31_
4375j
HT®"
GO 82!
Gamma Distribution T est with Detected Values Oriy
k star (bias corrected)
Theta star
nu star
A-D Test Statistic
52 A-D Critical Value
K-S Test Statistic
52 K-S Critical Value
1 111
4*54
"10F9
0 775
""0*775
Data Distribution T est with Detected Values OrJy
D ata do not follow a D iscernable Distribution (0.05)
Nonparametric Statistics
Kaplan-Meier (KM) Method,
Mean'
51.14|
013!
Data Not Gamma Distributed at Significance Level
Assuming Gamma Distribution
SD'
SE of Mean
43 33!
6 013
952 KM (t)UCL
352 KM (z)UClJ
61 21
—61 03
Gamma R0S Statistics using Extrapolated Data
i
|
95* KM (jackkrufe) UCLi
61 14
Minrmum
1 OOOOE-9]
952 KM (bootstrap t) UCL1
62 07
Maximum
121 11
509]
952 KM (BCA) UCL j
60 58
Mean
952 KM (Percentile Bootstrap) UCL|
60 92
Median
24 56|
95X KM (Chebyshev) UCLl
77.35
SD
44 02
97 5% KM (Chebyshev) UCL|
89 69
k starj
Theta staj
Nu starj
AppChi2
952 Gamma Appioximate UCLi
0 302!
992 KM (Chebyshev) UCL1
111
952 Adjusted Gamma UCL|
Note: DL/2 is not a recommended method.
1693;
3205!
20 ifr
81 11 j
"8221
Potential UCLs to U te
952 KM (Chebyshev) UCL'
77 35
167
-------
6.4.1.2 Upper Prediction Limits (UPL) / Upper Tolerance Limits (UTL)
6.4.1.2.1 NoNDs
1. Click Stats/GOF > Intervals ~ Upper (Right Sided) ^ UPL/UTL ~ No
NDs P> All.
^ Scout* 4..QJ ¦ [D:\Nqrflin\ScQul^f}or_Windows\Scqut^urce\WorkDqtlnExcel\DataVcen5qrtljyr-gi]^sJi]j
Stats/GOF I
~5 File Edit Cortftgure Data Graphs
Navigation Panel I
I Out kers/Est mates Regression Multivariate EDA Geo Stats Prcnyams Wndow Help
Name
Descriptive
GOF
Hypothesis Testng ~
Groc^lX
Gtoup2><
_m rrq
u broup^!
y
Prethctcn Intervals ~ I
Tolerance Intervals ~ [
Confidence Intervals ~ Jl
GroupSK
116 467
102 922
8
U broupJ
" y
93 659
1
" If
~r
Upper (Right-Sided) >1 UPL/im. >1 No NDs
T* uas
With NDs ~
18 467j 1
100 859
15006 1
~8f9
2. The "Select Variables" screen (Section 3.2) will appear.
° Select one or more variables from the "Select Variables" screen.
o If the statistics have to be produced by using a Group variable, then select
a group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
° When the option button is clicked, the following window will be shown.
!3S|
Qjitijjhs, ygyyiEL Mai iHBs>
Number of Bootstrap Operations
OK
Confidence Level 13
Coverage | 0.9
Different or Future K Values 1
2000
Cancel
A
168
-------
o Specify the "Confidence Level"; a number in the interval [0.5, 1), 0.5
inclusive. The default choice is "0.95."
o Specify the "Coverage" level; a number in the interval (0.0, 1).
Default is "0.9."
o Specify the next "K." The default choice is "1."
o Specify the "Number of Bootstrap Operations." The default choice
is "2000."
o Click on "OK" button to continue or on "Cancel" button to cancel the
option.
o Click on "OK" button to continue or on "Cancel" button to cancel the
UPLs and UTLs.
169
-------
Output Screen for LPL/LTL for Data Sets with No Non-detects (All option).
User SelectedO ptions
General Background Statistics for Fii Data Sets
'
From File
D\Narain\S cout_For_Windows\S coutS ource\WorkD atlnExceIND ata\censor-by-gr ps1
Full Precision
OFF
Confidence Coefficient
Coverage
Different or Future K. Values
1
Number of Bootstrap Operations
2000
X
General Statistics
51
Total Number of Observations
53
Number of Distnct Observations
Raw Statistics
Log-T ransformed Statistics
Minimum
1 5
Minimum
0.405
Maximum
121 1
iT 6 5
Maximum
4 797
Second Largest
Second Largest
4 758
~2 273~
First Quartile
9 708
First Quartile
Median
24 5G
Median
3 201
Thud Quaitile
9S 88
Third Quartile
4 573
Mean
51 1
—437B~
Mean
3 325
SD
SD
1 298
Coefficient of Variation
0 857
Skewness
0 277
I
Background Statistics
NoimalDistiibution T est
0.247
Lognormal Distribution T est
Lilliefors Test Statistic
Lilliefors Test Statistic
0 225
Lilliefors QiticalValuej 0122
Lilliefors Critical Valuej 0122
DataNot Normal at 5% Significance Level
Data Not Lognormal at 5% Significance Level
170
-------
Output Screen for UPL/UTL for Data Sets with No Non-detects (All option) (continued).
Assuming Noimal Distribution I Assuming Lognormal Distribution
35% UTL with 90% Coverage
122 4
35% UTL with 90% Coverage
229 8
95% UPL (t)
1251
~\072 ~
95% UPL (t)
249 3
30% Percentile (2)
90% Percentile (z)
146 G
95% Percentile (z)
1231
95% Percentile (z]| 234 9
99% Percentile (z)
153
99% Percentile (z)j 56B 9
Gamma Distribution Test
Data D istribution T est
k star
0 912
Data do not follow a Discernable D istribution (0.05)
Theta star
56 04
nu star
96 66
A-D Test Statistic
2.591
N onparametric S tatcdics
5% A-D Critical Value
0 782
90% Percentile j 110
K-S Test Statistic
0 222
0126
95% Percentilej 116.4
5% K-S Critical Value
99% Percentilej ^ '
Data Nat Gamma Distributed at 5Z Significance Level
|
1
Assuming Gamma Distribution
J
£OI
SI
i
95% UTL with 90% Coverage 116 4
30% Percentile
95% Percentile Bootstrap UTL with 90% Coverage
114 8
35% Percentile
159 2
246 G
95% BCA Bootstrap UTL with 90% Coverage
1148
99% Percentile
95% UPL
1164
95% Chebyshev UPL
243 7
"2271
Upper Threshold Limit Based upon IQR
Note: UPL (or upper percentile for gamma distributed data) represents a preferred estimate of BTV
6.4.1.2.2 WithNDs
I. Click Stats/GOF E> Intervals E> Upper (Right Sided) > UPL/UTL E» With
NDs > All.
Scout- 4'.0j - [D:\HQroin\Scout_For._V/indovys,VScoutSo-urce\V/qrkDqllii£xcel\PqtQW:ensori-b^,-grp^s1i]J
Stats/GOF
~y File Edit Configure Data Graphs
Navigation Panel
Outbers/Estmates Regression Multivariate EDA GeoStats Pro^ams Wndow Help
2. The "Select Variables" screen (Section 3.2) will appear.
° Select one or more variables from the "Select Variables" screen.
171
-------
If the statistics have to be produced by using a Group variable, then select
a group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
When the option button is clicked, the following window will be shown.
Confidence Level |
Coverage | 0-9
Different or Future K Values | 1
Number of Bootstrap Operations | 2000
OK Cancel
A
o Specify the "Confidence Level"; a number in the interval [0.5, 1), 0.5
inclusive. The default choice is "0.95."
o Specify the "Coverage" level; a number in the interval (0.0, 1).
Default is "0.9."
o Specify the next "K." The default choice is "1."
o Specify the "Number of Bootstrap Operations." The default choice
is "2000."
o Click on "OK" button to continue or on "Cancel" button to cancel the
option.
Click on "OK" button to continue or on "Cancel" button to cancel the
UPLs and UTLs.
-------
Output Screen for UPL/IITL for Data Sets With Non-detects (All option).
* General Background Statistics lor Data Sets mth NorvDetects
Uiei Selected Options
From File , D \Narain\ScouF For_Windows\ScourSouice\WoikDatlnExcel\Data\censoi-bv-grps1
Full Precision (OFF
Confidence Coefficient j 95%
Coverage 19IK
Different or Future K Values 11
Numbei of Bootstrap Operations 12000
X
-
General Statistics
Number of Valid Data|
53
Number of Detected Data
49"""
Number of Distinct Detected Data]
49
Number of Non-Detect Data
4
I
Percent Non-Detectsi7 552
Raw Statistics
Log-transformed Statistics
Minimum Detected]
3 202
Minimum Detected
1 164
Maximum Detected
|
"121 1
Maximum Detected
4 797
Mean of Detected]
55,05
Mean of Detected
3 523
SD of Detectedi
432 "
SD of Detected
1 128
Minimum Non-Detectl
1 5
Minimum Non-Detect
0 405
Maximum Non-Detecl|
4
Maximum Non-Detect
1 386
I
Data with Muttiple Detection Lidts
Single Detection Limit Scenario
Note' Data have multiple DLs • Use of KM Method is recommended
Number treated as Non-Detect with Single DL
5
For all methods (except KM, DL/2, and RQS Methods),
Number treated as Detected with Single DL
48
Observations < Largest ND are treated as NDs
Single DL Non-Detect Percentage
9 43X
Background Statistics
Normal Distribution T est with Defected Values Only Lognotmal Distribution T est with Detected Values OrJy
Liiiiefors Test Statistic^ (1802 Liiiiefors Test Statistic^ 0 856
5Z Lrllrelors Critical Value) 0 947 5% Ldiiefors Critical Value) 0 947
Data Nat Normal at 52 Significance Level Data Not Lognortnal at 5Z Significance Level
173
-------
Output Screen for UPL/UTL for Data Sets With Non-detects (All option) (continued).
Assuming Normal Distribution
Assuming Lognormal Distribution
DL/2 Substitution Method
DL/2 Substitution Methoc
l
I
Mean
51
~ 439
Mean (Log Scale)! 3 273
" " SD~[Log"Scafe)i " 1.406 "
95? UTL 90? Coveiage
. _ 95FUPL"(1)
122 5
T252
95% UTL 90% Coverage
" " " " " 95? UPL (t)
260 2
284 1" "
90? Percentile (z]
95? Percentile (z]
1073
" 1232
90? Percentile (z
95? Percentile (z
159 9
266 5
93? Percentile (z]
153.1
99? Percentile (z
694 8
Maximum Likelihood Estimate(MI.F) Method
Log ROS Methoc
Mean
48 86
Mean in Original Scale
51 13
SD
4G 77
SD in Original Scale
43 75
95? UTL with 90% Coverage
125
95? UTL with 90? Coverage
220 8
95? BCA UTL with 90? Coverage
1145
95? Bootstrap (?) UTL with 90? Coverage
1148
95? UPL (1)
127 9
95? UPL (t)
238 9
905; Percentile (z]
108 8
90? Percentile (z
142 5
95? Percentile (z]
125 8
95? Percentile (z|
225 6
99? Percentile (z)
157 7
99? Percentile (z
533 6
Gamma Distribution T est with Detected Values Only
Data Distribution T est with Detected Values Only
k star (bias corrected]
1 111
Data do not follow a Discernable Distribution (0 05)
Theta star
49 54
nu star
1089
I
A-D Test Statistic
2 882
N onpar amebic S tatistics
5? A-D Ciitical Value
0 775
Kaplan-Meier (KM) Method
K-S Test Statistic
0 23G
Mean
51 14
5% K-S Critical Vakje
0.13
SD
43 33
D ata N ot Gamma D istributed at 5X S ignificance Level
SE of Mean
6 013
1 1 95? KM UTL with 90? Coverage
121 7
Assuming G amma D istribution
95? KM Chebyshev UPL
241 8
Gamma ROS Statistics with extiapolated Data
95? KM UPL (t)
124 4
Mean
50 9
90? Percentile (z)
106 7
Median
24.56
95? Percentile (z)
122.4
SD
44 02
99? Percentile (z)
151 9
k star
0 302
Theta star
168 3
Nu star
3205
95? Percentile of Chisquare (2k)
2 759
90? Percentile
150
95? Percentile
232 3
99? Percentile
445 9
- — - - '
N ote: U PL (or upper percentile for gamma distributed data) represents a preferred estinate ot BTV
Foi an E xample: KM-U PL may be used when multiple detection limits are present
Note: DL/2 is not a recommended method.
174
-------
6.4.2 Classical Confidence Intervals
6.4.2.1 Without Non-detects
The confidence intervals for data with no non-detects available in Scout are:
o Normal:
o Student's t
x±t,
o Gamma:
o Approximate Gamma
o Adjusted Gamma
o Lognormal
o Land's H
LCL = exp
(
- sy
y H—— +
2
C s H ^
y <*/
v
r
LCL — exp
o Chebyshev MVUE
-
V H—— +
2
yjn- 1
V J J
(s^A
yjn- 1
V
— ^ ®mvue
a yjn
o Nonparametric
o CLT
x±z,
2) yjn
o Jackknife
y(«)±
//„/ 7
(%"-\) J{o)
o Standard Bootstrap
d±\a/fB
175
-------
o Bootstrap t
LCL -x-t,
s UCL = x-t,„, ,4=
4~n
o Percentile Bootstrap
a
LCL = — percentile of X
i a
UCL = 1 — — percentile of x
o Chebyshev
- , 1 s
x ±-
o Modified (t)
jf+-4-±t
6s2n {"A'" ') yfn
o Adjusted CLT
x±
1 + 2 z,
Z{%)+'
6yfn
Details of those intervals can be found in the ProUCL 4.00.04 Technical Guide.
Click Stats/GOF l>Intervals P- Classical E> Confidence Intervals > No NDs.
Scout* 4J..Q) - rP:\Narain\Scou1'_For._W.indowsKcoutSource\WorkDaH'nExcel\F>UUillRIST|
Stats/GOF'
og File Edit Configure Data Graphs
Navigation Panel
Outliers/Estimates Regression Multiyanate EDA GeoStats Programs Window Help
Name
ConfNoNDsNorm ost
D \Narain\Scout_Fo.
ConfwNDs ost
Descriptive ~
GOF ~
Hypothesis Testing ~
t-widlh
pt-length
pt-width
Predicbon Intervals ~
Robust ~! Tolerance Intervals ~
Confidence Intervals ~.
Upper (Right-Sided) ~ With NDs ~
176
-------
The "Select Variables" screen (Section 3.2) will appear.
° Select one or more variables from the "Select Variables" screen.
o If the results have to be produced by using a Group variable, then select a
group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
o Click on "Options" for interval options.
1H1 Qptiqns, Gfl.ntideriGej Intenyalsj Mbj ND.Sj
Confidence Level |
Number.of Bootstrap Operations I 2000
OK
Cancel
A
o Specify the preferred "Confidence Level." The default is
"0.95."
o Specify the preferred number of bootstrap operations. The
default is "2000."
o Click "OK" to continue or "Cancel" to cancel the options.
o Click "OK" to continue or "Cancel" to cancel the computations.
177
-------
Output for Classical Confidence Intervals without Non-detects.
Confidence Intervals lor Datasets withni NortOetecU
~Die/timerfC^laiion 11/15/2008 1 2 39 56PM
User Selected Options
From File D \Narain\Scoul_Fof_Windows\ScaiSoijrce\WorkDatlr£xc^BOD\FAT
Ful Precision
Number of Bootstrap Operations
OFF
2000
Confidence Coefficient j 0 96
Skin(xl)
Number of Valid Observational 20
Number of Distrit Observation] 20
RawStatittics
Mean] 25 31
25 55~
Variance! 25 23
Standard Deviation! 5023
Nonnallnleivab
Normal Lower Lorn* j Upper Limit
SludenPjtl" ~2295" I "27ffi~
Gamma Statistics
k Stai (Bias Corrected) 20 54
Theta Star; f232~
nu Stdil 821 5
Gamma Intervals
Gamma
Approxmate Gamma
Adjusled Gamma
Lower Urn* | Upper Limit
23 03" j ~2794
22 82 j 29 21
Log-T lantformed Statistics
I
3 21
Mean of Log Transformed Data
Standard Deviation of Log Transformed Data^ 0216 ]
MVU Estimate of Median
MVU Estmate of Mean
MVU E striate of SD
MVU Estmate of Standard Etror of Mean
Lognormallntervab
24 75
25 34 j
"WaT
' 1 232 r
Lognoima! ' Lower Limrt Upper Limit;
IwTh]-23"02""J
9 83"
Chebyshev (MVUE)I
; Hi 23 0
IEi| "19 8
28 26
30 85
Nortparainetitc Intervals
Lower Limit! Upper Lomrt
231~ | "27 51* "
Jackkrafo" 22 95 " { ~ 27 ES_
Standard Bootibap 2319 27 42
Nonpaiametnc
Central Lfrut Theotem
Bootstiap-t
Percentile Bootstrap
22 67
2313"
Chebyshev "2028"
Modified (t) 22 93"
Adpjsted CLtI 23 3~
27 59
T7ll
30 33
27 63
"*2731"
J
i_
i
- -1
178
-------
6.4.2.2 With Non-detects
The confidence intervals for data with non-detects available in Scout are:
° Normal:
o Student's t
o Normal ROS Student's t
o Gamma:
o Gamma ROS Approximate Gamma
o Gamma ROS Adjusted Gamma
° Lognormal:
o Lognormal ROS Land's H
o Lognormal ROS Chebyshev MVUE
o Lognormal ROS % Bootstrap
o Nonparametric:
o Kaplan-Meier (t)
o Kaplan-Meier % Bootstrap (bootstrapping the KM means)
a
LCL = — percentile of x
o Kaplan Meier BCA Bootstrap
o Kaplan-Meier (z)
UCL = 1 — — percentile of x
179
-------
o Kaplan Meier Chebyshev
1 5
X ± •
yfa yfn
o Winsor(t)
x +t -^L-
s(n-k)
where v = n-2k 5... = ¦
V — 1
XH, = Winsorized mean
Details of those intervals can be found in the ProUCL 4.00.04 Technical Guide.
Click Stats/GOF > Intervals t> Classical > Confidence Intervals > With
NDs (Typical) or With NDs (Bounded).
HI Scout 4.0, =. [D:\Warain\5coul_FQr_VVJindo>vs.\ScoulSource,\WorkDallnExcelVfjyi!l!IRISJ|
' File Ecfit Configure Data Graphs
Navigation Panel
Name
ConfNoNDsNorm ost
D \Narain\Scout_Fo
ConfuvNDs ost
[ Outliers/Estmates Regression Multivariate EDA GeoStats Programs Window Help
With NDs (Bounded)
2. The "Select Variables" screen (Section 3.2) will appear.
° Select one or more variables from the "Select Variables" screen.
° If the results have to be produced by using a Group variable, then select a
group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
o Click on "Options" for interval options.
180
-------
Confidence Level j ^
1
Number of Bootstrap Operations j 2000
OK
Cancel
/Z
o Specify the preferred "Confidence Level." The default is
"0.95."
o Specify the preferred number of bootstrap operations. The
default is "2000."
o Click "OK" to continue or "Cancel" to cancel the options.
° Click "OK" to continue or "Cancel" to cancel the computations.
Output for Classical Confidence Intervals with Non-detects (Typical).
Confidence Intervals Datasets with NorvDetecU
Date/Time of Compulation 11 /21 /20081 25 37 PM
User Selected Options j
From Fde jD \Narari\Scout_FoMAfrdows\ScoutScwrce\WoikDatlnExceI\Data\censor-by gtpsl >
Full Precision .OFF
Nuitber of Bootstrap Operations 12000
Confidence Coefficient j0 95
General Statistics
Number of Valid Data
Number of Detected Data
Number of Distinct Detected Data'
53
"49"
49*
Minimum Detededj
Maximum Detected
Number of Hon Detect Dataj
Percent Non-Detects|
Minimum Non detect
Maximum Non detect
Mean of Detected Data
SD of Detected Data
3 202
T2U
" 4
7 555» *
""15""
J
~5505~
432~ '
Maximum Likelihood Statistics
Maximum Likelihood Estimated Meanj~ 48 &o
Maximum Likelihood Estimated Stdv! 46 77
Normal Confidence Intervab
Normal | Lower Limit
MLEltj 35 97
Upper Limit I
NormalROS Statistics
Mean of Normal ROS Datal
61 75
48 06
Stdv of Normal ROS Data 48 36
"ROS StudenPrf ~34~73~' I ~*6lls"
:L.:
-------
Output for Classical Confidence Intervals with Non-detects (Typical) (continued).
Gamma ROS Statistics
k Star of Gamma ROS Data
0 302
Theta Star of Gamma ROS Data
1G8 3
Nu Stai of Gamma ROSData
32.05
Gamma Intelvals
Gannma
Lower Limit j Upper Limit
ROS Approximate Gamma' 32.93
89
ROS Adjusted Gamma
43 94
6209
I
|
Log-T lansfoimed Statistics
I
I
Mean of Log-Transformed Detected Data
3 523
| |
Stdv of Log-T ransformed Detected Data
1 128
Mean of Lognormal ROS Data
51 13
|
Stdv of Lognormal ROS Dataj 43 75
I
|
Lognormal Confidence Intervab
l
i
Lognormal
Lower Limit
Upper Limit
|
ROS Land's H
41 91
109 5
I
ROS % Bootstrap
40.11
62 98
|
ROS BCA Bootstrap
39 71
63 51
I
Kaplan Meiei Distribution Fiee Statistics
I
I
Kaplan Meier Mean
51 14
|
|
Kaplan Meier Stdv
43 33
I
Kaplan Meier SEM
G.013
I
I
Nonpaiametric Confidence Intervals
|
Nonparametric
Lower Limit
Upper Limit
Kaplan Meier (t]
39 07
63 21
Kaplan Meier (z)
39 35
62 92
Kaplan Meier X Bootstrap
401
62 95
Kaplan Meier BCA Bootstrap
40 91
63 54
I
Kaplan Meier Chebyshev
24 25
78 03
Winsorization Statistics
Winsor Mean|
50 72
Winsor Stdv
42.87
Winsor (t)
38 83
62 6
182
-------
Output for Classical Confidence Intervals with Non-detccts (Bounded).
¦ Bounded Confidence Inteivals loi Dalasets wihNon-Detects
Date/Time erf Computation (1/15/200812 45' 11 PM
User Selected Options
From File | D.\Naratn\Scout_For_Windows\ScoutSource\WorkDatlnExcel\Data\cen$or-by-grp$1
Full Precision
OFF
Number of Bounding Operations j 1000
Bounding Coefficient
09
Number of Bootstrap Operations
2000
Confidence Coefficient
09
1
..... -r
x 1
I
]
General Statistics |
Lower Bound (LB)
Upper Bound (UB)
Mean
50 95
I
51 06
Standard Deviation]
43 84
I
43 97
Normal Confidence Lmits |
LBLCL |
UB LCL I LBUCL
UBUCL
Student (t)j
40 83 |
40 97
61 06
61 14
'
1
Gamma Statistics
Lower Sound [LB) j
Upper Bound (UB)
k Stai (Bias Corrected)
0 761
I
0 883
Theta Star
57 8
66 87
nu Star
80.62
93 58 I
Gamma Confidence Limits |
LBLCL
UB LCL |
LBUCL
UBUCL |
Approximate Gamma|
40 04
40 77
6611
67 4
Adjusted Gamma|
39 72
40 51 |
66 61
6811
I
1
I
I
Logrtormal Statistics !
Lower Bound (LB)
Upper Bound (UB)
Mean of Log Trarreformed Data1
3179
3.297
d Deviation of Log-Transformed Data
1 355
1.674
Lognoimal Statistics
Lowe; Bound (LB)
Uppei Bound (UB)
Mean of Log-Tlansformed Data
3179
3 297
d Deviation of Log-Transformed Data
1 355
1 674
_ I
Lognormal Confidence Limits
LB LCL
UB LCL
L8UCL
UBUCL
Land's H
46 22
5914
106*6
197~8
— ...
Chebyshev (MVUEJ
-1 432
16 07
1141
1891
Nonpaiametrtc Corifidence Limits
LB LCL
UB LCL
LB UCL
UB UCL
Central Limit Theorem
41 01
41 15
60 88
60 96
Central Limit Theorem
40 83
40 97
61 06
61 14
Standard Bootstrap
40 9
41 43
60 58
61 09
Bootstrap-t
40 64
41 65
60 88
61 88
Percentile Bootstrap
40 76
41 69
60 42
61 36
Chebyshev
31 84
32 01
70 04
701
Modified (t
40 87
41 02
611
61 18
Adjusted Cll
40 78
40 9
61 12
61 2
183
-------
6.4.3 Classical Tolerance Intervals
6.4.3.1 Without Non-detects
The tolerance intervals for data with no non-detects available in Scout are:
° Normal:
LTL = x-K, - xS
( ">%•?)
UTL - x + K, , ,s
° Lognormal:
LTL = exV[y-K(n%i})Sy
UTL = exp y + K, , ,sv
I A'p) y
o Nonparametric:
o Percentile Bootstrap
o BCA Bootstrap
— \3
a-
®2(LOWER) ^
Z0 +
Z0 + 2
I 5
or/2
H
z0 +zQ,/2)«
z0=O
-I
#(x; < x)
^2 (UPPER) ^
z0 +
N
Zq + Z
-a/2
z0 + z" 0/2
LTL = x^2[i-0>vi:w))
UTL = x
_ —(a2(um:R))
o Percentile Tolerance
Details of those intervals can be found in the ProUCL 4.00.04 Technical Guide.
184
-------
1.
Click Stats/GOF > Intervals > Classical > Tolerance Intervals > No NDs.
Scout-' .4.0/--|5^W9r3jn,\S.couli_ForF_W|ind(^^coujSii,urce>Wb/^aU'nE^el\^yiiL IRIS])
5tats/GGF"
b§ File Edit Configure Data Graphs
Navigation Panel I
Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window Help
Name
ConfNoNDsNorm ost
D \Narain\Scout_Fo
ConfwNDs ost
Pnn1wMn« a nst
Descriptive ~
GOF ~
Hypothesis Testing ~ j
j-widlh pt-length
pt-width
Prediction Intervals ~"j,-
Tolerance Intervals ~ ! No NDs
Confidence Intervals ~ With NDs
Upper (Right-Sided) ~ JT
2. The "Select Variables" screen (Section 3.2) will appear.
° Select one or more variables from the "Select Variables" screen.
o If the results have to be produced by using a Group variable, then select a
group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
° Click on "Options" for interval options.
iliE
.jAijgns, Tole,[anee) IjitepvalS) fc{o) MQs,
Confidence Level
Coverage | 0 9
Numberof Bootstrap Operations I 2000
OK
Cancel
A
o Specify the preferred "Confidence Level." The default is
"0.95."
o Specify the preferred coverage percentage. The default is
"0.9."
o Specify the preferred number of bootstrap operations. The
default is "2000."
o Click "OK" to continue or "Cancel" to cancel the options.
o Click "OK" to continue or "Cancel" to cancel the computations.
185
-------
Output for Classical Tolerance Intervals without Non-detects.
Tolerance Intervals/Limits (TLs) foi Datasets Wihtai Non-Detecti
Date/Tme of Computation
2/25/2008 7 51 11 AM
User Selected Options
Fiom File
D VJarain\Scout_For-Windows\ScoutSource\WofkDatlnExcet\Data\censcr-by-grp5l
Full Piecisicn
OFF
Number of Bootstrap Operators
2000
Coverage
09
Confidence Coefficient
0 95
X
Number of Valid Observations
53
-
Number of Distinct Observations
51
Raw Statistics
Mean
51 1
Minimum
1 5
52 Percentile
2606
102 Percentile
4 071
IstQuaHe
9 608
Median
24 56
3rd Quaitile
95.73
902 Percentile
1076
952 Percentde
1129
Maamirn
121 1
Standard Deviation
43 78
MAD i 0 6745
30 48
IQR / 1 35
64 57
12 Percentile (z)
-50 75
52 Percentile (z)
-20 91
102 Percentile (z)
-5 006
1st Quaitile (z)
21 57
ROS Median (z)
51 1
3rd Quarble (z)
80 64
902 Percentile (z)
107 2
952 Percentile (z)
1231
992 Percentile (z)
153
Normal T olerance Limits
Tolerance [Lower Limit
Upper Limit
Normal -35 74
1379
Log-T rantf ormed S tatbtics
Mean of Log-Transformed Data1 3 325
Standard Deviation of Log-Transformed Dataj 1 299
Log-T ransformed T olerance Linfa
Lognoimal^ 2119 [ 364 G
Nortparametric T olerance Linfc
2 Bootstiap
98 51
1164
BCA Bootstrap
97 97
1148
2 TL
2 053
1164
186
-------
6.4.3.2 With Non-detects
The tolerance intervals for data with non-detects available in Scout are:
o Normal:
o Using MLE of mean and standard deviation
o Using Normal ROS methods
o Lognormal ROS
o Using bootstrap methods based on Lognormal ROS
• Nonparametric:
o Nonparametric KM
Details of those intervals can be found in the ProUCL 4.00.04 Technical Guide and the
Scout Technical Guide.
187
-------
1. Click Stats/GOF ^Intervals >¦ Classical Tolerance Intervals ~ With
NDs.
~§i Fib Edit Conftgixe Data Graphs
CXjtfiers/Estnates Regression Multivariate EDA GeoStats Programs Wndow Help
Navigation Panel J
Descriptive
GOF
HvDothesK Testina
' 1 2 1 3 | 4 | 5 | 6 | 7
8
9
Name |
^ j-wfdth | pt-tength | pt-width
I I I
MWSMfeJ
7G
1 fffifercat? mi fifesteii C
Predrctlon Inte
Tolerance Inte
. „ i i
rvab ~
ConfNoNDsNorm ost
D \Narain\Scout_Fo
ConfwNDs ost
77
.. ,
~1 Robust ~
rvals ~ I I
78
2
6 7
y
j
79
2
6
29
r
Upper (Rtght-Stded) >|| WDG315 Oil ®Si)(Z!BS(6l3^S ||
80
2
5 7
Tsl
2 6:
-i ii
3,5| 1
in1 11
j j| With NDs (Bounded) |
2. The "Select Variables" screen (Section 3.2) will appear.
o Select one or more variables from the "Select Variables" screen.
o If the results have to be produced by using a Group variable, then select a
group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
o Click on "Options" for interval options.
ii(§pini0Wbiiyii&
Confidence Level j 1
M
Coverage | 0 9
Niimber of Bootstrap Operations | 2000
OK
Cancel
A
o Specify the preferred "Confidence Level." The default is
"0.95."
o Specify the preferred coverage percentage. The default is
"0.9."
o Specify the preferred number of bootstrap operations. The
default is "2000."
188
-------
o Click "OK" to continue or "Cancel" to cancel the options,
o Click "OK" to continue or "Cancel" to cancel the computations.
Output for Classical Tolerance Intervals with Non-detects.
Tolerance Intervals for Datasets with Non-Detects
Date/Time of Computation
2/25/2008 8-3G-35 AM
User Selected Options
From File
D \Naram\Scout_For_Windows\ScoutSou[ce\WorkDatlnExcel\Data\censo!-by-gips1
Full Precision
OFF
Number of Bootstrap Operations
2000
Coverage
09
Confidence Coefficient
0 95 '
K2 represents the two-sided cutoff for tolerance intervals based upon the procedure desaked h Hahn and Meekef (1331)
X
Number of Valid Observations
53
Number of Distinct Observations
49
Number of Non-D elect Data
4
Number of Detected Data
49
Minimum Detected
3 202
Maximum Detected
121 1
Percent Non-Detects
7 552
Minimum Non-detect
1 5
Maximum Non-detect! 4
i
Raw Statistics
|
Mean of Detected Data
55 05
l
SD of Detected Data
43 2
Maximum Likelihood Estimates (MLEs)
MLE Mean
48 85
1 2 Percentile (2)
•59.95
52 Percentile (2]
•28.07
1 02 Percentile (2]
-11.08
1st Quaitile (2]
17 31
ROS Median (2]
48 88
3rd Quaitile (2]
80 41
902 Percentile (2]
1088
952 Percentile (z]
125 8
992 Percentile (z]
157 7
MLE Stdv
48 77
189
-------
Output for Classical Tolerance Intervals with Non-detects (continued).
K2| 1.983
N ormal T olerance 1 ntervab
Lower Limit
Upper Limit
MLE
-43 91
141.6
NormalRQS Statistics
Minimum of ROS Data
•49.39
Maximum of ROS Data
121.1
Mean of ROS Data
48.0G
SD of ROS Data
48.36
K2
1 983
Nonparamtric Percentiles Using ROS Data
1% ROS Percentile
•49.39
5% ROS Percentile
-36 93
10% ROS Percentile
3.513
1st ROS Quartile
9.608
ROS Median
24.26
3rd ROS Quartile
95.73
90% ROS Percentile
107 6
95% ROS Percentile
112.9
99% ROS Percentile
118.7
Parametiic Percentiles Using Normal Distribution
1% ROS Percentile [z)
-64.44
5% ROS Percentile [z)
-31.49
10% ROS Percentile [z)
•13.92
1st ROS Quartile [z)
15.44
ROS ROS Median (z)
48 06
3rd ROS Quartile (z)
80.68
90% ROS Percentile [z)
110
95% ROS Percentile (z)
127.6
99% ROS Percentile [z)
160 6
Normal ROS Tolerance Interval
Lower Limit
Upper Limit
Normal
¦47.86
144
190
-------
Output for Classical Tolerance Intervals with Non-detects (continued).
Log-T ransfoimed Statistics
Mean of Log-Transformed Detected Data
3.523
Stdv of Log-Transformed Detected Data
1.128
Minimum of Lognormal ROS Data
2 204
Maximum of Lognormal ROS Data
121.1
Mean of Lognormal ROS Data
51.13
Stdv of Lognormal ROS Data
43.75
K2
1.983
'
N onparamtric Percentiles U sing ROS Data
1% ROS Percentile
2.204
5% ROS Percentile
3 041
10% ROS Percentile
4.174
1st ROS Quartile
9.608
ROS Median
24.26
3rd ROS Quartile
95.73
90% ROS Percentile
107 6
95% ROS Percentile
112.9
99% ROS Percentile
118.7
Parametric Percentiles Using Lognormal D isliiiuliuri
1%R0S Percentile (z)
1.493
5% ROS Percentile (z)
3.532
10% ROS Percentile (z)
5.589
1st ROS Quartile [z]
12.04
ROS ROS Median (z'
28 22
3rd ROS Quartile (z]
66.19
90% ROS Percentile [z]
142.5
95% ROS Percentile (z]
225.6
99% ROS Percentile (z]
533.6
Lognormal T olerance Intervals
Lower Limit
Upper Limit
•
ROS Lognormal
2 302
346
ROS % Bootstrap
98 51
116.4
ROS BCA Bootstrap
97 97
116.4
191
-------
Output for Classical Tolerance Intervals with Non-detects (continued).
Kaplan Meier Distribution Free Statistics
Mean
51.14
1X Percentile (z)
-49. GG
5Z Percentile (z)
-20.13
10% Percentile (z)
-4.339
1st Quartile (z)
21.91
Median (z)
51.14
3rd Quartile (z)
80.3G
30% Percentile (z)
10G.7
95% Percentile (z)
122.4
99% Percentile (z)
151.9
Standard Deviation
43.33
Kaplan Meier SEM
6.013
K2
1.983
Nonparametiic T olerance Intervals
Lower Limit
Upper Limit
KM Nonparametric
¦34.8
137.1
6.4.4 Classical Prediction Intervals
6.4.4.1 Without Non-detects
The prediction intervals for data with no non-detects available in Scout are (the square
root quantity, [(1 /k) + (l/n)] ", in the equations below is given for k = 1 future
observation):
o Normal
° Lognormal
exp
° Chebyshev
J:±-I=sjl + —
a
n
192
-------
o Nonparametric t
LPL = x
(»»)
m
= (« + !)
v2 y
UPL = x
H
m
= (« + !)
r a^
1--
V 2 y
Details of those intervals can be found in the ProUCL 4.00.04 Technical Guide and the
Scout Technical Guide.
I. Click Stats/GOF > Intervals > Classical > Prediction Intervals No NDs.
01 Scoutj 4}.0) - [D:\Narain\Scout_lior._Windpws\ScoiJlSource\WorkDairnExcel\BRADII).]!
Stats/GOF ij
~§ File Edit Configure Data Graphs
Navigation Panel
Outliers/Estimates Regression Multivariate EDA GeoStats Prog-ams Window Help
Name
D \Narain\Scout_Fo
D \Narain\Scout Fo .
1
Descnptive ~
GOF ~
Hypothesis Testing ~
x2
x3
4I
103|
~~95|
Robust ~
—IU7- ¦¦¦¦
9 9j
UBBMHMMB
Tolerance Intervals ~ With NDs
Confidence Intervals ~
Upper (Right-Sided) ~
2. The "Select Variables" screen (Section 3.2) will appear.
° Select one or more variables from the "Select Variables" screen.
o If the results have to be produced by using a Group variable, then select a
group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
° Click on "Options" for interval options.
!H Options, Pre djfetib rtj Intervals] Nbj MDsj
Confidence Level ]
Different or Future K Values | 5
OK
Cancel
A
193
-------
o Specify the preferred "Confidence Level." The default is
"0.95."
o Specify the number of future k values. The default is "5."
o Click "OK" to continue or "Cancel" to cancel the options.
Click "OK" to continue or "Cancel" to cancel the computations.
-------
Output for Classical Prediction Intervals without Non-detects.
: Prediction Intervals/Limls [PLs] for Dataseis WttmJt NorvDetects
User Selected Options
Date/Time of Computation
2/25/2008 9:03:29 AM
From File
D:\Narain\Scout_For_Windows\ScoutSource\WorkDatlnExcel\Data\censor-by-grps1
Full Precision
OFF
Number of Future K Values
5
Confidence Coefficient
0 95
X
Number of Valid Observations
53
Number of Distinct Observations
51
Raw Statistics
Minimum
1 5
Mean
51 1
Median
24 5S
Maximum
121.1
Standard Deviation
43.78
Noimal Piediction Intervals
Normal
Lower Limit
Upper Limit
Student's 1
-37.58
139.8
For Next 5
-67.0G
169 3
Log-T ransformed Statistics
Mean of Log-Transformed Data
3.325
Standard Deviation of Log-Transformed Data
1 298
Lognormal
Lower Limit
Upper Limit
Log
2.007
385
For Next 5
0 838
922.5
Chebyshev
Lower Limit
Upper Limit
Chebyshevj -146 5
248 7
Nonpaiametric
Lower Limit
Upper Limit
Nonparametric
0 394
1195
195
-------
6.4.4.2 With Non-detects
The prediction intervals for data with non-detects available in Scout are:
° MLE-t
° Lognormal ROS -1
o Nonparametric
o KM Chebyshev
o KM-t
o KM - z
Details of those intervals can be found in the ProUCL 4.00.04 Technical Guide and the
Scout Technical Guide.
1. Click Stats/GOF !> Intervals > Classical > Prediction Intervals With
NDs.
Hi Scout; 4±Q) - fDiAHarainVScoutLJior^VVindov/sNScoutSourceW/orkDatlnExcelNDQtQ^ensorrb^.-grpsIji
Stats/GOF'
ay Fde Ed* Configure Data Graphs |
1 Navigation Panel I
Qjtfiers/Estmates Recession Multivariate EDA GeoStats Programs Wndow Help
Name
D \Narain\Scoul_Fo | 2
PredNoNOs ost j 3
Desapbve
GOF
Hypothesis Testrig ~
[D _>< GrouplX
4 52
7 233
i
T Robust ~
U_broupl
Gtoup2<
U_biciup^
Group 3X
Classical M Prediction Intervals ~
No NDs
Tolerance Intervals >||
Confidence Intervals ~ j 1,1
Upper (Rtght-Sided) ~ 5 1
116467)
1029221
U_bfOUpJ|
V I
II
93 859
1
97334!"
2. The "Select Variables" screen (Section 3.2) will appear.
° Select one or more variables from the "Select Variables" screen.
o If the results have to be produced by using a Group variable, then select a
group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
° Click on "Options" for interval options.
196
-------
MS Options,Xoferancei Intervals; V/ithjMBs, [Xj
Confidence Level |
EES
Coverage |
09
Number of Bootstrap Operations |
2000
OK
Cancel
J
1 A
o Specify the preferred "Confidence Level." The default is
"0.95."
o Specify the number of future k values. The default is "5."
o Click "OK" to continue or "Cancel" to cancel the options.
° Click "OK" to continue or "Cancel" to cancel the computations.
197
-------
Output for Classical Prediction Intervals with Non-detects.
I Piediction Intel vals foi D atasets w*h Non-Detects
User Selected Options
Date/Time of Computation
2/25/2008 9 06.12 AM
From File
D. \N arain\S cout_For_Windows\S coutS outceW/orkD atlnExcel\D ata\censor-by-grps1
Full Precision
OFF
Number of Future K Values
Confidence Coefficient
0.95
X
General Statistics
Number of Valid Observations
53
Number of Distinct Observations
49
Number of Non-Detect Data
4
Number of Detected Data
49
Minimum Detected
3.202
Maximum Detected
121.1
Percent Non-Detects
7.55%
Minimum Non-detect
1.5
Maximum Non-detect
4
Raw Statistics
Mean of Detected Data
55 05
SD of Detected Data
43 2
Maximum Likelihood Estimates (MLEs)
MLE Mean
48 86
1 % Percentile (z)
-59 95
5% Percentile (z)
-28 07
10% Percentile (z)
-11.08
1st Quartile (z)
17.31
ROS Median (z)
48.86
3rd Quartile (z)
80.41
90% Percentile (z)
108.8
95% Percentile (z)
125 8
99% Percentile (z)
157.7
MLE Stdv
46.77
198
-------
Output for Classical Prediction Intervals with Non-detects (continued).
1 1 1 1
Normal Prediction Intervals
Lower Limit
Upper Limit
MLE (t)
-45.88
143.G
Prediction Interval for Next 5
•77.37 •
175.1
Normal ROS Statistics
Minimum of ROS Data
-49.39
Mean of ROS Data
48 06
Maximum of ROS Data
121.1
SD of ROS Data
48.36
Nonparamtric Percentiles U sing ROS Data
1% ROS Percentile
-49.39
5% ROS Percentile
-36.93
10% ROS Percentile
3.513
1st ROS Quartile
9 608
ROS Median
24.26
3rd ROS Quartile
95.73
90% ROS Percentile
107.6
95% ROS Percentile
112.9
99% ROS Percentile
1187
Parametric Percentiles U sing NormalD istiixtion
1% ROS Percentile (z)
-64.44
5% ROS Percentile (z)
-31.49
10% ROS Percentile (z)
-13.92
1st ROS Quartile (z)
15.44
ROS ROS Median (z]
48 06
3rd ROS Quartile (z)
80.68
90% ROS Percentile (z)
110
95% ROS Percentile (z)
127.6
99% ROS Percentile (z)
160.6
Normal ROS Prediction Intervals
Lower Limit
Upper Limit
Normal
-49 89
146
Prediction Interval for Next 5
-82.46
178 6
199
-------
Output for Classical Prediction Intervals with Non-detects (continued).
Kaplan Meier Distribution Free Statistics
Mean
51.14
1Z Percentile (z)
¦49 66
5% Percentile (z)
¦20.13
10% Percentile (z)
-4.389
1st Quartile (z)
21 91
Median (z)
51.14
3rd Quartile (z)
80.36
30% Percentile (z)
106.7
35% Percentile (z)
122.4
33% Percentile (z)
151.9
Standard Deviation
43.33
Kaplan Meier SEM
6.013
Nonparametric Prediction Intervals
Lower Limit
Upper Limit
KM Chebyshev
¦144.5
246.7
KM (t)
-36.62
138.9
KM (z)
•34.58
136.9
Prediction Interval for Next 5
-65 8
168.1
6.5 Robust Intervals
Various robust and resistant univariate intervals (confidence intervals, prediction
intervals, tolerance intervals, and simultaneous intervals) can be computed using Scout.
For details of those robust intervals, refer to Kafadar (1982) and Singh and Nocerino
(1997). Singh and Nocerino (1997) discussed the performance of those intervals.
Typically, those robust procedures are iterative requiring initial estimates of location and
scale. In Scout, those robust intervals can be computed using the mean and the standard
deviation, or median and MAD/0.6745 as the initial estimates of center and location. The
different methods for the computation of the robust intervals available in Scout are:
o PROP (using PROP influence function)
° Huber (using Huber influence function)
o Tukey's Biweight as described in Tukey (1977)
° Lax/Kafadar Biweight as described in Kafadar (1982) and Horn (1988)
° MVT (using trimming percentage)
200
-------
The performance of these intervals can also be compared using the graphics option in the
variable selection screen. If the graphics option is selected, then a plot of intervals will
be generated for all of the interval methods selected in the options window.
6.5.1 Robust Confidence Intervals
1. Click Stats/GOF > Intervals > Robust E> Confidence Intervals.
§1 Scout4.0) - rD:Warain\Scoul' For, WindowsXScoutSourceW/orkDalliiExceWylAGKLOSSlj
Help
Navigation Panel | J
Descriptive ~ | 2 3 4 5
6
7 j
E
Name
GOF ~ t A j n
jemp. Acid-Conc
Hvnohhesis Testinn ~ 1 r-r*
1 !
Classical
!
2
Predction Intervals
Tolerance Intervals
i
3
37| 75
j
A
28| 62i 24
18j ~ "22i
--- -h
5
Simultaneous Intervals
Group Analysis
6
18j 62] 23j
2. The "Select Variables" screen (Section 3.2) will appear.
° Select one or more variables from the "Select Variables" screen.
° If the results have to be produced by using a Group variable, then select a
group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
201
-------
o Click on "Options" for interval options.
m
Robust GonfidenGei Inter, valis; Qp.tibns,
-PROP Method Options
F7 PROP
-# Iterations ¦
10
rInitial Estimate —
f Mean/Stdv
f* Median/MAD
"Influence Alpha
I 005 '
"MDs Distribution -|
(• Beta
Chisquared
-Huber Method Options
P? Hubei
-tt Iterations
10
—Initial Estimate ~
C Mean/Stdv
(* Median/MAD
"Influence Alpha
( 005
rMDs Distribution
(* Beta
f Chisquared
~Tukey Biweight Method Options
"tt Iterations
Tukey
B weight
10
Maximum
-Initial Estimate —
C- Mean/Stdv
t* Median/MAD
—Tuning Constants
r~r~
Location
4
Scale
-Lax/Kafadar Biweight Method Options
I-8 Iterations
9
Lax/Kafader
B i weight
10
Maximum
—Initial Estimate —
C Mean/Stdv
<• Median/MAD
"Tuning Constant ~
I i
"MVT Method Options
rtt Iterations ¦
f? MVT
10
"Initial Estimate —
f Mean/Stdv
(• Median/MAD
—T rimming %
01
Confidence Level
0 95
OK
Cancel
A
o Choose your methods and options. All of the options displayed
in the above graphical user interface (GUI) are the default
options.
o Click "OK" to continue or "Cancel" to cancel selected options,
o Click "Graphics" for the graphics option.
202
-------
Conjideneei Intei; vats, plot;
1*7 Generate Robust Intervals Plot
Intervals Plot Title
(Robust Confidence Intervals
OK
Cancel
o Click "OK" to continue or "Cancel" to cancel graphics
options.
° Click "OK" to continue or "Cancel" to cancel the computations.
Output for Robust Confidence Intervals.
Date/Time of Computation
: Robust Conlidence Intervals
11/15/20081! 48 55 AM
User Selected Options
From File
Full Precision
Confidence Coefficient
j D \Narain\3cout_Fcii_Windows\ScoutSource\WorkDallnExcelVSTACKLQSS
[OFF
|0 95
PROP Method | Influence Function Alpha of 0 05 with MDs following Beta Distribution.
| PROP CLs derived using 10 Iterations and initial estimates of median/MAD
Hubei Method [Influence Function Aipiia of 0.05 with MDs following Beta Distribution
jHuber CLs derived using 10 Iterations and initial estimates of median/MAD
Tukey Biweight Method j Location Tuning Constant of 4 and a Scale Tunmg Constant of 4
jTukey CLs derived using a Maximum of 10 Iterations and initial estimates of median/MAD.
Lax/Kafader Biweight Method j Tuning Constant of 4
ILax/Kafadei CLs derived using a Maximum of 10 Iterations and initial estimates of median/MAD
MVT Method
| T riming Percentage of 10X
MVT CLs derived using 10 Iterations and initial estimates of median/MAD
Stack-Lo»
'
.
Numbei
Standard
MAD/
Obs.
Mean
Median
Deviation
0.6745
SE Mean
Critical t
LCL
UCL
Classical
21
17 52
. .15
1017
5 93
2.22
2 086
12 89
2215
Initial
Initial
Final
Final
Method
Mean
Stdv
Mean
Stdv
Wsum
SEM
Critical t
LCL
UCL
PROP
15
5 93
133
4 206
1713
1.016
2119
11 14
15.45
Hubei
15
5 93
1G76
8.79
20 3
1957
2 091
1268
" 20 84
TukepBiweicft
15
5 93
1321
5 839
16 G3
1.432
2124
10.17
16 25
Lax Kafader Biweighi
15
5 93
14 57
7.571
1742
1 314
"" 2116
1074
1841
MVT
15
5 93
1521
7413
19
1.701
2101
11 64
18 78
-
203
-------
Output for Robust Confidence Intervals (continued).
Robust Confidence Intervals
l-fcjber Tukey
Intervals for Stack-Loss
6.5.2 Robust Simultaneous Intervals
1. Click Stats/GOF ~Intervals ~ Robust ~ Simultaneous Intervals.
SB Scout 4.0 - [D:\Narain\Scout_For_Windows\ScoutSource\WorkDatlnExEel\STACKLOSS]
00 File Edit Configure Data Graphs
9 Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window
Help
Navigation Panel j
1 f
Descriptive ~
GOF ~
Hypothesis Testing ~
2 3
4 5
6
7
t
Name
emp. Acid-Conc
/" . 1 83
'
<—-——
RobConflnt.ost
! I
2
u
Classical ~
Prediction Intervals
Tolerance Intervals
3
37 75
4
28 G2 24
Confidence Intervals
5
18 62 22
Simultaneous Intervals
fi
18 62 23
Group Analysis
2. The "Select Variables" screen (Section 3.2) will appear.
• Select one or more variables from the "Select Variables" screen.
° If the results have to be produced by using a Group variable, then select a
group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
204
-------
user should select and click on an appropriate variable representing a
group variable.
o Click on "Options" for interval options.
1551
E^ll^Jt^Sirn,u)tah%o.u^ IhjtefcyalSj Options;
- PR 0 P M ethod 0ptions
r~# Iterations ¦
(7 PROP
10
"Initial Estimate —
C Mean/Stdv
f? Median/MAD
"Influence Alpha
0.05
~MDs Distribution
(* Beta
C Chisquared
"Huber Method Options
ft? Huber
¦# Iterations
10
"Initial Estimate —
C Mean/Stdv
f Median/MAD
-Influence Alpha —
0 05
"MDs Distribution
(* Beta
C Chisquared
"Tukey Biweight Method Options
p# Iterations
r? TukeV
Bi weight
10
Maximum
"Initial Estimate —
C Mean/Stdv
Median/MAD
"Tuning Constants
| 4
Location
4
Scale
"Lax/Kafadar Biweight Method Options
I-# Iterations
Lax/Kafader
Biweight
10
Maximum
"Initial Estimate —
r Mean/Stdv
(* Median/MAD
"T uning Constant ~
-"MVT Method Options
W MVT
¦# Iterations ¦
10
—Initial Estimate —
f Mean/Stdv
t* Median/MAD
-Trimming %
0.1
Confidence Level
| 095
OK
Cancel
A
o Specify the preferred options. All of the options displayed are
defaults.
o Click "OK" to continue or "Cancel" to cancel the options.
o Click "Graphics" for the graphics option.
205
-------
y .... ....
iigaimDJanmns OodaarcefelM&fi
H
P Generate Robust Intervals Plot
Intervals Plot Title
(Robust Simultaneous Intervals
OK | Cancel
1
o Click "OK" to continue or "Cancel" to cancel graphics
options.
Click "OK" to continue or "Cancel" to cancel the computations.
-------
Output for Simultaneous Intervals.
! Robust Simultaneous Intervals/Units (SLs)
Date/Time of Computation
2/25/2003 9 22 03 AM
User Selected Options
-
From File
D \Narain\Scout_Foi_Wmdows\ScoutS ourceNWorkDatlnExcel\Data\censor-by-grps1
Full Precision
OFF
Confidence Coefficient
095
PROP Method
Influence Function Alpha of 0 05 with MDs foUowng Beta Distribution
jPROP SLs derrved using 10 Iterations and initial estimates of median/MAD
Hubei Method
Influence Function Alpha of 0 05 with MDs following Beta Distribution
| Huber SLs derived using 10 Iterations and initial estimates o( median/MAD
Tukey Biweight Method
Location Tuning Constant of 4 and a
Scale Tuning Constant of 4
Tukey SLs derived using a Maximum of 10 Iterations and initial estimates of ntedian/MAD
Lax/Kafader Biweight Method
Tuning Constant of 4
Lax/Kafader SLs derived using a Maximum of 10 Iterations and initial estimates of median/MAD
MVT Method
Turning Percentage of 10.08£
MVT SLs derived using 10 Iterations and initial estimates of median/MAD.
D2Max represents unsquared critical value of Max-MD (Mahalanobis Distances) computed based MponWsun Values
X
Number
Standard
MAD/
Obs.
Mean
Median
Deviation
0.6745
D2Max
LSL
USL
Classical
53
51 1
24.56
43 78
30.48
3.151
¦86 88
1891
Initial
Initial
Final
Final
Method
Location
Scale
Mean
Stdv
Wsum
D2Max
LSL
USL
PROP
24 56
30 48
51 1
43 78
53
3151
-86 88
1891
Huber
24 56
30.48
51.1
43 78
53
3151
¦86 88 | 1891
Tukey Biweicft
24 56
30 48
1495
159
41
3 047
-33 48
63 38
Lax Kafader BiweitJ*
24 56
30 48
14 02
13.09
4983
3127
-26 9
54 93
MVT
24 56
30 48
44 44
40 48
48
3112
-81 52
1704
. _
207
-------
Output for Simultaneous Intervals (continued).
Robust Simultaneous Intervals
Hiiser TiAey
Intervals for X
6.5.3 Robust Prediction Intervals
1. Click Stats/GOF ~Intervals ~ Robust ~ Prediction Intervals.
eBScout 4.0 [D:\Narain\Scout_For_Windows\ScoutSource\WDrkDatlnExceKSTACKl_OSS]
¦t1 File Edit Configure Data Graphs Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window
Help
Navigation Panel |
Descriptive ~
GOF ~
HvDothesisTestina ~
2
3 4
5
6
7
8
Name
emp.
Acid-Conc
| \Narain\Scouf_Fo, |
!| Intervals > 1
Bb
Classical ~ l
RobConflnt.ost
RobSimulnt.ost
2
-
Robust >
Prediction Intervals
3
37 75 ra
Tolerance Intervals
4
28 62 24
Confidence Intervals
5
18 62 22
Simultaneous Intervals
Group Analysis
6
18 62 23
2. The "Select Variables" screen (Section 3.2) will appear.
® Select one or more variables from the "Select Variables" screen.
208
-------
o If the results have to be produced by using a Group variable, then select a
group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
° Click on "Options" for interval options.
A
o Specify the preferred options. All of the options displayed are
defaults.
o Click "OK" to continue or "Cancel" to cancel the options.
° Click "Graphics" for the graphics option.
209
-------
H f?ce.dietioiij Intervals; ^Iflt
X
l>/ Generate Robust Intervals Plot
Intervals Plot Title
(Robust Prediction! ntervals
OK I
Cancel
A
o Click "OK" to continue or "Cancel" to cancel graphics
options.
o Click "OK" to continue or "Cancel" to cancel the computations.
Output for Robust Prediction Intervals.
Dale/T ime ol Computation
¦ Robust Prediction Interval
11/15/200812-13 44 PM
User Selected Options
From File j DANaiainSScout_For_Windows\ScoutS ourceW/orkD atlnE xcel\S TACKLO S S
Full Precision jOFF
Confidence Coefficient
PROP Method
Huber Method
Influence Function Alpha of 0 05 with MDs following Beta Distribution
PROP PLs derived using 10 Iterations and initial estimates of median/MAD
Influence Function Alpha of 0.05 with MDs following Beta Distribution.
Huber PLs deiived using 10 Iterations and initial estimates of median/MAD
Tukey Biweight Method | Location Tuning Constant of 4 and a Scale Tuning Constant of 4
iTukey PLs derived using a Maximum of 10 Iterations and initial estimates of median/MAD
Lax/Kafader Biweight Method
T uning Constant of 4
Lax/Kafader PLs derived using a Maximum of 10 Iterations and initial estimates of mediari/MAD
|MVT PLs derived using 10 Iterations and initial estimates of median/MAD
-
Air-Flow
I
Number
Standard
MAD/
Ob*.
Mean
Median
Deviation
0.6745
SE Mean
Qitical t
LPL
UPL
Classical
21
HI 43
58
8168
583
2 001
2 08G |
40 85
80
Initial
Initial
Final
Final I
| I
Method
Mean
Stdv
Mean
Stdv
Wsum
SEM
Critical t
LPL
UPL
PROP
58
583
57.18
5.02
17 54
1 199
2.114
46 26
68 09
Huber
53
5 83
GO 07
8 546
20 62
1 882
2 089
41.78
78 34
TukeyBiweicfit
58
5 83
57.48
7 438
17 66
1 784
2113
41.18
73 76
Lax Kafader Biweir^t
58
5 83
59 41
4164
14 84
1 081
2.147
5018
68 65
MVT
58
5 33
58 37
6 809
18
1 5G2
2.101
43 63
73 04
"
210
-------
Output for Robust Prediction Intervals (continued).
Robust Predictionlntervals
Hubw Tukey
Intervals for Stack-Loss
6.5.4 Robust Tolerance Intervals
1. Click Stats/GOF ~Intervals ~ Robust ~ Tolerance Intervals.
S§ Scout 4.0 [D:\Narain\Scout_For_Windows\ScoutSoiirce\WorkDatlnExcel\STACKLOSS]
¦0 File Edit Configure Data Graphs
Stats/GOF
| Outliers/Estimates Regression Multivariate EDA GeoStats Programs Window
Help
Navigation Panel |
Descriptive ~
GOF ~
Hvnothesis Testina ~
2 3
4
5
6
7
8
Name
emp. Acid-Conc
1 i
Intervals ~ 1
Classical ~ I
RobConflnt.ost
RobSimulnt.ost
RobPredlnt.ost
2
Robust ~
Prediction Intervals
3
37
Tolerance Intervals
T1
28
62 24
Confidence Intervals
5
18
S2 22
Simultaneous Intervals
Group Analysis
6
18
62 23
2. The "Select Variables" screen (Section 3.2) will appear.
• Select one or more variables from the "Select Variables'' screen.
211
-------
o If the results have to be produced by using a Group variable, then select a
group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
° Click on "Options" for interval options.
pobust' Tolerance; InteryalSj QjJtibns,
¦PROP Method Options
rtt Iterations
F? PROP
10
"Initial Estimate —
f Mean/Stdv
C Median/MAD
-Influence Alpha —
0.05
rMDs Distribution
(* Beta
C Chisquared
—Huber Method Options
I? Huber
"8 Iterations
10
"Initial Estimate —
C Mean/Stdv
<• Median/MAD
"Influence Alpha —
0.05
rMDs Distribution
(* Beta
f Chisquared
"Tukey Biweight Method Options
rtt Iterations ¦
a Tukey
Biweight
10
Maximum
r Initial Estimate ~~
C Mean/Stdv
Median/MAD
"Tuning Constants
I 4
Location
4
Scale
"Lax/Kafadar Biweight Method Options
rtt Iterations ¦
rj Lax/Kafader
Biweight
10
Maximum
r Initial Estimate —
Mean/Stdv
(* Median/MAD
-Tuning Constant _
-MVT Method Options
MVT
"tt Iterations
10
r Initial Estimate —
C Mean/Stdv
(• Median/MAD
—T rimming %
0.1
Confidence Level
| 095
Coverage
*
OK
Cancel
A.
o Specify the preferred options. All of the options displayed are
defaults.
o Click "OK" to continue or "Cancel" to cancel the options.
212
-------
Click "Graphics" for the graphics option.
il !?[edibtioni Inteijvalsj £lot
F7 Generate Robust Intervals Plot
Intervals Plot Title
(Robust Predictionlntervals
OK Cancel
A
o Click "OK" to continue or "Cancel" to cancel graphics
options.
Click "OK" to continue or "Cancel" to cancel the computations.
-------
Output for Robust Tolerance Intervals.
Robust T olerance Inter valsAJmis (TLs)
Date/Tirne of Computation
2/25/2008 9 23 20 AM
User Selected Options
From File
DANarain\Scout_For_Windows\ScoutSource\WorkDatlnExcel\Data\censor-by-grps1
Full Precision
OFF
Confidence Coefficient
0 95
Coverage
09
PROP Method
Influence Function Alpha of 0 05 with MDs following Beta Distribution
PROP TLs derived using 10 Iterations and initial estimates of rnedian/MAD.
Huber Method
Influence Function Alpha of 0 05 with MDs following Beta Distribution
Huber TLs derived using 10 Iterations and initial estimates of median/MAD
Tukey Biweight Method
Location Tuning Constant of 4 and a Scale Tuning Constant of 4
Tukey TLs derived using a Maximum of 10 Iterations and initial estimates of median/MAD
Lax/Kafader Biweight Method
T uning Constant of 4
Lax/Kafader TLs derived using a Maximum of 10 Iterations and initial estimates of median/MAD
MVT Method
T riming Percentage of 10SJ
MVT TLs derived using 10 Iterations and initial estimates of median/MAD.
K2 represents the two-sided cutoff for tolerance intervals and is computed based ipon Wsun Values
following the procedure described in Hahn and Meeker (1991)
X
Number
Standard
MAD/
Obs.
Mean
Median
Deviation
0.6745
k2
LTL
UTL
Classical
53
51.1
24.56
43.78
30.48
1 993
¦35.74
137.9
Initial
Initial
Final
Final
Method
Location
Scale
Mean
Stdv
Wsum
k2
LTL
UTL
PROP
24 5G
30.48
51 1
43 78
53
1.983
•35.74
137 9
Huber
24 56
30.48
51.1
43.78
53
1 983
•35.74
137.9
Tukey Biwei^t
24 56
30 48
14 95
159
41
2 045
¦17 56
47 46
Lax Kafader Biweight
24 56
30 48
14 02
13 09
49 83
1.997
-12.12
4015
MVT
24 56
30.48
44.44
40.48
48
1 983
-35.85
124.7
214
-------
Output for Robust Tolerance Intervals (continued).
Robust Tolerance Intervals
140.0 _____ _____
130X1
120 X)
110XJ 9
IOOjD
90X1
800 «
70X1
60.0
3 500
a
a
40.0
30.0
20 a
10 o
0.0
-10.0
-20 X)
¦30 X)
-40 X)
-SOO
Classical PROP tt-fcet Tukey
Intervals for X
6.5.5 Intervals Comparison
1. Click Stats/GOF ~ Intervals ~ Robust ~ Intervals Comparison.
IB Scout 2008 - [D:\Narain\Scout_For_Windows\ScoutSource\WorkDatlnExcel\BRADU]
a2 File Edit Configure Data Graphs
Stats/GOF
1 Outliers/Estimates Regression Multivariate EDA Geo5tats Programs Window
Help
Navigation Panel
L
¦ 1
Descriptive ~
GOF ~
Hvoothesis Testina ~
2 3 4
5
6
7
£
Name
x1 x2 x3
1
^Intervals
D
—1CL1- >19.6 28.3
Classical ~ ¦
IntComp.gst
2
I
P.obust ~
Prediction Intervals
Tolerance Intervals
3
3
10.3
4
4
9.5
9.9
Confidence Intervals
5 j
5
10
10.3
Simultaneous Intervals
6
6
10
10.8
Interval Comparison
2. The "Select Variables" screen (Section 3.2) will appear.
• Select one or more variables from the "Select Variables" screen.
215
-------
If the results have to be produced by using a Group variable, then select a
group variable by clicking the arrow below the "Group by Variable"
button. This will result in a drop-down list of available variables. The
user should select and click on an appropriate variable representing a
group variable.
Click on "Options" for interval options. The options screens shown below
are the default options screen and the options screen for the PROP
method.
SH Q^tionsliitenyalsRobustGA;
"Select Method —
<• Classical
r prop
Huber
Tukey B weight
C Lax Kafader Biweight
r MVT
"Select Intervals
W Prediction Intervals
1*7 Tolerance Intervals
I<7 Simultaneous Intervals
OK
-Confidence Level
[095
"Converage
[09
T itle for Method Analysis
Cancel Intervals
-------
13 OptionslntervalsRobustGA
Select Methcd
C Classical
f PROP
Huber
C TiJcey Biweight
C Lax Kafader El iweight
C MVT
I
Select Intervals
f>? Prediction Intervals
& Tolerance Intervals
W Simultaneous Intervals
Confidence Level
Convetage
[0.95
|0.9
Initial Estimate
MDs Distribution
f Mean/Stdv
Beta
t* MediaVMAD
Chisquared
8 Iterations
InlluenceAlpha
1 10
| 0.05
Maximum
T itle lor Method Analysis
OK Cancel | j Interval:;
o Specify the preferred options.
o Click "OK" to continue or "Cancel" to cancel the options.
• Click "OK" to continue or "Cancel" to cancel the computations.
Output for Intervals Comparison (Default Options — Classical on data set BRADU.xls).
Classical Intervals
M M
¦ SS* Predctioo units
¦ low-5 727178
¦ Upper-8284S111
¦ 35%OtU«neouslm4s
¦ Lower » -10.18796
¦ Upper* 12.745290
~ ¦95%TolnenceL»nl*
" , , 4 I wth 90% Coverage
* ¦ Lower--5 *18152
* ¦ Upper » 7 9754854
¦ OasscalMean
¦ Mean • 12786687
-10 3 ¦
-113
Index of Observations
217
-------
Output for Intervals Comparison (Default Options - PROP on data set BRADU.xls).
Robust PROP Intervals using Median/MAD
¦ 95% Pradction Lrij
¦ Lower --1 185427
¦ l#per-1.0531190
¦ 95% Smutaneous UmU
¦ Lower--1.662105
¦ upper. 1.7297975
¦ 95% Tolerance Liml*
¦ wBh 90% Coverage
¦ Lower--1146212
| Ifcper »1.0139043
¦ PROP Mean
¦ Mean--0.066154
M MM
Index of Observations
6.5.6 Group Analysis
This option in Scout is used for comparing the intervals for each of the groups in a
particular variable of the data.
1. Click Stats/GOF ~Intervals ~ Robust ~ Intervals Comparison.
218
-------
2. The "Select Variables" screen (Section 3.2) will appear.
° Select one or more variables from the "Select Variables" screen.
o Select the Group variable by clicking the arrow below the "Group by
Variable" button. This will result in a drop-down list of available
variables. The user should select and click on an appropriate variable
representing a group variable.
° Click on "Options" for interval options. The options screen shown below
is the options screen for the PROP method.
Options Prediction*Intervals Comparison!by;Gcou|)>
"Select Method
C Classical
PROP
C Huber
f Tukey Biweight
f Lax Kafadar Biweight
C MVT
'Confidence Level
[095
'Future K -
pInitial Estimate
C Mean/Stdv
(* Median/1.48MAD
-MDs Distribution
(• Beta
r Chisquared
-tt Iterations "
10
Maximum
'InfluenceAlpha ¦
0 025
l* Use Default Title
OK
Cancel
o Specify the preferred input parameters for PROP method,
o Click "OK" to continue or "Cancel" to cancel the options,
o Click "OK" to continue or "Cancel" to cancel the computations.
219
-------
Output for Group Analysis (PROP Options - FULLIRIS.xls).
Prediction Intervals by Group
<}
5 7n
5 i-QM
Group 1
n » 50
sd-0.35
Wsum = 4989
Group 2
n - 50
sd-0.52
Wwm ¦ 50.00
95% Prediction Intervals for sp-length (Future K ® 1)
PROP Method
Mid Estimate
Medanrt 4&AAD
Bda MDs OstilbUion
Influence Alpha ¦ 0.025
Nunber teraUons • 10
220
-------
References
Dixon, W.J., and Tukey, J.W. (1968). "Approximate Behavior of Winsorized
(trimming/Winsorization 2)," Technometrics, 10, 83-98.
Fisher, A. and Horn, P. (1994). "Robust Prediction Intervals in a Regression Setting."
Computational Statistics & Data Analysis, 17, pp. 129-140.
Giummore, F. and Ventura, L. (2006). "Robust Prediction Limits Based on M-
estimators," Statistics and Probability Letters, 76, 1725-1740.
Gross, A.M. (1976). "Confidence Interval Robustness with Long-Tailed Symmetric
Distributions," Journal of the American Statistical Association, 71, 409-417.
Horn, P.S., Britton, P.W, and Lewis, D.F. (1988). "On The Prediction of a Single Future
Observation from a Possibly Noisy Sample," The Statistician, 37, 165-172.
Huber, P.J. (1981). Robust Statistics, John Wiley and Sons, NY.
Kafadar, K. (1982). "A Biweight Approach to the One-Sample Problem," Journal of the
American Statistical Association, 77, 416-424.
Mardia, K.V. (1970). "Measures of Multivariate Skewness and Kurtosis with
Applications," Biometrika, 57, 519-530.
ProUCL 4.00.04. (2009). "ProUCL Version 4.00.04 User Guide." The software
ProUCL 4.00.04 can be downloaded from the web site at:
http ://wwvv.epa. go v/esd/tsc/software. htm.
ProUCL 4.00.04. (2009). "ProUCL Version 4.00.04 Technical Guide." The software
ProUCL 4.00.04 can be downloaded from the web site at:
http://www.epa.gov/esd/tsc/software.htm.
Royston, J. P. (1982). "The W test for Normality," Applied Statistics, 31, 2, 176-180.
Scout. 2002. A Data Analysis Program, Technology Support Project, USEPA, NERL-
LV, Las Vegas, Nevada.
Scout. 2008. Technical Guide under preparation.
Singh, A., and Nocerino, J.M. 1997. "Robust Intervals in Some Chemometric
Applications," Chemometrics and Intelligent Laboratory Systems, 37, pp. 55-69.
221
-------
Singh, A. and Nocerino, J.M. 2002. "Robust Estimation of the Mean and Variance
Using Environmental Data Sets with Below Detection Limit Observations,"
Chemometrics and Intelligent Laboratory Systems Vol. 60, pp. 69-86.
Singh, A. 1993. Omnibus Robust Procedures for Assessment of Multivariate Normality
and Detection of Multivariate Outliers, In Multivariate Environmental Statistics, Patil,
G.P. and Rao, C.R., Editors, pp. 445-488, Elsevier Science Publishers.
Tukey, J.W. (1977). Exploratory Data Analysis, Addison-Wesley Publishing Company,
Reading, MA.
USEPA. 2006. Data Quality Assessment: Statistical Methods for Practitioners, EPA
QA/G-9S. EPA/240/B-06/003. Office of Environmental Information, Washington,
D.C. Download from: http://www.epa.gov/quality/qs-docs/g9s-final.pdf.
222
-------