Basic Environmental Statistics Notebook : Training Manual


  EPA-430/1-74-004
  Basic Environmental Statistics
  Notebook
TRAINING  MANUAL

U.S. ENVIRONMENTAL PROTECTION AGENCY
WATER PROGRAM  OPERATIONS

-------
BASIC ENVIRONMENTAL  STATISTICS  NOTEBOOK
              This course is designed to introduce the concepts
           and applications  of statistics to environmentally oriented
           studies.  It is for professional personnel responsible for
           the collection, analysis and interpretation of environ-
           mental data.  The emphasis is on parametric tests of
           significance or sampling from normally distributal data.
           It is necessarily methods oriented where heuristic
           persuasion is used to give insights into the concepts,
           developments and foundations of statistical theory.
               ENVIRONMENTAL PROTECTION AGENCY
                     Water Program Operations
                       TRAINING PROGRAM
                              May-- 1974

-------
             INTRODUCTION TO TESTS OF SIGNIFICANCE
In this paper we characterize the program action necessary in a two-tailed
test of significance by defining and relating the following:  (1) test of signif-
icance, (2) statistic, (3) null hypothesis, (4) test assumptions, (5) law of
random variation for the statistic when H_ is true, (6) alternative hypothesis,
(7) distribution of the statistic when H. is true, (8) four possible decisions
in a test of significance (9) type-I error, (10) critical values, or value,
(11) type-II error,  (12) power of the test, (13) rejection region,  (14) acceptance
region, (15) decision rules,  (16) statistical decision, (17) diagram of a two-
sided test of significance,  (18) robustness, (19) sample size, (20) test of
significance summarized,  (21) example of a two-tailed t-test.

-------
                     AN INTRODUCTION TO TESTS OF SIGNIFICANCE
I   INTRODUCTION

A  Test of Significance

   Statistics has been classified by many
   into two broad subject areas-  descriptive
   statistics and statistical inference (see
   Figure 1).

   Descriptive statistics makes extensive
   use of pictures,  tables, graphs, and
   other visual arrangements.  These methods
   are found popularized in magazines and
   newspapers. The onus  is centered on
   summarizing the available information
   into an easily assimilated form and not
   on acquiring new knowledge  about the world
   we live in.

   Inferential  statistics on the other hand is
   further subdivided into two classes of
   problems,  namely,  estimation and tests of
   significance (other names for the latter
   are hypothesis testing,  significance tests.
   and tests of hypothesis).  New knowledge
   is acquired via induction;  that is we progress
   from the particular to the general or  from
   sample to parent population.
Estimation methods are used to approximate
numerical values of unknown population
parameters (like the mean) from incomplete
or sample data.  A teat of significance can
he defined as a method of analyzing aaia so
aa to discriminate between two hypotneses.

These two methods are closely related,
and indeed one can contruct a procedure to
cover both aspects.  If they are separated,
as done here, then the logical development
for a test of significance  is greatly simpli-
fied resulting in a comprehensive under-
standing in a shorter time period.

The test of significance in turn is only one
small link1  of the total experimental chain
that can be briefly described as (1) problem
definition.  (2) data collection scheme,
(3) data collection, (4) data analysis,  and
(5) the written report.

In the  most carefully  controlled investiga-
tion it  is virtually impossible for  two
 identical experiments to  yield absolutely
 identical responses,  under the assumption
                                         FIGURE 1
                             CLASSIFICATION OF  STATISTIC
                                       STATISTICS
                    DESCRIPTIVE
                                                         INFERENCE
                                              TESTS OF
                                              SIGNIFICANCE

                                                  I	.
                 ESTIMATION
                               PARAMETRIC
  NON-PARAMETRIC

-------
An Introduction to Teats of Significance
    that the measuring instrument is sensitive.
    If, however,  it is  not, then identical
    responses can always be obtained.

    This inability to obtain identical responses
    is due to random variation in the surrounding
    environmental conditions,  the  imprecision
    of the measuring instruments, limitation
    of the observer's technique and experience.
    and the inherent variability encountered
    from sample to sample.  All these factors
    plus others induce fluctuations in the
    experimental reponse.  Even when the
    tested  hypothesis  is true the observed result
    will not match it.

    The purpose then of a statistical test of
    hypothesis  is to sort out and identify the
    differences attributable to expected random
    observational fluctuations as opposed to the
    differences attributable to deviations from
    the  hypothesis.

    The basic question to be answered is "How
    large  must an observed difference be to
    justify rejecting the hypothesis'"  Or stated
    otherwise,  "How divergent must the observed
    difference be in order to be called a rare
    difference as opposed to expected random
    variation' "

    To answer  this question the procedure or
    the  steps taken in  a two-tailed test of
    hypothesis  are always the same'.  The
    explanation of the  variations on the basic
    two-tailed test will be taken up in the sequel.

    The criterion or measure used for discrim-
    inating between differences or distinguishing
    between hypotheses is a statistic.

    Statistic

    A statistic (note no s  on the end) can be
    simply defined as  a function of the sample.
    That is, it is some value calculated from  the
    observational or sample data.   This is
    accomplished by using a formula or some
    prescribed recipe for mangling the data.
    The method varies from test to test and is
    not  unique.
where    x -  sample mean
          n =  sample size
          5 =  standard deviation of parent
               population
          s =  sample estimate of standard
               deviation

   In (1) the statistic value calculated from
   the sample is used to test if the mean of
   a sampled normal population is  10 when
   its standard deviation is known and equals
   5.  In (2) the test is the same except that
   the sampled population's standard devia-
   tion is unknown and is estimated from the
   data as s.

   For example if the four sampled values are 10,
   11. 12, and 11. then z =(11-10)^4 / 5  = 0.4

   The statistic is the measure or criterion used
   to discriminate between two hypotheses used
   in a test of significance,  namely, the null
   hypothesis  and the alternative hypothesis.

   Null Hypothesis

   The null hypothesis is defined as the hypo-
   thesis of no difference between a hypothetical
   and the sample population.  It will be shown
   later that it should be formulated for the
   express purpose of being rejected or nulli-
   fied (see Alternative Hypothesis).

   If the hypothetical population is  labeled  as
   Ph and the  sampled population as Ps, then
   "no difference" can be written as Pg - Pn=  0
   or null. Many statisticians notationally write
   the null hypothesis as
   or
            H,
-Ph = 0
    Two examples are
      _  (g - 10) *fn
                                       (1)
   Unfortunately this notation is not universally
   uniform, but fortunately most deviations
   from these are centered in using a different
   subscript for H.

   As an example the null hypotheses  for the
   statistic given in (1) and (2) are both
   written the same as
        (x -  10)
                                        (2)
            H,
                                                                        10.

-------
                                                     An Introduction to Tests of Significance
For (1) the hypothetical population is
normally distributed with mean • 10 and
standard deviation = 5 while the sampled
population is normally distributed with
standard deviation = 5 and an unknown
mean, and we wish to test whether the
unknown mean = 10.

For (2) the hypothetical population is nor-
mally distributed  with mean = 10  and an
unknown standard deviation while the
sampled population has unknown mean and
standard deviation, and we wish to test
whether the unknown mean =  10.

In both  cases we test whether the difference
between the sampled and hypothetical pop-
ulation  is null or whether the sampled
population has a mean of 10.

Restrictions such as the type of population
sampled as well as parameter information
have been placed.  These restrictions are
popularly called test assumptions.
Test Assumptions

Each test of significance has associated
with it explicit or implicit, or both,  test
assumptions that must be satisfied if the
results of the test are to be valid. Failure
to meet the assumptions affects the pro-
bability statements that will be introduced
later under the topic Type-1 Error (see
also Robustness).

For  example the test assumptions or the
mathematical model for (1) are that the
sampled population  is normally distributed
with standard deviation = 5.  The theory
also requires that the samples be random
and  independent.

Test assumptions are generally classified
as either parametric or non parametric.
The former have more stringent require -
ments, and justification for their use is
generally more difficult.  For example
that the sampled population is normally
distributed with a known standard deviation
could be a parametric test assumption.

A non parametric test requires no hypothesis
about specific parameter values.  These
tests are often called distribution-free.
which really means that the test is inae-
pendent of the form of the underlying dis-
tribution.  These two are not interchange-
able but are so often done that usage has
made them indistinguishable.  A non
parametric test might merely r»quir
-------
An Introduction to Tests of Significance
   As an example the empirical determina-
   tion for the z statistic defined by (1) could
   be as follows.  Take a random sample of
   nine,  n = 9 (any value for n can be used),
   from a normal population whose mean is
   10 and whose standard deviation  is 5.   From
   these nine values calculate x and finally the
   value of z from equation (1).  As a result
   of this first sequence we have one value
   for z; label it Zj.

   This sequential procedure can be repeated
   as often as desired.  Suppose it  is re-
   peated 1,000 times,  resulting in 1,000
   different values for z, namely zlf  r.^,
   • • • - zl  000-  We can form a histogram
   using these 1, 000 values and then draw
   a smooth, continuous curve through the
   center of the rectangles as shown in
   Figure 2.

   The smooth curve labeled HQ in  Figure 2
   is an approximation to the law of random
   variation or the distribution 01 the  z
   statistic when HQ is true.   If instead of
   using 1,000 values for z, we used all
   possible values then the distribution would
   have been exact, that is,  in perfect agree-
   ment with the analytic solution.  With
   1, 000 values the differences  between  the
   two solutions are small and furthermore
   can be made arbitrarily small by calcul-
   ating enough sequences.  The approximate
   solution converges or approaches the
   analytic solution as the number  of sequences
   increases.
If the above experiment were performed
then it would be determined that z would
be approximately normally distributed
with zero mean and unit variance, regard-
less of the value of n used.

The t statistic in (2) when sampling from
normal is distributed as a t-distribution
with degrees of freedom equal to (n - 1)
provided that the degrees of freedom used
in estimating s is (n -  1).  Usually this is  ,
true since x and s are  estimated from the
same data.

That these distributions do approximate
the ones claimed can be verified objective-
ly via a test of significance or subjectively
by using the eyeball test. With 1. 000
samples the agreement should pass the
eyeball test easily by the comparing of
percentiles or  some other cumulative
values.

The reader is urged to verify (1) when
n = 4 or  some small number.  If a computer
is used then 1,000 is a small number of
sequences. If  a computer is not used
then decrease the number of sequences,
realizing that the approximation will not
conform as well.

There are four distributions that one can
talk about up to this point- (1) the hypothe-
tical population or distribution, (2) the
sampled  population,  (3) the distribution
of the random sample  of size n taken from
the sampled distribution,  and  (4) the dis-
tribution or law of random variation for the
statistic calculated  from the sample (see
Figure 3A).
                                          FIGURE 2


         DISTRIBUTION OF  THE STATISTIC WHEN THE  NULL HYPOTHESIS IS  TRUE




r*



_s
^


_f*
s


^^
^


^^




•• •




—•^




h*^_





^
V




^,
V




v,_
V




^•^••L STATISTIC AXIS
-------
                                                                           FIGURE 3



                                                           FOUR DISTRIBUTIONS III HYPOTHESIS  TESTING


                                                                           H  IS TRUE
          HYPOTHETICAL POPULATION

          SAMPLED POPULATION
                                                        DISTRIBUTION OF SAMPLE
                            RANDOM SAMPLE
                    I
                    10
               VARIABLE AXIS
                                                                                                          DISTRIBUTION OF
                                                                                                              STATISTIC
                                                            VARIABLE AXIS
                                                                                        ^INIQUE VALUE OF STATISTIC

                                                                                        OBTAINED FROM SAMPLE
                                                                                                                                     STATISTIC AXIS
                                                                           HA IS TRUE
              HYPOTHETICAL
              POPULATION

sPULATICtt

\
i
\
\
\
\
i
iRAHDOM SAMPLE
V~
DISTRIBUTION OF SAMPLE
LIKE PARENT IF
REPRESENTATIVE
f\
' \
, \
1 \
• \
' \
i \
f
.^









\

                    10
                       VARIABLE
                                  20
                                                         VARIABLE AXIS
                                                                          sa
                                                                                                           DISTRIBUTION OF STATISTIC
                                                                                                                                                    STATISTIC AXIS
                                                                                            \  UNIQUE VALUE OF STATISTIC
CP
**•

3
Q.

O
rt-

O
3
                                                                                                                                                                                 (B
                                                                                                                                                                                 01
O
"1
C/3

•i
                                                                                                                                                                                 n
                                                                                                                                                                                 tu

                                                                                                                                                                                 o
-------
An Introduction to Tests of Significance
  In an actual test there is only one sample
  and hence only one value for the statistic.
  In Figure 3A  it is labeled ZQ.  The reader
  should bear in mind that it is only one
  value out of a possible range  of expected
  random variation.  Some intervals,
  especially those near the mean, have
  higher probabilities than others,  owing
  to the nature of the distribution.  It will
  be shown later that these probabilities
  will be used to determine unexpected
  random variation.

   If Ilg is not true then in the real world
  some alternative hypothesis is true, and
  this circumstance gives rise  to a different
   law of random variation,  owing to the
   truth of the alternative hypothesis.

   Alternative Hypothesis

  If the null hypothesis is not true then the
   negation of this is the alternative hypothesis.
   For our example earlier this \s written
   by most as
               H,
10
   and read as "the mean of the sample pop-
   ulation is not equal to 10. " Again observe
   the caution that this notation varies from
   author to author.  The form ol the test
   dictated by such an alternative hypothesis is
   a two-sided or two-tailed test,  and the
   reason is shown in Figure 4.
There it will be noted that the alternative
hypothesis occupies both sides of the
parameter space, that is,  to the right and
left of u B 10.  A one-sided or one-tailed
test will be discussed in the sequel, and
is~lhe reader has guessed, the alternative
occupies one side of the parameter space.

The alternative hypothesis must be the
operational statement of the experimenter's
research nypotnesis.  ir this is not so, or
tne null Hypothesis is the research hypo- -
thesis, then the researcher cannot really
prove his research hypothesis.  In order
to prove the research or alternative hy-
pothesis the null hypothesis must be re-
jected or nullified.  A heuristic argument
justifying this very important position (al-
ternative and research hypothesis should
be the same)  will be taken up undeT
Decision Rules.

In the design of experiments,  data collection,
and the subsequent test of  hypothesis, the
aim of the researcher should be to reject
his favorite theory.

If the result of the tests of significance
from  data so collected doeo reject the
pet theory then savings in money,  man-
hours, and  material have been realized.
On  the other hand if the test results do
not reject his  position the implication
is a large gam in the confidence of the
validity of the theory.  One is in the
                                          FIGURE 4
                         PARAMETRIC  SPACE  FOR -A TWO-SIDED TEST
                                           T
                                           10
                                                                        H AXIS
                rto :

                »A !
                                                 10

                                                 10
-------
                                                    An Introduction to Tests of Significance
position of trying to prove something is
wrong but being unable to do so.  Each
failure adds extra strength to the belief
that it cannot be wrong.

Too often researcher design data collection
to support their theory and innocently build
in unknown biases.  By taking the opposite
position and vigorously attacking it the
danger of bias  is decreased.  An aggressive
philosophy will yield more rapid advances
in research.

Distribution of the Statistic When HA is
True

Figure 3A shows the distribution of the
statistic z in (1) when  HQ is true or  the
sampled population  has ji - 10.  How is
that distribution affected if the alternative
hypothesis, u  t 10 is true9

To be specific, suppose the sampled pop-
ulation had a = 20 (see Figure 3B).  By
examining the  numerator of (1) we note
that z would now tend to have mostly posi-
tive values since most sample means would
be greater than 10.  This distribution for
z would be shifted to the right of the
^ » 10 curve.

A sequential sampling scheme would give
its approximate shape.  Obviously it would
be some distribution like  that labeled
,A - 20 in Figure 3B. If u B 0 then by the
same argument the  distribution for  the
statistic would be shifted to the left  of the
M  » 10 curve,  as shown in Figure  5.
By applying the sequential sampling
process it is easily seen that, for every
possible value of ^ for the sampled popu-
lation, there would exist some distribu-
tion for the statistic.  It could 1-e any one
of the two shown  in Figure 5 or some
other not drawn there.

Regardless of the value of ji  in the sampled
population the reader can imagine one but
only one distribution for HA »f true..  If HQ
is true then there is only one distribution
for the statistic (see Figure  3A).

The uncertainty faced  is what is the position
for the true sampled population9  Is  it
coincident as  in 3A or  shifted as in 3B9

This problem is solved by partitioning the
statistic axis so as to  maximize our
possible correct  decisions and minimize
our possible incorrect decisions when we
discriminate.

Four Possible Decisions  in a Test of
Significance

The correct decisions are (1) do not  reject
HQ when it is true and (2) reject HQ w"e" '*
is false.  Our incorrect decisions are (3)
reject Hn when it is true,  or (4) do not
reject Hp when it is lalse.  These four
possible decisions are shown in Figure 6.
In order to distinguish between the two types
of incorrect decisions  or errors, they are
given the special names type-I and type-11
error.
                                      FIGURE 5

                    SOME POSSIBLE DISTRIBUTIONS FOR THE STATISTIC
                                    = 10
                                                             20
                                                                             STATISTIC AXIS
                       O   B
-------
An Introduction to Tests of Significance
                                         FIGURE 6

                 FOUR POSSIBLE DECISIONS IN A  TEST OF  SIGNIFICANCE
                                          STATISTICAL DECISION
                            DO NOT REJECT HQ
                                            REJECT HO
        REAL
       WORLD
"O


IS


TRUE
   CORRECT


   DECISION
  INCORRECT

  DECISION

(TYPE  I  ERROR)

   (a-ERROR)
"O


IS


NOT


TRUE
   INCORRECT

    DECISION

(TYPE  n  ERROR)

    (P-ERROR)
                                                              CORRECT  DECISION
    Type-1 Error

    The type-1 error (also called « - error
    or error of the first kind) is the rejection
    of the null hypothesis when it is actually
   true.  The seriousness or quantification of
    this error is measured by using probability,
    and  its value is obtained  in the following
    way.  Suppose in Figure  5 that we adopt
    the following discrimination rule.  If the
    statistic  is greater than or equal to A(some
    constant) reject HQ: otherwise do not re-
    ject H0.
                                  Suppose the sampled population is ^ = 10.
                                  Since the distribution of the statistic for
                                  the hypothetical population is always known.
                                  its area under the curve to the right of a
                                  known A can be determined.  This area is
                                  the probability of getting a value  for the
                                  statistic,  which according to our rule
                                  means reject HQ falsely,  or a type-1
                                  error has been made.

                                  A familiar example of making a  deliberate
                                  oc-error is the fairy tale wherein the
                                  little girl jokingly cries  "wolf, wolf
                                  falsely, HQ- no wolf, HA- wolf.
-------
                                                    An Introduction to Tests of Significance
The magnitude of the oe- error is called «
and is also known as the level of signifi-
cance. In a test of significance  the reverse
procedure is actually followed, that is, oc
is pre-specified before any data  are
collected, and from a set of appropriate
tables the corresponding value for A is
determined.

It should  be  clear from Figure 3 that the
law of random variation is  determined by
the test assumptions, and failure to meet
them results in a different  law.  Using the
incorrect law implies an incorrect A.
Thus, conforming to the test assumptions
is important—foj^a^cor!r!£C^&—and_S_ The
only loophole possible is discussed under
robustness.

The tables for various laws of random
variation are found in the back of almost
every book  in statistics and are  used to
specify A, which is called the  critical
value or significant limit.

Critical Values or Value
In the two-sided tests used here two
critical values are used in order to maxi-
mize both correct decisions shown in
Figure 6.  A heuristic argument leading
to this conclusion is as follows.

Assume that Hois true; therefore the only
distribution on Figure 5 is the p. = 10 curve.
Let the statistic be the normally distributed
z of (1).   If « • o. 05 then  from a set of
normal-curve tables the single critical
value for the statistic is A = 1.64.  The
probability of z less than or equal to 1.64
is 0.95,  or the  correct decision (do not
reject HQ) will be made,  on the average,
85  percent of the time.  The probability
of getting z greater than 1.64 is 0.05.
We shall make the incorrect decision or
an «-error (reject HQ when true), on the
average,  5 percent of the  time.  The
value for A is determined  by oc, and the
distribution of the statistic is determined
by the known hypothetical  distribution
whether Hn is true or not.

Is A a good choice when H« is  not true'
Let us assume for definiteness that ^ = 0
in the sampled population (see  Figure 5).
The probability of a correct decision now
(reject Hn) is almost zero.  The value of
the statistic would be some point on the
^ = 0 curve as determined by the particu-
lar sampled data.  The probability that it
 is greater than  A is the area under u.  = 0
 to the right of A or almost zero.  Tne corrcv '
 decision of rejecting IfQ is almost never
 a poor choice for A  under the acsumption
 ji = 0.

 On the other hand A is  a good choice if
 the sampled population is  ^ = 20.   In that
 case  the probability of  rejecting Hg is high
 and is the area  under the ^ - 20 to the right
of A.   If one alternative is true we have high
 probability of making the correct decision
 (reject HQ) while if another alternative is
 true we have low probability.

 We try to correct this situation by the
 following procedure. Keep « » 0. 05 and
 change the critical value from A to B  =  -1. 64.
 The decision rule is changed to reject HQ if
 the statistic is less  than B; otherwise do
 not reject HQ.   Applying the same logic
as in  the previous two paragraphs shows
that if n = 20 is  true then the correct
 decision (reject HQ) has almost zero
 probability.  Tf  n =  0 is true then the
 correct decision has high  probability.

 Thus  to protect  against the alternative's
 being on either  side of  p = 10, the area
 should be put equally in both tails, giving
 rise to two  critical values.  For example,
 if oc= 0.05 then  A =  1. 96 and B =  -1. 96
for  the normal curve.   Do not reject Hn
 will be the decision  if tTTe"  statistic lies
 between these values; otherwise it will be
 rejected.

 Values of the statistic from 1.96 to -1.96
 are expected random variation.  Those
 outside this  range are unexpected and
 represent genuine differences when the
 power of the test is  considered. A topic
 to be  defined soon.

 If the sampled population is u = 20 then the
 probability of a  correct decision (reject HQ)
 is the area  under its curve to the  right oF
 A and to the left of B, a nice high number.

 If the sampled population is ^ = 0  then the
 probability  of a  correct decision is the
 area  under  its curve to the right of A  and
 to the left of B,  also a  high value.

 If the sampled population is ^ = 10 then the
 probability  of a  correct decision (dp not
 reject HQ) is the area under its curve
 between A and B, which 1  - cc,  a preselected
 number.

 Both  correct decisions  have a high probability
 regardless  of the value for ^ in the sampled
 population.
-------
An Introduction to Tests of Significance
   The «is  preselected by the researcher,
   not the consulting statistician; however.
   the implications of a large or small a
   should be made clear.

   In initial research an oc_error or 10
   percent or even 20 percent is reasonable
   and not unusual.  In that case the re-
   searcher is guarding against rejecting
   the research hypothesis and is less con-
   cerned about an «-error.  Large « means
   that the distance between A and B in
   Figure 5 is decreased to,  say,  E and F.
   The decreased distance means that the
   continue research or reject HQ probabili-
   ties are  increased regardless of the
   sampled population.

   As research continues  and other tests are
   made, the cc level is gradually reduced.
   If an cc-error (to incorrectly continue re-
   search)  was made initially,  a subsequent
   test will reveal the mistake, and no harm
   will have been done except extra research.

   On the other hand if the «-error is made
   small initially,  the distance between A
   and B is increased to,  say,  C and D. The
   increased distance means  that the do not
   reject HQ (abandon research) probabilities
   are increased regardless  of the sampled
   population.

   If a type-II error (see Figure 6; abandon
   research when it should be continued) is
   made then the rmstake  would never De
   uncovered unless another  researcher works
   in the same area.

   A type-I error can also be thought of as
   the researcher's risk of following a false
   clue  in research. If a  type-I error is
   made then the statistical decision is that
   the research hypothesis is true  when it
   really is not.  The result  is further re-
   search in a false area.  Eventually enough
   evidence or subsequent tests of hypothesis
   will reverse the decision and reject the
   research hypothesis.  Because  of the
   initial incorrect decision,  we have follow-
   ed the false clue (continue research) and
   did subsequent research until a  later test
   corrected the initial incorrect decision.

   The type-I error can also be viewed as
   taking a risk. The risk is the probability
   of saying the research  hypothesis is true
   when it really is not.  This risk is oc, and
    1  - cc is the  probability or the  confidence
   one would like for accepting a true HO.
Both values are related to each other by
odds.  The odds of being right when HQ
is true are 1 - «= to «.  Therefore, if
one chooses  « » 0.05 the odds of a right
decision when HQ is true are 0. 95  to 0. 05
or 19 to  1.  Changing cc to 0.01 changes
the odds to 99 to  1, and « • 0.1 implies
odds of 9 to 1 of correct decision if HQ
is true.

One cannot set the odds too high since
there  is a penalty for  each increase, namely,
raising the probability of a type-II error.
These two errors are not independent, and
the interrelationship between them can be
explained in the following way.

Type-II Error

The type-II error (also called p-error or
error of the second kind) is the probability
of not rejecting HQ when it is false or is
the probability of rejecting HA when it  is
true.  The p  is the magnitude of the p-error.

Graphically this error can be shown in
Figure 5 by considering the following
hypothetical problem.  Test HQ with the
decision  do not reject HQ if the statistic
lies between A and hi, otherwise reject.
SUppose HA is true and ^ » zu.
probability that a statistic will he between
A and B  is the area under its curve between
these two points.  ThereforetS1**he pro-
bability of not rejecting Hn falsely (it
should be since H^ is true; is the area
under the appropriate true alternative
hypothesis over the region of do not
reject HQ,  or from A to B.

If the critical values are changed from A,
B to C,  D (decrease oc) then p  would be
increased.  Increasing °c from A, B to
E, F would decrease p.  Thus  « and p  are
inversely related.

Unfortunately the value for the alternative
is never  known and p cannot be calculated.
The size of the type-II error depends on the
disparity or distance between the null and
alternative hypotheses,  sample size, cc,
and the particular test of hypothesis used.
Further remarks will be made about p
estimation in the sequel.

The oc-error is generally considered by
most to be more serious than a p-error.
Making false claims about our research,
<*-error,  is  more important error than
abandoning a possibly fruitful research
effort, p-error.
  10
-------
                                                    An Introduction to Tests of Significance
 Furthermore, journals frequently publish
 articles in which Hn is rejected,  and one
 seldom sees articles that do not reject HQ.
 This mean «-errors are the only kind that
 can appear in print.  One wonders how much
 duplication there is in research because of
 that fact.  Failure to reject Hg is also in-
 formation that could  guide other researchers
 in the selection of their hypothesis.

 A good example of minimizing oc irrespec-
 tive of the effect on the size of (3 exists in
 our courts of law (HQ:  not guilty,  HA
 guilty).  We would  sooner accept  a (3-error
 (let a guilty man go free) than commit an
 cc-error (send an innocent man to jail).  Of
 course,  we can guarantee °c = o always by
 setting everyone free.

 The (3 -error can be thought of as the risk
 of failing to following a true clue.  Our
 decision is do not reject HQ (or reject re-
 search hypothesis) when H0 should be re-
 jected (or HA  accepted).  Owing to this
 incorrect decision,  we abandon further
 research that should be pursued.   The true
 clue is the truth of H/.,  but because of a
 p -error we reject it  and abandon follow-up
 research.

 Power of the Test
One minus beta (1 - 0) is defined as the
power of the test. It is the probability of
accepting HA when true.  Graphically it
is the area under the true alternative over
the rejection region.

It can be thought of as our confidence level
of accepting the research hypothesis when
true.  Thus tests with high power are
desirable. The odds of accepting HA if
true are (1 - p) to (p).

If you have a set of data for which there is
a parametric and a non-parametric test
available then the parametric is more
powerful and preferred. Suppose you were
searching for someone in a large crowd, and
the only clue you had was that the individual
was definitely in the crowd.  This search
would not have much hope of being fruit-
ful (i.e., powerful).  On the other hand if
the available clues were that the person
sought is male, 6 feet tall, has black hair,
is wearing a  brown suit, smokes cigars,
limps, has an amputated right arm, and so
on. then the search would be effective or
powerful.  More clues  imply a more
powerful search, and by analogy the more
assumptions  that are made and  realized, the
more powerful the test.
 Rejection Region

 The preselection of « m conjunction with
 the known distribution of the statistic for
 P^ determines the critical values.  These
 critical values in  turn partition the total
 statistic axis into two parts, the rejection
 region  (also called significance region or
 critical region) and the acceptance region.

 The rejection region is so selected that if
 HA is true then the probability of accepting
 it is high (1  - p) while at the same time if
 HQ is true the probability of rejecting it is
°c, the preselected value.

 These values are classified as real differ-
 ences,  not random vairation, and represent
 a compromise when type-I and type-II
 errors  are taken into consideration.

 In Figure 5  if C and D are the critical
values then the rejection region is to the
 right of C and to the left of D.  The re-
 mainder of the statistic axis  is called the
 acceptance region.

 Acceptance Region

 The acceptance  region is selected so that
 if H^ is true the probability of rejecting
 it is small (p) while at the same time if
 HQ is true the probability of not rejecting
 it is the preselected 1 - oc.

The action to be taken when a statistic is in
 either region is called the decision rule.

 Decision Rules

 Our first decision or discrimination rule
 (see Figure 6) is do not reject HQ if the
 statistic falls in the acceptance region
(some aliases-  cannot reject Hn. a non-
 significant result,   outside  tne rejection
 region, outside  tne~cT7tical region,  accept
 HQ,
research hypothesis not proved).
A nonsignificant result should be interpreted
as meaning that such a statistic value from
these sample sizes this large is obtained
so frequently when HQ is true that the data
convinces no one that anything more than a
HQ random  process produced them.  If  HQ
is true then the acceptable odds,  preselected
by the researcher at  1 - oc to oc.  favor this
decision.  It also means that the  research
hypothesis was not proved statistically
according to the standard of evidence based
on probability.
                                                                                       11
-------
An Introduction to Tests of Significance
   The contracted statement of the decision
   rule,  to accept HQ, is often used instead
   of the longer statement.   If one took this
   statement literally (unfortunately some
   do) it would be incorrect, since the null
   hypothesis is never proved when a  non-
   significant result is obtained.

   A heuristic argument taken from Draper
   and SnutM1) is as follows.  John Doe, an
   office worker, is at lunch when the idea
   is advanced that he is not rich, HQ- not
   rich,  HA ' rich.  Two proofs are cited:
   (1) he always buys his clothes at second-
   hand stores (2) he always brings his lunch
   or eats at the cheapest places  as today.

   The statistical parallel is that  two  non-
   significant tests of significance were in-
   correctly interpreted as  proving HQ as
   true instead of do not reject HQ.  It was
   learned later that Mr.  Doe died and left
   $500, 000 to  charity. All the proofs of
   the null hypothesis are invalidated.

   One can never prove the  null hypothesis
   by getting a nonsignificant result.   The
   implication is clear-  given two tests
   (1) H0° research hypothesis and (2) HA
   = research hypothesis then one must
   choose  the second test.   If a significant
   result is obtained then the research
   hypothesis has been proved.  See the
   sequel  for further remarks.

   Our second decision rule is reject  HQ if
   the statistic falls in the  reaction region.
   Rejection means that if HQ is true then a
   statistic value of this magnitude from a
   sample size this large is so rarely obtained
   (odds favoring it are « to 1 - «)  by a ran-
   dom  process alone that this peculiar statistic
   value points to something over and above
   the HQ  random process.

   When the statistic falls within  the rejection
   region, two logical possibilities exist. The
   first  is that HQ is true  and a rare event or an
   oc error by the random process.  The only
   other possibility is that HA is true  and
   caused this large statistic value.   Since
   the acceptable °c-error was preselected
   the decision must be that the sampling was
   from HA.
Some authors state the result of a signi-
ficant statistic as HA is true unless a
rare event has happened,  constant
repetition of tnis long statement becomes
tedious and it is often contracted as H^
is true,  accept research hypothesis.
reject Hp, mere is a sigmlicanrresult,
and so forffTI

Statistical Decision

The statistical decisions in a test are
(1) do not reject Hn if the statistic  is in
the acceptance region (Z) reject Hn n tne
statistic is  in tne rejection region.

In a test where the statistic is a continuous
distribution the critical value can be in
either region since it does not affect the
value of cc.

In tests where the statistic is discrete
the tables clearly state whether the critical
value is the boundary of the rejection or of
the acceptance region.   Caution is advised.

Note also that for discrete distributions
probabilities vary by discrete jumps.
Hence it is unusual to have °= equal exactly
to 0.05 or any other level.  In those cases
the °c level is specified as 0.05 or less.

Diagram of a Two-Sided Test of Significance

The proceeding terms have been collected
and their interrelationship shown in Figure 1,
Some interpretations are as follows.

If HQ is true or it is the only distribution
for tne statistic in Figure 7 then three kinds
of statistic values can be obtained.

First would be a value  in the left rejection
region meaning reject H«  incorrectly or an
oc-error.  Next a value  in tne rignt rejection
region with the same result.  The third
value we can obtain lies between the two
critical values meaning do not reject Hn
correctly or no error.

On the other hand if the sampled population
has a larger mean then Pn, its distribution,
 is on the right.  When the first of  its three
values is in the right rejection region then
HQ is rejected or HA. is  accepted correctly
  12
-------
                                 FIGURE 7
                         TWO-SIDED TEST OF SIGNIFICANCE
    REAL WORLD
I HA TRUE DUE TO
: SMALLER MEAN
HQ IS TRUE
HA TRUE DUE TO
 LARGER MEAN
    I
    I
   ..J
STATISTICAL
DECISION
 *               i               i               *
 | REJECT HO       j DO NOT REJECT HQ  j  REJECT HQ      )

^^?.^^r~^~IZ:£--,
     :  a/2 OR  | p OR  j        • p OR    «/2 OR   J
              ERROR \

             •••••••••••••A
                     TYPE I
                     ERROR
             : TYPE
             { ERROR :
    NO
    ERROR
I TYPE H
J ERROR

I	.--
  TYPE I
  ERROR
1  NO
J  ERROR

L. ___
1
:
..•""* DISTR. OF
...'iHE STATISTIC
^.••^OR SMALLER MEAN Hft
	 -nrrrf
REJECTION REGION

Iff
i
i
I
l
^-~""^^ '
J^^ DISTR. OF^^Hftt'll
'\jffl THE STATISTIC xf iH^
fjTj'/i.. FOR HQ ^
-------
                                                    An Introduction to Tests of Significance
or no error.  The next value could be in
the left rejection region  (not shown, to
prevent overcrowding) with the same re-
sult.  The third number could be in the
acceptance region with the incorrect
decision do not reject Hn or a (3-error.

The discussion if the sampled population
has a smaller mean follows analogously
and is left to the reader.  Again only two
of the three values are shown.

Robustness
A robust test is one that is insensitive to
departures from the assumptions.  A test
that is sensitive to departures from the
test assumptions lacks  robustness.  If a
test is robust and the assumptions are not
violated badly then the odds still remain as
specified.  This means that, for example,
if one uses a test which assumes normality
and the source data are quasi-normal, then
a robust test can be confidently applied.

It is impossible to quantitate the departures
from test assumptions that can be made in
general and still have valid odds. Each
test is different and much research has
been done.  The reader should consult with
a statistician on these matters.

If a test requires several assumptions then
one can talk about robustness with respect
to each assumption or with respect to any
combination of assumptions.  See the sequel
for further remarks.

Sample Size

A realistic question before data collection
is how many sample points are needed.  In
many cases this question can be answered
provided the researcher can supply infor-
mation about variability and the size of
differences that are to be discriminated.
Consultation is again  suggested since much
work has been and is being done.

Test of Significance Summarized

A test of significance can be defined as a
method of analyzing data so as to discrim-
inate between two hypotheses.  The first
is the null hypothesis; the second is the
alternative hypothesis,  which should be the
operational statement of the experimenter's
research hypothesis.  Next, the best test
for the alternative hypothesis is selected,
that is, the one with the greatest power, and
is robust.  This test is based upon assump-
tions, which may be explicitly or implicitly
stated, or both.  The probability statements
used in discrimination are based upon the
assumptions. After the test is selected,
the«-error is prespecified. that is, the
odds of being right if the null'hypothesis
is true. This oc- error in conjunction with
the known distribution of the statistic for
the null hypothesis determines critical
values.  The critical values partition the
entire statistic axis into two parts,  the
acceptance and rejection region.

The sample size is fixed and the data are
now collected.  From the collected data
the statistic is evaluated and compared
with the critical values to determine
whether it is in the acceptance or rejection
region.  After this comparison the appro-
priate decision is made.

If the  statistic is in the  acceptance region,
the appropriate decision is  do not reject
HQ, or cannot reject Hp. or rej'ect HA . and
¥o on.  It the statistic is in the rejection
region, the decision is  reject Hn, or accept
-A' or there is a significant difference, ana
so forth'witn eacn decision mere is sr
possible error.

When the statistic falls in the acceptance
region and if in the real world HA is true
and we reject HA , this  is a type-Tl error;
however,  if HQ is true there is no error.
There is no chance for  a type-I error when
the statistic is in the acceptance region.

When  the statistic falls  in the rejection
region and if in the real world HQ is true
and we reject H0, this is a  type-I error;
however if HA is true there is no error.
There is no chance for  a type-II error
when  the statistic is in  the rejection region.

Briefly when a test of significance is per-
formed the following information should be
available-  (1) H0, (2) HA. (3) «, (4) test
assumptions.  (5) statistic,  (6) distribution
for the statistic,  (7) critical values,  (8)
acceptance region.  (9) rejection region, and
(10) decision after the statistic is evaluated.
Additional desirable information is (11) a
comparison of the power of the test versus
that of others available for  the same HQ
and H^, (12) sample size needed to dis-
criminate  differences of specified magnitudes,
(13) information about robustness,  and
(14) type-II error information.

Example of a Two-Tailed t-Test

We illustrate a two-tailed t-test by using
a four-step process for convenience's sake.
                                                                                     14
-------
An Introduction to Teats of Significance
   Assume that we wish to test at the 5
   percent significance level whether a new
   product average p, is different from the
   previous standard » 7.35.  The observa-
   tional data are random independent nor-
   mally distributed Xt i = 1. 2,....  25.
Step 1.         HQ:  jx»7.35

               HA-  n*7.35

               oc . 0.05
Step 2.  Statistic « «/h (X  - 7.35) / s. which
        is distributed as t-distribution with
        24 • degrees of freedom.  From a
        table of t values,  the two critical
        values are± 2.064.  The acceptance
        region lies between these two values,
        and the rejection  region is all other
        statistic values.
Step 4.  Reject HO since the statistic is in
        the rejection region, or the new
        average is different from the old
        one.

   Many more examples can be found in
   almost every textbook on statistics.  The
   four-step process used above  is not
   generally used but is adopted here for
   instructional convenience.

   In the concluding article we discuss one-
   talled test, variations on a two-tailed test
   and conclude with several remarks that
   should clarify further other ideas about
   a test of significance.
REFERENCE

1  Draper, N.R.  and Smith, H.  Applied
      Regression Analysis, 1967. John
      Wiley and Sons.  Inc.. New York.
Step 3.   From the data'X -7.1, s • 0. 504;
         therefore, the statistic » ^5 (7.1
         - 7.35)/0.504 * -2.480. or in the
         left rejection region.
 This outline wa.s prepared by
 Joseph F. Santner,  Mathematical
 Statistician,  Manpower Develop-
 ment Staff, Office of Water  Programs,
 National Training Center, EPA,
 Cincinnati, OH 45268.
  15
-------
             VARIATIONS ON A TEST OF SIGNIFICANCE
This is the second of a two-part paper which describes the program of
action in a test of significance when extended to one-sided tests.  In addition
simple and composite hypotheses are illustrated by several examples.
Finally, the relationship between a significance test and confidence limits
is exhibited when variations on a two-sided t-test are discussed.
-------
                         VARIATIONS ON A TEST OF SIGNIFICANCE
In part one of this paper we characterized the
program of action necessary  in a two-tailed
test of significance by defining and relating
the following   (I) test of significance,  (2)
statistic.  (.5) null hypothesis. (4) test assump-
tions,  (5) law of random variation for the
statistic when HQ is true.  (6) alternative
hypothesis, (7) distribution of the statistic
when  HA is true, (8) four possible decisions
in a test of significance. (9) type-I error, (10)
critical values or value, (11) type-II error,
(12) power of a test,  (13) rejection region,
(14) acceptance region.  (15) decision rule,
(16) statistical decision, (17) diagram of a two
suled test of the significance, (10) robustness,
(10) sample size,  (20) test of significance
summarized and  (21) example of a two-tailed
t-test.  The  purpose of this paper is to  extend
the procedure to  one-sided tests.

I  TWO-SIDED VERSUS A ONE-SIDED  TEST

The decision to make either a two-sided test
or n one-sided Irst (also called  one-tailed) is
not left to the discretion of the researcher.
This determination is made in advance of any
data collection and is based upon eitner
theoretical grounds or previous experience.
If at least one of  these causes is present then
a one-sided test can be made and is actually
preferred since it is generally more power-
ful than the corresponding two-sided test.  If,
however,  there exists no valid justification for
a one-sided test, a two-sided test is mandatory.
This procedure as well as  the preselection of
fl before data collection guarantees that
statistical  decisions will not be biased.
                      II  STATEMENT OF THE HYPOTHESES

                       The null and alternative hypotheses for a
                       one-sided test can be written in two different
                       ways, neither of which is universally accept-
                       able to statisticians and neither of which is
                       both logical and satisfying to the researcher.
                       Authors recognize the problem, however,
                       their varied solutions cause much difficulty
                       for readers.

                       We illustrate with a specific example.  Suppose
                       one has theoretical grounds to test whether
                       a  given distribution has a new mean greater
                       than 10 when the previous mean was 10.
                       These hypotheses can be written either as
                             H0.  p = 10   HA :  ii > 10
                       or
                             H
              10
                                                    10
(1)

(2)
                       where ^ is read less than or equal.

                       The parameter space for this one-sided test
                       is shown in Figure 1. In both  (1) and (2) the
                       alternative hypothesis occupies one side of the
                       parameter space line, and hence its name.
                       On the otherhand the null hypothesis is only
                       a point for (1) but a half line for (2).
                                         Figure  I

                        PARAMETRIC SPACF  FOR A ONE-SIDED TEST
                                                                        AXIS
                  '0:
10

10
HO:

HA:
-------
  Variations on a Test of Significance
   It is logical to define the hypotheses as (1)
   since the definition of °c remains as the pro-
   bability of rejecting H0 if true.  It is very
   unsatisfactory when one realizes that if the
   true mean is  less than 10 the correct decision
   is impossible.  A satisfying  decision procedure
   would be one  which makes a  correct decision
   for all possible values of  the parameter.  This
   is true only when the unon of the  hypotheses
   is the entire  parameter line.

   It is satisfying to define the hypotheses as
   (2) but illogical since the  « error' must be
   redefined as  the probability or rejecting rfp
wnen
          = m only.  THIS follows since tne null
                   has a infinite number of
   hypothesis in
   different values for p.  Hence for a given
   00 there are an infinite number of critical
   values with a corresponding number of
   acceptance and rejection regions.  Out of this
   set there  is only one which gives HA its greatest
   power and that is when ^ = 10.  This is the one
   used in a  test and hence the redefinition of «.
                                                 Those who prefer (1) object to the illogical
                                                 redefinition of <*  and rightly so.  Those who are
                                                 partial to (2) object to unsatisfactory incom-
                                                 plete specification of the parameter space.
                                                 The saving grace is that,  regardless of which
                                                 specification is used, the program of action
                                                 and final decisions agree either that there is
                                                 not enough evidence to support the research
                                                 hypothesis or that the research hypothesis has
                                                 been proved statistically.  I prefer {2) and
                                                 shall use it throughout with cc redefined.
Ill  DIAGRAM OF A ONE-SIDED SIGNIFICANCE
    TEST

 Figure 2 shows the diagram of a one-sided
 test.  Note that the distribution for HO is a
 unique curve and would correspond to HQ in
 (1).  If (2) were  used then the distribution of
 HO would be the  same and obtained by using
 the u = 10 value  of HQ.
                                            Figure 2
                                ONE-SIDED  SIGNIFICANCE TEST
           REAL WORLD
STATISTICAL DECISION
                         :                    I                   I
                         I DO  NOT REJECT HQ •   REJECT H0       •
                                                      '  DISTRIBUTION OF
                                                                                         STATISTIC
                                                                                            AXIS
                  ACCEPTANCE REGION
                                       CRITICAL VALUE
-------
                                                      Variations on a Test of Significance
 In comparing Figure 2 with the diagram of a
 two-sidee1 test of Figure 3 some differences
 are obviously apparent.  The most notable
 is that al of a is in the right tail of HO in
 Figure 2 as opposed to °c/2 m Figure 3.

 This shifting is heuristically obtained by
 modifying Figure 3 sufficiently to obtain a
 one-sided test.  Since larger differences are
 the research hypothesis and smaller differ-
 ences are of no interest the latter distribution
 should bo deleted from Figure 3.
                             The probability of accepting I-I^ when true is
                             the area under its curve over the rejection
                             region, and hence the rejection r-'ijion should
                             be selected to maximize this ares.  This is
                             done by inspection of Figure  3.  It is easily
                             seen that increasing the right rejection region
                             increases power.  This is done by moving the
                             right critical value as far to  the left as
                             possible (such that all of «is in the  right tail of
                             HQ).  which maximizes power.
                                       Figure 3
                                 TWO-SIDED TEST OF SIGNIFICANCE
        REAL WORLD
HA TRUE DUE TO

 SMALLER MEAN
  TRUE DUE TO

LARGER MEAN
STATISTICAL DECISION
                     REJECT H0
                  ...._.
                j  DO NOT REJECT HQ    j   REJECT HQ       j
                        n~r~zz\:J.
*
. NO
'•
ERROR


0./2 OR
TYPE I
ERROR





DISTR. OF
THE STATISTIC
FOR SMALLER MEAN II
•
: ?. OR :
•TYPE II
• ERROR
.


NO


HRROK





.^iJfSTR

r i
3 OR
TYPE II
ERROR
,

fv/2 OR
TYPE I
ERROR

r



ro?^^^ ^,-1
./r ">^J ^i |
X^THE STATISTIC xrKl i
rrT
REJECTION REGION j
*[> , FOR M,
, ACCEPTANCE
J-,-llTf!l!|T
KRT.ION ,
^

lr




1 	 — 1
NO i
I
ERROR J
1 !
	 r ~
i
i
i
DISTR ^ j
OF STATISTIC
X
FOR LARGER •*
MEAN HA ^^.^
TTT — ^^"~-
, REJECTION REGION
                                                                                    STATISTIC
                                                                                     AXIS
                         CRI7 ICAL VALUE
                                                   r.'iriCAL VALUE
-------
Variations ona Test of Significance
 Note that this shifting or increasing the size
 of the right rejection region in turn increases
 the area under HA over the reiection region
 or  increases the probability of accepting HA
 when true or increases the power of the test.
 Therefore if HA is true the one-sided test
 has more  power than a corresponding two-
 sided test, or in other words,  one-sided test
 are preferred.

 The discussion of Figure 2 follows analogously
 that of Figure 3 in part one of this paper
 and is left to the reader.
IV  EXAMPLE OF A ONE-TAILED t-TEST

 An example of a one-sided test could be as
 follows.   There exists theoretical grounds
 for testing whether a new product average
 exceeds the previous standard = 7.35.  The
 observational data  are random and independ-
 ently normallv distributed X, i = 1,  2,  ....  25.
 Step 1.
       HO v i
                 II
                   A
             > 7.35
                    = 0.05
 Step 2.
 Sten
 Step 4.
The statistic = \Tn(X - 7.35)/s, which
distributed as t-distribution with
degrees of freedom = 2-1.  From a set
oft tables, the critical value is 1.711
(see Figure 2).  The acceptance
riqion is less than or equal to 1.711,
and the reiection region is all other
values.

Prom the data "X = 7.6 and s = 0.504;
honce the statistic = ^5(7.60 - 7.35)
/O.S02 = 2.48.

Reject HO since the statistic is in the
rejection region,  or the new average
exceeds the old.
 Modifying the test for a research hypothesis
 with a smaller mean would be as follows
 Step 1.
        II0  ji>7.35

        HA  u<  7.35
                   r«"*0. OS

 Step 2.  The statistic is th*» same as in the
          previous problem. Critical value
          = - 1.711 with the acceptance region
          greater than or equal to -1.711 and
          the rejection region all other values.
                                         Step 3.  From the data X = 7. 10. s = 0. 504,
                                                 hence the statistic =v.^5(7.10 - 7.35)
                                                 /0.504 = -2.48.

                                         Step 4.  Reject HQ
V  SIMPLE AND COMPOSITE HYPOTHESES

 At this time it might be well to define two  •   .
 other terms used in connection with hypothesis
 testing.  A simple hypothesis is one that
 specifies the distribution uniquely or is a
 point in the parameter space of the distribution.
 A composite hypothesis is one that is  not
 simple.

 If the distribution for Figure 1 has one para-
 meter, say the mean, then HQ as  written on the
 left is a simple hypothesis or a point.  IIQ on
 the right is composite or the half line while
 HA >s composite in both statements.

 An example of a two-parameter family would
 be the normal distribution as shown in Figure 4.
 If the test HO  u = 10 is  made with known
 variance =10 then HQ >s a simple hypothesis
 since it is the point (10, 10).

 If the test is HO- n = 5 with an unknown variance
 then MO is composite or the half line (5, a2)
 or the unmarked, solid, vertical line of Figure 4.
 The  alternative in both cases is composite.
VI  VARIATIONS ON A TWO-SIDED t-TEST

 When one reads two different texts  giving the
 same test, confusion  can result il the test is
 presented in what looks like two different pro-
 grams of action. The confusion is  muddied
 even further when more references arc con-
 sulted.   Figure 5 shows some common ways
 of presenting a two-sided t-test for the mean
 where S- = sKrn.
        X
 The top line of Figure 5 agrees with Figure 3.
 Note that one can algebraically manipulate
 from  one line  of Figure 5 to another.  In some
 cases it  is possible to do a two-sided test with
 only one critical value,  as shown in line 2.
 A good example of a two-sided test with only
 one possible critical  value is the F-test used
 in an analysis of variance.  The right tail is
 always used since we have the  ratio of a
 numerator which consists of a  larger or equal
 mean square divided  by a denominator mean
 square.

 Note that lines 3 and  4 have no statistic or
 critical values as defined in this paper.   They
 would be classified as quasi-test of significance
 by this writer.
-------
                                      Variations on a Test of Significance
                        Figure 4
    PARAMETER SPACE FOR THE  NORMAL DISTRIBUTION
o  AXIS
                                         (10,10)
                                      10
                                                     U  AXIS
                        Figure 5
           VARIATIONS ON A  TWO-SIDED TEST
EXPLANATION

RANDOM VARIABLE
AND
TEST STATISTIC

RANDOM VARIABLE
AND
TEST STATISTIC

RANDOM VARIABLES
NO
TEST STATISTIC

RANDOM VARIABLES
NO
TEST STATISTIC

RANDOM VARIABLES
NO
TEST STATISTIC
METHOD

-t < (X -

	 1
t
\X -V \

4. C ^" / v
~ tO— S k A
X

1
Ix -n\<

1
x - tss<
OF TEST

T
/l)/S5 
-------
 Variations on a Test of Significance
 At times it may be possible to manipulate a
 significance test into a confidence limits.
 Hence one can perform a test by using confi-
 dence limits, as shown in the last line of
 Figure 5.  This result is not always true.
 Here estimation and tests are on  common
 ground.  Lines 1 and 5 of Figure  5 are math-
 ematically equivalent.

 If the reader is  unsure of Figure  5 then a
 numerical example should be used to work
 out all cases.  If the previous numerical
 example is used then arithmetic and time
 are at a minimum with the advantage that
 the statistical decision can be checked for
 agreement if "•- = 0. 1 is used.

 Variations on a  one-sided test are left to
 the reader.  They can be easily obtained
 from lines 1, 3,  and 5 with the appropriate
 modification.
VIT  CONCLUDING REMARKS
  We conclude with several remarks that should
  help clarify further the ideas associated with
  a test of significance.

  A  For small samples a significant result
     will be obtained only if the null hypothesis
     is very  badl> violated.   In Figure 2 this
     means that the distance  between the  peaks
     of MQ and HA must be very large.  Small
     samples  imply large variances for both dis-
     tributions, and hence poor power.  It is
     a truism that from small samples you get
     small information.

  M  If a difference exists and is shown, then by
     increasing sample size this difference can
     be shown with a smaller type-I error. For
     example,  if one  gets a significant result
     at «= 0.05 and a sample size of 20,  then
     with a sample size of 50 a significant re-
     sult can be obtained with a smaller «.
     The effect of increasing sample size in
     Figure 2 would be to cause the variance
     of both I-IQ and 11^ to be  decreased or
     both distributions would be more peaked.
     Hence to keep « fixed the critical value
     must be shifted  to the left,  and this shift
     thereby increases the rejection region or
     power of the test.  In addition since H^
     is peaked the type-II error would be de-
     creased without changing the  critical value,
     and hence power increases for a second
     reason.  The sum of these two  increments
     gives  a more powerful test.  Therefore
     "" can be decreased or the critical value
     moved right to retain the original power,
     which was significant.
C  Using the same °- and increasing sample
   size will result m a rejection of HQ even
   when differences  are smaller and thus
   more difficult to detect.  AS  an example,
   if «  = 0.05 and sample size = n failed to
   get a significant result because of a type-II
   error,  then increasing the sample size
   sufficiently will result in a significant
   statistic.  Increasing the sample size in-
   creases the power for the same *. as
   discussed in R above.

D  For the same  sample size and ^-, larger
   mean differences  are more easily dis-
   criminated.  As the difference between Up
   and HA  increases, this means in Figure Z
   that the peaks of the two distributions be-
   come farther apart.  This increases power
   and the  probabilitv of a significant result.

E  For the  same sample size and 'r. smaller
   differences are more difficult to determine.
   The heuristic  argument follows that m
   the pervious paragraph, with II/\ and  HQ
   shifted closer together, loss in power can
   be regained by increasing sample size.

F  It is often claimed that if a difference exists,
   no matter how small, one can prove it pro-
   vided that sample size is large enough.  One
   can reduce both tvpe-I and tvpc-II errors
   by increasing  sample size.   If sample size
   is infinite then both errors can be made zero.

G  If after  increasing Cample size the same
   decision of nonsigmficancc i& obtained the
   confidence of agreement between the sampled
   and hypothetical population is strengthened.
   In  other words, if one continuously  urils
   nonsignificant results with increasing sample
   size, then the acceptance  of  lly becomes
   more tennble.  Complete confidence m MQ
   is  obtained onl\ with an intir-itdv large
   sample.

H  Trying to prove the validiU ol llo b> non-
   significant results for finite  n is like  the
   mathematician's trying to prove a theorem
   by example.   If the mathematician's
   theorem is true for every possible example,
   then it is proved.   Mv the same reasoning,
   if every possible  sample of size n is tested
   then one would expect to have a significant
   result of  '"N times, where N =  total number
   of all possible samples of size  n.  Significant
   deviations from reN would invalidate \\Q.

I  If a test does not meet the assumptions
   then the ° level is affected and is not what
   the researcher specified.  Power mav also
   be adversely affected. A test with  fewer
-------
                                                      Variations on a Test of Significance
K
assumptions is preferred unless the
departures are compensated for by
robustness.

If a researcher really understood the test
he was using,  with enough time he could
determine the critical value empirically
via a Monte Carlo simulation  study.  Re-
peated sampling would permit the construc-
tion of the distribution of the  statistic for
HQ.  From it the critical value could be
easily determined.  !f the researcher cannot
imagine an experiment for obtaining the
distribution via some random sampling
procedure, then he really does not under-
stand his  test  of significance.

Similarly, by  using the Monte Carlo
technique, one can empirically determine
the power of the test versus that of a
specific HA.   First assume that H^ has a
given set  of parameter values, next deter-
mine its approximate distribution via Monte
Carlo, and determine 1-p or the power of the
  area under the curve over the rejection re-
  gion. Byassummga different set of parameter
  values and by repeating the abo^e, one
  can study the change in the powpr of a
  test versus any set of specified parameter
  values.

L In the same way one  can use Monte Carlo
  techniques to investigate robustness.   If
  one assumes a departure of a certain magni-
  tude from the basic assumption, then the
  evaluation of °- is determined as in J,
  where the rejection region  is now known from
  the distribution of HQ when no assumptions
  are violated. If these two ^'s are in agree-
  ment,  then the procedure is rpbust with
  respect to the proceeding departures from
  the basic assumptions.  This procedure can
  be repeated for any assumption  or combina-
  tion of them or any magnitude of departure
  from them.
                                                 This outline was prepared by
                                                 Joseph F.  Santner, Mathematical
                                                 Statistician, Manpower Develop-
                                                 ment Staff,  Office  of  Water Programs,
                                                 National Training Center,  EPA,
                                                 Cincinnati,  OH  45268.
-------
STATISTICAL DATA
-------
           Classification of Statistic
                 STATISTICS
 DESCRIPTIVE                     INFER
ENCE.
                      TESTS OFVn. ,       ESTIMATION
                    SIGNIFICANCE
T«*t« of significance (hypothesis teating, testa of hypothesis, significance
   tests, ... )	.
A teat of significance can be defined aa a aathod of analyzing data s« aa to
    discriminate between two hypothesis.
                  L£M    D£FIHITIQN
                A*«A^^^^-^nu>^x^ jt^^^^^
       DATA   COLLECTION   $C/Y£A/£
  3)     »
                 /
                 S
  5)
                       1A-1
-------
A statistic (not* no a on the end) can be simply defined as a function of the
sample and is the Measure or criterion used to discriminate between two hypotheses
used in a test of significance, namely the null and alternative hypothesis.
                                             ?  S/ZE
  5=  57"A/V. QEV OF  M=SAHP-
        PAR.   POP.
           57AN.  DEM.
               uo
L^yV. SKLAS*.""i 10TT
D i- R '( K- /o)/
                      r"
                                           H.
Two examples of a statistic are focmula 1) and 2) above.
                       1A-2
-------
   The  null  hypothesis  is  defined  as  the  hypothesis  of no  difference  between
   a hypothetical  and the  sampled  population.
 Hypothetical population (P ) is
 nomally distributed with
 mean = 10 and standard deviation
 • 5.

 Sampled population (P ) is
 nomally distributed
 with unknown mean and
 standard deviation = 5
Each test of significance has associated with it explicit or implicit, or both,
test assumptions that must be satisfied if the results of the test are to be vallc
                                    U-3
-------
           Classification of Statistic
                  STATISTICS
DESCRIPTIVE                      INFERENCE
                       TESTS OF             ESTIMATION
                     SIGNIFICANCE               <&$* *•
                                               duff  s-
       PARAMETRIC               NON-PARAMETRIC
                        .               0 _ ^      ^^.
   hss*~*^_  -^jf lui/xr Cu/ux^  ~  flg*  I*' fa  ~ /H.f

 JLdlutAtf/  Ixfivti*  AtiAJtdr * S«^VY>^.  /«5^ •
                         1A-4
-------
    Distribution of the Statistic
When the Null Hypothesis Is True
                                         '^*/-*/^
                     **U3«» STATISTIC AXIS
             Four Distributions in Hypothesis Testing
                     HQ is True
  HYPOTHETICAL FOR
  EQUALS
  SAMPLED POP
       RANDOM SAMPLE
 DISTR. OF SAMPLE
 LIKE PARENT IF RER
    10
  VARIABLE AXIS
VARIABLE AXIS
    DISTR. OF STATISTIC
       0  ZQ STATISTIC
         4   AXIS
OF STATISTIC /
                                 FROM SAMPLE
                      1A-5
-------
  The alternative hypothesis is usually the negation of the null hypothesis but
  can be any admissible hypothesis alternative to the one under the test.
  Oz«
-*»/!
A) t = ^Cx-io
                              - 10
Parametric Space fora Two-Sided Test
          	p^rrMXlS
                     10   '
                 H0:ji=lO
                 HA:|1^10
                   1A-6
-------
                 Four Distributions in Hypothesis Testing
                             H is True
HYPOTHETICAL
    POP   SAMPLED
            ... POP
                RANDOM
                "SAMPlT
   DISTR.OF SAMPLE
   LIKE PARENT IF REP.

DISTR. OF STATISTIC
     10      20
   VARIABLE AXIS
          20
    VARIABLE AXIS
 UNIQUE VALUE
"oTsTATTsricT
                                                               AXIS
                 Four Distributions in Hypothesis Testing
                             HQ is True
  HYPOTHETICAL POP
  EQUALS
  SAMPLED POP
        RANDOM SAMPLE
    10
 VARIABLE AXIS
  DISTR. OF SAMPLE
  LIKE PARENT IF REP
                       .X
      DISTR. OF STATISTIC
   10
VARIABLE AXIS
          0  zn STATISTIC
             4    AXIS
 OF STATISTIC  /
      Some Possible
                                              FROM SAMPLE
                    Distribution for the Statistic
                 n=o
                     |i=20
/"AX
• • » m
J 3i
"••...
0
\
^
^ \.
D B E F A C CTATiCTir
                                                  AXIS
                             1A-7
-------
    Four Possible Decisions in a Test of Significance


REAL
WORLD
HO
IS
TRUE
HO
IS
NOT
TRUE
STATISTICAL
DO NOT REJECT H0
CORRECT
DECISION
INCORRECT
DECISION
(TYPE II ERROR)
(0- ERROR) ^

DECISION
REJECT HQ
INCORRECT
DECISION
(TYPE 1 ERROR)
(ex -ERROR) <*-
CORRECT
DECISION
V
Error of the
"frlrst Kind
Error of the
Second Kind
     cr-error (Type I Brror, Error of First Kind) Is the rejection of the null hypothesis
     when It Is true.
       Some Possible
                    Distribution for the Statistic
                                                    -. \A
                   L=0
DBE
-•*£
                                             STATISTIC
                                                AYIC
                                                AAlO
    'he nacnltude of the a-error Is called a and is also known as the level of
    ignlflcance.

   The critical values (or value) partition the statistics axis into two regions,
   the acceptance region and the rejection region.
-------
Some Possible
         Distribution for the Statistic
                n=io
n=o
            DB E   F AC
                lislO
                    STATISTIC
                      AXIS
            DBE    F AC
                |i=10
                    STATISTIC
                      AXIS
                  H=20
            DBE    F A C   STATISTIC
                             AXIS
             1A-9
-------
ODDS OF  CORRECT
                    TO
o(=0.0/
 {UMAAW/vsl/P'  P~~
              1A-10
-------
0-error (Type II Error, Error of  Second Kind) IB not rejecting the null  hypothesis
when It la  false, or is the rejection of the alternative hypothesis when it  is true

The magnitude of the (5-error is called 0.
The cr-srror is generally considered more serious than a 0-error.
   Some  Possible
                                          for the Statistic
                         DBE
STATISTIC
    AXIS
One ainus beta (1-3) is defined as  the power of the test.
The odds of accepting the alternative hypothesis (H>  when true are 1' - P  to 0
Parametric test  are generally more powerful than the  corresponding non-parametric
test.	
The rejection region is a region on the statistic axis such that if the statistic
fall within it the null hypothesis is rejected.  It is also called significance
region and critical region.

The acceptance region is a region on the statistic axis such that if the statistic
falls within it  the null hypothesis is not rejected.
The first decision or discrimination rule is do not reject IL if the statistic
falls In the acceptance region (some aliases: cannot reject   HO> a non-significant
result, outside  the rejection region, outside the critical region, accept HQ,
research hypothesis not proved).	

The second decision or discrimination rule is reject 1L if the statistic falls
in the rejection region (some aliases: H  is true, accept research hypothesis,
there is a significant result, H. is true1 unless a rare event has happened).
                                           1A-11
-------
         Two'Sided Test of Significance
  REJECT HQ
   «/2 OR
   TYPE  I
    ERROR
               REJ. REGION
                                  DO NOT REJECT HO
 DIST.
STATISTIC
 FOR HQ

ACC. REGION
   CRITICAL VALUE
                                       STATISTIC AXIS
                                        REJ. REGION
                                                 CRITICAL VALUE
       Two-Sided Test of Significance
RiAL
                              \  HA TRUE DUE TO
                              '   LARGER MEAN  !
REJECT HQ    I DO NOT REJECT Hp |     REJECT  Hp




REJ. REGION rrffll 1







/» OR :
TYPE Ilj
J
/ STATISTIC,^
f
1
FOR Hpxfjj
ERROR i

<
i
i
i

^- 	 •
'Ik
\ 	
i NO
i "j^
~
DIST/0/
cnnun
I 	

STATISTIC FOR
LARGER MtAN HA
«, REJ. REGION-^
i
i
!
-J

STATISTIC

 CRITICAL VALUE
                                               CRITICAL VALUE
                         1A-12
-------
               Two-Sided Test of Significance
     HA TRUE DUE TO
     SMALLER MEAN
REJECT HO
NO
ERROR
:

DO NOT REJECT HQ
REJECT HQ 1
ft OR ;
TYPE 1 1
ERROR :
         DIST. OF
    STATISTIC FOR
SMALLER MEAN HA
    REJ. REGION
         CRITICAL VALUE
  'DIST. OF
  STATISTIC
   FOR HQ

  ACC. REGION
\
        REJ. REGION
              CRITICAL VALUE
STATISTIC
  AXIS
             Two-Sided Test of Significance
   HA TRUE  DUE TO
    SMALLER MEAN
HO IS TRUE
 HA TRUE DUE TO
  LARGER MEAN


REJECT HO

NO
ERROR







oc/2 OR
TYPE I
ERROR


DIST.


OF
STATISTIC FOR
SMALLER MEAN HA
REJ. REGION ^







11



\ DO NOT REJECT HO REJECT HO X^yjool/

ft OR
TYPE II
ERROR



llfl
III




'



t




S



"--.,, """.^
N0 ft OR «/2 OR NQ i
rppop TYPE II TYPE! TV ',
ERROR ERROR ERROR ERROR •



A^\ --' "
STATisYlC,fH
FOR Hoi'f ! j K
•- Jf! i i : K
^.r^ilifi i llTt>
i
^N
DISfOF
STATISTIC FOR
LARGER MEAN HA
k REJ. REGIOR-x^ STATISTIC
AXIS
        CRITICAL VALUE
            CRITICAL VALUE
                                      1A-13
-------
A yoK^flt- font- is one that  la  insensitive to departures from the assumptions.  A
test that la sensitive to  departures  fro*  the  teat assumptions lacka robustness.
 A test Of significance can be defined as a method of analyzing data so as to
 discriminate between two hypotheses.  The first hypothesis is the null hypothesis;
 the second is the alternative hypotheals, which should be the operational statement
 of the experimenter's research hypothesis.  Next the best tast for the alternative
 hypothesis la selected, that is, the one with the greatest power, and is robust.
 This test is based upon assumptions, which may be explicitly or Implicitly stated,
 or both.  The probability statements used in discrimination are baaed upon the
 assumptions.  After the test Is selected, the or-error is prespeclfled, that la
 the odds of being right If the null hypothesis la true.  This cr-error In conjunc-
 tion with the known distribution of the statistic for the null hypotheals determines
 critical values.  The critical values partition the entire statistic axis Into
 two parts, the acceptance and rejection region.  The sample size la fixed and the
 data are now collected.  From the collected data the statistic la evaluated and
 compared with the critical values to determine whether it is in the acceptance
 or rejection region.  After this comparison the appropriate decision is made.  If
 the statistic is In the acceptance region, the appropriate decision is:do not
 reject the null hypothesis.  If the statistic is in the rejection region, the
 decision is reject the null hypothesis.  With each decision there is a possible
 error.  When the statistic falls in the acceptance region and if in the real
 world the alternative hypothesis is true and we reject It, this Is a Type-II error.
 On the other hand if the null hypothesis is true and we do not reject it then
 there is no error.  There is no chance for a Type-I error when the statistic- is In
 the acceptance region.  When that statistic falls in the rejection region and if
 in the real world the null hypothesis is true and we reject it, this la a Type-I
 error.  On the other hand if the alternative hypo the is la true and we do not
 reject it then there is no error.  There is no chance for a Type-II error when
 the statistic is in the rejection region.
                                            1A-14
-------
                                 M/votf. o/sra
                                    W-&1
3) X=7/;xL^.5^t3t5~(7/-7
                 '\
:^-«ty8
If there exists no valid justification (theoretical grounds or previous experience)
for a one sided test then a two-sided test is Mandatory.
    a)
-------
Parametric Space for a One-Sided Test
  HQ : H= 10
  HA:|i>10
                            HAXIS
10
                U-16
-------
REAL WORLD
              One-Sided Significance Test -"''  ~*
                                                'V^4
                                                  W
    	             —   —           i

    HQ IS TRUE    I   JlA JS TRUE  j
 STATISTICAL	—	—	
   DECISION LPQ. NJ?T_REJEPl MoJ _ _"J'ICT Hp_ _}
            DIST. OF
           STATISTIC
             FOR HQ
                           STATISTIC    -^
                             FOR HA     STATISTIC
                                        	AXIS
ACC. REGION
                                       REJ. REGION
                                       n b j» n hipivra   *

                       CRITICAL VALUE   ^JJL ^  ^


                Two-Sided Test of Significance
       HA TRUE DUE TO
       SMALLER MEAN
HO IS TRUE
                             HA TRUE  DUE TO  !
                               LARGER MEAN  !


	 	 	
REJECT HQ
— " " """"" ""~"~---— .^ '
DO NOT REJECT HO

""-
N0 «/2 OR ft OR NQ
r.Bn. TYPE * TYPE 1* CDPAD
ERROR ERROR ERROR ERROR









DIST. OF
STATISTIC FOR
SMALLER MEAN HA
REJ. REGION f






^Mcr AP\

^ffff
mfl

REJECT HO I

e OR
TYPE H
ERROR



,-'""
S STATISTIC^
FORH^f!||j
•• ACC:f^Clii
i !'




])v
m


^
«/20R
TYPE I
ERROR



^«


r



>.K




DISlTtff
STATIST!]
LARGER II
-.REJ. REl


""•-^
NO ;
ERROR !
i
i
i
i
I
i
8dLOR
MCANHA
Gmbk. STATISTIC
UlUR 
-------
                     •    «
                        _
           =^ C V= I- 7/»! X"
AR
   "
                       t?P
                                   \C.V=-llli
3) X=7
   71
-------
  A alaple hypothesis is one that specifies the distribution uniquely or is
  point in the parameter space of the distribution.
  A composite hypothesis is one that is not simple.
Parametric Space fora Two-Sided Test
          	1	[i AXIS
                     10

                    := 10 ~
   Parameter Space
            for the Mormal Distribution
         a2 AXIS
                               10
                                     HAXIS
                    lA-19
-------
        Variations on a Two-Sided Test
   EXPLANATION
 RANDOM VARIABLE
       AND
  TEST STATISTIC
 RANDOM VARIABLE
       AND
  TEST STATISTIC
 RANDOM VARIABLES
       NO
  TEST STATISTIC
METHOD OF TEST

                   •I MZbitc aUuan,
                    |  £uUErf(LA'«rt U
                 X-nl/S-
-------
5M  I) IFF
       C V. -> MOR£
            SAME C \/
                 lA-21
-------
DO NOT
                        A^UyK/»
                  U-22
-------

                   /\
   Distribution of the Statistic
When the Hull Hypothesis Is True
                 H
                      STATISTIC AXIS
                lA-23
-------
     F/XED  PARA. VALUES
                       ii
      H.
/yo(/)  MEETS TESTS ASSUMPTIONS
HLQ) FAILS
                rv.  F/?OH TABLES
-------
    EFFECT OF A  CHANGE  IN  /Ji

        [JL2  LARGER THAN
         CT SAME FOR THE CURVES.

 LJL  IS A  MEASURE  OF CENTRAL TENDENCY

 OR WHERE  VALUES TEND  TO CENTRALIZE.
EFFECT OF CHANGING CT ALONE  AND KEEPING
fJL THE SAME.

O~ IS A   MEASURE  OF  SPREAD  OR VARIATION
IN ORIGINAL  SCALE UNITS.
                 1B-1
-------
EFFECTS  OF  CHANGING  BOTH  fJL a cr

                      fJL2 GREATER THAN ^,
                         GREATER THAN CT,
                IB-2
-------
CUMULATIVE NORMAL DISTRIBUTION-VALUES OF P
ZP
.0
•
•
•
1.0
•
•
•
2.0
.00


.8413


.01





• ••





.05




.9798
• • •





.09





          No %.
                  IB-3
-------
1B-4
-------
 CUMULATIVE NORMAL DISTRIBUTION-VALUES OF P

                       P          ~fablfc A-\

                                     T-Z
ZP
.0
•
•
•
1.0
•
•
•
2.0
.00


.8413


.01





• ••





.05




.9798
• ••





.09





r
0.0^03
  \
                    IA>
                 o.
                   1B-5
-------
CUMULATIVE NORMAL DISTRIBUTION-VALUES OF zp
                                      .  A-2.
                                     T-3
MV
Zp
.9
1.282
.95
1.645
.975
1.960
.999 1
3.090 1
^^^^•^^•^^•^^^^^••^^H
-------
    0.05"
CUMULATIVE NORMAL DISTRIBUTION-VALUES OF zp
   I
P
          9
1.282

z
.95
1.645
.975
1.960
.999
3.090
-------
     o-QW
                                   O.OS.S-
1
CUMULATIVE NORMAL DISTRIBUTION-VALUES OF zp
                                      '. A-2.

1
.9
1.282
.95
1.645
i
1
                     1B-8
-------
TAPLE  A-I^A-1  FO*
UNIT
X=ANY NoR.\ti£rt !«K
          I         I
       Iflf^flV,  W
_     v\/'J&'i   \/ ^^ >^/     ^ ~T /\ /v

    X-NOI?.
FIND AREA
                 To  =
 ^C/o-ao)^--/^

 Z=(^o-Sio;/4- |
           IB-9
-------
         10
30  X
CUMULATIVE NORMAL DISTRIBUTION-VALUES OF P
                        •p         Table.  A-j
                                     T-2
                   1B-10
ZP
.0
•
•
1.0
•
•
•
2.0
.00


.8413


.01





• ••





.05




.9798
• ••





.09





-------
7    NO R
       •C —TK.
 ,Y,-T;^ZJ+--»
            AS
-^  z
  ^-— ^-x
  X=»
RV-
                  x2-Distribution
                               12 V2
              ic-i
-------
                 NOR,  DI sra
                    0,02.5
           DisTB
          s  '
 PERCENTILES OF THE X  DISTRIBUTION
                              Table. A-3
THE VALUES OF Xp CORRESPONDING TO p
df
1
4
•
•
•
10
• • •




X2
*.90



15.99
X2
A.95
3.84
9.49


*2
*.99
6.63



            1C-2
-------
         TT-DIST
         O.oS
    D/S7B.

    AR	
                 Y-/0
KKENTILES OF THE
               DISTRIBUTION
                        Tqble

                          T-f
THE VALUES OF Xp CORRESPONDING TO p
df
1
4
•
•
•
10
• • •




X2
*.90



15.99
X2
*.95
3.84
9.49


*2
^.99
6.63



          iC-3
-------
KAIV. VAR
df
         1C-4
-------
NOR. DISTB
                        MJLJzCV*^**
                      >*<*•*
                              »«r
           1D-1
-------
         Normal vs Student Distribution
                                      NORMAL
                                       STUDENT  4d.f.
f(t)
0.4
       -202
Student t-Distribution
    i.
                          1IX-2
-------
LET
  V.=
      A-  l

           1D-3
-------
Percentiles of the t  Distribution
                        P
-------
              t
o, O5"
              Perantilts of the t Distribution
^^^^^mMnh
df

24
26
...
60

...






^tmmu/A y/M//fth±.

'.95

1.711
•

1.671

tp
*.975


2.056



1C
...






                                              T-5
                     1D-5
-------
0.0*-r
o-?s-
              Percentiles of the I Distribution
                             P
                      1D-6
-------
(LET N.^
 1
 >^ '
EITHER,
RV=
r
                    ,
    =M-\= off
  I x-v
                  F-DISTRIBUTION
              (nj=10f n2=4)
                      30
-------
PERCENTILES OF THE F DISTRIBUTION
        F.95(nl'n2)
                          Table ,,  ^
                             T-7
>^L
n2^\
1
2
30
•
•
•
00
1
161.4
18.51
4.17

3.84
• it





30
250.1
19.46
1.84

1.46
• • »


i
i
<


00
254.3
19.50
1.62

1.00
       /n(-1
      O.I 5
                         0,0^
               QlMfl
         1E-2
-------
        0-75
   ft
0,?5
    PERCENTILES OF THE F DISTRIBUTION

            F.95(nl'"2>
                                    r •  /
>^l
n2^\
1
2
30
•
•
•
00
1
161.4
18.51
4.17

3.84
• • •


•


30
250.1
19.46
1.84

1.46
• • •





GO
254.3
19.50
1.62

1.00
                 1E-3
-------
(30,30)
             O.OS"
PERCENTILES OF THE F DISTRIBUTION

                 f
        F.95(nl'n2>
                    /J-S
                  T-7
>
-------
                           PARENT
                           POPULATION
              SAMPLING
              DISTRIBUTION
Classification of Statistic
       STATISTICS
      I
DESCRIPTIVE
                  INFER
     •NCE
               r
            TESTS OF
          SIGNIFICANCE
             I
        ESTIMATION
      I
 PARAMETRIC
NON-PARAMETRIC

]./
           -1
-------
           ESTIMATION
                          CONFIDENCE- INTERVAL
POINT    REAL
— •"—  "LINE
                                 ONE-SIDED
                TWO-SIDED
- 5

  *
                                LOWER
                   _ I ______
                     I
                     ______
                                 UPPEB
                                    v,
                                    (
                     I
             i
             o
   fsr /t   P/\R.  POP MEAN
-------
a K.-*t+frr>$-KX.r*) ^ g for* J
                   u, m-i
x4T-
A§4J6-
 ^^#-Wf^>=

 =^7'^=V^/
                         I
                     *
         2-3
-------
             3/KSED  ESI OF (T
                 POP 5TAHD. DEV.
                           Pqge I-10
SAMPLE SIZE, n
      3

      4
s, IS AN UNBIASED
   ESTIMATE OF:


     0.797CT

     0.886
           2-4
-------
               ESTIMATION
POINT     REAL
  *"       LINE
                                   	1
                                   CONFIDENCE-INTERVAL
                      TW^SIDED , _
                                              ONE-SIDED
                                              LOWER
                             	h
UPPER
              SAMPLE SIZE
                     2-5
-------
  I1!
I    X
                  "'
                                 jw4Xjuua**^e t^
                                 A    I    »J
\/VV>jL0L/VV-
                               M *    t   A •  • ^^^
                                               s*~
 ^y ,   v*A/NA/^ ^^M, V/                            ^
  bk cLt/w-ec d/om^fcx^/^-f iT^o J. <^ J ^
     >    I  > 1  (    J       ^ \   .      ft
                2-6
-------
 /ood -
                60 N ff I NT  Fo fl /r £
CUMULATIVE NORMAL DISTRIBUTION -VALUES OF zp
   z
p
         .9
1.282
.95
1.645
.975
1.960
^^^^^^^^^^^^^^_^___
.999
3.090
^•^••^^•^^^^•^^^
-------
-------
                                     ya^
2-9
-------
  CONF. INT
     <   (f
FACTORS FOR COMPUTING TWO-SIDED
    CONFIDENCE LIMITS FOR «
                          ,cT
DEC. OF
FREEDOM
V
1
2
3
•
•
•
100
ot a 0.05
BU
17.79
4.859
3.183

1.157
BL
0.3576
0.4581
0.5178

0.8757
                             T-31
                             T-35
    ALSO ot = 0.01, 0.001
       2-10
-------
              COKF. JNT (TO-SIDED)
   UL
                     /\
                           r*
/&-ftwiO:K
          ^
         
-------
If
                             DISTRIBUTION
                                '99.7r>
                              at     »i
                                                     V
  gRNOAID
  VIATION
Estinitri b|
                                                b-a
                                                4.2
                                                b-i
                                                4.2
R5.  2-1

   3
    5

Pae
                                    2-12
-------
LET

If
                                5TA6£
                          T
CUMULATIVE NORMAL DISTRIBUTION -VALUES OF zp
                               ToLle A-Z
                                  T-3
   z
p
.9
1.282
.95
1.645
•
1.
                          975
1.960
                               999
3.090
-------
                 >    .
   \HT.
Percentiles of the t Distribution
                 P
Tabl
df

24
26

60

• • •






!.95

1.711


1.671

!.975


2.056

2.0

• • •






       2-14
-------
cr-pcr
     t
                           =0.95
    NUMBER OF MEASUREMENTS REQUIRED
TO ESTABLISH THE VARIABILITY WITH STATED PRECISION
                                     F.
1000
 UJ
 £ 20
 C/9
"I
^r * ^9
^ tT
0

c


ALSO if = .90, .99





V
\


i


\

                                              2-2
                    P°/o
                 2-15
-------
SAMPLE SIZE
o.9«  P 5
                      OF
                   Uiv\\r$
-------
    2.-SJD5D  TOL. ill fa
100
OF    POP
                 *
           P
      FACTORS FOR TWO-SIDED TOLERANCE
      LIMITS FOR NORMAL DISTRIBUTIONS


n
••••••
2
3
4
y=0.90
P
0.75



0.90

*

0.95


4.943
0.99



0.999



                                             "fable
                                                T-IO
      r=0.75,0.9r0.95,0.99
      n = 2 (VARIOUS) oo
      2-17
-------
One Sample Problem
TEST
H0:>u = CONSTANT
HA:>U * CONSTANT
H0 '-/JL* CONSTANT
HA:/^ -CONSTANT
H0:/t ^CONSTANT
HA -'A -CONSTANT
•
CONDITIONS
1.
2.
3.
4.
5.
6.
a UNKNOWN^ 	
6 KNOWN
6 UNKNOWN
6 KNOWN
6 UNKNOWN
6 KNOWN
-------
   EAV6.  D/fF.  TO
Sample Sizes Required...
                  ^ C0*M4JUu&v>-> 40*- *v-U7
d

.2
.4
.8

i-x*
.5

97
25
7

.6


31
8

.7





.*





.9

263



.95
•




.99


.


                               Table A-h
                                T-n.
     ALSO oc=Q.01
       J-2
-------
-------
                                            =0. 5
                                   -735) . / -7
                                 J
OfCDo  Nor
      'I
of the t Distribution
       P
                                   (0e
df
• • •
24
26
• • •
60

• • •






'.95
•
1.711


1.671

^975
t

,2.056'
•


• • •
--

•


i
                                          Table A-4

                                           4T-5
                   3-4
-------
-jj^ -fOty**
       Sample Sizes Required...
d

.2
.4
.8

1-/J
.5

97
25
7

.6

-
31
8

.7





.8



•

.9

263



.95





.99





                                                Table A-8

                                                 T-16
             ALSO  ot = 0.01
                  3-5
-------
INTERVAL
                          Fofr
   SM/iLL
                        SMALL
CORRECT
             Sample Sizes Required...

                 06=0.05
                 ALSO oc = 0.01
                   3-6
d

.2
.4
.8

\-fl
.5

97
25
7

.6


31
8
•
.7




•
.8


-
.

.9

263

'*.-•

.95


•
•

.99


..


                                    Table A
                                     T-16
-------
           One Sample Problem
TEST
H0:>U = CONSTANT
HA:>u* CONSTANT
H0 :/** CONSTANT
HA :/6 -CONSTANT
H0:/< ^CONSTANT
HA • /** CONSTANT
CONDITIONS
1. a UNKNOWN
2. 6 KNOWN
3. 6 UNKNOWN
4. 6 KNOWN
5. 6 UNKNOWN ^
6.  <
X.735

-------
                   -g.   _ /fte-.i, 2
                          ""           "
§-
  7. 35"-/f ) ~ 0. 8
            J
JL -P-8.- a
 (T
                DISTRIBUTION


                   „*""-
                      STANDARD
                      DEVIATION
                      Estimated by
                            4.9


                            b-a
                            3.5


                            b-a
                       4.2


                       b-a
                       4.2
         Sample Sizes Required...

              0^=0.05
                               . #nvt>.
                               Figure 2-1

                               Page 2-9
d
.4
1-X? " '•
.5

.6 '
23
.7

.8
. . .
.9
•
.95
* - k
.99
-'
Table
T-l/


               3-8
-------
!  fours. _  A.R.
      Percent!Ies of the t .Distribution
                     P
••§••••1
df
• • •
24
26
• • •
60
•^^M^M^
mm^fm^m
• • •





^^^•MM
^•M^HM^B
!.95

1.711
•

1.671

tp
•••^•M^
!.975


2.056



•M^MBH
• • •






Table A-4
T-5


               3-9
-------
     IS/OR N\AL
                   (f
d
           3-10
-------
         Two  Sample Problem
TEST
    CONDITIONS
            i-s. /./
            J-3J..I
             -3./« 3
1.  <7A =
-------
          .«.
          t
                               - O-9
Sample  Sizes Required...

      Od=0.05
d

.2
.4
.8

i-x?
.5

97
25
7

.6


31
8

.7





.8





.9

263
t


.95





.99
•*


•

                                      Table A-8

                                       T-16
      ALSO  oc = 0.01
           3-12
-------
3-13
-------
        £>EC  Do NOT R£T.   //0
CUMULATIVE NORMAL DISTRIBUTION-VALUES OF zp
p
ZP
.9
1.282
•
.95
1.645

1
                         975
                         .960
 999
3.090
-------
       Two Sample  Problem
TEST
CONDITIONS
              5.
              6.
   ,  BOTH  UNKNOWN ^K
   ,  BOTH  UNKNOWN
              1.  6A ,  
-------
Sample Sizes Required...

       01=0.05
d
.1
.2
.4

3.0
i-x?
.5





.6

91
23


.7


30


.8

•



.9 '




.
.95




•
.99





                                       Table A-9
                                         T-17
      ALSO ot = O.Ql
         3-16
-------
       />i4-»•/v\ B - 2^
3-17
-------
DEC.
            PercentiIes of the t Distribution

df

24
26

60


IP
• • 4






'.95

1.711


1.671

*.975

•
.2.056











Table A-4
T-5

                   3-18
-------

       OC Curves for the  Two- Sided t-test (eC= .05)
   o.6
s
hU
gO.4
i^.
50.2
 II    A
t^   0
                     1.
                                                     Figure 3-1
                                                     Page 3-6
                        3-19
-------
                                   >.-I.S-
             t,  r
            OiFF.     ARE
OC Curves for the Two-Sided t-test feC= .05)
            3-20
-------
v.ENirti-/0;
-0.7
             ..
               -S-  d.3
SMALLER
                             £ 'IS
      OC dims for the Two-Sided t-test «.= .05)
                                   Figure 3-1
                                   Page 3-6
                 3-21
-------
                          I     I
                          I     I
G\ VE N; m-100'
                 I   I
osr
                      -  - ALttosr
  /A/C/?FAS£S
         OC Curves for the Two- Sided t-test fot= .05)
                 3-22
-------
JL-
           6/VEN   d
 see Table 3-1 Page 3-4
 see Table 3-2 Page 3-22

 see Article 3-3.1.4 paired observations two-sided test
 see Article 3-3.2.4 paired observations one-sided test

 see Article 3-4 for the k-sample problem
                                                     •' /
                                3-23
-------
One Sample Problem

HO
H*
1
-------
   DEPIHEA^tfl^ &- = /
           *== C   10
^«&
   x^-
      ^/
CUMULATIVE NORMAL DISTRIBUTION-VALUES OF zp

                               Tible A

                                  T-3
                   4-2
-------
   fiC'-O'iW
    \os
       os
K7eCONF.WT.FOS
 i—*-
9-77  /0
       FACTORS FOR COMPUTING ONE-SIDED
           CONFIDENCE LIMITS FOR cr
DEC. OF
FREEDOM
1
2
20
•
•
•
100
A.05
0.5103
0.5778
0.7979
•
0.8968
A.95
15.947
4.415
1.358

1.133
       ALSO - Ai02s A 01 At005 A>975 A>99 A|995
                                     is
                    4-3
-------
i-
                           S.A
                       H
    A
                      DO
  OUT$/0£
10
                      .: DO
                                  H
            4-4
-------
               Operating Characteristics...
               1.0
  2.0    /
OF/CONSTANT
3.0
         Two Sample  Problem  .
            1
             2
H
HA
   H  :
           2
                                            2
                       4-5
-------

F--
                     4-6
-------
                         7J>
               !=£-
CUMULATIVE NORMAL DISTRIBUTION-VALUES OF zp
p
ZP
.9
1.282
•
1.
95
.645
.975
1.960
.999
3.090
-------
   30
    00
£\  **«
     &/»., D&» &°
        PERCENTILES OF THE F DISTRIBUTION  ,,
                        •p          TflWe  A-S

                                      r-7
                 F.95(nl'n2)
        161.4
         18.51
4.17
3.84
                        30
            250.1
             19.46
1.84
1.46
            254.3
              19.50
 .62
1.00
                  4-8
-------
   1.0
          Operating Characteristics.
 b  6
    4
V
                                         e 4-it
    - a /
                   F   (30 30) ~/.8i-
-------
  One Sample Problem
          C= CONSTANT
i

HO
1
a=C
2
a=c
3
a=C
FACTORS FOR COMPUTING ONE-SIDED
     CONFIDENCE LIMITS FOR 0-     Table  A-II
DEC. OF
FREEDOM
1
2
20
•
•
•
100
A.05
0.5103
0.5778
0.7979
•
0.8968
A.95
15.947
4.415
1.358

1.133
ALSO - A<025A01 A>005 A>975 A<99 A>995
           '+-10
-------
            oo
             J
4-11
-------
   One Sample Problem

HO
H*
1
a=C
o*C
2
o=C
a-C
3
a=C
a-C
           C=CONSTANT
FACTORS FOR COMPUTING ONE-SIDED TUU
    CONFIDENCE LIMITS FOR (7    M
DEC. OF
FREEDOM
1
2
20
•
•
•
100
A.05
0.5103
0.5778
0.7979

0.8968
A.95
15.947
4.415
1.358

1.133
                                    -?l
                                     *-\
ALSO - A>025 A Q1ABQ05 A>975 Ai99 Ai995
            4-12
-------
c.j:.
            J.
4-13
-------
One Sample Problem
     1

HO
H*
	 ^^^^^^^^^^»^^
1
a=c
^•••^^^•M^MBHH^H
2
0=C
a*C
3
1 •—•— -^— ^•"•••^
a-C
a-C
         C=CONSTANT
     FACTORS FOR COMPUTING TWO-SIDED
         CONFIDENCE LIMITS FOR *
DEC. OF
FREEDOM
V
1
2
3
t
•
•
100
oc =0.05
BU
17.79
4.859
3.183
•
1.157
BL
0.3576
0.4581
0.5178

0.8757
Table A-20
   T-34
         ALSO a = 0.01, 0.001
            4-14
-------
               *•
                          CO
                          J
                   0.

                          J.
>a.
                       <*
       Two Sample Problem
         1
                  2
H:
           a,
              '
           H:
                     a,
                 4-15
-------
f^ LA*.
                                   <*
                                    -  di
                       It-It
-------
BIVARIATE NORMAL DISTRIBUTION
 OR  STATISTICAL RELATIONSHIP
                f(XY)
yy-u QM. ji&*<-&ffi»\
              5-1
-------
      •>
       TWO-DIMENSIONAL NORMAL DISTRIBUTION
            OR  STATISTICAL RELATIONSHIP
      LINEAR FUNCTIONAL RELATIONSHIP
              OF TYPE FH
JOINT
OF Xj  AND
DISTRIBUTION
-------
FUNCTIONAL RELATIONSHIP FI
                     X VALUES
                     CHOSEN BY
                     EXPERIMENTER
   ALINE OF Y  MEANS
    LINEAR FUNCTIONAL RELATIONSHIP      5-2
                   "
              5-3
-------
          - MODEL
                    X FACTOR IS
       CONTROLLED
                  UNCONTROLLED
 QUANTITATIVE
QUALITATIVE
QUANTITATIVE
QUALITATIVE
 FUNCTIONAL
 RELATIONSHIP
     OR
 REG. ANA.

 MODEL
 FI NO ERROR
 IN X ONLY
      '
 ANA. OF
 VARIANCE

 FIXED
 EFFECTS
 MODEL
 FH ERROR
 IN X &Y

(NATRELLA)
 STATISTICAL
 RELATIONSHIP
    OR
 CORR. ANA.

 MODEL
 SI &SH

(NATRELLA)
 ANA. OF
 VARIANCE

 RANDOM
 EFFECTS
 MODEL
                        5-4
-------
        L  FT
        -a.
MODEL FI

y=b0+b,x=
10
8
y °
4
•i
<_
0
_••>





(
/•
~A






1
>







c







3





i

i ;







i


               5-5
-------
   L_  FT  \V(SHEET,
            X-Y
    -a.
     &
            £
       o
                   -3
ZX= 0
DATA BANK
 X = 0
 Y=3
 ZX2=10
n = 4

SY2=50
   *
   *
           5-6
-------
                            iLjMJtu**.
=    lo
                        -   50
ZX= 0
2Y=12
S  = 10
 xx
          DATA BANK
            X = 0
            Y=3
            EX2 =10
            b = 1.
n=4

ZY2=50
                        b0=3
             5-7
-------
SXx = 10
Sxy = 11
       S=U
b, =
                       yy
                         b0=3
MODEL FI
       y
           = b0 + b, X =3H|X~^
10


A
«•»
2

-2
-A




(
^f


i




! -1







C





I

) 1


v • -




^







>
w


          5-8
-------
                   ;/ (0,
MODEL FI
y=4+/0| x+e^(MODEL)
.Xftb+b,
           y
10
8
6
4
2
0
„••>
-4
,2




<
_<•






^
> -





/

(





^ (

) i




/
\

;







?


               5-9
-------
          st
         -0.
                 >h-j£
ZX= 0
          DATA BANK
            X = 0
            Y=3
            ZX2=10
S  = 10
Sxy =
Y=b0 +
s2Y=0.95
            3+1. IX
           = 0.9747
 n=4

 ZY2=50

Syy=14
   b0=3
  = 0.3082
             5-10
-------
 A
A   ~
      SX= 0
                        -41-0.^873
    I
DATA  BANK
  X = 0
  Y=3
  ZX2=10
           n=4

           SY2=50
      S  = 10
       xx
      sxy =
bi =
           Syy=14
               b0=3
      Y=bo + b] X — 3+1.IX
      s2Y=0.95  sY=0.9747   sb =0.3082

      sb =0.4873      t975[df=2]=4.303

                            PAGE T-5
             Percentiles of the t Distribution
                          P          ^
                                         T-5

df
• • •
2
• i •
• • •



'.95



tp
!.975

4.303


• • •



                     5-11
-------
                       Y  AT
               i  ^iSE
                     s
            u2£2;~2.(
            * '  Su  ~
                 t
          - 5. i   )   to £"• i)
SX= 0
         DATA BANK
          X = 0
          Y=3
          SX2=10
Sxx = 10
SXy=ll

Y=b0 +
s2Y = 0.95
  = 0.4873
n=4

SY2=50
                 Syy=14
                    b0=3
          3+1. IX
          = 0.9747  sb =0.3082

             t975[df=2] =4.303

                PAGE T-5
          5-12
-------
                            Y  AT
                 a 7     fo|0.7)
                 5. I :    Co, 5-1)
MODEL FI    (0y=bo+b
                  5-13
-------
            A7X=
ZX= 0

ZXY = 11

SXx = 10
$xy = 11

Y=b0 +
s2Y=0.95  5

sh =0.4873
 bo
 -   7.7
DATA BANK
  Y=3
  ZX2=10
 b,=i.i

= 3+1.IX
= 0.9747
             ^0,7.7;
           n —- 4

           SY2=50
          Syy=14
              b0=3
             =0.3082

     t_975[df=2]=4.303

         PAGE T-5
     5-14
-------
too«-
-------
ft ; p' <= £0 rt ST
  o i  /
                   .=  0,05
     U-O
       SX= 0
       SY = 12
       SXY = 11

       SXx = 10
       Sxy=H
           DATA BANK
            X = 0
            Y=3
            ZX2=10
n = 4

ZY2=50
                     Syy=14
                        bo = 3
       Y=b0 t-b!X =3+1.IX
       s2Y=0.95  sY=0.9747   sb =0.3082

                      t975[df = 2]=4.303
                          PAGE T-5
sh =0.4873
 bo
                       5-16
-------
   5
5-17
-------
                     ^
                            a?ff
A£
                DATA  BANK
    ZX= 0       X = 0       n=4
    2Y = 12       Y=3
    2XY = 11      ZX2=10   ZY2=50
    Sxx = 10               Syy=14
                   = l.l        b0=3
    Y=b0 *biX =3+1. IX
              sY=0.9747  sb =0.3082
    sb =0.4873     t975[df=2]=4.303

                         PAGE T-5
                5-18
-------
                : INT.
4-£-
 O  /  »
bo
         = 0.05"
   ItiJ.
               5-19
-------
MODEL-
LINEAR FUNCTIONAL RELATIONSHIP    r~
         OF TYPE Fl           r*

                            P-i:
                                        e   -
             5-20
-------
           LINEAR FUNCTIONAL RELATIONSHIP
                   OF  TYPE  FH
F/3 £-3
   * £-5
JOINT DISTRIBUTION   Yo
OF Xi AND v         z
                                      Tobk
RELATIONSHIP
Y=a+A
-i- = a + bX
Y=abx
Y=aXb
PLOT
Y
1
Y
logY
logY
1
X
X
X
log X
                              5-21
-------
STAT
       R / CocHRAN   P.
  - LI N.
     77/AN OMB
 X
 Of X
9P

                           '- - S-V
          5-22
-------
                  H0: LINEAR RELATIONSHIP
                     ART. 5-4.1.6
ART. 5-4.1.3
          REJECT
            «
  DO
 NOT
REJECT
  H
                   REJECT HQ
       TRY
      Y=
TRY
                     DO NOT REJECT H0
              USE
               = b0
IJSE
V ^« Vv
    MAY NOT EQUAL ZERO WHEN SECOND DEGREE TERM
    IS INTRODUCED.
      H>
            A/A
     ST/\T.  A/A/VUAL cRow.er
                       5-23
-------
IV
    SB  AMO£  AT
      AT
       LIN.
       .  AT
      5-24
-------
KNQWM   LIM,
   BIVARIATE NORMAL DISTRIBUTION
    OR STATISTICAL RELATIONSHIP  j(,
/'
                f(XY)
            5-25
-------
             TWO-DIMENSIONAL NORMAL  DISTRIBUTION
                  OR STATISTICAL RELATIONSHIP
                     X FACTOR IS
        CONTROLLED
                   UNCONTROLLED
 QUANTITATIVE
QUALITATIVE
 QUANTITATIVE
QUALITATIVE
 FUNCTIONAL
 RELATIONSHIP
     OR
 REG. ANA.

 MODEL
 FI NO ERROR
 IN X ONLY

 FIX ERROR
 IN X &Y

(NATRELLA)
 ANA. OF
 VARIANCE

 FIXED
 EFFECTS
 MODEL
 STATISTICAL
 RELATIONSHIP
     OR
 CORR. ANA.

 MODEL
 si &sn

(NATRELLA)
 ANA. OF
 VARIANCE

 RANDOM
 EFFECTS
 MODEL
                         5-26
-------
y
                 X
  DATA
          ONLY
               ~ ART. 5-5 A
  SI- KODE[SAbA£ AS PX
       m« ALSO
 COMP, INT.
          ETC.
            5-27
-------
             X
               V-
            DATA BANK
ZX= 0       X = 0      n=4
ZY = 12       Y=3
SXY = 11      ZX2=10    2Y2=50
                       Syy=U
Sxy = ll       bi = l.l        b0=3
Y=b0 t-hX =3+1. IX
s2Y=0.95  sY=0.9747   sb =0.3082

  ""            t975[df=2]=4.303

                      PAGE T-5

                 5-28
-------
                 X
 MODEL $ I      y = b°+b' X =3HlXZ£/MW WM
              10
               8
            y
       W/MT
   XA
=~fl?.
-'" -2  -I  0
                                   X
                    5-29
-------
ZX= 0
2Y = 12
SXY = 11

Sxx == 10
sxy = n
            DATA BANK

              X = 0
              Y=3
n=4
              SX2=10   SY2=50
                       Syy=14
                           b0=3
Y=b0 +biX =3 + 1.IX
           sY=0.9747  sb =0.3082

                 t975[df=2]=4.303

                      PAGE T-5
sb =0.4873
 DO
                 5-30
-------
    TWO-DIMENSIONAL NORMAL DISTRIBUTION
        OR STATISTICAL RELATIONSHIP
         -'^* I I
-Sy1S^-»/W-aW7
         .  OF DE7WHMT/OA/
VA^.of y
  3 ^
  ^v» ^^^^
                          ay
               5-31
-------
                   r-3/
6IVEN1- CHOICE  OF- TWO
5TA77
             5-32
-------
           FUNCTIONAL RELATIONSHIP FI
                                X VALUES
                                CHOSEN BY
                                EXPERIMENTER
               ALINE OF Y MEANS
          LINEAR FUNCTIONAL RELATIONSHIP
                  OF TYPE FH
F/q 5-3
   5-5
JOINT DISTRIBUTION
OF Xt AND
                           5-33
-------
         TWO-DIMENSIONAL NORMAL DISTRIBUTION
             OR STATISTICAL RELATIONSHIP
y
DATA
                                   ON
                       5-34
-------
             Bi  Rf.
                         A
COi N/
            HfAD
            TA/L
  A
A/oT^A
            7A-1
-------
     A
O
                    I
                            // T
                         60
                             1 >/
TAKE   /7l   fMD
        A  0/N.
  P(A-5 1 K.
                      si A/
                            f
              7A-2
-------
 X;*'/>  V
our OFfl\J
             (m-t).'x'.
TO S
T
                       b4
                   #
               BIN.
                ttSK
/W =
            7 A-3
-------
       OF-
         POP.
TWO-SIDED CONF. INTERVALS FOR
  PARENT POP. PROPORTION P
NATURE
>EXACT
EXACT
APPROX.
SAMPLE
n<30
n>50
n>30
ART.
7-3.1.1
7-3.1.2
7-3.1.3
METHOD
TABLE A-22
TABLE A 24
FORMULA
             7B-1
-------
 CONFIDENCE LIMITS FOR PROPORTION TQLie /?-
          (TWO-SIDED)           T-40
11
27
            n=27
         90%
      .239 .593
                  95%
.223 .598
           99%
  (VO. OF- RED

                           me
O)N f.
                     P
              7B-2
-------
TWO-SIDED CONF. INTERVALS FOR
  PARENT POP. PROPORTION P
NATURE
EXACT
EXACT
+APPROX.
SAMPLE
OS 30
n>50
n>30
ART.
7-3.1.1
7-3.1.2
7-3.1.3
METHOD
TABLE A-22
TABLE A-24
FORMULA
            7B-3
-------
   B|
CUMULATIVE NORMAL DISTRIBUTION -VALUES OF zp
                                 T-J
p
ZP
.9
1.282
.95
1.645
.975
1.960
.999
3.090
^^^^^—^—^•—^H H^— ••— MM
-------
ONE-SIDED CONF. INTERVAL FOR P
NATURE
^ EX ACT
EXACT
APPROX.
SAMPLE
n<30
n>50
n>30
ART.
7-3.2.1
7-3.2.2
7-3.2.3
METHOD
TABLE A-23
TABLE A-24
FORMULA
     CONFIDENCE LIMITS FOR PROPORTION
             (ONE-SIDED)         Table 4-2.3
P
0
•
•
•
11
•
•
•
16
•
•
•
26
11=27
90%


.549




95%


.583

.752


99%


.645*




                 7B-5
-------
                    p

                                     \
CONFIDENCE LIMITS FOR PROPORTION
          (ONE-SIDED)
r
0
•
•
•
11
•
•
•
16
•
•
•
26
n=27
90%


.549




95%


.583

.752


99%


.645*




             7B-6
-------
                      o
                                            v
0.5
0.4
0.3
0.2
o.ay      r        \
  CONFIDENCE BELTS FOR PROPORTIONS
     (CONFIDENCE COEFFICIENT 0.90)
  .1
             .3  .35  .4

              P=!/n
              7B-7
                                             T.
-------
     SAMPLE SIZE DETERMINATION
   TWO-SIDED CONFIDENCE INTERVAL
NATURE
EXACT
APPROX.
SAMPLE
n>50
n>30
ART.
7-4.1.1
7-4.1.2
METHOD
TABLE A-24
FORMULA
   ONE-SIDED CONFIDENCE INTERVAL
EXACT
n>30
7-4.2
FORMULA
                7B-8
-------
  P     CHOICE OF  P
UNK.
     To
CONFIDENCE BELTS FOR PROPORTIONS
  (CONFIDENCE COEFFICIENT 0.90)
                          P

                           P
                          Tcible
-------
     A/ Af ITS             d
CUMULATIVE NORMAL DISTRIBUTION
p
ZP
.9
1.282
.95
1.645
.975
1.960
.999
3.090
-------
          ONE SAMPLE PROBLEM
        TEST
                     CONDITIONS
   H0: P = CONSTANT
   HA:P * CONSTANT
                     1. NS3<
                     2. N>30
H0:P30
Pa PARENT POPULATION PROPORTION
              8-1
-------
  .23?
CotlF. /AT.
             is
Do
              : DO
     CONFIDENCE LIMITS FOR PROPORTION  Table
              (TWO-SIDED)              T-40
p
0
1
•
•
•
11
•
•
•
27
n=27
90%



.239 .593


95%






99%






            n= HD30
                   8-2
-------
. OF
                   . To
6*0.1, P=0.tzG (VALUE   CLo.



TO  0.5 ),P= 0. 5 OR 0.3   •• P-0.5
 -fts if fF~-3 SlifFli V- rsi\- o.
  TABLE OF ARC SINE ...PROPORTIONS
         6 = 2 ARC SIN
p


e


ATIVE NORMA
p
0.4
0.5
L DISTK
e
1.17
1.57
IBUTIO
p


e


N -VALUES 0
                                  ZP
      .9
     1.282
   .95
  1.645
 .975
1.960
3.090
               8-3
          Tofck rt-2
-------
107*
         ¥    •
    .1
         CONFIDENCE BELTS FOR PROPORTIONS
            (CONFIDENCE COEFFICIENT 0.90)
.3   .35  .4
 P=7n
.5
                     8-4
-------
            ONE SAMPLE PROBLEM
          TEST
      H0:P = CONSTANT
      HA:P ^CONSTANT
      H0: P CONSTANT
CONDITIONS
1. NS30
2. N>30
1. N<30
2. N>30
   PS PARENT POPULATION PROPORTION
        ONE SAMPLE PROBLEM
       TEST
CONDITIONS
 H0 :P2 CONSTANT
 HA:P< CONSTANT
 .1    ?./3.i-
 1. NS30
 2. N >30
PEPARENT POPULATION PROPORTION
                 8-5
-------
          TWO SAMPLE PROBLEM
      TEST
   H0 : P=
   Hfl:Pfl*PB
      CONDITIONS
  Nfl=NB
3. NA*NB (LARGE SAMPLES)
PA= PROPORTION FOR A POPULATION
NA= SAMPLE SIZE FROM A POPULATION
                 8-6
-------

A
B
S(/6.
a
8
A/of

/•3
  MINIMUM CONTRAST...
5% LEVEL, TWO-SIDED (ALSO 1%)
2.5% LEVEL, ONE-SIDED (ALSO 0.5%)
SAMPLE SIZE
nA = nB
•
.
•
20
•
•
A1'A2

0,5 1,7 2,9 3,10
4,11 5,13 6,14

"A = nB = i CD 20 (io) 100  soo
                  8-7
-------
         . To 0. r> 0.4t. 0? - O-
TABLE OF ARC SINE ...PROPORTIONS
        6 = 2 ARC —1~=~  Tokle /j-27
p


e


ATIVE NORMA
p
0.4
0.48
e
1.37
1.53
P


L DISTRIBUTION -VA
e


LUES 01
.9
1.282
.95
1.645
.975
1.960
.999
3.090
                 8-8
-------
a-
                         r-3
               8-9
-------
/*
                                    • '
                  °\
AR.
  PERCENTILES OF THE X  DISTRIBUTION
  75ble
    T-4
 THE VALUES OF Xp CORRESPONDING TO p
df
1
4
•
•
•
10
• • •




X2
*.90



15.99
I2
*.95
3.84
9.49


*2
*.99
6.63



                8-10
-------
             TWO SAMPLE PROBLEM
3-
       TEST
     H0:pA=pB
CONDITIONS
                  2
                  i..
    (BOTH LESS
                  3.  NA*NB (LARGE SAMPLES)
       PA= PROPORTION FOR A POPULATION

       NA=SAMPLE SIZE FROM A POPULATION
     MINIMUM CONTRAST...
    Table A-2.8
      T-5&
5% LEVEL, TWO-SIDED (ALSO 1%)
2.5% LEVEL, ONE-SIDED (ALSO 0.5%)
SAMPLE SIZE
nA = nB
•
•
20
.
•
•
Alf A2

0,5 1,7 2,9 3,10
4,11 5,13 6,14

      = n  = 1(1)20(10)100(50)200(100)500
                  8-11
-------
        ONE SAMPLE PROBLEM
       TEST
CONDITIONS
 H0:P>CONSTANT
 HA:P< CONSTANT
 1. NS30
 2. N >30
P EPA RE NT POPULATION PROPORTION
       CONFIDENCE BELTS FOR PROPORTIONS
          (CONFIDENCE COEFFICIENT 0.90)
                                 ( -TO
                3  .35  .4
                P=!Xn
    .5
.6
                8-12
-------

NIGHT
SHIFT
CATEGORIES OR REJECTION (DISCRETE)
SAND

DROPPED

BROKEN

OTHER

ANNUAL
 PROD
         CATEGORIES OR MEASUREMENT
               (CONTINUOUS)
        -GOTO -
-1TO 0
OT01
1TOOO
                9-1
-------

WEEK 1
WEEK 2
WEEK 3
CATEGORIES OR REJECTION (DISCRETE)
SAND



DROPPED



BROKEN



OTHER




SECOND
CATEGORY
OR
EDUCATION
LEVEL
GRADE SCHOOL
HIGH SCHOOL
COLLEGE
GRADUATE
FIRST CATEGORY
OR
TYPE OF SUCCESS
A




B




C




9-2
-------
 X scELL COUNTS ORFREQ.
  IF X (N CELLS  ARE.
POISSOM
     TEAM
 tf
 5T=
    ALL.
          9-3
-------
   0,
    ,3
Li
         8,9
         WAL
o
30
40
33
   47
   NOT ALL  e-
^=0.05
          9-4
-------
    MS
   0,
13
 L£
          &i
                 WAI
0
30
40
33
47
E
40
40
40
40
40
      ~ 40- AV£-E  M osr-
           9-5
-------
a  R - '/5
    WOT  ALL  6-
  = 7. f5
7-*5
                        /VOT
     PERCENTILES OF THE \T DISTRIBUTION
                              Table  /4-3
                     Y2
                      P
     THE VALUES OF Xp CORRESPONDING TO p
df
1
4
•
•
•
10
• • •




X2
*.90



15.99
X2
*.95
3.84
9.49


^2
*.99
6.63



                 9-6
-------
HAIR
    BRUW
    RED
        MMMTA
MAR
10,

D.
               V
         SMJtfMS)
     130
          5/N
          /77
                  X
                  360
   NO AS 5 N B7W (t.^5. f j, IS «so.
E
            9-7
-------
        MWMTA
HAIR
(HO
RONE
    BRUN
RED
MAR
              ~)
             57A1WAI
            V
      o
SIN.
                   360
            9-8
-------
 PERCENTILES OF THE X  DISTRIBUTION
                      Xp
THE VALUES OF Xp CORRESPONDING TO p
                                     T-24
df
1
4
•
•
•
10
• • •




X2
A.90



15.99
X2
*95
3.84
9.49


*2
^.99
6.63



                9-9
-------
      LARGER
AT
MOST OKALLE<5"

           SH
|-D/rt  (500DN ESS OF  FIT
       E  MIN M
            9-10
-------
CIRCUMSTANCE DECIDED ON A PR OR GROUNDS
     SUSPECT COULD BE IN EITHER TAIL
CASE
1
17-t.l,l
II
/7-J./.X
III
n-s.i.3
IV ,
/7- 3. /.#
CONDITION 1
-* AND ff UNKNOWN ^
^ AND 0" UNKNOWN
EXTERNAL ESTIMATE FOR
if
S AVAILABLE
^u UNKNOWN, 9 KNOWN
!
^ AND 0- KNOWN
i
                 17-1
-------
   RANKEP
SWUE5T
                        L/VfcfJT
   HiHOfAO.
CRITERIA FOR REJECTION OF OUTLYING OBSERVATIONS
                  Table XI-M
                      T-2?
STATISTIC
rll
n
8
9
10
UPPER PERCENT! LES
.70



• • •



.995


.639
                17-2
-------
A R
             R R

D£C.

   SAMPL5
                     -©.£'-/• 7
  A
    (V
    4
    T\tf
      t*
       »*'
              17-3
-------
CIRCUMSTANCE DECIDED ON A PRIORI GROUNDS
      SUSPECT COULD BE IN EITHER TAIL
CASE
1
II
III
IV
CONDITION
-* AND ff UNKNOWN
^(. AND 0- UNKNOWN -4—
EXTERNAL ESTIMATE FOR
^u UNKNOWN, 0- KNOWN
^U AND * KNOWN
I
\
\
t
	 \
S AVAILABLE r
j
1*
i
i
1
                   17-4
-------
                              = I0
                          NOT
PERCENTILES OF THE STUDENTIZED RANGE, q
                                 Table  A-ID
                                   T-2.Z
Wf TOP
sorv
i
•
•
•
30
•
•
•
t = SAMPLE SIZE
2




• • •




10


5.76

• • •




20




                  17-5
-------
CIRCUMSTANCE DECIDED ON A PRIORI GROUNDS
       SUSPECT IS IN ONE TAIL ONLY
CASE
               CONDITION
I
         AND ff UNKNOWN
                          '-<
^f-bL^
  II
/7-3.JJL
       AND 0- UNKNOWN
    EXTERNAL ESTIMATE FO
  III
/7-V'J. 3
  IV
       UNKNOWN, ff KNOWN
       AND 
-------
/i-
     , DATA
PERCENTAGE POINTS OF ...MEAN
      
-------
CIRCUMSTANCE DECIDED ON A PRIORI GROUNDS
       SUSPECT IS IN ONE TAIL ONLY
CASE
1
II
III
IV
CONDITION
-* AND 
-------
                             = o.ooi
           =2
        -  Z./M
   -75
           37
CUMULATIVE NORMAL DISTRIBUTION-VALUES OF zp
                              Table
                                T-J


.95
1.645
.975
1.960
^^^^^^^^^^••^^^•^^^^•^M
                                  3.090
                   17-9
-------
                                 IN
         II
                          OHE
         n
                           BOTH  7/iits
                                  "
TEST OF
SIGNIFICANCE
NOT SIGNIFICANT
                THEY
                ARE
              SIGNIFICANT
              SEARCH FOR
            PHYSICAL GROUNDS
                             FOUND
                              NOT
                             FOUND
                             17-10
                                     ANALYZE ALL  DATA
CORRECT & ANALYZE
REJ. & ANALYZE REDUCED SAMPLE
REJ. & GET ANOTHER OBSERVATION
REJ. & REPLACE BY MEAN (OR ETC.)
REJ. & USE TRUNCATED THEORY
  cJT ^ oSSL «*
                                      REJECT ON
                        UNKNOWN
                        GROUNDS
                                              PHY.
                                       SAME AS THE
                                       LAST 4  ABOVE
                                      DO NOT REJECT
                                        ANALYZE
                                        ALL DATA
-------