EPA-430/1-74-004
Basic Environmental Statistics
Notebook
TRAINING MANUAL
U.S. ENVIRONMENTAL PROTECTION AGENCY
WATER PROGRAM OPERATIONS
-------
BASIC ENVIRONMENTAL STATISTICS NOTEBOOK
This course is designed to introduce the concepts
and applications of statistics to environmentally oriented
studies. It is for professional personnel responsible for
the collection, analysis and interpretation of environ-
mental data. The emphasis is on parametric tests of
significance or sampling from normally distributal data.
It is necessarily methods oriented where heuristic
persuasion is used to give insights into the concepts,
developments and foundations of statistical theory.
ENVIRONMENTAL PROTECTION AGENCY
Water Program Operations
TRAINING PROGRAM
May-- 1974
-------
INTRODUCTION TO TESTS OF SIGNIFICANCE
In this paper we characterize the program action necessary in a two-tailed
test of significance by defining and relating the following: (1) test of signif-
icance, (2) statistic, (3) null hypothesis, (4) test assumptions, (5) law of
random variation for the statistic when H_ is true, (6) alternative hypothesis,
(7) distribution of the statistic when H. is true, (8) four possible decisions
in a test of significance (9) type-I error, (10) critical values, or value,
(11) type-II error, (12) power of the test, (13) rejection region, (14) acceptance
region, (15) decision rules, (16) statistical decision, (17) diagram of a two-
sided test of significance, (18) robustness, (19) sample size, (20) test of
significance summarized, (21) example of a two-tailed t-test.
-------
AN INTRODUCTION TO TESTS OF SIGNIFICANCE
I INTRODUCTION
A Test of Significance
Statistics has been classified by many
into two broad subject areas- descriptive
statistics and statistical inference (see
Figure 1).
Descriptive statistics makes extensive
use of pictures, tables, graphs, and
other visual arrangements. These methods
are found popularized in magazines and
newspapers. The onus is centered on
summarizing the available information
into an easily assimilated form and not
on acquiring new knowledge about the world
we live in.
Inferential statistics on the other hand is
further subdivided into two classes of
problems, namely, estimation and tests of
significance (other names for the latter
are hypothesis testing, significance tests.
and tests of hypothesis). New knowledge
is acquired via induction; that is we progress
from the particular to the general or from
sample to parent population.
Estimation methods are used to approximate
numerical values of unknown population
parameters (like the mean) from incomplete
or sample data. A teat of significance can
he defined as a method of analyzing aaia so
aa to discriminate between two hypotneses.
These two methods are closely related,
and indeed one can contruct a procedure to
cover both aspects. If they are separated,
as done here, then the logical development
for a test of significance is greatly simpli-
fied resulting in a comprehensive under-
standing in a shorter time period.
The test of significance in turn is only one
small link1 of the total experimental chain
that can be briefly described as (1) problem
definition. (2) data collection scheme,
(3) data collection, (4) data analysis, and
(5) the written report.
In the most carefully controlled investiga-
tion it is virtually impossible for two
identical experiments to yield absolutely
identical responses, under the assumption
FIGURE 1
CLASSIFICATION OF STATISTIC
STATISTICS
DESCRIPTIVE
INFERENCE
TESTS OF
SIGNIFICANCE
I .
ESTIMATION
PARAMETRIC
NON-PARAMETRIC
-------
An Introduction to Teats of Significance
that the measuring instrument is sensitive.
If, however, it is not, then identical
responses can always be obtained.
This inability to obtain identical responses
is due to random variation in the surrounding
environmental conditions, the imprecision
of the measuring instruments, limitation
of the observer's technique and experience.
and the inherent variability encountered
from sample to sample. All these factors
plus others induce fluctuations in the
experimental reponse. Even when the
tested hypothesis is true the observed result
will not match it.
The purpose then of a statistical test of
hypothesis is to sort out and identify the
differences attributable to expected random
observational fluctuations as opposed to the
differences attributable to deviations from
the hypothesis.
The basic question to be answered is "How
large must an observed difference be to
justify rejecting the hypothesis'" Or stated
otherwise, "How divergent must the observed
difference be in order to be called a rare
difference as opposed to expected random
variation' "
To answer this question the procedure or
the steps taken in a two-tailed test of
hypothesis are always the same'. The
explanation of the variations on the basic
two-tailed test will be taken up in the sequel.
The criterion or measure used for discrim-
inating between differences or distinguishing
between hypotheses is a statistic.
Statistic
A statistic (note no s on the end) can be
simply defined as a function of the sample.
That is, it is some value calculated from the
observational or sample data. This is
accomplished by using a formula or some
prescribed recipe for mangling the data.
The method varies from test to test and is
not unique.
where x - sample mean
n = sample size
5 = standard deviation of parent
population
s = sample estimate of standard
deviation
In (1) the statistic value calculated from
the sample is used to test if the mean of
a sampled normal population is 10 when
its standard deviation is known and equals
5. In (2) the test is the same except that
the sampled population's standard devia-
tion is unknown and is estimated from the
data as s.
For example if the four sampled values are 10,
11. 12, and 11. then z =(11-10)^4 / 5 = 0.4
The statistic is the measure or criterion used
to discriminate between two hypotheses used
in a test of significance, namely, the null
hypothesis and the alternative hypothesis.
Null Hypothesis
The null hypothesis is defined as the hypo-
thesis of no difference between a hypothetical
and the sample population. It will be shown
later that it should be formulated for the
express purpose of being rejected or nulli-
fied (see Alternative Hypothesis).
If the hypothetical population is labeled as
Ph and the sampled population as Ps, then
"no difference" can be written as Pg - Pn= 0
or null. Many statisticians notationally write
the null hypothesis as
or
H,
-Ph = 0
Two examples are
_ (g - 10) *fn
(1)
Unfortunately this notation is not universally
uniform, but fortunately most deviations
from these are centered in using a different
subscript for H.
As an example the null hypotheses for the
statistic given in (1) and (2) are both
written the same as
(x - 10)
(2)
H,
10.
-------
An Introduction to Tests of Significance
For (1) the hypothetical population is
normally distributed with mean 10 and
standard deviation = 5 while the sampled
population is normally distributed with
standard deviation = 5 and an unknown
mean, and we wish to test whether the
unknown mean = 10.
For (2) the hypothetical population is nor-
mally distributed with mean = 10 and an
unknown standard deviation while the
sampled population has unknown mean and
standard deviation, and we wish to test
whether the unknown mean = 10.
In both cases we test whether the difference
between the sampled and hypothetical pop-
ulation is null or whether the sampled
population has a mean of 10.
Restrictions such as the type of population
sampled as well as parameter information
have been placed. These restrictions are
popularly called test assumptions.
Test Assumptions
Each test of significance has associated
with it explicit or implicit, or both, test
assumptions that must be satisfied if the
results of the test are to be valid. Failure
to meet the assumptions affects the pro-
bability statements that will be introduced
later under the topic Type-1 Error (see
also Robustness).
For example the test assumptions or the
mathematical model for (1) are that the
sampled population is normally distributed
with standard deviation = 5. The theory
also requires that the samples be random
and independent.
Test assumptions are generally classified
as either parametric or non parametric.
The former have more stringent require -
ments, and justification for their use is
generally more difficult. For example
that the sampled population is normally
distributed with a known standard deviation
could be a parametric test assumption.
A non parametric test requires no hypothesis
about specific parameter values. These
tests are often called distribution-free.
which really means that the test is inae-
pendent of the form of the underlying dis-
tribution. These two are not interchange-
able but are so often done that usage has
made them indistinguishable. A non
parametric test might merely r»quir that
the sampled population be continuous in or-J'1-
to be validly applied. The requ'-ements
for a non parametric test are g< nerally
fewer and can be applied to a la-ger set of
source data, or used when the parametric
assumptions cannot be proved or are
unknown.
The advantages of parametric exceed non
parametric when applicable and will be
discussed further under the topic Power
of the Test.
Let us assume that the test assumptions
(or assumption) are (or is) satisfied. If
the null hypothesis is true (sampled and
hypothetical populations are equal) then the
sampled values are determined by the sampled
population. These sampled values fix the
value for the statistic. Different random
samples result indfferent random values for
the statistic; that is we say the statistic is
a random variable. The law of random varia-
tion that the statistic follows when the null
hypothesis is true must be known before
discrimination is possible.
Law of Random Variation for the Statistic
When HQ is True"
The law of random variation for the statistic
can be determined either analytically or
empirically. The analytic determination is
made by the originator of the test, who is
a researcher or a mathematical statistician.
For specific examples of an analytic develop-
ment the reader is referred to the journals
or the many commercial texts available on
mathematical statistics.
Some examples of the statements of several
of these laws as determined by the researcher
might be (1) the statistic is normally distri-
buted with zero mean and unit variance, or
(2) the statistic is distributed as chi-square
with degrees of freedom equal to sample
size minus one, or (3) the statistic is
t-distributed with degrees of freedom equal
to sample size minus two.
The empirical determination of the law of
random variation can be done by anyone who
really understands the test that is being
applied. If the experimenter cannot imagine
some sampling sequence using random
numbers to determine the law of variation
then there is a deficiency somewhere.
-------
An Introduction to Tests of Significance
As an example the empirical determina-
tion for the z statistic defined by (1) could
be as follows. Take a random sample of
nine, n = 9 (any value for n can be used),
from a normal population whose mean is
10 and whose standard deviation is 5. From
these nine values calculate x and finally the
value of z from equation (1). As a result
of this first sequence we have one value
for z; label it Zj.
This sequential procedure can be repeated
as often as desired. Suppose it is re-
peated 1,000 times, resulting in 1,000
different values for z, namely zlf r.^,
- zl 000- We can form a histogram
using these 1, 000 values and then draw
a smooth, continuous curve through the
center of the rectangles as shown in
Figure 2.
The smooth curve labeled HQ in Figure 2
is an approximation to the law of random
variation or the distribution 01 the z
statistic when HQ is true. If instead of
using 1,000 values for z, we used all
possible values then the distribution would
have been exact, that is, in perfect agree-
ment with the analytic solution. With
1, 000 values the differences between the
two solutions are small and furthermore
can be made arbitrarily small by calcul-
ating enough sequences. The approximate
solution converges or approaches the
analytic solution as the number of sequences
increases.
If the above experiment were performed
then it would be determined that z would
be approximately normally distributed
with zero mean and unit variance, regard-
less of the value of n used.
The t statistic in (2) when sampling from
normal is distributed as a t-distribution
with degrees of freedom equal to (n - 1)
provided that the degrees of freedom used
in estimating s is (n - 1). Usually this is ,
true since x and s are estimated from the
same data.
That these distributions do approximate
the ones claimed can be verified objective-
ly via a test of significance or subjectively
by using the eyeball test. With 1. 000
samples the agreement should pass the
eyeball test easily by the comparing of
percentiles or some other cumulative
values.
The reader is urged to verify (1) when
n = 4 or some small number. If a computer
is used then 1,000 is a small number of
sequences. If a computer is not used
then decrease the number of sequences,
realizing that the approximation will not
conform as well.
There are four distributions that one can
talk about up to this point- (1) the hypothe-
tical population or distribution, (2) the
sampled population, (3) the distribution
of the random sample of size n taken from
the sampled distribution, and (4) the dis-
tribution or law of random variation for the
statistic calculated from the sample (see
Figure 3A).
FIGURE 2
DISTRIBUTION OF THE STATISTIC WHEN THE NULL HYPOTHESIS IS TRUE
r*
_s
^
_f*
s
^^
^
^^
^
h*^_
^
V
^,
V
v,_
V
^^L STATISTIC AXIS
-------
FIGURE 3
FOUR DISTRIBUTIONS III HYPOTHESIS TESTING
H IS TRUE
HYPOTHETICAL POPULATION
SAMPLED POPULATION
DISTRIBUTION OF SAMPLE
RANDOM SAMPLE
I
10
VARIABLE AXIS
DISTRIBUTION OF
STATISTIC
VARIABLE AXIS
^INIQUE VALUE OF STATISTIC
OBTAINED FROM SAMPLE
STATISTIC AXIS
HA IS TRUE
HYPOTHETICAL
POPULATION
sPULATICtt
\
i
\
\
\
\
i
iRAHDOM SAMPLE
V~
DISTRIBUTION OF SAMPLE
LIKE PARENT IF
REPRESENTATIVE
f\
' \
, \
1 \
\
' \
i \
f
.^
\
10
VARIABLE
20
VARIABLE AXIS
sa
DISTRIBUTION OF STATISTIC
STATISTIC AXIS
\ UNIQUE VALUE OF STATISTIC
CP
**
3
Q.
O
rt-
O
3
(B
01
O
"1
C/3
i
n
tu
o
-------
An Introduction to Tests of Significance
In an actual test there is only one sample
and hence only one value for the statistic.
In Figure 3A it is labeled ZQ. The reader
should bear in mind that it is only one
value out of a possible range of expected
random variation. Some intervals,
especially those near the mean, have
higher probabilities than others, owing
to the nature of the distribution. It will
be shown later that these probabilities
will be used to determine unexpected
random variation.
If Ilg is not true then in the real world
some alternative hypothesis is true, and
this circumstance gives rise to a different
law of random variation, owing to the
truth of the alternative hypothesis.
Alternative Hypothesis
If the null hypothesis is not true then the
negation of this is the alternative hypothesis.
For our example earlier this \s written
by most as
H,
10
and read as "the mean of the sample pop-
ulation is not equal to 10. " Again observe
the caution that this notation varies from
author to author. The form ol the test
dictated by such an alternative hypothesis is
a two-sided or two-tailed test, and the
reason is shown in Figure 4.
There it will be noted that the alternative
hypothesis occupies both sides of the
parameter space, that is, to the right and
left of u B 10. A one-sided or one-tailed
test will be discussed in the sequel, and
is~lhe reader has guessed, the alternative
occupies one side of the parameter space.
The alternative hypothesis must be the
operational statement of the experimenter's
research nypotnesis. ir this is not so, or
tne null Hypothesis is the research hypo- -
thesis, then the researcher cannot really
prove his research hypothesis. In order
to prove the research or alternative hy-
pothesis the null hypothesis must be re-
jected or nullified. A heuristic argument
justifying this very important position (al-
ternative and research hypothesis should
be the same) will be taken up undeT
Decision Rules.
In the design of experiments, data collection,
and the subsequent test of hypothesis, the
aim of the researcher should be to reject
his favorite theory.
If the result of the tests of significance
from data so collected doeo reject the
pet theory then savings in money, man-
hours, and material have been realized.
On the other hand if the test results do
not reject his position the implication
is a large gam in the confidence of the
validity of the theory. One is in the
FIGURE 4
PARAMETRIC SPACE FOR -A TWO-SIDED TEST
T
10
H AXIS
rto :
»A !
10
10
-------
An Introduction to Tests of Significance
position of trying to prove something is
wrong but being unable to do so. Each
failure adds extra strength to the belief
that it cannot be wrong.
Too often researcher design data collection
to support their theory and innocently build
in unknown biases. By taking the opposite
position and vigorously attacking it the
danger of bias is decreased. An aggressive
philosophy will yield more rapid advances
in research.
Distribution of the Statistic When HA is
True
Figure 3A shows the distribution of the
statistic z in (1) when HQ is true or the
sampled population has ji - 10. How is
that distribution affected if the alternative
hypothesis, u t 10 is true9
To be specific, suppose the sampled pop-
ulation had a = 20 (see Figure 3B). By
examining the numerator of (1) we note
that z would now tend to have mostly posi-
tive values since most sample means would
be greater than 10. This distribution for
z would be shifted to the right of the
^ » 10 curve.
A sequential sampling scheme would give
its approximate shape. Obviously it would
be some distribution like that labeled
,A - 20 in Figure 3B. If u B 0 then by the
same argument the distribution for the
statistic would be shifted to the left of the
M » 10 curve, as shown in Figure 5.
By applying the sequential sampling
process it is easily seen that, for every
possible value of ^ for the sampled popu-
lation, there would exist some distribu-
tion for the statistic. It could 1-e any one
of the two shown in Figure 5 or some
other not drawn there.
Regardless of the value of ji in the sampled
population the reader can imagine one but
only one distribution for HA »f true.. If HQ
is true then there is only one distribution
for the statistic (see Figure 3A).
The uncertainty faced is what is the position
for the true sampled population9 Is it
coincident as in 3A or shifted as in 3B9
This problem is solved by partitioning the
statistic axis so as to maximize our
possible correct decisions and minimize
our possible incorrect decisions when we
discriminate.
Four Possible Decisions in a Test of
Significance
The correct decisions are (1) do not reject
HQ when it is true and (2) reject HQ w"e" '*
is false. Our incorrect decisions are (3)
reject Hn when it is true, or (4) do not
reject Hp when it is lalse. These four
possible decisions are shown in Figure 6.
In order to distinguish between the two types
of incorrect decisions or errors, they are
given the special names type-I and type-11
error.
FIGURE 5
SOME POSSIBLE DISTRIBUTIONS FOR THE STATISTIC
= 10
20
STATISTIC AXIS
O B
-------
An Introduction to Tests of Significance
FIGURE 6
FOUR POSSIBLE DECISIONS IN A TEST OF SIGNIFICANCE
STATISTICAL DECISION
DO NOT REJECT HQ
REJECT HO
REAL
WORLD
"O
IS
TRUE
CORRECT
DECISION
INCORRECT
DECISION
(TYPE I ERROR)
(a-ERROR)
"O
IS
NOT
TRUE
INCORRECT
DECISION
(TYPE n ERROR)
(P-ERROR)
CORRECT DECISION
Type-1 Error
The type-1 error (also called « - error
or error of the first kind) is the rejection
of the null hypothesis when it is actually
true. The seriousness or quantification of
this error is measured by using probability,
and its value is obtained in the following
way. Suppose in Figure 5 that we adopt
the following discrimination rule. If the
statistic is greater than or equal to A(some
constant) reject HQ: otherwise do not re-
ject H0.
Suppose the sampled population is ^ = 10.
Since the distribution of the statistic for
the hypothetical population is always known.
its area under the curve to the right of a
known A can be determined. This area is
the probability of getting a value for the
statistic, which according to our rule
means reject HQ falsely, or a type-1
error has been made.
A familiar example of making a deliberate
oc-error is the fairy tale wherein the
little girl jokingly cries "wolf, wolf
falsely, HQ- no wolf, HA- wolf.
-------
An Introduction to Tests of Significance
The magnitude of the oe- error is called «
and is also known as the level of signifi-
cance. In a test of significance the reverse
procedure is actually followed, that is, oc
is pre-specified before any data are
collected, and from a set of appropriate
tables the corresponding value for A is
determined.
It should be clear from Figure 3 that the
law of random variation is determined by
the test assumptions, and failure to meet
them results in a different law. Using the
incorrect law implies an incorrect A.
Thus, conforming to the test assumptions
is importantfoj^a^cor!r!£C^&and_S_ The
only loophole possible is discussed under
robustness.
The tables for various laws of random
variation are found in the back of almost
every book in statistics and are used to
specify A, which is called the critical
value or significant limit.
Critical Values or Value
In the two-sided tests used here two
critical values are used in order to maxi-
mize both correct decisions shown in
Figure 6. A heuristic argument leading
to this conclusion is as follows.
Assume that Hois true; therefore the only
distribution on Figure 5 is the p. = 10 curve.
Let the statistic be the normally distributed
z of (1). If « o. 05 then from a set of
normal-curve tables the single critical
value for the statistic is A = 1.64. The
probability of z less than or equal to 1.64
is 0.95, or the correct decision (do not
reject HQ) will be made, on the average,
85 percent of the time. The probability
of getting z greater than 1.64 is 0.05.
We shall make the incorrect decision or
an «-error (reject HQ when true), on the
average, 5 percent of the time. The
value for A is determined by oc, and the
distribution of the statistic is determined
by the known hypothetical distribution
whether Hn is true or not.
Is A a good choice when H« is not true'
Let us assume for definiteness that ^ = 0
in the sampled population (see Figure 5).
The probability of a correct decision now
(reject Hn) is almost zero. The value of
the statistic would be some point on the
^ = 0 curve as determined by the particu-
lar sampled data. The probability that it
is greater than A is the area under u. = 0
to the right of A or almost zero. Tne corrcv '
decision of rejecting IfQ is almost never
a poor choice for A under the acsumption
ji = 0.
On the other hand A is a good choice if
the sampled population is ^ = 20. In that
case the probability of rejecting Hg is high
and is the area under the ^ - 20 to the right
of A. If one alternative is true we have high
probability of making the correct decision
(reject HQ) while if another alternative is
true we have low probability.
We try to correct this situation by the
following procedure. Keep « » 0. 05 and
change the critical value from A to B = -1. 64.
The decision rule is changed to reject HQ if
the statistic is less than B; otherwise do
not reject HQ. Applying the same logic
as in the previous two paragraphs shows
that if n = 20 is true then the correct
decision (reject HQ) has almost zero
probability. Tf n = 0 is true then the
correct decision has high probability.
Thus to protect against the alternative's
being on either side of p = 10, the area
should be put equally in both tails, giving
rise to two critical values. For example,
if oc= 0.05 then A = 1. 96 and B = -1. 96
for the normal curve. Do not reject Hn
will be the decision if tTTe" statistic lies
between these values; otherwise it will be
rejected.
Values of the statistic from 1.96 to -1.96
are expected random variation. Those
outside this range are unexpected and
represent genuine differences when the
power of the test is considered. A topic
to be defined soon.
If the sampled population is u = 20 then the
probability of a correct decision (reject HQ)
is the area under its curve to the right oF
A and to the left of B, a nice high number.
If the sampled population is ^ = 0 then the
probability of a correct decision is the
area under its curve to the right of A and
to the left of B, also a high value.
If the sampled population is ^ = 10 then the
probability of a correct decision (dp not
reject HQ) is the area under its curve
between A and B, which 1 - cc, a preselected
number.
Both correct decisions have a high probability
regardless of the value for ^ in the sampled
population.
-------
An Introduction to Tests of Significance
The «is preselected by the researcher,
not the consulting statistician; however.
the implications of a large or small a
should be made clear.
In initial research an oc_error or 10
percent or even 20 percent is reasonable
and not unusual. In that case the re-
searcher is guarding against rejecting
the research hypothesis and is less con-
cerned about an «-error. Large « means
that the distance between A and B in
Figure 5 is decreased to, say, E and F.
The decreased distance means that the
continue research or reject HQ probabili-
ties are increased regardless of the
sampled population.
As research continues and other tests are
made, the cc level is gradually reduced.
If an cc-error (to incorrectly continue re-
search) was made initially, a subsequent
test will reveal the mistake, and no harm
will have been done except extra research.
On the other hand if the «-error is made
small initially, the distance between A
and B is increased to, say, C and D. The
increased distance means that the do not
reject HQ (abandon research) probabilities
are increased regardless of the sampled
population.
If a type-II error (see Figure 6; abandon
research when it should be continued) is
made then the rmstake would never De
uncovered unless another researcher works
in the same area.
A type-I error can also be thought of as
the researcher's risk of following a false
clue in research. If a type-I error is
made then the statistical decision is that
the research hypothesis is true when it
really is not. The result is further re-
search in a false area. Eventually enough
evidence or subsequent tests of hypothesis
will reverse the decision and reject the
research hypothesis. Because of the
initial incorrect decision, we have follow-
ed the false clue (continue research) and
did subsequent research until a later test
corrected the initial incorrect decision.
The type-I error can also be viewed as
taking a risk. The risk is the probability
of saying the research hypothesis is true
when it really is not. This risk is oc, and
1 - cc is the probability or the confidence
one would like for accepting a true HO.
Both values are related to each other by
odds. The odds of being right when HQ
is true are 1 - «= to «. Therefore, if
one chooses « » 0.05 the odds of a right
decision when HQ is true are 0. 95 to 0. 05
or 19 to 1. Changing cc to 0.01 changes
the odds to 99 to 1, and « 0.1 implies
odds of 9 to 1 of correct decision if HQ
is true.
One cannot set the odds too high since
there is a penalty for each increase, namely,
raising the probability of a type-II error.
These two errors are not independent, and
the interrelationship between them can be
explained in the following way.
Type-II Error
The type-II error (also called p-error or
error of the second kind) is the probability
of not rejecting HQ when it is false or is
the probability of rejecting HA when it is
true. The p is the magnitude of the p-error.
Graphically this error can be shown in
Figure 5 by considering the following
hypothetical problem. Test HQ with the
decision do not reject HQ if the statistic
lies between A and hi, otherwise reject.
SUppose HA is true and ^ » zu.
probability that a statistic will he between
A and B is the area under its curve between
these two points. ThereforetS1**he pro-
bability of not rejecting Hn falsely (it
should be since H^ is true; is the area
under the appropriate true alternative
hypothesis over the region of do not
reject HQ, or from A to B.
If the critical values are changed from A,
B to C, D (decrease oc) then p would be
increased. Increasing °c from A, B to
E, F would decrease p. Thus « and p are
inversely related.
Unfortunately the value for the alternative
is never known and p cannot be calculated.
The size of the type-II error depends on the
disparity or distance between the null and
alternative hypotheses, sample size, cc,
and the particular test of hypothesis used.
Further remarks will be made about p
estimation in the sequel.
The oc-error is generally considered by
most to be more serious than a p-error.
Making false claims about our research,
<*-error, is more important error than
abandoning a possibly fruitful research
effort, p-error.
10
-------
An Introduction to Tests of Significance
Furthermore, journals frequently publish
articles in which Hn is rejected, and one
seldom sees articles that do not reject HQ.
This mean «-errors are the only kind that
can appear in print. One wonders how much
duplication there is in research because of
that fact. Failure to reject Hg is also in-
formation that could guide other researchers
in the selection of their hypothesis.
A good example of minimizing oc irrespec-
tive of the effect on the size of (3 exists in
our courts of law (HQ: not guilty, HA
guilty). We would sooner accept a (3-error
(let a guilty man go free) than commit an
cc-error (send an innocent man to jail). Of
course, we can guarantee °c = o always by
setting everyone free.
The (3 -error can be thought of as the risk
of failing to following a true clue. Our
decision is do not reject HQ (or reject re-
search hypothesis) when H0 should be re-
jected (or HA accepted). Owing to this
incorrect decision, we abandon further
research that should be pursued. The true
clue is the truth of H/., but because of a
p -error we reject it and abandon follow-up
research.
Power of the Test
One minus beta (1 - 0) is defined as the
power of the test. It is the probability of
accepting HA when true. Graphically it
is the area under the true alternative over
the rejection region.
It can be thought of as our confidence level
of accepting the research hypothesis when
true. Thus tests with high power are
desirable. The odds of accepting HA if
true are (1 - p) to (p).
If you have a set of data for which there is
a parametric and a non-parametric test
available then the parametric is more
powerful and preferred. Suppose you were
searching for someone in a large crowd, and
the only clue you had was that the individual
was definitely in the crowd. This search
would not have much hope of being fruit-
ful (i.e., powerful). On the other hand if
the available clues were that the person
sought is male, 6 feet tall, has black hair,
is wearing a brown suit, smokes cigars,
limps, has an amputated right arm, and so
on. then the search would be effective or
powerful. More clues imply a more
powerful search, and by analogy the more
assumptions that are made and realized, the
more powerful the test.
Rejection Region
The preselection of « m conjunction with
the known distribution of the statistic for
P^ determines the critical values. These
critical values in turn partition the total
statistic axis into two parts, the rejection
region (also called significance region or
critical region) and the acceptance region.
The rejection region is so selected that if
HA is true then the probability of accepting
it is high (1 - p) while at the same time if
HQ is true the probability of rejecting it is
°c, the preselected value.
These values are classified as real differ-
ences, not random vairation, and represent
a compromise when type-I and type-II
errors are taken into consideration.
In Figure 5 if C and D are the critical
values then the rejection region is to the
right of C and to the left of D. The re-
mainder of the statistic axis is called the
acceptance region.
Acceptance Region
The acceptance region is selected so that
if H^ is true the probability of rejecting
it is small (p) while at the same time if
HQ is true the probability of not rejecting
it is the preselected 1 - oc.
The action to be taken when a statistic is in
either region is called the decision rule.
Decision Rules
Our first decision or discrimination rule
(see Figure 6) is do not reject HQ if the
statistic falls in the acceptance region
(some aliases- cannot reject Hn. a non-
significant result, outside tne rejection
region, outside tne~cT7tical region, accept
HQ,
research hypothesis not proved).
A nonsignificant result should be interpreted
as meaning that such a statistic value from
these sample sizes this large is obtained
so frequently when HQ is true that the data
convinces no one that anything more than a
HQ random process produced them. If HQ
is true then the acceptable odds, preselected
by the researcher at 1 - oc to oc. favor this
decision. It also means that the research
hypothesis was not proved statistically
according to the standard of evidence based
on probability.
11
-------
An Introduction to Tests of Significance
The contracted statement of the decision
rule, to accept HQ, is often used instead
of the longer statement. If one took this
statement literally (unfortunately some
do) it would be incorrect, since the null
hypothesis is never proved when a non-
significant result is obtained.
A heuristic argument taken from Draper
and SnutM1) is as follows. John Doe, an
office worker, is at lunch when the idea
is advanced that he is not rich, HQ- not
rich, HA ' rich. Two proofs are cited:
(1) he always buys his clothes at second-
hand stores (2) he always brings his lunch
or eats at the cheapest places as today.
The statistical parallel is that two non-
significant tests of significance were in-
correctly interpreted as proving HQ as
true instead of do not reject HQ. It was
learned later that Mr. Doe died and left
$500, 000 to charity. All the proofs of
the null hypothesis are invalidated.
One can never prove the null hypothesis
by getting a nonsignificant result. The
implication is clear- given two tests
(1) H0° research hypothesis and (2) HA
= research hypothesis then one must
choose the second test. If a significant
result is obtained then the research
hypothesis has been proved. See the
sequel for further remarks.
Our second decision rule is reject HQ if
the statistic falls in the reaction region.
Rejection means that if HQ is true then a
statistic value of this magnitude from a
sample size this large is so rarely obtained
(odds favoring it are « to 1 - «) by a ran-
dom process alone that this peculiar statistic
value points to something over and above
the HQ random process.
When the statistic falls within the rejection
region, two logical possibilities exist. The
first is that HQ is true and a rare event or an
oc error by the random process. The only
other possibility is that HA is true and
caused this large statistic value. Since
the acceptable °c-error was preselected
the decision must be that the sampling was
from HA.
Some authors state the result of a signi-
ficant statistic as HA is true unless a
rare event has happened, constant
repetition of tnis long statement becomes
tedious and it is often contracted as H^
is true, accept research hypothesis.
reject Hp, mere is a sigmlicanrresult,
and so forffTI
Statistical Decision
The statistical decisions in a test are
(1) do not reject Hn if the statistic is in
the acceptance region (Z) reject Hn n tne
statistic is in tne rejection region.
In a test where the statistic is a continuous
distribution the critical value can be in
either region since it does not affect the
value of cc.
In tests where the statistic is discrete
the tables clearly state whether the critical
value is the boundary of the rejection or of
the acceptance region. Caution is advised.
Note also that for discrete distributions
probabilities vary by discrete jumps.
Hence it is unusual to have °= equal exactly
to 0.05 or any other level. In those cases
the °c level is specified as 0.05 or less.
Diagram of a Two-Sided Test of Significance
The proceeding terms have been collected
and their interrelationship shown in Figure 1,
Some interpretations are as follows.
If HQ is true or it is the only distribution
for tne statistic in Figure 7 then three kinds
of statistic values can be obtained.
First would be a value in the left rejection
region meaning reject H« incorrectly or an
oc-error. Next a value in tne rignt rejection
region with the same result. The third
value we can obtain lies between the two
critical values meaning do not reject Hn
correctly or no error.
On the other hand if the sampled population
has a larger mean then Pn, its distribution,
is on the right. When the first of its three
values is in the right rejection region then
HQ is rejected or HA. is accepted correctly
12
-------
FIGURE 7
TWO-SIDED TEST OF SIGNIFICANCE
REAL WORLD
I HA TRUE DUE TO
: SMALLER MEAN
HQ IS TRUE
HA TRUE DUE TO
LARGER MEAN
I
I
..J
STATISTICAL
DECISION
* i i *
| REJECT HO j DO NOT REJECT HQ j REJECT HQ )
^^?.^^r~^~IZ:£--,
: a/2 OR | p OR j p OR «/2 OR J
ERROR \
A
TYPE I
ERROR
: TYPE
{ ERROR :
NO
ERROR
I TYPE H
J ERROR
I .--
TYPE I
ERROR
1 NO
J ERROR
L. ___
1
:
..""* DISTR. OF
...'iHE STATISTIC
^.^OR SMALLER MEAN Hft
-nrrrf
REJECTION REGION
Iff
i
i
I
l
^-~""^^ '
J^^ DISTR. OF^^Hftt'll
'\jffl THE STATISTIC xf iH^
fjTj'/i.. FOR HQ ^
-------
An Introduction to Tests of Significance
or no error. The next value could be in
the left rejection region (not shown, to
prevent overcrowding) with the same re-
sult. The third number could be in the
acceptance region with the incorrect
decision do not reject Hn or a (3-error.
The discussion if the sampled population
has a smaller mean follows analogously
and is left to the reader. Again only two
of the three values are shown.
Robustness
A robust test is one that is insensitive to
departures from the assumptions. A test
that is sensitive to departures from the
test assumptions lacks robustness. If a
test is robust and the assumptions are not
violated badly then the odds still remain as
specified. This means that, for example,
if one uses a test which assumes normality
and the source data are quasi-normal, then
a robust test can be confidently applied.
It is impossible to quantitate the departures
from test assumptions that can be made in
general and still have valid odds. Each
test is different and much research has
been done. The reader should consult with
a statistician on these matters.
If a test requires several assumptions then
one can talk about robustness with respect
to each assumption or with respect to any
combination of assumptions. See the sequel
for further remarks.
Sample Size
A realistic question before data collection
is how many sample points are needed. In
many cases this question can be answered
provided the researcher can supply infor-
mation about variability and the size of
differences that are to be discriminated.
Consultation is again suggested since much
work has been and is being done.
Test of Significance Summarized
A test of significance can be defined as a
method of analyzing data so as to discrim-
inate between two hypotheses. The first
is the null hypothesis; the second is the
alternative hypothesis, which should be the
operational statement of the experimenter's
research hypothesis. Next, the best test
for the alternative hypothesis is selected,
that is, the one with the greatest power, and
is robust. This test is based upon assump-
tions, which may be explicitly or implicitly
stated, or both. The probability statements
used in discrimination are based upon the
assumptions. After the test is selected,
the«-error is prespecified. that is, the
odds of being right if the null'hypothesis
is true. This oc- error in conjunction with
the known distribution of the statistic for
the null hypothesis determines critical
values. The critical values partition the
entire statistic axis into two parts, the
acceptance and rejection region.
The sample size is fixed and the data are
now collected. From the collected data
the statistic is evaluated and compared
with the critical values to determine
whether it is in the acceptance or rejection
region. After this comparison the appro-
priate decision is made.
If the statistic is in the acceptance region,
the appropriate decision is do not reject
HQ, or cannot reject Hp. or rej'ect HA . and
₯o on. It the statistic is in the rejection
region, the decision is reject Hn, or accept
-A' or there is a significant difference, ana
so forth'witn eacn decision mere is sr
possible error.
When the statistic falls in the acceptance
region and if in the real world HA is true
and we reject HA , this is a type-Tl error;
however, if HQ is true there is no error.
There is no chance for a type-I error when
the statistic is in the acceptance region.
When the statistic falls in the rejection
region and if in the real world HQ is true
and we reject H0, this is a type-I error;
however if HA is true there is no error.
There is no chance for a type-II error
when the statistic is in the rejection region.
Briefly when a test of significance is per-
formed the following information should be
available- (1) H0, (2) HA. (3) «, (4) test
assumptions. (5) statistic, (6) distribution
for the statistic, (7) critical values, (8)
acceptance region. (9) rejection region, and
(10) decision after the statistic is evaluated.
Additional desirable information is (11) a
comparison of the power of the test versus
that of others available for the same HQ
and H^, (12) sample size needed to dis-
criminate differences of specified magnitudes,
(13) information about robustness, and
(14) type-II error information.
Example of a Two-Tailed t-Test
We illustrate a two-tailed t-test by using
a four-step process for convenience's sake.
14
-------
An Introduction to Teats of Significance
Assume that we wish to test at the 5
percent significance level whether a new
product average p, is different from the
previous standard » 7.35. The observa-
tional data are random independent nor-
mally distributed Xt i = 1. 2,.... 25.
Step 1. HQ: jx»7.35
HA- n*7.35
oc . 0.05
Step 2. Statistic « «/h (X - 7.35) / s. which
is distributed as t-distribution with
24 degrees of freedom. From a
table of t values, the two critical
values are± 2.064. The acceptance
region lies between these two values,
and the rejection region is all other
statistic values.
Step 4. Reject HO since the statistic is in
the rejection region, or the new
average is different from the old
one.
Many more examples can be found in
almost every textbook on statistics. The
four-step process used above is not
generally used but is adopted here for
instructional convenience.
In the concluding article we discuss one-
talled test, variations on a two-tailed test
and conclude with several remarks that
should clarify further other ideas about
a test of significance.
REFERENCE
1 Draper, N.R. and Smith, H. Applied
Regression Analysis, 1967. John
Wiley and Sons. Inc.. New York.
Step 3. From the data'X -7.1, s 0. 504;
therefore, the statistic » ^5 (7.1
- 7.35)/0.504 * -2.480. or in the
left rejection region.
This outline wa.s prepared by
Joseph F. Santner, Mathematical
Statistician, Manpower Develop-
ment Staff, Office of Water Programs,
National Training Center, EPA,
Cincinnati, OH 45268.
15
-------
VARIATIONS ON A TEST OF SIGNIFICANCE
This is the second of a two-part paper which describes the program of
action in a test of significance when extended to one-sided tests. In addition
simple and composite hypotheses are illustrated by several examples.
Finally, the relationship between a significance test and confidence limits
is exhibited when variations on a two-sided t-test are discussed.
-------
VARIATIONS ON A TEST OF SIGNIFICANCE
In part one of this paper we characterized the
program of action necessary in a two-tailed
test of significance by defining and relating
the following (I) test of significance, (2)
statistic. (.5) null hypothesis. (4) test assump-
tions, (5) law of random variation for the
statistic when HQ is true. (6) alternative
hypothesis, (7) distribution of the statistic
when HA is true, (8) four possible decisions
in a test of significance. (9) type-I error, (10)
critical values or value, (11) type-II error,
(12) power of a test, (13) rejection region,
(14) acceptance region. (15) decision rule,
(16) statistical decision, (17) diagram of a two
suled test of the significance, (10) robustness,
(10) sample size, (20) test of significance
summarized and (21) example of a two-tailed
t-test. The purpose of this paper is to extend
the procedure to one-sided tests.
I TWO-SIDED VERSUS A ONE-SIDED TEST
The decision to make either a two-sided test
or n one-sided Irst (also called one-tailed) is
not left to the discretion of the researcher.
This determination is made in advance of any
data collection and is based upon eitner
theoretical grounds or previous experience.
If at least one of these causes is present then
a one-sided test can be made and is actually
preferred since it is generally more power-
ful than the corresponding two-sided test. If,
however, there exists no valid justification for
a one-sided test, a two-sided test is mandatory.
This procedure as well as the preselection of
fl before data collection guarantees that
statistical decisions will not be biased.
II STATEMENT OF THE HYPOTHESES
The null and alternative hypotheses for a
one-sided test can be written in two different
ways, neither of which is universally accept-
able to statisticians and neither of which is
both logical and satisfying to the researcher.
Authors recognize the problem, however,
their varied solutions cause much difficulty
for readers.
We illustrate with a specific example. Suppose
one has theoretical grounds to test whether
a given distribution has a new mean greater
than 10 when the previous mean was 10.
These hypotheses can be written either as
H0. p = 10 HA : ii > 10
or
H
10
10
(1)
(2)
where ^ is read less than or equal.
The parameter space for this one-sided test
is shown in Figure 1. In both (1) and (2) the
alternative hypothesis occupies one side of the
parameter space line, and hence its name.
On the otherhand the null hypothesis is only
a point for (1) but a half line for (2).
Figure I
PARAMETRIC SPACF FOR A ONE-SIDED TEST
AXIS
'0:
10
10
HO:
HA:
-------
Variations on a Test of Significance
It is logical to define the hypotheses as (1)
since the definition of °c remains as the pro-
bability of rejecting H0 if true. It is very
unsatisfactory when one realizes that if the
true mean is less than 10 the correct decision
is impossible. A satisfying decision procedure
would be one which makes a correct decision
for all possible values of the parameter. This
is true only when the unon of the hypotheses
is the entire parameter line.
It is satisfying to define the hypotheses as
(2) but illogical since the « error' must be
redefined as the probability or rejecting rfp
wnen
= m only. THIS follows since tne null
has a infinite number of
hypothesis in
different values for p. Hence for a given
00 there are an infinite number of critical
values with a corresponding number of
acceptance and rejection regions. Out of this
set there is only one which gives HA its greatest
power and that is when ^ = 10. This is the one
used in a test and hence the redefinition of «.
Those who prefer (1) object to the illogical
redefinition of <* and rightly so. Those who are
partial to (2) object to unsatisfactory incom-
plete specification of the parameter space.
The saving grace is that, regardless of which
specification is used, the program of action
and final decisions agree either that there is
not enough evidence to support the research
hypothesis or that the research hypothesis has
been proved statistically. I prefer {2) and
shall use it throughout with cc redefined.
Ill DIAGRAM OF A ONE-SIDED SIGNIFICANCE
TEST
Figure 2 shows the diagram of a one-sided
test. Note that the distribution for HO is a
unique curve and would correspond to HQ in
(1). If (2) were used then the distribution of
HO would be the same and obtained by using
the u = 10 value of HQ.
Figure 2
ONE-SIDED SIGNIFICANCE TEST
REAL WORLD
STATISTICAL DECISION
: I I
I DO NOT REJECT HQ REJECT H0
' DISTRIBUTION OF
STATISTIC
AXIS
ACCEPTANCE REGION
CRITICAL VALUE
-------
Variations on a Test of Significance
In comparing Figure 2 with the diagram of a
two-sidee1 test of Figure 3 some differences
are obviously apparent. The most notable
is that al of a is in the right tail of HO in
Figure 2 as opposed to °c/2 m Figure 3.
This shifting is heuristically obtained by
modifying Figure 3 sufficiently to obtain a
one-sided test. Since larger differences are
the research hypothesis and smaller differ-
ences are of no interest the latter distribution
should bo deleted from Figure 3.
The probability of accepting I-I^ when true is
the area under its curve over the rejection
region, and hence the rejection r-'ijion should
be selected to maximize this ares. This is
done by inspection of Figure 3. It is easily
seen that increasing the right rejection region
increases power. This is done by moving the
right critical value as far to the left as
possible (such that all of «is in the right tail of
HQ). which maximizes power.
Figure 3
TWO-SIDED TEST OF SIGNIFICANCE
REAL WORLD
HA TRUE DUE TO
SMALLER MEAN
TRUE DUE TO
LARGER MEAN
STATISTICAL DECISION
REJECT H0
...._.
j DO NOT REJECT HQ j REJECT HQ j
n~r~zz\:J.
*
. NO
'
ERROR
0./2 OR
TYPE I
ERROR
DISTR. OF
THE STATISTIC
FOR SMALLER MEAN II
: ?. OR :
TYPE II
ERROR
.
NO
HRROK
.^iJfSTR
r i
3 OR
TYPE II
ERROR
,
fv/2 OR
TYPE I
ERROR
r
ro?^^^ ^,-1
./r ">^J ^i |
X^THE STATISTIC xrKl i
rrT
REJECTION REGION j
*[> , FOR M,
, ACCEPTANCE
J-,-llTf!l!|T
KRT.ION ,
^
lr
1 1
NO i
I
ERROR J
1 !
r ~
i
i
i
DISTR ^ j
OF STATISTIC
X
FOR LARGER *
MEAN HA ^^.^
TTT ^^"~-
, REJECTION REGION
STATISTIC
AXIS
CRI7 ICAL VALUE
r.'iriCAL VALUE
-------
Variations ona Test of Significance
Note that this shifting or increasing the size
of the right rejection region in turn increases
the area under HA over the reiection region
or increases the probability of accepting HA
when true or increases the power of the test.
Therefore if HA is true the one-sided test
has more power than a corresponding two-
sided test, or in other words, one-sided test
are preferred.
The discussion of Figure 2 follows analogously
that of Figure 3 in part one of this paper
and is left to the reader.
IV EXAMPLE OF A ONE-TAILED t-TEST
An example of a one-sided test could be as
follows. There exists theoretical grounds
for testing whether a new product average
exceeds the previous standard = 7.35. The
observational data are random and independ-
ently normallv distributed X, i = 1, 2, .... 25.
Step 1.
HO v i
II
A
> 7.35
= 0.05
Step 2.
Sten
Step 4.
The statistic = \Tn(X - 7.35)/s, which
distributed as t-distribution with
degrees of freedom = 2-1. From a set
oft tables, the critical value is 1.711
(see Figure 2). The acceptance
riqion is less than or equal to 1.711,
and the reiection region is all other
values.
Prom the data "X = 7.6 and s = 0.504;
honce the statistic = ^5(7.60 - 7.35)
/O.S02 = 2.48.
Reject HO since the statistic is in the
rejection region, or the new average
exceeds the old.
Modifying the test for a research hypothesis
with a smaller mean would be as follows
Step 1.
II0 ji>7.35
HA u< 7.35
r«"*0. OS
Step 2. The statistic is th*» same as in the
previous problem. Critical value
= - 1.711 with the acceptance region
greater than or equal to -1.711 and
the rejection region all other values.
Step 3. From the data X = 7. 10. s = 0. 504,
hence the statistic =v.^5(7.10 - 7.35)
/0.504 = -2.48.
Step 4. Reject HQ
V SIMPLE AND COMPOSITE HYPOTHESES
At this time it might be well to define two .
other terms used in connection with hypothesis
testing. A simple hypothesis is one that
specifies the distribution uniquely or is a
point in the parameter space of the distribution.
A composite hypothesis is one that is not
simple.
If the distribution for Figure 1 has one para-
meter, say the mean, then HQ as written on the
left is a simple hypothesis or a point. IIQ on
the right is composite or the half line while
HA >s composite in both statements.
An example of a two-parameter family would
be the normal distribution as shown in Figure 4.
If the test HO u = 10 is made with known
variance =10 then HQ >s a simple hypothesis
since it is the point (10, 10).
If the test is HO- n = 5 with an unknown variance
then MO is composite or the half line (5, a2)
or the unmarked, solid, vertical line of Figure 4.
The alternative in both cases is composite.
VI VARIATIONS ON A TWO-SIDED t-TEST
When one reads two different texts giving the
same test, confusion can result il the test is
presented in what looks like two different pro-
grams of action. The confusion is muddied
even further when more references arc con-
sulted. Figure 5 shows some common ways
of presenting a two-sided t-test for the mean
where S- = sKrn.
X
The top line of Figure 5 agrees with Figure 3.
Note that one can algebraically manipulate
from one line of Figure 5 to another. In some
cases it is possible to do a two-sided test with
only one critical value, as shown in line 2.
A good example of a two-sided test with only
one possible critical value is the F-test used
in an analysis of variance. The right tail is
always used since we have the ratio of a
numerator which consists of a larger or equal
mean square divided by a denominator mean
square.
Note that lines 3 and 4 have no statistic or
critical values as defined in this paper. They
would be classified as quasi-test of significance
by this writer.
-------
Variations on a Test of Significance
Figure 4
PARAMETER SPACE FOR THE NORMAL DISTRIBUTION
o AXIS
(10,10)
10
U AXIS
Figure 5
VARIATIONS ON A TWO-SIDED TEST
EXPLANATION
RANDOM VARIABLE
AND
TEST STATISTIC
RANDOM VARIABLE
AND
TEST STATISTIC
RANDOM VARIABLES
NO
TEST STATISTIC
RANDOM VARIABLES
NO
TEST STATISTIC
RANDOM VARIABLES
NO
TEST STATISTIC
METHOD
-t < (X -
1
t
\X -V \
4. C ^" / v
~ tO S k A
X
1
Ix -n\<
1
x - tss<
OF TEST
T
/l)/S5
-------
Variations on a Test of Significance
At times it may be possible to manipulate a
significance test into a confidence limits.
Hence one can perform a test by using confi-
dence limits, as shown in the last line of
Figure 5. This result is not always true.
Here estimation and tests are on common
ground. Lines 1 and 5 of Figure 5 are math-
ematically equivalent.
If the reader is unsure of Figure 5 then a
numerical example should be used to work
out all cases. If the previous numerical
example is used then arithmetic and time
are at a minimum with the advantage that
the statistical decision can be checked for
agreement if "- = 0. 1 is used.
Variations on a one-sided test are left to
the reader. They can be easily obtained
from lines 1, 3, and 5 with the appropriate
modification.
VIT CONCLUDING REMARKS
We conclude with several remarks that should
help clarify further the ideas associated with
a test of significance.
A For small samples a significant result
will be obtained only if the null hypothesis
is very badl> violated. In Figure 2 this
means that the distance between the peaks
of MQ and HA must be very large. Small
samples imply large variances for both dis-
tributions, and hence poor power. It is
a truism that from small samples you get
small information.
M If a difference exists and is shown, then by
increasing sample size this difference can
be shown with a smaller type-I error. For
example, if one gets a significant result
at «= 0.05 and a sample size of 20, then
with a sample size of 50 a significant re-
sult can be obtained with a smaller «.
The effect of increasing sample size in
Figure 2 would be to cause the variance
of both I-IQ and 11^ to be decreased or
both distributions would be more peaked.
Hence to keep « fixed the critical value
must be shifted to the left, and this shift
thereby increases the rejection region or
power of the test. In addition since H^
is peaked the type-II error would be de-
creased without changing the critical value,
and hence power increases for a second
reason. The sum of these two increments
gives a more powerful test. Therefore
"" can be decreased or the critical value
moved right to retain the original power,
which was significant.
C Using the same °- and increasing sample
size will result m a rejection of HQ even
when differences are smaller and thus
more difficult to detect. AS an example,
if « = 0.05 and sample size = n failed to
get a significant result because of a type-II
error, then increasing the sample size
sufficiently will result in a significant
statistic. Increasing the sample size in-
creases the power for the same *. as
discussed in R above.
D For the same sample size and ^-, larger
mean differences are more easily dis-
criminated. As the difference between Up
and HA increases, this means in Figure Z
that the peaks of the two distributions be-
come farther apart. This increases power
and the probabilitv of a significant result.
E For the same sample size and 'r. smaller
differences are more difficult to determine.
The heuristic argument follows that m
the pervious paragraph, with II/\ and HQ
shifted closer together, loss in power can
be regained by increasing sample size.
F It is often claimed that if a difference exists,
no matter how small, one can prove it pro-
vided that sample size is large enough. One
can reduce both tvpe-I and tvpc-II errors
by increasing sample size. If sample size
is infinite then both errors can be made zero.
G If after increasing Cample size the same
decision of nonsigmficancc i& obtained the
confidence of agreement between the sampled
and hypothetical population is strengthened.
In other words, if one continuously urils
nonsignificant results with increasing sample
size, then the acceptance of lly becomes
more tennble. Complete confidence m MQ
is obtained onl\ with an intir-itdv large
sample.
H Trying to prove the validiU ol llo b> non-
significant results for finite n is like the
mathematician's trying to prove a theorem
by example. If the mathematician's
theorem is true for every possible example,
then it is proved. Mv the same reasoning,
if every possible sample of size n is tested
then one would expect to have a significant
result of '"N times, where N = total number
of all possible samples of size n. Significant
deviations from reN would invalidate \\Q.
I If a test does not meet the assumptions
then the ° level is affected and is not what
the researcher specified. Power mav also
be adversely affected. A test with fewer
-------
Variations on a Test of Significance
K
assumptions is preferred unless the
departures are compensated for by
robustness.
If a researcher really understood the test
he was using, with enough time he could
determine the critical value empirically
via a Monte Carlo simulation study. Re-
peated sampling would permit the construc-
tion of the distribution of the statistic for
HQ. From it the critical value could be
easily determined. !f the researcher cannot
imagine an experiment for obtaining the
distribution via some random sampling
procedure, then he really does not under-
stand his test of significance.
Similarly, by using the Monte Carlo
technique, one can empirically determine
the power of the test versus that of a
specific HA. First assume that H^ has a
given set of parameter values, next deter-
mine its approximate distribution via Monte
Carlo, and determine 1-p or the power of the
area under the curve over the rejection re-
gion. Byassummga different set of parameter
values and by repeating the abo^e, one
can study the change in the powpr of a
test versus any set of specified parameter
values.
L In the same way one can use Monte Carlo
techniques to investigate robustness. If
one assumes a departure of a certain magni-
tude from the basic assumption, then the
evaluation of °- is determined as in J,
where the rejection region is now known from
the distribution of HQ when no assumptions
are violated. If these two ^'s are in agree-
ment, then the procedure is rpbust with
respect to the proceeding departures from
the basic assumptions. This procedure can
be repeated for any assumption or combina-
tion of them or any magnitude of departure
from them.
This outline was prepared by
Joseph F. Santner, Mathematical
Statistician, Manpower Develop-
ment Staff, Office of Water Programs,
National Training Center, EPA,
Cincinnati, OH 45268.
-------
STATISTICAL DATA
-------
Classification of Statistic
STATISTICS
DESCRIPTIVE INFER
ENCE.
TESTS OFVn. , ESTIMATION
SIGNIFICANCE
T«*t« of significance (hypothesis teating, testa of hypothesis, significance
tests, ... ) .
A teat of significance can be defined aa a aathod of analyzing data s« aa to
discriminate between two hypothesis.
L£M D£FIHITIQN
A*«A^^^^-^nu>^x^ jt^^^^^
DATA COLLECTION $C/Y£A/£
3) »
/
S
5)
1A-1
-------
A statistic (not* no a on the end) can be simply defined as a function of the
sample and is the Measure or criterion used to discriminate between two hypotheses
used in a test of significance, namely the null and alternative hypothesis.
? S/ZE
5= 57"A/V. QEV OF M=SAHP-
PAR. POP.
57AN. DEM.
uo
L^yV. SKLAS*.""i 10TT
D i- R '( K- /o)/
r"
H.
Two examples of a statistic are focmula 1) and 2) above.
1A-2
-------
The null hypothesis is defined as the hypothesis of no difference between
a hypothetical and the sampled population.
Hypothetical population (P ) is
nomally distributed with
mean = 10 and standard deviation
5.
Sampled population (P ) is
nomally distributed
with unknown mean and
standard deviation = 5
Each test of significance has associated with it explicit or implicit, or both,
test assumptions that must be satisfied if the results of the test are to be vallc
U-3
-------
Classification of Statistic
STATISTICS
DESCRIPTIVE INFERENCE
TESTS OF ESTIMATION
SIGNIFICANCE <&$* *
duff s-
PARAMETRIC NON-PARAMETRIC
. 0 _ ^ ^^.
hss*~*^_ -^jf lui/xr Cu/ux^ ~ flg* I*' fa ~ /H.f
JLdlutAtf/ Ixfivti* AtiAJtdr * S«^VY>^. /«5^
1A-4
-------
Distribution of the Statistic
When the Null Hypothesis Is True
'^*/-*/^
**U3«» STATISTIC AXIS
Four Distributions in Hypothesis Testing
HQ is True
HYPOTHETICAL FOR
EQUALS
SAMPLED POP
RANDOM SAMPLE
DISTR. OF SAMPLE
LIKE PARENT IF RER
10
VARIABLE AXIS
VARIABLE AXIS
DISTR. OF STATISTIC
0 ZQ STATISTIC
4 AXIS
OF STATISTIC /
FROM SAMPLE
1A-5
-------
The alternative hypothesis is usually the negation of the null hypothesis but
can be any admissible hypothesis alternative to the one under the test.
Oz«
-*»/!
A) t = ^Cx-io
- 10
Parametric Space fora Two-Sided Test
p^rrMXlS
10 '
H0:ji=lO
HA:|1^10
1A-6
-------
Four Distributions in Hypothesis Testing
H is True
HYPOTHETICAL
POP SAMPLED
... POP
RANDOM
"SAMPlT
DISTR.OF SAMPLE
LIKE PARENT IF REP.
DISTR. OF STATISTIC
10 20
VARIABLE AXIS
20
VARIABLE AXIS
UNIQUE VALUE
"oTsTATTsricT
AXIS
Four Distributions in Hypothesis Testing
HQ is True
HYPOTHETICAL POP
EQUALS
SAMPLED POP
RANDOM SAMPLE
10
VARIABLE AXIS
DISTR. OF SAMPLE
LIKE PARENT IF REP
.X
DISTR. OF STATISTIC
10
VARIABLE AXIS
0 zn STATISTIC
4 AXIS
OF STATISTIC /
Some Possible
FROM SAMPLE
Distribution for the Statistic
n=o
|i=20
/"AX
» m
J 3i
"...
0
\
^
^ \.
D B E F A C CTATiCTir
AXIS
1A-7
-------
Four Possible Decisions in a Test of Significance
REAL
WORLD
HO
IS
TRUE
HO
IS
NOT
TRUE
STATISTICAL
DO NOT REJECT H0
CORRECT
DECISION
INCORRECT
DECISION
(TYPE II ERROR)
(0- ERROR) ^
DECISION
REJECT HQ
INCORRECT
DECISION
(TYPE 1 ERROR)
(ex -ERROR) <*-
CORRECT
DECISION
V
Error of the
"frlrst Kind
Error of the
Second Kind
cr-error (Type I Brror, Error of First Kind) Is the rejection of the null hypothesis
when It Is true.
Some Possible
Distribution for the Statistic
-. \A
L=0
DBE
-*£
STATISTIC
AYIC
AAlO
'he nacnltude of the a-error Is called a and is also known as the level of
ignlflcance.
The critical values (or value) partition the statistics axis into two regions,
the acceptance region and the rejection region.
-------
Some Possible
Distribution for the Statistic
n=io
n=o
DB E F AC
lislO
STATISTIC
AXIS
DBE F AC
|i=10
STATISTIC
AXIS
H=20
DBE F A C STATISTIC
AXIS
1A-9
-------
ODDS OF CORRECT
TO
o(=0.0/
{UMAAW/vsl/P' P~~
1A-10
-------
0-error (Type II Error, Error of Second Kind) IB not rejecting the null hypothesis
when It la false, or is the rejection of the alternative hypothesis when it is true
The magnitude of the (5-error is called 0.
The cr-srror is generally considered more serious than a 0-error.
Some Possible
for the Statistic
DBE
STATISTIC
AXIS
One ainus beta (1-3) is defined as the power of the test.
The odds of accepting the alternative hypothesis (H> when true are 1' - P to 0
Parametric test are generally more powerful than the corresponding non-parametric
test.
The rejection region is a region on the statistic axis such that if the statistic
fall within it the null hypothesis is rejected. It is also called significance
region and critical region.
The acceptance region is a region on the statistic axis such that if the statistic
falls within it the null hypothesis is not rejected.
The first decision or discrimination rule is do not reject IL if the statistic
falls In the acceptance region (some aliases: cannot reject HO> a non-significant
result, outside the rejection region, outside the critical region, accept HQ,
research hypothesis not proved).
The second decision or discrimination rule is reject 1L if the statistic falls
in the rejection region (some aliases: H is true, accept research hypothesis,
there is a significant result, H. is true1 unless a rare event has happened).
1A-11
-------
Two'Sided Test of Significance
REJECT HQ
«/2 OR
TYPE I
ERROR
REJ. REGION
DO NOT REJECT HO
DIST.
STATISTIC
FOR HQ
ACC. REGION
CRITICAL VALUE
STATISTIC AXIS
REJ. REGION
CRITICAL VALUE
Two-Sided Test of Significance
RiAL
\ HA TRUE DUE TO
' LARGER MEAN !
REJECT HQ I DO NOT REJECT Hp | REJECT Hp
REJ. REGION rrffll 1
/» OR :
TYPE Ilj
J
/ STATISTIC,^
f
1
FOR Hpxfjj
ERROR i
<
i
i
i
^-
'Ik
\
i NO
i "j^
~
DIST/0/
cnnun
I
STATISTIC FOR
LARGER MtAN HA
«, REJ. REGION-^
i
i
!
-J
STATISTIC
CRITICAL VALUE
CRITICAL VALUE
1A-12
-------
Two-Sided Test of Significance
HA TRUE DUE TO
SMALLER MEAN
REJECT HO
NO
ERROR
:
DO NOT REJECT HQ
REJECT HQ 1
ft OR ;
TYPE 1 1
ERROR :
DIST. OF
STATISTIC FOR
SMALLER MEAN HA
REJ. REGION
CRITICAL VALUE
'DIST. OF
STATISTIC
FOR HQ
ACC. REGION
\
REJ. REGION
CRITICAL VALUE
STATISTIC
AXIS
Two-Sided Test of Significance
HA TRUE DUE TO
SMALLER MEAN
HO IS TRUE
HA TRUE DUE TO
LARGER MEAN
REJECT HO
NO
ERROR
oc/2 OR
TYPE I
ERROR
DIST.
OF
STATISTIC FOR
SMALLER MEAN HA
REJ. REGION ^
11
\ DO NOT REJECT HO REJECT HO X^yjool/
ft OR
TYPE II
ERROR
llfl
III
'
t
S
"--.,, """.^
N0 ft OR «/2 OR NQ i
rppop TYPE II TYPE! TV ',
ERROR ERROR ERROR ERROR
A^\ --' "
STATisYlC,fH
FOR Hoi'f ! j K
- Jf! i i : K
^.r^ilifi i llTt>
i
^N
DISfOF
STATISTIC FOR
LARGER MEAN HA
k REJ. REGIOR-x^ STATISTIC
AXIS
CRITICAL VALUE
CRITICAL VALUE
1A-13
-------
A yoK^flt- font- is one that la insensitive to departures from the assumptions. A
test that la sensitive to departures fro* the teat assumptions lacka robustness.
A test Of significance can be defined as a method of analyzing data so as to
discriminate between two hypotheses. The first hypothesis is the null hypothesis;
the second is the alternative hypotheals, which should be the operational statement
of the experimenter's research hypothesis. Next the best tast for the alternative
hypothesis la selected, that is, the one with the greatest power, and is robust.
This test is based upon assumptions, which may be explicitly or Implicitly stated,
or both. The probability statements used in discrimination are baaed upon the
assumptions. After the test Is selected, the or-error is prespeclfled, that la
the odds of being right If the null hypothesis la true. This cr-error In conjunc-
tion with the known distribution of the statistic for the null hypotheals determines
critical values. The critical values partition the entire statistic axis Into
two parts, the acceptance and rejection region. The sample size la fixed and the
data are now collected. From the collected data the statistic la evaluated and
compared with the critical values to determine whether it is in the acceptance
or rejection region. After this comparison the appropriate decision is made. If
the statistic is In the acceptance region, the appropriate decision is:do not
reject the null hypothesis. If the statistic is in the rejection region, the
decision is reject the null hypothesis. With each decision there is a possible
error. When the statistic falls in the acceptance region and if in the real
world the alternative hypothesis is true and we reject It, this Is a Type-II error.
On the other hand if the null hypothesis is true and we do not reject it then
there is no error. There is no chance for a Type-I error when the statistic- is In
the acceptance region. When that statistic falls in the rejection region and if
in the real world the null hypothesis is true and we reject it, this la a Type-I
error. On the other hand if the alternative hypo the is la true and we do not
reject it then there is no error. There is no chance for a Type-II error when
the statistic is in the rejection region.
1A-14
-------
M/votf. o/sra
W-&1
3) X=7/;xL^.5^t3t5~(7/-7
'\
:^-«ty8
If there exists no valid justification (theoretical grounds or previous experience)
for a one sided test then a two-sided test is Mandatory.
a)
-------
Parametric Space for a One-Sided Test
HQ : H= 10
HA:|i>10
HAXIS
10
U-16
-------
REAL WORLD
One-Sided Significance Test -"'' ~*
'V^4
W
i
HQ IS TRUE I JlA JS TRUE j
STATISTICAL
DECISION LPQ. NJ?T_REJEPl MoJ _ _"J'ICT Hp_ _}
DIST. OF
STATISTIC
FOR HQ
STATISTIC -^
FOR HA STATISTIC
AXIS
ACC. REGION
REJ. REGION
n b j» n hipivra *
CRITICAL VALUE ^JJL ^ ^
Two-Sided Test of Significance
HA TRUE DUE TO
SMALLER MEAN
HO IS TRUE
HA TRUE DUE TO !
LARGER MEAN !
REJECT HQ
" " """"" ""~"~--- .^ '
DO NOT REJECT HO
""-
N0 «/2 OR ft OR NQ
r.Bn. TYPE * TYPE 1* CDPAD
ERROR ERROR ERROR ERROR
DIST. OF
STATISTIC FOR
SMALLER MEAN HA
REJ. REGION f
^Mcr AP\
^ffff
mfl
REJECT HO I
e OR
TYPE H
ERROR
,-'""
S STATISTIC^
FORH^f!||j
ACC:f^Clii
i !'
])v
m
^
«/20R
TYPE I
ERROR
^«
r
>.K
DISlTtff
STATIST!]
LARGER II
-.REJ. REl
""-^
NO ;
ERROR !
i
i
i
i
I
i
8dLOR
MCANHA
Gmbk. STATISTIC
UlUR
-------
«
_
=^ C V= I- 7/»! X"
AR
"
t?P
\C.V=-llli
3) X=7
71
-------
A alaple hypothesis is one that specifies the distribution uniquely or is
point in the parameter space of the distribution.
A composite hypothesis is one that is not simple.
Parametric Space fora Two-Sided Test
1 [i AXIS
10
:= 10 ~
Parameter Space
for the Mormal Distribution
a2 AXIS
10
HAXIS
lA-19
-------
Variations on a Two-Sided Test
EXPLANATION
RANDOM VARIABLE
AND
TEST STATISTIC
RANDOM VARIABLE
AND
TEST STATISTIC
RANDOM VARIABLES
NO
TEST STATISTIC
METHOD OF TEST
I MZbitc aUuan,
| £uUErf(LA'«rt U
X-nl/S-
-------
5M I) IFF
C V. -> MOR£
SAME C \/
lA-21
-------
DO NOT
A^UyK/»
U-22
-------
/\
Distribution of the Statistic
When the Hull Hypothesis Is True
H
STATISTIC AXIS
lA-23
-------
F/XED PARA. VALUES
ii
H.
/yo(/) MEETS TESTS ASSUMPTIONS
HLQ) FAILS
rv. F/?OH TABLES
-------
EFFECT OF A CHANGE IN /Ji
[JL2 LARGER THAN
CT SAME FOR THE CURVES.
LJL IS A MEASURE OF CENTRAL TENDENCY
OR WHERE VALUES TEND TO CENTRALIZE.
EFFECT OF CHANGING CT ALONE AND KEEPING
fJL THE SAME.
O~ IS A MEASURE OF SPREAD OR VARIATION
IN ORIGINAL SCALE UNITS.
1B-1
-------
EFFECTS OF CHANGING BOTH fJL a cr
fJL2 GREATER THAN ^,
GREATER THAN CT,
IB-2
-------
CUMULATIVE NORMAL DISTRIBUTION-VALUES OF P
ZP
.0
1.0
2.0
.00
.8413
.01
.05
.9798
.09
No %.
IB-3
-------
1B-4
-------
CUMULATIVE NORMAL DISTRIBUTION-VALUES OF P
P ~fablfc A-\
T-Z
ZP
.0
1.0
2.0
.00
.8413
.01
.05
.9798
.09
r
0.0^03
\
IA>
o.
1B-5
-------
CUMULATIVE NORMAL DISTRIBUTION-VALUES OF zp
. A-2.
T-3
MV
Zp
.9
1.282
.95
1.645
.975
1.960
.999 1
3.090 1
^^^^^^^^^^^^^^^H
-------
0.05"
CUMULATIVE NORMAL DISTRIBUTION-VALUES OF zp
I
P
9
1.282
z
.95
1.645
.975
1.960
.999
3.090
-------
o-QW
O.OS.S-
1
CUMULATIVE NORMAL DISTRIBUTION-VALUES OF zp
'. A-2.
1
.9
1.282
.95
1.645
i
1
1B-8
-------
TAPLE A-I^A-1 FO*
UNIT
X=ANY NoR.\ti£rt !«K
I I
Iflf^flV, W
_ v\/'J&'i \/ ^^ >^/ ^ ~T /\ /v
X-NOI?.
FIND AREA
To =
^C/o-ao)^--/^
Z=(^o-Sio;/4- |
IB-9
-------
10
30 X
CUMULATIVE NORMAL DISTRIBUTION-VALUES OF P
p Table. A-j
T-2
1B-10
ZP
.0
1.0
2.0
.00
.8413
.01
.05
.9798
.09
-------
7 NO R
C TK.
,Y,-T;^ZJ+--»
AS
-^ z
^- ^-x
X=»
RV-
x2-Distribution
12 V2
ic-i
-------
NOR, DI sra
0,02.5
DisTB
s '
PERCENTILES OF THE X DISTRIBUTION
Table. A-3
THE VALUES OF Xp CORRESPONDING TO p
df
1
4
10
X2
*.90
15.99
X2
A.95
3.84
9.49
*2
*.99
6.63
1C-2
-------
TT-DIST
O.oS
D/S7B.
AR
Y-/0
KKENTILES OF THE
DISTRIBUTION
Tqble
T-f
THE VALUES OF Xp CORRESPONDING TO p
df
1
4
10
X2
*.90
15.99
X2
*.95
3.84
9.49
*2
^.99
6.63
iC-3
-------
KAIV. VAR
df
1C-4
-------
NOR. DISTB
MJLJzCV*^**
>*<**
»«r
1D-1
-------
Normal vs Student Distribution
NORMAL
STUDENT 4d.f.
f(t)
0.4
-202
Student t-Distribution
i.
1IX-2
-------
LET
V.=
A- l
1D-3
-------
Percentiles of the t Distribution
P
-------
t
o, O5"
Perantilts of the t Distribution
^^^^^mMnh
df
24
26
...
60
...
^tmmu/A y/M//fth±.
'.95
1.711
1.671
tp
*.975
2.056
1C
...
T-5
1D-5
-------
0.0*-r
o-?s-
Percentiles of the I Distribution
P
1D-6
-------
(LET N.^
1
>^ '
EITHER,
RV=
r
,
=M-\= off
I x-v
F-DISTRIBUTION
(nj=10f n2=4)
30
-------
PERCENTILES OF THE F DISTRIBUTION
F.95(nl'n2)
Table ,, ^
T-7
>^L
n2^\
1
2
30
00
1
161.4
18.51
4.17
3.84
it
30
250.1
19.46
1.84
1.46
»
i
i
<
00
254.3
19.50
1.62
1.00
/n(-1
O.I 5
0,0^
QlMfl
1E-2
-------
0-75
ft
0,?5
PERCENTILES OF THE F DISTRIBUTION
F.95(nl'"2>
r /
>^l
n2^\
1
2
30
00
1
161.4
18.51
4.17
3.84
30
250.1
19.46
1.84
1.46
GO
254.3
19.50
1.62
1.00
1E-3
-------
(30,30)
O.OS"
PERCENTILES OF THE F DISTRIBUTION
f
F.95(nl'n2>
/J-S
T-7
>
-------
PARENT
POPULATION
SAMPLING
DISTRIBUTION
Classification of Statistic
STATISTICS
I
DESCRIPTIVE
INFER
NCE
r
TESTS OF
SIGNIFICANCE
I
ESTIMATION
I
PARAMETRIC
NON-PARAMETRIC
]./
-1
-------
ESTIMATION
CONFIDENCE- INTERVAL
POINT REAL
" "LINE
ONE-SIDED
TWO-SIDED
- 5
*
LOWER
_ I ______
I
______
UPPEB
v,
(
I
i
o
fsr /t P/\R. POP MEAN
-------
a K.-*t+frr>$-KX.r*) ^ g for* J
u, m-i
x4T-
A§4J6-
^^#-Wf^>=
=^7'^=V^/
I
*
2-3
-------
3/KSED ESI OF (T
POP 5TAHD. DEV.
Pqge I-10
SAMPLE SIZE, n
3
4
s, IS AN UNBIASED
ESTIMATE OF:
0.797CT
0.886
2-4
-------
ESTIMATION
POINT REAL
*" LINE
1
CONFIDENCE-INTERVAL
TW^SIDED , _
ONE-SIDED
LOWER
h
UPPER
SAMPLE SIZE
2-5
-------
I1!
I X
"'
jw4Xjuua**^e t^
A I »J
\/VV>jL0L/VV-
M * t A ^^^
s*~
^y , v*A/NA/^ ^^M, V/ ^
bk cLt/w-ec d/om^fcx^/^-f iT^o J. <^ J ^
> I > 1 ( J ^ \ . ft
2-6
-------
/ood -
60 N ff I NT Fo fl /r £
CUMULATIVE NORMAL DISTRIBUTION -VALUES OF zp
z
p
.9
1.282
.95
1.645
.975
1.960
^^^^^^^^^^^^^^_^___
.999
3.090
^^^^^^^^^^^
-------
-------
ya^
2-9
-------
CONF. INT
< (f
FACTORS FOR COMPUTING TWO-SIDED
CONFIDENCE LIMITS FOR «
,cT
DEC. OF
FREEDOM
V
1
2
3
100
ot a 0.05
BU
17.79
4.859
3.183
1.157
BL
0.3576
0.4581
0.5178
0.8757
T-31
T-35
ALSO ot = 0.01, 0.001
2-10
-------
COKF. JNT (TO-SIDED)
UL
/\
r*
/&-ftwiO:K
^
-------
If
DISTRIBUTION
'99.7r>
at »i
V
gRNOAID
VIATION
Estinitri b|
b-a
4.2
b-i
4.2
R5. 2-1
3
5
Pae
2-12
-------
LET
If
5TA6£
T
CUMULATIVE NORMAL DISTRIBUTION -VALUES OF zp
ToLle A-Z
T-3
z
p
.9
1.282
.95
1.645
1.
975
1.960
999
3.090
-------
> .
\HT.
Percentiles of the t Distribution
P
Tabl
df
24
26
60
!.95
1.711
1.671
!.975
2.056
2.0
2-14
-------
cr-pcr
t
=0.95
NUMBER OF MEASUREMENTS REQUIRED
TO ESTABLISH THE VARIABILITY WITH STATED PRECISION
F.
1000
UJ
£ 20
C/9
"I
^r * ^9
^ tT
0
c
ALSO if = .90, .99
V
\
i
\
2-2
P°/o
2-15
-------
SAMPLE SIZE
o.9« P 5
OF
Uiv\\r$
-------
2.-SJD5D TOL. ill fa
100
OF POP
*
P
FACTORS FOR TWO-SIDED TOLERANCE
LIMITS FOR NORMAL DISTRIBUTIONS
n
2
3
4
y=0.90
P
0.75
0.90
*
0.95
4.943
0.99
0.999
"fable
T-IO
r=0.75,0.9r0.95,0.99
n = 2 (VARIOUS) oo
2-17
-------
One Sample Problem
TEST
H0:>u = CONSTANT
HA:>U * CONSTANT
H0 '-/JL* CONSTANT
HA:/^ -CONSTANT
H0:/t ^CONSTANT
HA -'A -CONSTANT
CONDITIONS
1.
2.
3.
4.
5.
6.
a UNKNOWN^
6 KNOWN
6 UNKNOWN
6 KNOWN
6 UNKNOWN
6 KNOWN
-------
EAV6. D/fF. TO
Sample Sizes Required...
^ C0*M4JUu&v>-> 40*- *v-U7
d
.2
.4
.8
i-x*
.5
97
25
7
.6
31
8
.7
.*
.9
263
.95
.99
.
Table A-h
T-n.
ALSO oc=Q.01
J-2
-------
-------
=0. 5
-735) . / -7
J
OfCDo Nor
'I
of the t Distribution
P
(0e
df
24
26
60
'.95
1.711
1.671
^975
t
,2.056'
--
i
Table A-4
4T-5
3-4
-------
-jj^ -fOty**
Sample Sizes Required...
d
.2
.4
.8
1-/J
.5
97
25
7
.6
-
31
8
.7
.8
.9
263
.95
.99
Table A-8
T-16
ALSO ot = 0.01
3-5
-------
INTERVAL
Fofr
SM/iLL
SMALL
CORRECT
Sample Sizes Required...
06=0.05
ALSO oc = 0.01
3-6
d
.2
.4
.8
\-fl
.5
97
25
7
.6
31
8
.7
.8
-
.
.9
263
'*.-
.95
.99
..
Table A
T-16
-------
One Sample Problem
TEST
H0:>U = CONSTANT
HA:>u* CONSTANT
H0 :/** CONSTANT
HA :/6 -CONSTANT
H0:/< ^CONSTANT
HA /** CONSTANT
CONDITIONS
1. a UNKNOWN
2. 6 KNOWN
3. 6 UNKNOWN
4. 6 KNOWN
5. 6 UNKNOWN ^
6. KNOWN
l*w> <
X.735
-------
-g. _ /fte-.i, 2
"" "
§-
7. 35"-/f ) ~ 0. 8
J
JL -P-8.- a
(T
DISTRIBUTION
*""-
STANDARD
DEVIATION
Estimated by
4.9
b-a
3.5
b-a
4.2
b-a
4.2
Sample Sizes Required...
0^=0.05
. #nvt>.
Figure 2-1
Page 2-9
d
.4
1-X? " '
.5
.6 '
23
.7
.8
. . .
.9
.95
* - k
.99
-'
Table
T-l/
3-8
-------
! fours. _ A.R.
Percent!Ies of the t .Distribution
P
§1
df
24
26
60
^^M^M^
mm^fm^m
^^^MM
^M^HM^B
!.95
1.711
1.671
tp
^M^
!.975
2.056
M^MBH
Table A-4
T-5
3-9
-------
IS/OR N\AL
(f
d
3-10
-------
Two Sample Problem
TEST
CONDITIONS
i-s. /./
J-3J..I
-3./« 3
1. <7A =
-------
.«.
t
- O-9
Sample Sizes Required...
Od=0.05
d
.2
.4
.8
i-x?
.5
97
25
7
.6
31
8
.7
.8
.9
263
t
.95
.99
*
Table A-8
T-16
ALSO oc = 0.01
3-12
-------
3-13
-------
£>EC Do NOT R£T. //0
CUMULATIVE NORMAL DISTRIBUTION-VALUES OF zp
p
ZP
.9
1.282
.95
1.645
1
975
.960
999
3.090
-------
Two Sample Problem
TEST
CONDITIONS
5.
6.
, BOTH UNKNOWN ^K
, BOTH UNKNOWN
1. 6A ,
-------
Sample Sizes Required...
01=0.05
d
.1
.2
.4
3.0
i-x?
.5
.6
91
23
.7
30
.8
.9 '
.
.95
.99
Table A-9
T-17
ALSO ot = O.Ql
3-16
-------
/>i4-»/v\ B - 2^
3-17
-------
DEC.
PercentiIes of the t Distribution
df
24
26
60
IP
4
'.95
1.711
1.671
*.975
.2.056
Table A-4
T-5
3-18
-------
OC Curves for the Two- Sided t-test (eC= .05)
o.6
s
hU
gO.4
i^.
50.2
II A
t^ 0
1.
Figure 3-1
Page 3-6
3-19
-------
>.-I.S-
t, r
OiFF. ARE
OC Curves for the Two-Sided t-test feC= .05)
3-20
-------
v.ENirti-/0;
-0.7
..
-S- d.3
SMALLER
£ 'IS
OC dims for the Two-Sided t-test «.= .05)
Figure 3-1
Page 3-6
3-21
-------
I I
I I
G\ VE N; m-100'
I I
osr
- - ALttosr
/A/C/?FAS£S
OC Curves for the Two- Sided t-test fot= .05)
3-22
-------
JL-
6/VEN d
see Table 3-1 Page 3-4
see Table 3-2 Page 3-22
see Article 3-3.1.4 paired observations two-sided test
see Article 3-3.2.4 paired observations one-sided test
see Article 3-4 for the k-sample problem
' /
3-23
-------
One Sample Problem
HO
H*
1
-------
DEPIHEA^tfl^ &- = /
*== C 10
^«&
x^-
^/
CUMULATIVE NORMAL DISTRIBUTION-VALUES OF zp
Tible A
T-3
4-2
-------
fiC'-O'iW
\os
os
K7eCONF.WT.FOS
i*-
9-77 /0
FACTORS FOR COMPUTING ONE-SIDED
CONFIDENCE LIMITS FOR cr
DEC. OF
FREEDOM
1
2
20
100
A.05
0.5103
0.5778
0.7979
0.8968
A.95
15.947
4.415
1.358
1.133
ALSO - Ai02s A 01 At005 A>975 A>99 A|995
is
4-3
-------
i-
S.A
H
A
DO
OUT$/0£
10
.: DO
H
4-4
-------
Operating Characteristics...
1.0
2.0 /
OF/CONSTANT
3.0
Two Sample Problem .
1
2
H
HA
H :
2
2
4-5
-------
F--
4-6
-------
7J>
!=£-
CUMULATIVE NORMAL DISTRIBUTION-VALUES OF zp
p
ZP
.9
1.282
1.
95
.645
.975
1.960
.999
3.090
-------
30
00
£\ **«
&/»., D&» &°
PERCENTILES OF THE F DISTRIBUTION ,,
p TflWe A-S
r-7
F.95(nl'n2)
161.4
18.51
4.17
3.84
30
250.1
19.46
1.84
1.46
254.3
19.50
.62
1.00
4-8
-------
1.0
Operating Characteristics.
b 6
4
V
e 4-it
- a /
F (30 30) ~/.8i-
-------
One Sample Problem
C= CONSTANT
i
HO
1
a=C
2
a=c
3
a=C
FACTORS FOR COMPUTING ONE-SIDED
CONFIDENCE LIMITS FOR 0- Table A-II
DEC. OF
FREEDOM
1
2
20
100
A.05
0.5103
0.5778
0.7979
0.8968
A.95
15.947
4.415
1.358
1.133
ALSO - A<025A01 A>005 A>975 A<99 A>995
'+-10
-------
oo
J
4-11
-------
One Sample Problem
HO
H*
1
a=C
o*C
2
o=C
a-C
3
a=C
a-C
C=CONSTANT
FACTORS FOR COMPUTING ONE-SIDED TUU
CONFIDENCE LIMITS FOR (7 M
DEC. OF
FREEDOM
1
2
20
100
A.05
0.5103
0.5778
0.7979
0.8968
A.95
15.947
4.415
1.358
1.133
-?l
*-\
ALSO - A>025 A Q1ABQ05 A>975 Ai99 Ai995
4-12
-------
c.j:.
J.
4-13
-------
One Sample Problem
1
HO
H*
^^^^^^^^^^»^^
1
a=c
^^^^M^MBHH^H
2
0=C
a*C
3
1 -^ ^"^
a-C
a-C
C=CONSTANT
FACTORS FOR COMPUTING TWO-SIDED
CONFIDENCE LIMITS FOR *
DEC. OF
FREEDOM
V
1
2
3
t
100
oc =0.05
BU
17.79
4.859
3.183
1.157
BL
0.3576
0.4581
0.5178
0.8757
Table A-20
T-34
ALSO a = 0.01, 0.001
4-14
-------
*
CO
J
0.
J.
>a.
<*
Two Sample Problem
1
2
H:
a,
'
H:
a,
4-15
-------
f^ LA*.
<*
- di
It-It
-------
BIVARIATE NORMAL DISTRIBUTION
OR STATISTICAL RELATIONSHIP
f(XY)
yy-u QM. ji&*<-&ffi»\
5-1
-------
>
TWO-DIMENSIONAL NORMAL DISTRIBUTION
OR STATISTICAL RELATIONSHIP
LINEAR FUNCTIONAL RELATIONSHIP
OF TYPE FH
JOINT
OF Xj AND
DISTRIBUTION
-------
FUNCTIONAL RELATIONSHIP FI
X VALUES
CHOSEN BY
EXPERIMENTER
ALINE OF Y MEANS
LINEAR FUNCTIONAL RELATIONSHIP 5-2
"
5-3
-------
- MODEL
X FACTOR IS
CONTROLLED
UNCONTROLLED
QUANTITATIVE
QUALITATIVE
QUANTITATIVE
QUALITATIVE
FUNCTIONAL
RELATIONSHIP
OR
REG. ANA.
MODEL
FI NO ERROR
IN X ONLY
'
ANA. OF
VARIANCE
FIXED
EFFECTS
MODEL
FH ERROR
IN X &Y
(NATRELLA)
STATISTICAL
RELATIONSHIP
OR
CORR. ANA.
MODEL
SI &SH
(NATRELLA)
ANA. OF
VARIANCE
RANDOM
EFFECTS
MODEL
5-4
-------
L FT
-a.
MODEL FI
y=b0+b,x=
10
8
y °
4
i
<_
0
_>
(
/
~A
1
>
c
3
i
i ;
i
5-5
-------
L_ FT \V(SHEET,
X-Y
-a.
&
£
o
-3
ZX= 0
DATA BANK
X = 0
Y=3
ZX2=10
n = 4
SY2=50
*
*
5-6
-------
iLjMJtu**.
= lo
- 50
ZX= 0
2Y=12
S = 10
xx
DATA BANK
X = 0
Y=3
EX2 =10
b = 1.
n=4
ZY2=50
b0=3
5-7
-------
SXx = 10
Sxy = 11
S=U
b, =
yy
b0=3
MODEL FI
y
= b0 + b, X =3H|X~^
10
A
«»
2
-2
-A
(
^f
i
! -1
C
I
) 1
v -
^
>
w
5-8
-------
;/ (0,
MODEL FI
y=4+/0| x+e^(MODEL)
.Xftb+b,
y
10
8
6
4
2
0
>
-4
,2
<
_<
^
> -
/
(
^ (
) i
/
\
;
?
5-9
-------
st
-0.
>h-j£
ZX= 0
DATA BANK
X = 0
Y=3
ZX2=10
S = 10
Sxy =
Y=b0 +
s2Y=0.95
3+1. IX
= 0.9747
n=4
ZY2=50
Syy=14
b0=3
= 0.3082
5-10
-------
A
A ~
SX= 0
-41-0.^873
I
DATA BANK
X = 0
Y=3
ZX2=10
n=4
SY2=50
S = 10
xx
sxy =
bi =
Syy=14
b0=3
Y=bo + b] X 3+1.IX
s2Y=0.95 sY=0.9747 sb =0.3082
sb =0.4873 t975[df=2]=4.303
PAGE T-5
Percentiles of the t Distribution
P ^
T-5
df
2
i
'.95
tp
!.975
4.303
5-11
-------
Y AT
i ^iSE
s
u2£2;~2.(
* ' Su ~
t
- 5. i ) to £" i)
SX= 0
DATA BANK
X = 0
Y=3
SX2=10
Sxx = 10
SXy=ll
Y=b0 +
s2Y = 0.95
= 0.4873
n=4
SY2=50
Syy=14
b0=3
3+1. IX
= 0.9747 sb =0.3082
t975[df=2] =4.303
PAGE T-5
5-12
-------
Y AT
a 7 fo|0.7)
5. I : Co, 5-1)
MODEL FI (0y=bo+b
5-13
-------
A7X=
ZX= 0
ZXY = 11
SXx = 10
$xy = 11
Y=b0 +
s2Y=0.95 5
sh =0.4873
bo
- 7.7
DATA BANK
Y=3
ZX2=10
b,=i.i
= 3+1.IX
= 0.9747
^0,7.7;
n - 4
SY2=50
Syy=14
b0=3
=0.3082
t_975[df=2]=4.303
PAGE T-5
5-14
-------
too«-
-------
ft ; p' <= £0 rt ST
o i /
.= 0,05
U-O
SX= 0
SY = 12
SXY = 11
SXx = 10
Sxy=H
DATA BANK
X = 0
Y=3
ZX2=10
n = 4
ZY2=50
Syy=14
bo = 3
Y=b0 t-b!X =3+1.IX
s2Y=0.95 sY=0.9747 sb =0.3082
t975[df = 2]=4.303
PAGE T-5
sh =0.4873
bo
5-16
-------
5
5-17
-------
^
a?ff
A£
DATA BANK
ZX= 0 X = 0 n=4
2Y = 12 Y=3
2XY = 11 ZX2=10 ZY2=50
Sxx = 10 Syy=14
= l.l b0=3
Y=b0 *biX =3+1. IX
sY=0.9747 sb =0.3082
sb =0.4873 t975[df=2]=4.303
PAGE T-5
5-18
-------
: INT.
4-£-
O / »
bo
= 0.05"
ItiJ.
5-19
-------
MODEL-
LINEAR FUNCTIONAL RELATIONSHIP r~
OF TYPE Fl r*
P-i:
e -
5-20
-------
LINEAR FUNCTIONAL RELATIONSHIP
OF TYPE FH
F/3 £-3
* £-5
JOINT DISTRIBUTION Yo
OF Xi AND v z
Tobk
RELATIONSHIP
Y=a+A
-i- = a + bX
Y=abx
Y=aXb
PLOT
Y
1
Y
logY
logY
1
X
X
X
log X
5-21
-------
STAT
R / CocHRAN P.
- LI N.
77/AN OMB
X
Of X
9P
'- - S-V
5-22
-------
H0: LINEAR RELATIONSHIP
ART. 5-4.1.6
ART. 5-4.1.3
REJECT
«
DO
NOT
REJECT
H
REJECT HQ
TRY
Y=
TRY
DO NOT REJECT H0
USE
= b0
IJSE
V ^« Vv
MAY NOT EQUAL ZERO WHEN SECOND DEGREE TERM
IS INTRODUCED.
H>
A/A
ST/\T. A/A/VUAL cRow.er
5-23
-------
IV
SB AMO£ AT
AT
LIN.
. AT
5-24
-------
KNQWM LIM,
BIVARIATE NORMAL DISTRIBUTION
OR STATISTICAL RELATIONSHIP j(,
/'
f(XY)
5-25
-------
TWO-DIMENSIONAL NORMAL DISTRIBUTION
OR STATISTICAL RELATIONSHIP
X FACTOR IS
CONTROLLED
UNCONTROLLED
QUANTITATIVE
QUALITATIVE
QUANTITATIVE
QUALITATIVE
FUNCTIONAL
RELATIONSHIP
OR
REG. ANA.
MODEL
FI NO ERROR
IN X ONLY
FIX ERROR
IN X &Y
(NATRELLA)
ANA. OF
VARIANCE
FIXED
EFFECTS
MODEL
STATISTICAL
RELATIONSHIP
OR
CORR. ANA.
MODEL
si &sn
(NATRELLA)
ANA. OF
VARIANCE
RANDOM
EFFECTS
MODEL
5-26
-------
y
X
DATA
ONLY
~ ART. 5-5 A
SI- KODE[SAbA£ AS PX
m« ALSO
COMP, INT.
ETC.
5-27
-------
X
V-
DATA BANK
ZX= 0 X = 0 n=4
ZY = 12 Y=3
SXY = 11 ZX2=10 2Y2=50
Syy=U
Sxy = ll bi = l.l b0=3
Y=b0 t-hX =3+1. IX
s2Y=0.95 sY=0.9747 sb =0.3082
"" t975[df=2]=4.303
PAGE T-5
5-28
-------
X
MODEL $ I y = b°+b' X =3HlXZ£/MW WM
10
8
y
W/MT
XA
=~fl?.
-'" -2 -I 0
X
5-29
-------
ZX= 0
2Y = 12
SXY = 11
Sxx == 10
sxy = n
DATA BANK
X = 0
Y=3
n=4
SX2=10 SY2=50
Syy=14
b0=3
Y=b0 +biX =3 + 1.IX
sY=0.9747 sb =0.3082
t975[df=2]=4.303
PAGE T-5
sb =0.4873
DO
5-30
-------
TWO-DIMENSIONAL NORMAL DISTRIBUTION
OR STATISTICAL RELATIONSHIP
-'^* I I
-Sy1S^-»/W-aW7
. OF DE7WHMT/OA/
VA^.of y
3 ^
^v» ^^^^
ay
5-31
-------
r-3/
6IVEN1- CHOICE OF- TWO
5TA77
5-32
-------
FUNCTIONAL RELATIONSHIP FI
X VALUES
CHOSEN BY
EXPERIMENTER
ALINE OF Y MEANS
LINEAR FUNCTIONAL RELATIONSHIP
OF TYPE FH
F/q 5-3
5-5
JOINT DISTRIBUTION
OF Xt AND
5-33
-------
TWO-DIMENSIONAL NORMAL DISTRIBUTION
OR STATISTICAL RELATIONSHIP
y
DATA
ON
5-34
-------
Bi Rf.
A
COi N/
HfAD
TA/L
A
A/oT^A
7A-1
-------
A
O
I
// T
60
1 >/
TAKE /7l fMD
A 0/N.
P(A-5 1 K.
si A/
f
7A-2
-------
X;*'/> V
our OFfl\J
(m-t).'x'.
TO S
T
b4
#
BIN.
ttSK
/W =
7 A-3
-------
OF-
POP.
TWO-SIDED CONF. INTERVALS FOR
PARENT POP. PROPORTION P
NATURE
>EXACT
EXACT
APPROX.
SAMPLE
n<30
n>50
n>30
ART.
7-3.1.1
7-3.1.2
7-3.1.3
METHOD
TABLE A-22
TABLE A 24
FORMULA
7B-1
-------
CONFIDENCE LIMITS FOR PROPORTION TQLie /?-
(TWO-SIDED) T-40
11
27
n=27
90%
.239 .593
95%
.223 .598
99%
(VO. OF- RED
me
O)N f.
P
7B-2
-------
TWO-SIDED CONF. INTERVALS FOR
PARENT POP. PROPORTION P
NATURE
EXACT
EXACT
+APPROX.
SAMPLE
OS 30
n>50
n>30
ART.
7-3.1.1
7-3.1.2
7-3.1.3
METHOD
TABLE A-22
TABLE A-24
FORMULA
7B-3
-------
B|
CUMULATIVE NORMAL DISTRIBUTION -VALUES OF zp
T-J
p
ZP
.9
1.282
.95
1.645
.975
1.960
.999
3.090
^^^^^^^^H H^ MM
-------
ONE-SIDED CONF. INTERVAL FOR P
NATURE
^ EX ACT
EXACT
APPROX.
SAMPLE
n<30
n>50
n>30
ART.
7-3.2.1
7-3.2.2
7-3.2.3
METHOD
TABLE A-23
TABLE A-24
FORMULA
CONFIDENCE LIMITS FOR PROPORTION
(ONE-SIDED) Table 4-2.3
P
0
11
16
26
11=27
90%
.549
95%
.583
.752
99%
.645*
7B-5
-------
p
\
CONFIDENCE LIMITS FOR PROPORTION
(ONE-SIDED)
r
0
11
16
26
n=27
90%
.549
95%
.583
.752
99%
.645*
7B-6
-------
o
v
0.5
0.4
0.3
0.2
o.ay r \
CONFIDENCE BELTS FOR PROPORTIONS
(CONFIDENCE COEFFICIENT 0.90)
.1
.3 .35 .4
P=!/n
7B-7
T.
-------
SAMPLE SIZE DETERMINATION
TWO-SIDED CONFIDENCE INTERVAL
NATURE
EXACT
APPROX.
SAMPLE
n>50
n>30
ART.
7-4.1.1
7-4.1.2
METHOD
TABLE A-24
FORMULA
ONE-SIDED CONFIDENCE INTERVAL
EXACT
n>30
7-4.2
FORMULA
7B-8
-------
P CHOICE OF P
UNK.
To
CONFIDENCE BELTS FOR PROPORTIONS
(CONFIDENCE COEFFICIENT 0.90)
P
P
Tcible
-------
A/ Af ITS d
CUMULATIVE NORMAL DISTRIBUTION
p
ZP
.9
1.282
.95
1.645
.975
1.960
.999
3.090
-------
ONE SAMPLE PROBLEM
TEST
CONDITIONS
H0: P = CONSTANT
HA:P * CONSTANT
1. NS3<
2. N>30
H0:P30
Pa PARENT POPULATION PROPORTION
8-1
-------
.23?
CotlF. /AT.
is
Do
: DO
CONFIDENCE LIMITS FOR PROPORTION Table
(TWO-SIDED) T-40
p
0
1
11
27
n=27
90%
.239 .593
95%
99%
n= HD30
8-2
-------
. OF
. To
6*0.1, P=0.tzG (VALUE CLo.
TO 0.5 ),P= 0. 5 OR 0.3 P-0.5
-fts if fF~-3 SlifFli V- rsi\- o.
TABLE OF ARC SINE ...PROPORTIONS
6 = 2 ARC SIN
p
e
ATIVE NORMA
p
0.4
0.5
L DISTK
e
1.17
1.57
IBUTIO
p
e
N -VALUES 0
ZP
.9
1.282
.95
1.645
.975
1.960
3.090
8-3
Tofck rt-2
-------
107*
₯
.1
CONFIDENCE BELTS FOR PROPORTIONS
(CONFIDENCE COEFFICIENT 0.90)
.3 .35 .4
P=7n
.5
8-4
-------
ONE SAMPLE PROBLEM
TEST
H0:P = CONSTANT
HA:P ^CONSTANT
H0: P CONSTANT
CONDITIONS
1. NS30
2. N>30
1. N<30
2. N>30
PS PARENT POPULATION PROPORTION
ONE SAMPLE PROBLEM
TEST
CONDITIONS
H0 :P2 CONSTANT
HA:P< CONSTANT
.1 ?./3.i-
1. NS30
2. N >30
PEPARENT POPULATION PROPORTION
8-5
-------
TWO SAMPLE PROBLEM
TEST
H0 : P=
Hfl:Pfl*PB
CONDITIONS
Nfl=NB
3. NA*NB (LARGE SAMPLES)
PA= PROPORTION FOR A POPULATION
NA= SAMPLE SIZE FROM A POPULATION
8-6
-------
A
B
S(/6.
a
8
A/of
/3
MINIMUM CONTRAST...
5% LEVEL, TWO-SIDED (ALSO 1%)
2.5% LEVEL, ONE-SIDED (ALSO 0.5%)
SAMPLE SIZE
nA = nB
.
20
A1'A2
0,5 1,7 2,9 3,10
4,11 5,13 6,14
"A = nB = i CD 20 (io) 100 soo
8-7
-------
. To 0. r> 0.4t. 0? - O-
TABLE OF ARC SINE ...PROPORTIONS
6 = 2 ARC 1~=~ Tokle /j-27
p
e
ATIVE NORMA
p
0.4
0.48
e
1.37
1.53
P
L DISTRIBUTION -VA
e
LUES 01
.9
1.282
.95
1.645
.975
1.960
.999
3.090
8-8
-------
a-
r-3
8-9
-------
/*
'
°\
AR.
PERCENTILES OF THE X DISTRIBUTION
75ble
T-4
THE VALUES OF Xp CORRESPONDING TO p
df
1
4
10
X2
*.90
15.99
I2
*.95
3.84
9.49
*2
*.99
6.63
8-10
-------
TWO SAMPLE PROBLEM
3-
TEST
H0:pA=pB
CONDITIONS
2
i..
(BOTH LESS
3. NA*NB (LARGE SAMPLES)
PA= PROPORTION FOR A POPULATION
NA=SAMPLE SIZE FROM A POPULATION
MINIMUM CONTRAST...
Table A-2.8
T-5&
5% LEVEL, TWO-SIDED (ALSO 1%)
2.5% LEVEL, ONE-SIDED (ALSO 0.5%)
SAMPLE SIZE
nA = nB
20
.
Alf A2
0,5 1,7 2,9 3,10
4,11 5,13 6,14
= n = 1(1)20(10)100(50)200(100)500
8-11
-------
ONE SAMPLE PROBLEM
TEST
CONDITIONS
H0:P>CONSTANT
HA:P< CONSTANT
1. NS30
2. N >30
P EPA RE NT POPULATION PROPORTION
CONFIDENCE BELTS FOR PROPORTIONS
(CONFIDENCE COEFFICIENT 0.90)
( -TO
3 .35 .4
P=!Xn
.5
.6
8-12
-------
NIGHT
SHIFT
CATEGORIES OR REJECTION (DISCRETE)
SAND
DROPPED
BROKEN
OTHER
ANNUAL
PROD
CATEGORIES OR MEASUREMENT
(CONTINUOUS)
-GOTO -
-1TO 0
OT01
1TOOO
9-1
-------
WEEK 1
WEEK 2
WEEK 3
CATEGORIES OR REJECTION (DISCRETE)
SAND
DROPPED
BROKEN
OTHER
SECOND
CATEGORY
OR
EDUCATION
LEVEL
GRADE SCHOOL
HIGH SCHOOL
COLLEGE
GRADUATE
FIRST CATEGORY
OR
TYPE OF SUCCESS
A
B
C
9-2
-------
X scELL COUNTS ORFREQ.
IF X (N CELLS ARE.
POISSOM
TEAM
tf
5T=
ALL.
9-3
-------
0,
,3
Li
8,9
WAL
o
30
40
33
47
NOT ALL e-
^=0.05
9-4
-------
MS
0,
13
L£
&i
WAI
0
30
40
33
47
E
40
40
40
40
40
~ 40- AV£-E M osr-
9-5
-------
a R - '/5
WOT ALL 6-
= 7. f5
7-*5
/VOT
PERCENTILES OF THE \T DISTRIBUTION
Table /4-3
Y2
P
THE VALUES OF Xp CORRESPONDING TO p
df
1
4
10
X2
*.90
15.99
X2
*.95
3.84
9.49
^2
*.99
6.63
9-6
-------
HAIR
BRUW
RED
MMMTA
MAR
10,
D.
V
SMJtfMS)
130
5/N
/77
X
360
NO AS 5 N B7W (t.^5. f j, IS «so.
E
9-7
-------
MWMTA
HAIR
(HO
RONE
BRUN
RED
MAR
~)
57A1WAI
V
o
SIN.
360
9-8
-------
PERCENTILES OF THE X DISTRIBUTION
Xp
THE VALUES OF Xp CORRESPONDING TO p
T-24
df
1
4
10
X2
A.90
15.99
X2
*95
3.84
9.49
*2
^.99
6.63
9-9
-------
LARGER
AT
MOST OKALLE<5"
SH
|-D/rt (500DN ESS OF FIT
E MIN M
9-10
-------
CIRCUMSTANCE DECIDED ON A PR OR GROUNDS
SUSPECT COULD BE IN EITHER TAIL
CASE
1
17-t.l,l
II
/7-J./.X
III
n-s.i.3
IV ,
/7- 3. /.#
CONDITION 1
-* AND ff UNKNOWN ^
^ AND 0" UNKNOWN
EXTERNAL ESTIMATE FOR
if
S AVAILABLE
^u UNKNOWN, 9 KNOWN
!
^ AND 0- KNOWN
i
17-1
-------
RANKEP
SWUE5T
L/VfcfJT
HiHOfAO.
CRITERIA FOR REJECTION OF OUTLYING OBSERVATIONS
Table XI-M
T-2?
STATISTIC
rll
n
8
9
10
UPPER PERCENT! LES
.70
.995
.639
17-2
-------
A R
R R
D£C.
SAMPL5
-©.£'-/ 7
A
(V
4
T\tf
t*
»*'
17-3
-------
CIRCUMSTANCE DECIDED ON A PRIORI GROUNDS
SUSPECT COULD BE IN EITHER TAIL
CASE
1
II
III
IV
CONDITION
-* AND ff UNKNOWN
^(. AND 0- UNKNOWN -4
EXTERNAL ESTIMATE FOR
^u UNKNOWN, 0- KNOWN
^U AND * KNOWN
I
\
\
t
\
S AVAILABLE r
j
1*
i
i
1
17-4
-------
= I0
NOT
PERCENTILES OF THE STUDENTIZED RANGE, q
Table A-ID
T-2.Z
Wf TOP
sorv
i
30
t = SAMPLE SIZE
2
10
5.76
20
17-5
-------
CIRCUMSTANCE DECIDED ON A PRIORI GROUNDS
SUSPECT IS IN ONE TAIL ONLY
CASE
CONDITION
I
AND ff UNKNOWN
'-<
^f-bL^
II
/7-3.JJL
AND 0- UNKNOWN
EXTERNAL ESTIMATE FO
III
/7-V'J. 3
IV
UNKNOWN, ff KNOWN
AND
-------
/i-
, DATA
PERCENTAGE POINTS OF ...MEAN
-------
CIRCUMSTANCE DECIDED ON A PRIORI GROUNDS
SUSPECT IS IN ONE TAIL ONLY
CASE
1
II
III
IV
CONDITION
-* AND
-------
= o.ooi
=2
- Z./M
-75
37
CUMULATIVE NORMAL DISTRIBUTION-VALUES OF zp
Table
T-J
.95
1.645
.975
1.960
^^^^^^^^^^^^^^^^^^M
3.090
17-9
-------
IN
II
OHE
n
BOTH 7/iits
"
TEST OF
SIGNIFICANCE
NOT SIGNIFICANT
THEY
ARE
SIGNIFICANT
SEARCH FOR
PHYSICAL GROUNDS
FOUND
NOT
FOUND
17-10
ANALYZE ALL DATA
CORRECT & ANALYZE
REJ. & ANALYZE REDUCED SAMPLE
REJ. & GET ANOTHER OBSERVATION
REJ. & REPLACE BY MEAN (OR ETC.)
REJ. & USE TRUNCATED THEORY
cJT ^ oSSL «*
REJECT ON
UNKNOWN
GROUNDS
PHY.
SAME AS THE
LAST 4 ABOVE
DO NOT REJECT
ANALYZE
ALL DATA
------- |