EPA-650/2-74-080



SEPTEMBER 1974

Environmental Protection Technology Series

-------
                             EPA-650/2-74-080
STATISTICAL  CONCEPTS
FOR  DESIGN  ENGINEERS
                by

     J. R. Murphy and L. D. Broemehng

        Oklahoma Slate University
        Stillwater, Oklahoma 74074
          Grant No. R-802269
          ROAP No. 21ADE-026
       Program Element No. 1AB013
    EPA Project Officer: W. R. Schofield

       Control Systems Laboratory
   National Environmental Research Center
 Research Triangle Park, North Carolina 27711
             Prepared for

  OFFICE OF RESEARCH AND DEVELOPMENT
 U.S. ENVIRONMENTAL PROTECTION AGENCY
        WASHINGTON, D.C. 20460

            September 1974

-------
This report has been reviewed by the Environmental Protection Agency
and approved for publication.  Approval does not signify that the
contents necessarily reflect the views and policies of the Agency,
nor does mention of trade names or commercial products constitute
endorsement or recommendation for use.
                                 11

-------
                          TABLE OF CONTENTS




Chapter




0.  PREFACE	V1



1.  INTRODUCTION	-	   l



2.  POPULATIONS, VARIABILITY, UNCERTAINTY,  AND SAMPLING	   5



3.  VARIABILITY AND RANDOM ERROR	14



4.  BASIC CONCEPTS OF PROBABILITY AND  MATHEMATICAL STATISTICS  ...  19



    4.1  Random Experiments and Probability 	  19



    4.2  Randan Variables and Distributions	24



    4.3  Moments and Expectation  	32



    4.4  Other Descriptive Quantities  	  35



    4.5  Jointly Distributed Random Variables	37



    4.6  Useful Distributions  	  41



         4.6.1  The Binomial Distribution	42



         4.6.2  The Poisson Distribution   	44



         4.6.3  The Geometric Distribution   	  4^



         4.6.4  The Exponential Distribution   	46



         4.6.5  The Normal Distribution	47



         4.6-6  The Chi-Square Distribution	52



         4.6-7  Student's t-Distribution	52



         4.6-8  The F-Distribution	53



5.  SAMPLING AND INFERENCE	54



    5.1  Description of Finite Number  Populations   	  54



    5.2  Statistical Inference in Finite Populations  	  55



    5.3  Statistical Inference in Infinite  Populations  	  66



    5.4  Statistical Inference in Normally  Distributed  Populations  •  7^
                                iii

-------
                                                                     Page

          5.4.1  A Single Normal Population	   71
          5.4.2  Two Non-Independent Normal Populations	   78
          5.4.3  Two Independent Normal Populations   	   81
          5.4.4  Several Independent Normal Populations	   85
          5.4.5  Analysis of Variance   	   93
6.   EXPERIMENTAL TEST PROGRAMS	   "
    6.1   Introduction	   99
    6.2   Data	10°
    6.3   Comparative Experiments 	  104
    6.4   The Use of the Word "Design11	105
    6.5   Properties of a Good Experiment	106
    6.6   Experimental Units	109
    6.7   Experimental Error and Sampling Error .	110
    6.8   Degrees of Freedom in the Analysis of Variance  	  114
    6.9   Randomization	
    6.10  Randomized Designs  •	
    6.11  Multifactor Experiments	12°
    6.12  Mathematical Models	I27
7.  SUMMARY	136
8.  REFERENCES	138
                                  IV

-------
                                 FIGURES


N°i.                                                               Page

1.  Binomial Probability Mass Function                              43

2.  The Normal Density Curve                                        47

3.  Binomial Mass Function                                          50

4.  Results of the 0.05 -Level LSD Test                              91
              = 12.1)
5.   Experimental Design                                           105

-------
                            0.  PREFACE

        In the last few years, data gathering and analysis has reached
such a level of sophistication that in many cases, statistical treatment
is considered essential.  As a result, many scientists and engineers
have become acquainted with the valuable assistance that statistical
methods can often provide.  In addition, they have helped to point out
where new and/or better techniques are needed, so that statistical theory
and statistical techniques have also reached a high level of sophistication.
Consequently, the study and practice of statistics, like many disciplines,
has become a specialty field, which has tended to cause some science
and engineering to feel uncomfortable about the subject and to regard it
too difficult to understand and/or utilize.  To be sure, statistics is
not a trivial subject, and the effective application of statistical
methods to real world problems does require special training that an
engineer oft-times simply cannot afford to undertake.  Thus, the business
of designing and conducting experiments, and analyzing and interpreting
the results has necessarily become a joint effort requiring teamwork
between specialists.  It is a common misconception that such teamwork
is of a  "production assembly line"  nature, where each member does his
part and then passes it on to the next.  Thus, many times, a consulting
statistician is not introduced to a project until the data  is already
gathered and it is time to analyze and interpret it.  It would seem that
the researcher who has invested his time in planning and carefully con-
ducting an experiment would be reluctant to entrust its analysis and
interpretation to one who, although an expert in techniques of analysis,
is relatively unfamiliar with the circumstances surrounding the data;
however, there are many who believe that this is what they are supposed
                                 vi

-------
to do.  Perhaps much of the distrust some people have for  "statistics"
is attributable to such a misconception.
        One of the points we shall try to emphasize in this manual is
that useful analysis of data cannot be based on the numbers alone, that
in order to be able to draw meaningful inferences from the results of
an experiment, one must consider how those results were obtained.   Thus,
when we speak of teamwork, we are referring to cooperation and interaction
of the members at all stages of a project.  If a statistician is to be a
member of the team, there are at least two reasons why he should be
included from the start.  (1)  In line with the thought expressed above,
a great deal of time can be saved.  Since the analysis of the data depends
so much upon the background, invariably a statistician brought in at the
later stages must be filled in and brought up to date.  On the other
hand, if he is already familiar with the project, the lag time between
collection of the data and analysis of the data is less and, in addition,
the analysis is more efficient (2}  More importantly (as we shall also
try to show throughout the manual), only a small part of statistical
training is concerned with data manipulation.  "Statistics"  is to a
large degree a way of thinking based upon principles general enough to
allow application to a wide variety of special circumstances.  As such,
statistical thinking can make contributions to every phase of experimentation;
sometimes the greatest contributions are made in the initial stages, say,
in helping to recognize and define the problem.
        We have emphasized that imaginative and timely analysis of data
requires that the consulting statistician have at least a surface
understanding of the field of knowledge pertaining to the experiment which
gives rise to the data, and within which the results are to be interpreted.
                                  vii

-------
A similar requirement also holds for the scientist or engineer who
wishes to make effective use of the services of a statistical consultant.
The best possible solution would be for one to be expert in both his
own field and statistics as well, but such a capability is possessed by
few individuals.  An alternative is for the statistician to understand
something of the field where his specialty is to be applied and for the
engineer to be acquainted with the principles upon which the statistical
treatment is based.
        There are two erroneous ideas that people sometimes have about
statistics that we hope to address throughout this manual.  The first is
that statistics is nothing more than a routine set of calculations to be
performed upon numbers.  Such a misconception is probably responsible
for much of the inappropriate application of statistics that one can see
today.  Many people, noting that many statistical techniques involve the
same standard calculations, get the idea that this is all there is to
statistics, and that the same thing is applicable regardless of the
situation.  Then, there are those who acquire the opposite idea about
statistics.  Frequently, people without a strong mathematical background
let themselves become  "snowed"  by the details of probability theory
and mathematical statistics, and sometimes even by the underlying
mathematics of applied statistics.  A common attitude in such cases is
that statistics is just so much mathematical  "wizardry."  It is not
difficult to understand the existence of such misconceptions about
statistics, because the fact is that statistical techniques do involve
some routine calculations and are based upon an underlying mathematical
structure that is sometimes detailed and difficult for someone not
grounded in mathematics to easily grasp.
                              viii

-------
        It will be our objective to present the reader with an overview
of statistics as it relates to the planning, analysis, and interpretation
of scientific experimentation.  We shall not dwell on convincing anyone
of a need to use statistics; the increasing evidence of use (and misuse)
of statistics in scientific and engineering literature attests to the
fact that engineers who shun statistical methods entirely are fast
becoming a minority.  Statistics has a valid case to make, but it has
been well presented elsewhere.  The manual is not intended to be a
text nor a complete exposition of statistical methodology, and no detailed
treatment of any of the subjects covered will be given.  There are many
very good references available on engineering and industrial statistics
in which one may find detailed information, and the creation of yet
another is beyond the scope of our limited objectives.  It is our belief
that it is possible for anyone trained in technical and scientific subject
areas to understand the basic principles upon which statistical methods
are based without having to wade through all the details, and the
presentation will be largely limited to fundamental concepts.  We shall
assume that the reader is relatively unfamiliar with statistics and has
neither the time nor the inclination to become expert in the field.  It
is anticipated that the manual will provide sufficient background so that
the project engineers to whom it is addressed will have the ability to:
          i.  Know what statistics can and cannot do.
         ii.  Recognize potential areas for the application of statistical
              methods.
        iii.  Aid in the direction and thrust of statistical analysis
              of data.
                                  ix

-------
         iv.  Share a greater role in the interpretation of results and
              the formulation of conclusions and recommendations.
          v.  Make efficient use of whatever statistical consulting
              expertise is available to them.
        For the reader who may be interested in delving further into
matters we may touch upon, a list of references is provided, many of
which are considered classics in their respective subject areas.
                                     x

-------
                          1.  INTRODUCTION






      The area of statistical theory that this manual will primarily



deal with is commonly referred to as inference.  If statistics,  as



viewed from this angle, can be summarized in one statement, that



statement must surely be  "The role and commission of statistics is to



aid the researcher in drawing optimum inferences from his data in the



face of uncertainty."  Sounds very good, but....



      First,  "to aid"  means precisely that.  There is no method or



technique to replace the use of common sense.  Indeed, there are those



who would claim that application of statistical principles is merely



a structured application of common sense.  Statistics is not a mystical



collection of hocus-pocus to be applied to the data, suddenly making



every hidden secret become clear.  Statistical inference, as an entity



separate and distinct from scientific inference, does not exist.



Statistical methods and techniques are not things "better left to the



statisticians;" rather they are tools to be used in conjunction with



and consistent with the basic principles of scientific inference.



Robert Hooke in his monograph, Introduction to Scientific Inference,



p.  94, offers this observation,  "There is an old story about a flea



trainer who claimed that fleas hear with their legs.  As proof, he



taught some fleas to jump at his shout of 'Jump!' After amputating the



fleas' legs and observing tliat they no longer responded to his shouts,



he rested his case.  Statistics, of course, is an aid to, not a substitute



for, intelligence.   The flea trainer could have made measurements of

-------
great precision and gathered pages of data, but these would not have
protected him from his faulty logic."
      The word "optimum" requires explanation and qualification.  To
one who has not thought carefully about the meaning, "optimum" often
carries the idea of the best, in all possible senses of being best.
In actuality, when the word is used in any precise way, it means best
only according to a specific criterion of goodness.  Thus, an optimal
procedure is optimal (in a real sense) only if the criterion is relevant.
      Many optimality criteria used in statistics are based upon
conceptual long-run properties.  It should be clearly understood that
these "in the long-run" or "on the average" properties cannot be imparted
to a single trial or single application of a statistical method.  It
may be that such long-run properties will not seem terribly compelling
to a researcher who has before him a single experiment, and he may wish
for something more.  But neither wishing nor cleverly devised words  can
change a long-run property into a short-run property.
      Finally, the phrase "in the face of uncertainty" must be clarified.
One of the basic premises of statistics is that observable phenomena are
subject to variation and, although some of the causes or sources of var-
iation can be accounted for, ultimately there remains variation due  to un-
known and unexplained sources.  Frequently the assumption is made that the
behavior of such variation can be described in tenns of chance or random
occurrence.  That is, an exact deductive mathematical theory, the theory
of probability, is used to define a systematic structure, within which
variation can be placed and studied.  The use of mathematical models for
studying and understanding the physical world is not a new development by
any means, but the idea that unsystematic chaos, disorder, or chance liap-

-------
penings can also be understood in the context of a logical system is a



relatively new one. The consequences of using probability models for studying



variability are far-reaching and, sometimes, almost astounding.  Many



statistical techniques that have arisen as a result appear to be very



powerful, almost like  "getting something for nothing."   However, we



must remind ourselves that there is no way to "create knowledge" with



any statistical technique.  Assumptions are made in the absence of def-



finite knowledge and, no matter how reasonable, plausible, or compelling



the assumptions may be, the fact remains that derived methods are based



on and tied to those assumptions.



      It is reasonable to ask how statistical theory and methodology



may be utilized by engineers "to aid in drawing optimum inferences" from



their data.  Part of the answer comes with the realization that statistics



has something to contribute to every phase of experimentation; that, in



fact, the greater contribution may be in obtaining the data rather than



analyzing  it after it has been gathered.  It may be useful to consider



alternate descriptions of the use of statistical methods in experimentation



which have been given by various engineering people who apply them:



      1.  With respect to experimental planning and data taking,



          (a)  Statistical methods are aids in planning orderly



               experimentation.



          (b)  Statistical methods help one to get the most information



               for the least amount of experimentation.



          (c)  Statistical methods help one to organize, categorize,



               and quantify his data.

-------
      2.  With respect to analysis of data,
          (a)  Statistical methods provide ways of condensing and
               summarizing data with the minimum loss of information.
          (b)  Statistical methods assist in determining what is
               apparently systematic and what is apparently random in
               a set of data.
          (c)  Statistical methods enable one to get a "complete
               picture"  of the way the relevant factors in an
               experiment are affecting the response of interest.
      3.  With respect to inferences and conclusions,
          (a)  Statistical methods permit one to determine how far
               his results can be safely generalized.
          (b)  Statistical methods give the user  "yardsticks"  by
               which the strength of his results may be measured.
          (c)  Statistical methods provide a quantification of the
               degree of uncertainty associated with estimates made
               from the data.
      The list is, of course, incomplete.  This manual, it is hoped,
will be useful in assisting engineers to avail themselves of the
contributions that statistical methods can make.

-------
          2.   POPULATIONS,  VARIABILITY,  UNCERTAINTY,  AND SAMPLING





        The field of statistics began with and has  traditionally been



 considered as concerned with the gathering,  organization,  and analysis



 of facts in order to determine the  essential characteristics  of interest



 of some population under study without  having to examine the  population



 in its  entirety.   At first,  the populations  studied  were largely existing



 ones  such as people or  items of agricultural or industrial production,



 and the main concerns were efficient sampling schemes and techniques



 of analysis  which reduced  the risk  of erroneous inferences about the



 population when  generalizing from the sample.  Let us note here an



 evident fact,  and that  is  that any  study  of  a population of physical



 objects necessarily involves some form  of measurement of one  or more



 attributes of interest.  Thus,  any  given  population  of objects  can give



 rise  to several number  populations,  depending  upon what  aspect  of the



 physical population we  wish  to  study, and it is common usage  to speak



 of "the population" in  reference  to  either the physical  population



 itself  or  to  some number population  derived  from it by measurement.



 (We include here measurements of  the classification type, or  so-called



 qualitative measurements such as  "present" or  "not present,"  or such as



 "belongs to Class 1, Class 2	Class n," because these measurements



 can be  represented as numerical measurements.  For example, we can use



 the correspondence, 0 = "not present" and 1 =  "present."  Hence, we



 shall think of all measurements of populations as being numerical



measurements.]

-------
      As the ideas of statistics continued to develop, it became apparent



that the principles used in the study of existent populations were also



applicable for studying conceptual populations, that is, populations which



did not exist already, but could be theoretically generated if necessary.



For example,  we may conceive of the population generated by one million



tosses of a coin, the population generated by a large indefinite number



of rolls of a pair of dice, or the population of responses generated by



repeated measurement of some object or substance.  Conceptual populations



may be finite or infinite.  To carry the notion further, from a given



population, conceptual or otherwise, it is conceptually possible to



generate a derived population of samples (actually, several populations,



one for each fixed sample size) by repeatedly drawing samples from the



original (parent) population.  Conceptual populations generated by



repeated sampling are very important types of populations, and are the



kind    a large portion of future discussion will be concerned with.



From a statistical standpoint, in the study of any population, there



are actually two types of populations involved, the parent population



or the population of inference and one or more derived populations



generated by repeated sampling.  An important special case of this idea



is a population generated by conceptual repetitions of an experiment.



The study of the sampling techniques for populations to be studied by



experimentation comprises a portion of the subject matter of the area



of statistics known as experimental design.  That is, the performing



of an experiment can be thought of as taking a sample from some population,



and the theory of experimental design is concerned, not only with the



method of taking the sample and the population of inference (the experiment



itself), but also with the population generated by repeated sampling



(conceptual repetitions of the experiment).

-------
      Now, let us consider some of the concepts involved in the study of



populations.  First, there is the idea of variability, that is, that all



members of the population do not give the same measured numerical response



with respect to the attribute of interest.  If there is no variation,



there is no need of any statistical techniques, for the population will



be known completely as soon as we pick some member and observe it.  We



shall take as a starting point, therefore, that the reason why any



population requires more than examining just one of its members is that



the population is subject to variation.



      Assuming a population of interest does possess variability, it



naturally follows that any conclusions about the population as a whole,



based on observing a subset of the population,  must be subject to uncertainty.



Can the uncertainty be reduced?  Yes, in fact it can be eliminated entirely



if we can examine the entire population, but this is, of course,  im-



practical in all but a few cases.  The question arises as to whether it



is possible to find a satisfactory measure of uncertainty in order to



quantify the concept and give it more definite meaning.  Consider, for



a moment, some intuitive aspects of uncertainty:  (]) the larger the



subset of the population it is possible to examine, the less uncertain



will be our conclusions;  (2) the more variable the population is, the



more uncertain our inferences must be; and (3)  the more that we already



know about the population, the less uncertain we are.  Any measure of



uncertainty should somehow embody these considerations.



      Tn many cases, the reason for sampling can be taken to be the



estimation of some quantity in the population, and this usually involves



calculating some estimate from a sample.  Focusing our attention on



this sample estimate, we see that the derived population of samples

-------
gives rise to yet another population--the population of different values
that the sample estimate may assume from one sample to the next.   Due to
variability i.n the parent population, the population of values of the
sample estimate will also exhibit variability.  Moreover, the variation
of a sample estimate is generally governed by:  (1) as the sample size
increases, variation decreases;  (2) the more variable the parent
population, the greater the variation; and (3) with knowledge of the parent
population, the sampling procedure can be modified in order to reduce
variability.  Thus, in terms of estimation, uncertainty may be measured
in terms of the variability of the sample estimate over repeated
sampling.  A somewhat startling result, which we will later discuss in
more detail, is that when sampling from a parent population is done in
a certain ray, the variability of a sample estimate can itself be
estimated from the same sample.  That means, that from a single sample,
it is possible to get both an estimate of some population characteristic
and a measure of the uncertainty of that estimate.  Thus, although
uncertainty can sometimes be reduced and sometimes not, it can and should
be measured.
      It is a widely held beJ ief  among  sonic people  that  the uncertainty
of inference can somehow be negated by insuring that a  "representative"
sample is used.  Representative sampling, or stratified sampling, is a
valuable method of sampling, but its use depends upon having some prior
knowledge of the population.  Often, however, the term  is incorrectly
used to refer to sampling when nothing is known about the population.
It would be very nice indeed if we knew that the sample being examined
"represented"  the population of interest, for  then everything we
observe  an  the sample would also be valid for the  population.  The

-------
 "clinker"  in this logic is discovered when one considers  the following
 question,  'How is it determined that the sample represents  the population?"
 That is, if we know that the sample is representative with respect  to the
 attributes we desire to pinpoint,  then we already know about them in  the
 population, and taking a sample is a waste of time and effort.
       Let us try to determine exactly what is meant,  or should be meant
 by the term " representative sampJe" for  the cases where no  prior
 knowledge of the population is assumed.   When one says "representative,"
 he probably has in mind that the sample is "like" the population in the
 most important respects.  That is,  by the very use of the  term "population,"
 one has  implicitly delineated the  properties which define  the  population.
 He knows whether something  is or is not a member  of the population  by
 whether  or not it possesses  those  properties.   Thus,  we see  that the
 question,  "Is  this a representative sample?"  is  properly  translated,
 "Did this  sample come from  the population we wish to  characterize?"
 This is  an important consideration,  and is often  a factor  in erroneous
 inferences.  Generalization  from the  sample  to  the  population  is a risky
 business at best,  but  it can be  suicidal when the population one actually
 samples  is only  a subset of  the population he purports  to make conclusions
 about.  Thus, we  arrive at one of the  fundamental rules of inferential
 statistics,  'The  legitimate population of  inference is  that population
actually sampled."  One should constantly ask himself  the question,
 "What is the valid population of inference this sample corresponds to?"
A reasonable approach to the problem is to first clearly define the
population to be  studied, and to then make certain that the sampling
scheme used allows each member of the population at least a chance to
come into the sample.

-------
                                                                   10
      We should hasten to add, however, that there are times when



generalization beyond the legitimate population of inference is apparently



unavoidable.  We are often forced to come to conclusions or make decisions



based on scant data, and we do this by convincing ourselves that the



population not sampled is "not very different" from the one sampled.  It



is a well known fact, for example, that opinion surveys are necessarily



restricted to sampling the population of cooperative people who will



respond to them, but this does not prevent fairly accurate predictions



from being made the majority of the time.



      In the case of conceptual populations of repetitions of experiments,



the relation between the legitimate population of inference and the



experimental design should be clearly understood.  To a certain extent,



the desired population of inference determines the experimental design,



but not completely.  However, the experimental design does completely



determine the legitimate population of inference.  In other words, the



design dictates the manner in which the experiment would be repeated,



and this, in turn,  determines the conceptual population of repetitions.



It can be very disconcerting to discover, after an experiment has been



conducted,  that the population of inference  in which the experiment has



valid interpretation is far more restricted  than originally intended,



but it is a frequent consequence of poorly designed experiments.  To



decrease the likelihood of this happening, one should first get a clear



idea of  the population to which inferences are to be drawn, and then



he should enlist  the aid of a competent  statistician in designing the



experiment  so that  that population  is  properly sampled.



      We have discussed how one aspect of the problem of getting "good"



samples  is  that of  taking the sample from the correct population.

-------
                                                                    11
Suppose one has clearly defined the population of interest and realizes
that his method of sampling should permit each member of the population
a chance to be chosen.  (Recall that by "population"  and  'member,"
we are including experiments and conceptual repetitions thereof.)  What,
then, should be the method of taking the sample?  The approach, which
from the standpoint of valid inference is essential,  is one called
random sampling.  Simply stated, random sampling is the use of a chance
mechanism which chooses a sample in such a way that every member of the
population has a given probability of being chosen, and a random sample
is a sample which has been obtained by such a method.  It is a common
practice to use simple random sampling, whereby every member of the
population is given an equal chance to come into the sample.  For
experiments and their associated populations of repetitions, obtaining
random samples is generally more involved.  Here, random sampling is
accomplished with a technique referred to in the theory of experimental
design as randomization.  Any given experimental design has an associated
randomization scheme; in fact, the randomization scheme determines the
design and thereby the population of inference.  Thus, for experimentation,
getting a proper sample from the desired population of inference is a
matter of proper randomization.
      What, then, is the advantage of random sampling?  We discussed
before the idea of variability of a population, and the notion will be
further developed in the next section.  For the moment, however, let us
agree that, in situations most frequently encountered, different members
of the population yield varying measurements and that we may not know
exactly what causes  some to be high and others  low.  When the sample is
selected systematically according to some criterion, we run the risk of

-------
                                                                     12

 selecting  only the high ones or the low ones, for the criterion we use
 may very well  be  characteristic of high responses only or of low responses
 only.   Systematic sampling has another drawback also, and it is that the
 samples drawn  in  this way often tend to be more homogeneous.  This, then,
 can have the effect of causing the variability in the population of
 inference  to be grossly underestimated which, in turn, leads us to believe
 our estimates  to  be more precise than they really are.  But, when we
 use random sampling, chance alone dictates whether or not a particular
 member  comes into the sample.  Note that the use of such an approach
 does not preclude obtaining  "unlucky"  samples (e.g., all high yields);
 rather,  it allows such events to occur with small probability.
        This brings us to a very important property of random sampling or
 simple  random  sampling, and that is the long-run nature of the procedure.
 To  put  it  plainly, there simply is not a technique which will guarantee
 that a  "good"  sample will be drawn every time.   For the single sample
 or  experiment, random sampling cannot make the results more or less valid;
 it  is only when we place the sample or experiment in the context of being
 one  of  a population of conceptual repetitions that random sampling is of
 any value.
        Simple random sampling is applicable for  the case when nothing
 is  known about the structure of the population being sampled.   Often,
however, one does know something about the population and, in such cases,
that knowledge can be taken advantage of by the use of restricted random
sampling;  i.e., sampling whidi is partly systematic and partly random.   For
example, we may know from past experience that in measuring some response
 in a population of people,  a certain group tends  to give high responses,
another group moderate responses,  and the remaining group low responses.

-------
                                                                     13





Knowing this, we would not want a sample containing only members of any



one group, for then we would get a distorted estimate of the response of



interest.  For such a case, we might require that any sample chosen



contain certain proportions of observations from each group.  For example,



if the sample size is n with n^ n2> n3 to be taken from the three groups,



respectively, we may choose n, members from the first group at random,



n~ members at randan from the second group, and n^ members at random



from  the  last group.  Many similar and more complex situations are possible,



and the general rule is that any prior knowledge of response patterns



in the population should be taken advantage of by means of restricted



random sampling.  This is especially true for experiments where restricted



randomization is almost always used.



        Even though restrictions may be placed on sampling,  in order to



provide a basis  for valid statistical inference,  there must  remain an



element of randomness.   The rule is that the sampling procedure must



determine for each member of the parent population the probability that



it will be chosen in the sample and must be conducted in such a way that



population members will be chosen according to that probability.   Thus



population members may be chosen with equal probability (simple random



sampling), or they may be chosen with unequal probabilities  (restricted



random sampling), but they must be chosen according to some  probability;



chance must govern the selection process.

-------
                                                                     14
                    3.  VARIABILITY AND RANDOM ERROR






       We have mentioned variability as a factor one must consider in



the study of populations, and it should be emphasized again that one of



the major problems that one is faced with in such an endeavor is how to



draw reasonable conclusions about the population by observing only a few



members, when different members give varying measurements.  Let us ex-



amine the matter of variability somewhat more closely.  To simplify



matters, let us focus on conceptual populations of repetitions of ex-



periments.  Here we know, for example, that even under conditions as



identical as we can maKe them, two or more experiments do not yield the



exact same response.  What causes this variation in results?  There are



at least two ways to view the matter.  The deterministic point of view



is that if all the circumstances surrounding an event were known, then



the result could be predicted with exactitude, or that if time could be



turned back and exactly the same set of conditions repeated, the exact



same result would be obtained.  In other words, the failure of two or



more experiments to give the same result is said to be due to an inability



to match the set of conditions.  The probabilistic approach, on the other



hand, is that even if it were possible to obtain identical conditions,



the outcome would still be subject to variation because of an underlying



chance process.  Any distinction between the two philosophies is unim-



portant to the scientist, however, because from a practical standpoint,



they both lead to the same place.  Since it is, in fact, humanly imposs-



ible to identify all of the sources of variation affecting an event, one

-------
                                                                       IS
 simply recognizes and isolates a few of the important sources of varia-
 tion, calling these the "set of conditions,"  and the remaining variability
 is said to be due to random error.   Thus,  regardless of whether random
 error is perceived to be an inherent quality in nature, or the result
 of the combined (and chaotic)  effect of undetermined conditions, or
 even a combination of both of these, the end result is the same.  That
 is, in both cases, we postulate the existence of random error and
 attribute unexplained variation to  its effect.   It is immediately seen,
 however, that unless we are willing to also assume something about the
 behavior of random error,  we can make very little use of the presumption
 of its existence.   We therefore go  another step further and assume that
 the behavior of random error is governed by the laws of probability.
 This latter assumption is  a significant one and is one of the fundamental
 reasons why statistics has had such an impact upon almost every area  of
 scientific experimentation.   It appears that, almost without exception,
 one can do a better job of describing the  physical world with models which
 have a probabilistic component to them rather than a mathematical
 (deterministic)  component  alone,  in some cases  even to the exclusion  of
 a mathematical  component altogether.
        Let us now  consider for a  moment a  few of  the unfortunate conse-
 quences which can  sometimes  result when people  are unwilling to  go  any
 further than to  grudgingly admit  that  their experimental  results may be
 subject to  "some"  uncontrolled variation.  A common attitude in  this
 instance is that any acknowledgement of "error" is  a  reflection upon
 technique, and suitable refinement of  technique will  serve to make the
 error  "negligible,"  There are several potential pitfalls which can be
precipitated by such an attitude.  One  is that the error may not be

-------
                                                                     16

 negligible at all.  Recall that, in most cases, experimentation is done
 in order to try to determine what happens "in general," or what will
 happen if the experiment is repeated at some future date under the same
 general conditions, and the error which we should be thinking of is the
 variation in results which should be expected when the experiment is
 repeated anywhere within the restrictions which define the population
 of inference (experimental error).   All too often, when someone claims
 that the error is negligible, he has in mind the variation observed when
 multiple measurements are taken (sampling error).   Almost without ex-
 ception, the magnitude of the error is  smaller in the latter instance.
 Thus,  it is  possible for sampling error to be negligible while experi-
 mental error is  considerable.   Another  slightly different problem which
 can arise, whenever one tries to eliminate error by refinement of-experi-
 mental teclmique,  is that an experiment can often  be rendered worthless
 by over-restriction.   In other words, the conditions can be  made  so
 specialized  so as  to preclude  generalization to the  broader  class  of
 conditions of real  interest.   In addition,  it sometimes  happens that
 entirely too much  time  and money is wasted by trying to  reduce the
 error  further than  it really needs to be.   One  of  the  lessons  the
 statistical approach has  taught  us is that  it is possible  to make  sense
 of results in spite of  the presence of  error, provided it  does not  cloud
 the issue too greatly.  Thus, one should be concerned with the magnitude
 of the error in relation  to the magnitude of  the response  he is trying
 to study, rather than with the absolute magnitude of the error alone.
       The most satisfactory way to avoid the above complications, as
well as others, seems to be the use of a probability model to represent
what we believe to be the behavior of random error.  In adopting this

-------
                                                                    17
approach, it is necessary to make an educated guess as to the probability



law (usually formulated in rather general terms)  which most nearly fits



the situation.  There are many alternatives to choose from, and the study



of them comprises some of the subject matter of mathematical statistics



and distribution theory.  Some of the concepts associated with probability



theory in general, as well as some commonly encountered distributions,



will be discussed later.  The question which naturally arises, of course,



is, "How does one make a reasonable choice with respect to the probabil-



ity law which is supposed to apply in a given situation?"  It should be



obvious that we can never really know for sure whether the model chosen



is the correct one, but even so, it is still possible to take a



scientific approach to the matter.  That is, on the basis of the under-



lying theory, the data itself, intuition, and good judgement, the prob-



ability model is chosen.  The choice is never considered irrevocable,



however, and a particular model is kept only so long as it is not incon-



sistent with observations.  By way of clarification, we should point out



that in the large majority of cases, one is not required to go through



the process of choosing a model.  Most of the time, past experience by



previous researchers has indicated the most appropriate model, and the



only responsibility one must bear is to verify that his data and the



model are not grossly inconsistent.  One will discover that a great many



of the statistical methods and techniques employed are based on the



assumption of one particular probability model, the normal distribution.



There is much justification for assuming a normal distribution for random



error.



       The point, then, to be derived from the preceding discussion for



those who wish to apply statistical methods in experimentation, but do

-------
                                                                    18
not want to become entangled in the underlying theory, is the following:



uncontrolled or unexplained variation is a problem in experimentation



which cannot be ignored if useful and meaningful results are among the



primary goals of the experiment.  The most satisfactory approach to Lhe



problem found to date is to attribute such variation to random error and



to employ some probability model for a structure within which the effects



of random error upon the experimental results can be assessed.  Although



it is usually unnecessary to become overly involved with the underlying



model, it is a recommended practice to verify that required assumptions



are not blatantly violated and to make a few quick checks to see that the



data itself does not give evidence of non-conformity.



       As a final note, in answer to those who might question the correct-



ness or appropriateness of using the abstractions of probability and



probability models to study physical phenomena, we must reply that, while



such objections are valid, they do not apply only to the present instance



but to every case where the behavior of the observable physical world



is explained, idealized, and simplified in terms of an abstract model.



Thus, in order to pass judgement in this situation, we should use the



criteria which have been applied in the past to similar situations, and



we must therefore consider the use of probability models as the latest



(not necessarily the last) step in an evolutionary process.  In that con-



text, it may be said without equivocation that the use of probability and



probability models has represented a giant step forward in approaches to



data collection, data analysis, and data interpretation, and this, in



turn, has had a significant impact in a great many areas of scientific



endeavor.

-------
                                                                        19

       4.   BASIC  CONCEPTS OF PROBABILITY AND MATHEMATICAL STATISTICS

        To understand statistical methods, one should know something of
 the mathematical concepts behind them.  In this section, we shall take
 a brief look at a few of the basic ideas from mathematical statistics
 and in the next section, we shall show how they can be applied to some
 simple problems of inference.  For those who are familiar with elementary
 probability theory and mathematical statistics, this section may be
 either skipped entirely or used for purposes of quick review.

 4.1  Random Experiments and Probability
        For the purposes of this section, by  a random experiment we shall
 mean any observable phenomenon, which can  be repeatedly observed,  with
 varying results  attributable  to chance.  We  call the results  of a
 random experiment  its outcomes  and  the set of all possible outcomes,
 we call the  sample  space of the random experiment.   When outcomes  are
 combined into  sets, we  call these combinations of outcomes events.
 Suppose we are interested  in  a  particular  event.  If the outcome of the
 random  experiment is contained  in this event, then we say that  the event
 has occurred.

 Example 1:  Let a random experiment consist of tossing a pair of dice,
 one white and one black.  The outcomes have to do with the number showing
on the top faces of the dice.   Let us denote the outcomes by writing
 (i,j), where  i  is the number on the white die and  j  is the number
on the black die.  For this random experiment, the sample space consists
of 36 possible outcomes, which we may identify as:

-------
                                                                     20
                  (1,  1)     (2,  1)	(6,  1)
                  (1,  2)     (2,  2)	(6,  2)
                  (1,  6)     (2, 6)	(6, 6)

Let A be the event that the sum of  i  and  j  is 7.  Then, we can write:
A = {(1, 6), (2,  5),  (3, 4),  (4, 3), (5, 2),  (6, 1)},  so  A  is made up
of 6 outcomes, and  A occurs whenever a toss of the dice results in any
one of the 6 outcomes.
     Since a random experiment lias varying outcomes, any event (except the
event containing  all possible outcomes and the event  containing no
outcomes) will sometimes occur and will sometimes not occur.  In order to
somehow describe what is to be expected with regard to the occurrence
of an event, we use the concept of probability and speak of the probability
of occurrence of  the  event.  For the case above, it seems reasonable to
think as follows:  If the dice are evenly balanced (not "loaded"), then
on a single die,  any one face should come up as often as another.  Since
the die has six faces, each face should come up  1/6  of the time.  For
the pair of dice, the reasoning is similar.  There are 36 "faces" possible,
each equally likely, so that any one "face" will come up  1/36  of the
time.   The proportion of the time that the event  A, described above, will
occur is  6/36  or  1/6, because  A  contains 6 of the 36 equally likely
outcomes.
        One should note that the reasoning process used above does not
define probability; it merely described an interpretation  of it for the
problem at hand.   Unfortunately,  here,  as in many cases, it is possible to

-------
                                                                       21
define the concept in exact mathematical terms, and yet have no guidance
as to how it is to be used or interpreted in practice.   In pure
mathematics, it is permissible (and necessary) to define a concept, say
C, by stating what essential properties it must possess, without going
into a philosophical discussion of the type, "What is C?".  Thus, while
everyone agrees on the mathematical properties that probability must
have, there is some disagreement as to how the question, "What is prob-
ability?", should be answered.  More on that later, but now let us look
at some properties of probability.
       Before going any further, we must understand events a little
better.  First, we must recognize that the whole sample space itself can
be considered an event; call this event S.  We must also allow impossible
occurrences to be regarded  as events.  If the event B contains no outcomes
of S, then B will be said to be empty, and we write B = 0.  Finally, we
must say how events may be  combined.  There are two operations which are
needed, and they will be denoted by n , read  "and" or "intersection,"
and U , read "or" or "union".  Let  A  and  B  be two events.  Then by
the event  A D B, it is meant those outcomes, and only those outcomes,
which are common to both  A and  B.  By the  event  AU B,  it is meant
all outcomes in  A  and a]l outcomes in  B.   In addition,  there  is nega-
tion or complementation;  for example,  the  event  "A  and not B" means
those outcomes in  A  but not in  B.  The event "not  A"  is taken  to
mean "S  and not  A".
Example 2:  For the dice throwing experiment,  let  A  be  the event that
the sum of the values on the  faces is  7, let  B  be the event that a  1
is on the white die, and let  C  be the event of getting  a double.  Then

-------
                                                                     22
          A- {(1,  6),  (2,  5),  (3,  4),  (4,  3),  (5,  2),  (6, 1)}
          B = {(1,  1),  (1,  2),  (1,  3),  (1,  4),  (1,  5),  (1, 6)}
          C = {(1,  1),  (2,  2),  (3,  3),  (4,  4),  (5,  5),  (6, 6)}
and
          AflB = {(1,  6)}
          AUB = {(1,  6),  (2,  5),  (3,  4),  (4,  3),  (5,  2),  (6,1),
                      (1, 1), (1,  2),  (1,  3),  (1, 4),  (1,  5)}
          AH C = 0
          A  and not  B= {(2,  5),  (3,  4),  (4,  3),  (5,  2),  (6,  1)}
       If the events  A  and  B  have no outcomes in common  (i.e.,
AH B = 0), then  A  and  B  are said to be mutually exclusive.  Since
P(ADB) will be zero in this case,  we sometimes say that "If A  and  B
are mutually exclusive events then they cannot occur together."
       We may now state some of the mathematical properties  of  probability.
In what follows, we shall use P(A)  to mean, "the probability that the
event  A  occurs."  Let  S  be the sample  space of  outcomes  of  a random
experiment, then probability is a function  P  defined for all  the events
of  S, with the following properties:
        1.  P  ranges in value from  0  to  1  (0   means impossible,  1
            means surety).
        2.  P(S) = 1.
        3.  For any event  A, P(not A) = 1 - P(A).
        4.  For events  Aj, A2, Aj,..., which are pairwise mutually ex-
            clusive, P(A1UA2UA3U...) = P(AX) + P(A2) + P(A3) + ...
       Actually, probability is something almost everyone has a good in-
tuitive feel for, and there  is no great problem in bridging the gap

-------
                                                                    23
between mathematical definition and practical interpretation.  The most




widely accepted method of interpretation used to arrive at a satisfactory



probability assignment, consistent with the properties above, is the



frequency interpretation of probability.  Using the frequency interpreta-



tion, the probability of occurrence of an event is taken to be the



proportion of the time that the event occurs.  It is now evident that



this method of assignment of probabilities was the one used to obtain



the probabilities for the dice-throwing experiment.




       Some other features of probability should be introduced at this



time, one being the concept of independence of events.  Let  A  and  B



be two events, necessarily from the same sample space.  Then  A  and  B



are said to be independent if and only if P(A n B) = P(A) •  P(B).  In



words, "A  and  B  are independent whenever the probability that  A  and



B  occur together is equal to the product of their respective probabil-



ities."  As a general rule, we may classify events as independent whenever



the occurrence of one event does not affect the occurrence of the other.



       Closely associated with the concept of independence is that of



conditional probability.  Consider the following situation:  from an



ordinary deck of 52 cards (completely shuffled),  what would be the prob-



ability that the first card dealt was the queen of hearts?  The answer,



of course, is 1/52.  Now, suppose someone looked at the card and told us



that it was indeed a heart.  What odds would you now place on the card



being the queen of hearts?  Of course, the odds would not still be 52



to 1, because 3/4 of the possibilities have been eliminated.  One would



surely reason that, since there are 13 hearts, the probability of it



being the queen is 1/13.



       Let  A  and  B  be two events.  Then the conditional probability

-------
                                                                       24


 B  occurs,  given that  A  has occurred,  written  P(B]A)   is:


                             PCBJA)  =
                                       P(A)


 For the cards,  let  A  be  the event of getting a heart,  and let  B  be

 the event of getting  the queen of hearts.  Then
                                 P(A)     1/4


       The reason that we can characterize independence  in  terms  of  occur-

rence of events not affecting each other can be easily seen.   If  A   and

B  are independent, then
and
                 P(B|A) =         =       *      = P(B)
                            P(A)       P(A)
                                  . P(A)  • P(B) = p(A) m
                            P(B)       P(B)


       The concepts of independence and conditional ity have far-reaching

consequences in statistical theory, and one who is interested in delving

further should consult a textbook in mathematical statistics.


4.2  Random Variables and Distributions


       In order to put all random experiments on a common footing and to

simplify studying them, it is useful to transform the outcomes of random

experiments into numbers, and to think of the outcomes as spread along

the number line.  A function which transforms the outcomes of a random

experiment into numerical results is called a random variable.  For a

-------
                                                                      25





given random experiment, there are usually several alternative ways to




effect such a transformation, so that several different random variables



can be associated with the same random experiment.  Generally, however,



we have in mind a particular random variable for a given random experi-



ment.  Random variables are commonly denoted with capital letters, X, Y,



etc.




Example 3:  For the dice-throwing experiment, let the random variable  X



be the sum of the numbers on the top faces.  Then the transformation of



the outcomes to numbers is:





   Outcomes                        Value of X    Probability of Value



   (1,1)                               2              1/36



   (1, 2), (2, 1)                       3              2/36



   (1, 3), (2, 2), (3, 1)               4              3/36



   (1, 4), (2, 3), (3, 2), (4, 1)       5              4/36
(4, 6), (5, 5), (6, 4)                 10              3/36



(5, 6), (6, 5)                         11              2/36



(6, 6)                                 12              1/36





       In effect, a new sample space is created, where outcomes are de-



scribed in terms of the random variable  X.  Thus, we speak of  P(X = a),



"the probability that  X  has the value  a" or  P(a < X  <  b), "the



probability that  X  is greater than  a  and less than or equa] to  b."



       For any random variable  X, once it is specified what values it



is possible for  X  to have and the probabilities that these values occur

-------
                                                                      26

are given, then the behavior of  X  is completely determined.  In theory,
there are two type? of random variables , discrete and continuous , and the
type is determined by the nature of the specifications above.  A function
which performs these specifications for a random variable is called the
probability function:  it is commonly called a probability mass func-
tion (pmf) when it describes a discrete random variable; and a probability
density function  (pdf) when it describes a continous random variable.  To
understand the concepts associated with random variables, it is useful to
th:j.nk of the real number line as a rigid one -dimensional mechanical sys-
tem with mass distributed along its length.  If the entire system is
considered to have unit mass, then the analogy between mass and probabil-
ity is complete.  The discrete case can be thought of as one where a
finite or, at most, a countable number of discrete points in the system
have non-zero mass, and the continuous case can be thought of as one
where the mass is spread evenly and continuously along the line, the
distribution of mass being described by means of a density function.
       The dice-throwing example illustrates a discrete random variable
in mathematical terms.  Let  p  be a function which is non-zero for at
most a countable number of points on the real line such that:
                        1.   p(x) > 0   for al]  x.
                        2.
                            x
Then  p can be considered to be a probability mass function for some
random variable  X, and any random variable which has such a probability
mass function is said to be a discrete random variable.  If  p(x)  is the
probability mass function for the random variable  X, then for every
x, p(x) = P(X=x); that is, p[x)  is the probability that  X  takes the

-------
                                                                      27
 value   x.   In terms of the mechanical  system  it  is seen that  p(x)



 specifies  what points  are to  receive non-zero mass and apportions the



 mass to those points.



        As  a general rule, a discrete random variable will result from



 experiments where a counting  process is used.





 Example 4:  Consider an experiment where  there are only two possible out-



 comes,  zero or one, and suppose  P(zero)  = q, P(one) = p, with  p + q = 1.



 Let the experiment be  repeated  5  independent times.  J (: we consider u



 random  experiment to consist  of  5  repetitions of the simple zero-one



 experiment, the outcomes can  be represented as a string of zeros and



 ones, 5 digits long.   Let  X  be the number of ones in each outcome of



 digit strings; then  X is a  discrete  random variable which can have the



 values   0,  1,  2, 3, 4,  5.  The probabiiity mass function for  X  may be



 determined  as  follows:  For   x = 0, 1, 2, 3, 4, or 5, we must count



 those outcomes having  'x  ones and  5  - x zeros, since we can determine



 that the probability of obtaining any  single such outcome is  pxq   .



 By the  use  of  permutation and combination theory, it is determined that



 for any given  x, there will be  (5!j/[x!(5-x)!]  outcomes having exactly



 x  ones  and  5-x' zeros.  For economy of notation, let  Cn = (n!)/[x!(n-x)!].
                                                           Jv


 Then, we have determined that





         p(x)  = P(X =  x) = CJpxq5~X        for  x = 0, 1, ..., 5



                         = 0   for all other  x.




 Is  p(x)  a proper probability mass function?  WeJ], we certainly have



p(x) > 0  for all  x.   The only question, then, is whether the "system"



has unit "mass."  But,

-------
                                                                      28
                                 =  q5 +  5pq4 + 10p2q3 + 10p2q3 + 5p4q + p5





                                 -  (4 + P)5




                                 =  1,  since  p + q = 1





This example illustrates a special and important type of discrete random



variable, the binomial random variable.  Some other important discrete



random variables will be discussed later.




       Whenever outcomes of random experiments are measurements, as



opposed to counts, it is convenient to idealize the situation by using



continuous random variables.  Strictly speaking, a continuous random



variable must be considered a mathematical approximation, because the



set of. realized outcomes of any physical measuring process must be a



discrete set.  Nevertheless, it is useful to suppose that there are ran-



dom experiments which yield a continuum of outcomes, jf they could only



b^ measured as such.  Tn mathematical  terms:  Let  f  be a functiqn de-



fined for all real  x, which is continuous except for at most a countable



number of points such that





                       1.  f(x) > 0    for all  x.





                       2.
!.  f  ffx)dx = 1.
Then  ffx)  can be considered to be a probability density function for



some random variable  X, and any random variable which has such a prob-



ability density function is said to be a continuous random variable.  If



f(x)  is the probability density function for some random variable X, then

-------
                                                                      29

                                                ,b
for any  a  and  b  with  a < b, P(a < x < b) = J   f(x)dx.  Tn terms of
                                                a
the mechanical system, a probability density function is completely

analogous to a mass density function, and the mass oF any section  (a, b)

is determined as the area under the density curve Crom  a  to  b.


Example 5:  Consider a process for which occurrences of some event are

anticipated, and assume that frequency of occurrence is governed by the

following:  (i)  there is some positive number  r  such that if a short

enough interval is taken, say  h, the probability of exactly one occurrence

in the interval  h  is  rh;   (2)  the probability  of more than one

occurrence in the interval  h  is practically 0; and (3)  the occurrence

in any one interval of length  h  does not affect occurrence in some

other non-over lapping interval of length  h.  Such n process is cad-led

a Poisson process with parameter  r, and is highly useful in applications

involving queueing theory (customer arrivals i.n ;\ service line,  incoming

telephone calls to a central operator, flaws along a length of. wire,

failures in transistors, etc.)-  It applies equally well to time intervals

and to distance intervals.  Observing such a process c;m give rise to

both discrete and continuous random variables, depending upon what aspect

of the process is observed.   Assume that the interval measure is time.

(a)  A Discrete Random Variable

       Suppose a Poisson process with parameter  r  is observed .for a

length of time  t, let  X  be the number of occurrences observed during

that time, and let  A = rt.   Then it can be shown that  X  is a discrete

random variable with probability mass function:

-------
                                                                     30
          p(x) = P(X = x) = &-  e"A      for  x = 0, 1, 2, ...
                             x!
                          = 0              otherwise.



It is easily determined that  p(x)  is a proper probability mass function,



sj.nce





                   g"*  = e'X (1 + A + (A)2/2 + (A)3/3! +...1



                           -A    A    .
                        = e   • e  =  L.
x=0   '
A discrete random variable having a piobability mass function  iike  p,



given above, is called a Poisson random variable with parameter   A.




(h)  A Continuous Random Variable




       Suppose a Poi.sson process with parameter  r  is observed until



the first occurrence is noted, and let  T  be the time it takes for this



to happen.  Then it can be shown that  T   is a continuous random  variable



with probability density function:





                       f(t) =  re"rt    for  t > 0



                           *  0        otherwise.




Here, to convince ourselves that  f  is a  valid probability density func-



tion, we must verify that  the area under  thp curve  of  f is  L,  and  this



is easily done, since,




                       co              on


                       /re~rtdt =    J e"udu =    L


                     0              0




A continuous random variable  having  a probability density function like



f, given above, is called  an  exponential  random variable with  paramenter



r.

-------
                                                                      31






        Although random variables are completely determined by  their prob-



 ability functions,  it  is  sometimes useful to consider another  function



 which,  from a mathematical viewpoint,  is even more elementary  than a prob-



 ability function.   Let X be a  random variable (discrete or continuous).



 The cumulative  distribution function of  X  is defined as:




                            F(x) = P(X < x).




 For the discrete case





                                       P(t)  ,
                                 t
-------
                                                                       32



 f(x) •» F'(x)  if continuous.  Whenever a random variable  X  can be


 specified with some particular probability function or cumulative distri-


 bution function, we speak of the distribution of  X, referring, literally,

 to how its probability mass is distributed alonp. the real line.  Thus,


 we say that "X  has a Poisson distribution," or "X  has an exponential

 distribution," and so on.



 4.3  Moments and Expectation


        It is natural to try to find ways of describing distributions

 without having to resort to a full description given by a pmf,  pdf, or

 cdf.   When a probability distribution is visualized as a mechanical  sys-

 tem,  it readily occurs to one to use moments  of the system to accomplish

 this.   Consider the first two moments.   When  a system has unit  nidss,

 finding £he first moment simply  Locates  the centei  of mass, which  is  re-

 ferred  tp as  the mean  of the  distribution.  Thus,  for  a  discrete distri-

 bution with pmf  p(x),


      mean  = m = £xp(x)   (sum Of positive and negative moments),
                  A


 and for a continuous distribution with pdf  f(x),

                                       CO
                          mean = m =  i xf(x)dx.

                                     -co


       Next, we can compute the second moment about  in, which ib the mom-

ent o(: inertia of the system, and which we refer to as the variance of

 ch
-------
                                                                      33
                                     ou
                                 2    C       2
                    variance = a  =  / (x - m) f(x)dx.
        For most distributions of interest, these moments exist and, for
many  distributions,  the mean and variance give a reasonably good descrip-
tion;  in a few cases, they give a complete description.  The mean, of
course, gives us a measure of where the distribution is located in terms
of  its mass, and the variance gives us a measure of the spread of the
distribution, in the sense that a small variance indicates that the
probability mass tends to be concentrated at the mean, while a larger
variance indicates that the probability mass tends to be less concentrated
at  the mean and more "spread out."  There are many instances where it is
more convenient to use the square root of the variance.  The square root
of  the variance is called the standard deviation and is denoted by  o.
       Consider, now, a distribution which has center of mass at  m, but
has, say, 90% of its probability mass close to  m  on the left and 10%
spread far out to the right of  m.  Here, mean and variance do not give
a reasonable description, due to the fact that the distribution is skew
(not symmetric).   For some distributions, more than the first two moments
are needed for descriptive purposes, and a combination of second and third
moments can be used to describe this lack of symmetry or skewness. tFor
discrete distributions,
                      skewness = a. = £ C* " '")5pCXL
                                  -   x      o3
anil t'pr continuous distributions,
                   skewness
"s-  f

-------
                                                                      34





       For symmetric distributions,  a, = 0, and for others,  «3  is pos-



itive or negative depending upon whether the distribution is skewed to



the r^ght or skewed to the left ("skewed to the right (left)" means that



the bulk of the probability mass is to the left fright")  of  m) .



       The calculation of moments is a special case of a more general



type of operation called expectation or taking expected value .   Let  X



be a random variable (discrete or continuous) , and let  g(x)  be some



function defined over the points in the range of  x.  Then, we define



the expected value of  g(X) , written  E[g(X)l, as:




    E(?(X)J =  £gGOpM  i:t~  X  is a discrete r.v. with pmf  p(x),

               x


°r


    E[g(X)] * I  g(x)f(x)dz  if  X  is a continuous r.v. with pdf  f (x) .
              I
Thus, it is immediately seen that by taking  g(x) = x,




    mean = m =? E(X) ,


                             2

and by taking  g(x) = (x - m) .



                2            2
    variance = o  = Ef (X - m) ] .




       Expected value is a mathematical operator, and is highly useful



when applied as such,  From properties of summation and integration, we



may derive the following rules governing expected value:



      1.  If  c  is a constant (not a random variable),  E(c) = c.



      2.  If  c  is a constant and  X  is a random variable,  E(cX) = cE(X)



      3.  If  X  and  Y  are both random variables, E(X + Y) = E(X) + E(Y) .




By using  E  as an operator, we may derive, for example,

-------
                                                                      35
     variance = E[(X - m)2]= E(X2 - 2Xm + m2) = E(X2) - 2mE(X) + m2


                                                   7     9
                                                      - in



                                              = E(X2) - E2(X)



Example 6:  Consider the exponentiaJ distribution, where


                                 - XY
                        f (x) = Xe  A   for  x > 0


                             = 0            otherwise
                                 00
= f xe"Xxdx = I/A f (Xx)c"^x^d(Ax) = .I/A /"ue"udu = I/A
mean = E(X)


             00                        0
variance = E[(X - m)2] = E(X2) - m2,
                00                  09                       UO

but,  E(X2) =  f x2e"'Xxdx = I/A2 f (Ax)2e~Xxd(Ax) =  I/A2 /*u2e~udu = 2/A:


             00                       0



so, variance = 2/A2 -  (I/A)2 = I/A2.



       Since variance  is defined in terms of expectation, we may also


consider variance as an operator, and it is useful to do so.  Let us de-


note it by Var ( ) and note that the following rules can be derived:


     1.   If  c  is a constant, (not a random variable), Var(c) = 0.


     2.   If  c  is a constant and  X  is a random variable,


         Var(cX) = c^arfX).



4.4  Other Descriptive Quantities


       In addition to mean, variance, and skewness, there are other


attributes of distributions which can aid in describing them.  Two such


quantities are the median and the mode.   For some distributions, these


quantities may or may not be unique, but no essential difficulties arise


as a result of this fact.  A median is any point on the x-axis which

-------
                                                                      36

divides the distribution in half; that is, 50% of the probability mass
lies on either side of the median.  For a discrete distribution, working
from the lower end, it frequently happens that a point  x  will be
reached where including  p(x)  puts more than half of the distribution on
the left end, while leaving  p(x)  out puts more than half of the distri-
bution on the right.  In such a case,  x or any point close by will do
fine for a median.  For a continuous distribution the median will always
be unique.  We might note, in passing, that the position of the median
relative to the mean can be used as an indication of non symmetry; for
example, if the median is to the left of the mean, the distribution is
skewed to the left.
       As a generalization of the idea of a median, one may consider the
use of distribution quant.il es.  For example, the quantifies divide the
distribution into fourths, the deciles into tenths, the percentiles into
hundredths, and so forth.  Distribution quantiles are interesting in that
they give measures of both location and spread of a distribution.  In
addition, their behavior can be mathematically characterized independently
of what distribution they arise from, and for that reason, techniques
based on properties of distribution quantiles are said to be distribution-
Free .
       The use of modes to describe distributions is less common, but
they will be .included here for the sake of completeness.  For a contin-
uous distribution, the mode or modes  are simply the points at which the
relative maxima of the probability density function occur.  Most contin-
uous distributions of practical interest have a single mode.  Most
theoretical discrete distributions also have a single mode, and in these
cases the mode is the value of  x  for which  p(x)  is maximum.  If a

-------
                                                                       37
distribution has a single mode, it is said to be unimodal, and if it has



two  (or more) modes, it is said to be bimodal (multimodal).




4.5  Jointly Distributed Random Variables



       Sometimes the outcomes of interest of random experiments cannot



be adequately represented by single numbers, but require sets of numbers.



Suppose, for the moment, that the outcomes of some random experiment can



be represented by a pair  (x, y).  To handle this situation, we use the



concept of a two-dimensional random variable, say  (X, Y), where each of



the components is a one-dimensional random variable.  In the sample space



of   (X, Y), events are of the type  (a
-------
                                                                      38
probability density functions of  random variables must  conform to the
same rules as before;  that  is,  they must  satisfy:
           1.   f(x,y)  >  0  for  all  x  and  y.
                 00 CO
           2.   / If (x,y)dxdy = 1.
             -00   -00
        The joint pdf of  X  and  Y  is used to  determine probabilities
of events  in a manner analogous to  the one-dimensional  case.   For example,
                                          b   d
           P(a < X  < b, c ^ Y < d)  =     /  / f(x,y)dydx
                                        a  c
or
          P(a < X < b) = /  /f(x,y)dydx
                        a -»
or
                       d
                d) =  / / f(x,y)dxdy
                   -00  -03
        'fie may also define the joint cumulative distribution function of
X  ,-uid  Y  as:
                                             x y
               F(x,y) = P(X < x, Y < y) =   i f f(UfV)dvdu.

-------
                                                                     39
        Whenever random variables are distributed jointly, it is
frequently of interest to know how they are distributed individually.
To describe how  X  and  Y  are distributed  individually, we use the
concept of marginal distributions.  Suppose the joint distribution of
X  and  Y  is known, and we want the marginal distribution of  X.  Consider
some fixed value of  x.  To determine the value of the marginal probability
density function of  X  at XQ, we  "consolidate"  the probability along
the line  x = XQ  by integrating over  y.  Doing this for all x, we
would obtain
                         =   /
                               f(x,y)dy
Eor the marginal Pelf of  X.  By similar reasoning, we would also have
                             00

                         =  /£(x,3
                   f2(y) =  / f(x,y)dx
                           -GO
for the marginal pdf of  Y.  The properties of multiple integration
guarantee that  f^  and  f2  satisfy the conditions for the probability
density functions.
        An interesting question now arises:  If we know how  X  and  Y
are distributed marginally, do we know how they are distributed jointly?
The answer is sometimes, but not always, and it depends upon whether or
riot  X  and  Y  are independent random variables.  The random variables
X  and  Y  are said to be independent if and only if

             P(a < X < b  and  c < Y < d) = P(a < X < b) • P(c < Y < d).

-------
                                                                     40
Note that independence of random variables is defined in terms of indepen-


dence of events; let the events  A  and  B  be defined as:  A =  (a < X < b)


and  B = (c < Y < d).  Then the condition of independence may be stated:


P(ADB) = P(A) • P(B).

        For continuous random variables  X  and  Y  having joint pdf


f(x,y)  and marginal pdf's  f-^x)  and  f2(y), it can be mathematically

proven that  X  and  Y  are independent if


               f(x,y) = f,(x)f2(y)    for all  x  and  y


Thus, if  X  and  Y  are independent, knowing their marginal distributions


is sufficient to determine their joint distribution.

        Descriptions of joint distribtuions, for the most part,  are

analogous to the one-dimensional case.  Thus we use
                            /GO             OO   OO

                             xf1(x)dx = f  j xf(x,y)dydx
                                       -00  -00
                                        CO   OO
                       = [ yf2(y)dy =  f  f
                           yf2(y)dy =  /   / yf(x,y)dydx
                                     -OO  -OO
                          = f (x-^^Mdx = f  f
 o2 = Var(X) = E[(X - vj2] =    (x-^^Mdx =     f  (x-yx)2f(x,y)dydx
                                              -00   -00
                                                   /OO   OO


                                  Cy-My)2f2(y)dy = /  f



                            -00                  -00  -00
o2 = Var(Y) = E[(Y - yy]2] = /  (y-My)2f2(y)dy = f  f (y-^f (x,y)dxdy.
        Whenever jointly distributed random variables are not independent,

the marginal descriptions are not enough, however, and the concept of

covariance is  needed.   The  covariance  of the  jointly distributed random

-------
                                                                      41
 variables  X  and  Y, denoted by  Cov(X, Y)  and  a  ,  is defined as:
                                                    xy
= Cov(X'Y) = E[CX-Px)(Y-|J)]  =       (x-,(y-ll)f(x,y)dxdy
'            xy             -,-lly
                                       (  [ (x-,
                                     -00  -00
         The covariance of  X  and  Y  gives us a measure of how  X  and
 Y  vary together.   The correlation of  X  and  Y, denoted by  p,isa
 1 'standard!zed" covariance,
                                 °x '  ay
         Dividing by  DX •  a   has the effect of reducing  o    to
                            /                                xy
 "standardized units" so that,  for any joint distribution, -1 <  p  < 1,
 regardless  of the form of  the  distribution.
         The concepts of independence  and  correlation  are closely  associated
 with  each other,  and are frequently confused.   Whenever  random  variables
 are independent,   p  = 0, but the  converse is not  always  true.   Thus, un-
 correlated  random variables are not necessarily independent random
 variables.
         The  development  of this section was done  for  the  case of  con-
 tinuous  random variables, but  the development  for discrete  distributions
 is completely analogous.  Also, it is possible  to consider  more than two
 jointly distributed random variables, with a similar development.  The
mathematics of n-dimensional random variables is no more difficult than
that of two-dimensional ones; however, the geometrical visualizations
become almost impossible.
4.6  Useful Distributions
        There are some random variables whose distributions are encountered

-------
                                                                     42
 repeatedly in applications,  and  in this  section, some of them will be
 discussed briefly.
 4.6.1.   The Binomial  Distribution
         A Bernoulli trial  is a random experiment which has only two
 possible outcomes, generally called success and failure.  Let  X  be the
 number  of successes in  n  independent repetitions of a Bernoulli trial,
 where   P(success) = p, for each  trial.   Then  X  is a discrete random
 variable having binomial distribution with parameters  n  and  p.  The
 probability mass function  for X  is

         P(x)  = —	px(l-p)n"x     for  x = 0, 1, 2, ..., n.
               x!(n-x)!
              = 0                      otherwise.
 It can  be verified that the mean and variance are,
                     PX "  E(X) = np
                      x
                        = E[(X - np)2] =np(l-p).
The cumulative distribution function for the binomial distribution is
very tedious to calculate, because sums of terms of a binomial expansion
are involved,  and for  that reason, tables of the binomial cdf have been
computed for various values of  n  and  p.  Whenever one must do calcula-.
tions for the binomial distribution, such a table is generally consulted.
Example 6:  Let a "fair" coin be tossed 8 times in succession, and let  X
be the number of heads which would be obtained in those 8 tosses.  Then
X  has a binomial distribution with parameters  n = 8  and  p = 1/2.  The
probability mass function for  X  would be

-------
          p(x) =
                     8!
                 x!(8 - x)!
               = 0
 and the distribution for  X  could  be tabuJated  as,
                                                                     43
(1/2)*    for  x = 0,  1,  2,  ...,  8
          otherwise
value of X
P(X=x)=p(x)
0
1
2b6
1
8
256
2
28
256
3
56
256
4
70
256
5
56
256
6
28
256
7
8
256
8
]
256
70/256 -




60/256





50/256



40/256 -




30/256 -




20/256 -




10/256 -
       012345678




       Figure 1.  Binomial probability mass function,

-------
                                                                     44

        A binomial distribution is symmetric about the mean  np
if  p = 1/2, and is skewed for other values of  p.
4.6.2  The Poisson Distribution
        As shown earlier in Example 5, a Poisson random variable can
be obtained by observing a Poisson process with parameter  r.  Suppose
the process is repeatedly observed for a time  t  each time,  and let X
be the number of occurrences in each time interval of length t.  Then X
is a discrete random variable and has a Poisson distribution with
parameter  \ = rt.  The probability mass function for  X  is

                      xx  -A
               p(x) = £j- e        for x = 0, 1, 2, ...

                    = 0           otherwise.

We can compute,
                   Mx = EGO = *
                   o2 = E[QC - Xt)2] = X.
        With E(X) = X, we can give a physical interpretation to  r.
Since the average or expected number of occurrences in time  t  is
X = rt, the expected number of occurrences per unit time is r.  Thus,
we may refer to  r  as the occurrence rate of the process.
        The cumulative distribution function of the Poisson distribution
has also been tabulated in standard tables for various values of X.
Example 9:  The occurrence of failures in large lots of transistors can
often be represented by a Poisson process.  Suppose a particular lot has
a failure rate of  1/2000 per hour  (one every 2000 hours), and suppose
the lot is to be tested for 1000 hours.  What is the probability that

-------
                                                                     45
no more than 2 failures will occur?

        If we let X be the number of failures in 1000 hours, then  X

has a Poisson distribution with parameter  \ = 1000 -(1/2000) = 1/2.

The probability mass function is



                   P(x) = ~- e"1/2      for x = 0, 1, 2, ...
                          2 x!

                        = 0               otherwise.


The probability sought is  P(X < 2).  But


PCX < 2) = P(X = 0 or X = 1 or X = 2) = P(X = 0) + P(X = 1) + P(X = 2)

                                      = p(0) + p(l) + p(2)
                                      = e"1/2 (1 + 1/2 + 1/6)

                                      = 0.9856.

Note that if we had desired, we could have consulted a table of the

Poisson cdf, F(x), for A = 1/2  and x = 2.


4.6.3  The Geometric Distribution.

        Let a Bernoulli trial, with P (success) = p , be independently

repeated until the first success is obtained, and let  X  be the number

of trials necessary to do this.  Then  X  is a discrete random variable

having a geometric distribution with parameter p.  The probability mass

Function for X is

                   p(x) = p(l - p)*'1       for x = 0, 1, 2, ...

                        = 0                 otherwise

and

                   yx = E(X) = 1/p

                   aj = EL(X - 1/p)2] = (1 - p)/p2 .

-------
                                                                     46
The pmf of the geometric distribution is actually very easy to derive.



If  x  trials are required to get the first success, then the first



x - 1  trials must be failures and the  xth  trial a success.  The



probability of failures in every one of the first  x - 1  trials is



(l-p)(l-p)...(l-p) = (l-p)*"1  and the probability of success on the


                                                           x-1
xth trial is p.  Thus, P(taking x trials) = P(X=x) = pCl-p)   •  The



geometric distribution is a special case of a more general type of



distribution known as the negative binomial distribution.




4.6.4  The Exponential Distribution



        This distribution has also been discussed earlier as an example



of a continuous distribution  (Example 5).  If a Poisson process with



occurrence rate  r  is observed until the first occurrence, and if  X



is the elapsed time until the first occurrence, then X is a continuous



random variable having the exponential distribution with parameter r.



The probability density function for  X  is,



                          f(x) = re"rx        for x > 0



                               = 0            otherwise,



and


                          PX  = E(X) = 1/r




                          o2  = E[(X - 1/r)2] = 1/r2.
                           A.



        One might have suspected that the average time to  the  first



occurrence  E(X)  would be  1/r, since the occurrence rate is   r  per



unit time.  The similarity of the discrete geometric distribution and



the continuous exponential distribution  should be noted.   Like the



geometric distribution, the exponential  distribution  is  a  special case



of a more general family  of distributions called the  gamma distributions.



The gamma distribution  is a two-parameter distribution with probability

-------
 density function,
                                                                      47
£ (x) . J£L e-"6
                                                 for x < 0
                        = 0
                              otherwise,
where  a,  6 >  0.  The exponential distribution is obtained by taking


a=l  and 6  = 1/r  .


4.6.5  The Normal Distribution.


        The normal distribution is by far the most familiar and most


used distribution in applications.  Let  X  be a random variable having


the normal distribution.  Then the probability density function for  X


iS                                   2   2
                           1   _ f-y_|l^ '/O "
                              e         a       for all real x,
where  a  > 0  and  -°° < \i < <» .   We get


                  ux = E(X) = u


                  ax = E[(X-n) ] = a  .


The normal pdf is the familiar  "bell curve" shown in Figure 2.  It is sym-


metric about the point x = p and has its maximum value there.  The curve has


two points of inflection at  p - a  and  jj + a, and is more sharply peaked

                        2                                     j
Cor smaller values of  o   and flatter for larger values of  a .


               0.8
                                          1/2
                   -5 -4-3-2-1  0  1  2  3  4  5


                   Figure 2 .  The normal  density curve.

-------
                                                                     48
                  2
 If  y = 0  and  a  = 1, we have
                        fW
 and any random variable having such a pdf is said to have the standard



 normal distribution.    The cumulative distribution function for a


 standard normal distribution is
X   2
                        F(x)  = P(X < x)  = JL  A;'* /2dt,

                                         VzT J
 which is recognized as a form of the well-known erf function.  A closed-



 form functional expression for  F  does not  exist,  and tables  of the


 values of  F  have been computed, but only for  the  standard case of

              2
 y  = 0  and  a  =1.  The reason why further  tables  are unnecessary lies


 in the following result.   Let  X  be normally distributed with mean y
               2
and variance  a  .   (For economy, this can be written as, X-\-N(y, a2).)



Let  Z =  (X-y)/a,  then  Z  is also a random'variable and  Z^N(0, 1).



In other words, subtracting the mean y, and dividing by the standard



deviation a  "standardizes"  any normal  (y,a2)  random variable to a



normal (0,1) random variable.  This result can be used in the following

                       2

way.   Suppose  X^N(y,a )  and we want to know  F (c).  We can compute
                                                 Ai






    Fx(c) = PCX < C) = PCX - y < c - y) = P[(X-p)/o < (c-y)/a]



                                        = P[Z < (e-y)/a]



                                        =Fz[(c-y)/a],

-------
                                                                     49





 and  since  Z/^N(0,1), F   can be extracted from a table of the standard
                       Lt


 normal cumulative distribution.



        Because of the relation between X  and  Z  above, it is a



 common practice to think of values of a non-standard normal variable in



 terms of  a-units deviation from the mean  y.  For  Z^N(0, 1},



 P(-l < Z < 1) = 0.683; thus we can say that for any normal distribution,



 68.3% of the observations lie between  y - a  and  y + a  and, in a



 similar manner, we can determine that about 95.51 of the observations



 lie between  y - 2a  and  y + 2o  and that 99.61 of the observations



 are between  y - 3o  and  y + 3o.



        One of the reasons that the normal distribution is so useful is



 the fact that it can be used for accurate approximation of other



 probability distributions under certain conditions.  For example, if  X



 is a binomial (n,p) random variable, then the random variable



 (X - np)//np(l-p)   is approximately normally distributed for large values



 of n, and the approximation improves with increasing n.  The approximation



 is also better for  p  close to  1/2, since the binomial distribution is



more symmetric for  p  near  1/2.  The process of subtracting np and
dividing byv/hp(l-p)  can be thought of as standardization, since
yx = np  and  DX = Vnp(l-p).




Example 10:  Let  X  have a binomial distribution with  n = 8  and p = 1/2.



We will use the standard normal distribution to approximate  P(X < 6).



In applying the approximation, one step in the technique is that of making



a  "correction for continuity."  To understand why this improves the



approximation, it is useful to use a histogram to represent the pmf of



the binomial.  A histogram is a graphical device used to represent



discrete distributions for which the pmf is positive at evenly spaced

-------
                                                                     50
points along the number line (such as at the integers, 0, 1, 2, ...)•
Rectangles of constant width are drawn above and centered over the
points of positive probability mass in such a way that the areas of the
rectangles are proportional to the probability masses of the points.   A
histogram is given below in Figure 3 for the binomial with  n = 8, p = 1/2.
Also drawn is a smooth curve, which we will suppose is an approximating
curve.
          to
          .5
          8
          •H
          1
704
             60
             50-
             40-
             20.
             10-
                         12345678
                Figure 3.  Binomial mass  function.
Now, to find  P(X < 6) from the histogram, we would add the areas of the
7 rectangles at  0, 1, 2, ..., 6  together.  To approximate this area by
means of the curve, one would compute the area under it to the left of
6.5.  Let us suppose that the curve is the graph of the pdf of a random
variable  Y  which is approximately normally distributed with mean
np = 4  and standard deviation   np(l-p)  =  2.  Then
     PCX < 6) = P(Y < 6.5) = P[(Y-4)/vT< (6.5-4.0)/V2]
                           = P(Z < 1.77), where
                           = 0.9616.

-------
                                                                      51
 The actual probability obtained from the histogram is



                P(X < 6) = 247/256 = 0.9648.




 The procedure of finding the area under the  curve to the left of 6.5



 instead of 6.0 is an illustration of the use of the correction for



 continuity.  The matter of a continuity correction is often a source of



 confusion to people, but a graph such as Figure 3 illustrates that it



 can be visualized simply as a matter of including (or excluding) whole



 rectangles, which implies the use of boundary points, rather than



 midpoints along the  x  axis.   For example,




       P(X < 6)  = P(Y < 5.5)  and  P(2 < X <  5)  = P(2.5 < X < 5.5).




         A reason often given for using the normal distribution is the



 convenience of  doing so.   The mathematical properties of the normal



 distribution are very  "nice," indeed,  and a large body of known



 mathematical facts  has consequently been built  up around it.   While



 convenience and tractability are important considerations,  they are not



 the only influencing factors in the matter.   One of the most important



 and intriguing  theorems in all of statistical theory is the Central Limit



 Theorem.   Roughly stated,  the  theorem goes as follows:   let X,,  X9	X
                                                              x   L        Tl


 be  random variables all having the same distribution with  E(X.)  =  \i



 and Var(Xi)  = o2 ; and let I = (^ + X2 + ... + ^/n.  Then the random



 variable X  has an  approximate normal distribution with mean  v   and

            2

 variance  o /n.   The remarkable thing about  the theorem is  that  the



 distribution for the  X.'s  is  not restricted to  any particular  form;



any discrete or continuous distribution is allowed.  In short, part of



the reason  for the widespread use of the normal distribution in



applications is tne Central Limit Theorem.

-------
                                                                     52
 4.6.6.  The Chi-Square Distribution
        Let X,, X~, ..., X,  be independent random variables, each
                                                     2
 normally distributed with mean y = 0  and variance  o  = 1, and let
     22          2
 X = Xj + X- •*•  ... + X,.  Then  X  is a continuous random variable having
                                                                2
 the Chi-Square distribution with parameter k, and we write  X'v- X GO •
 The parameter in the Chi-Square distribution is called the degrees of
                                                             2
 freedom, for reasons which must be explained later.  If  X% x 00. the
 probability density function for  X  is,

           f(x)	1        xk/2~Vx/2        for x > 0
                = 0                               elsewhere
with
                      ux = E(X) = k
and
                      a2 = E[(X-k)2] = 2k.

Note that the Chi-Square distribution is also a special case of the
gamma distribution with  a = k/2,  6=2.
4.6.7.  Student's t-Distribution
        Let X^N(O.l) independently of  Y<\,X2(r), and let  T = S/v/Y/r.
Then  T  has a Student's t-distribution with  r  degrees of freedom, and
we write  T  t(r).  The actual form of the probability density function
of the t-distribution is of secondary interest, and will not be given here.
The t-distribution has its mean at zero, is symmetric about that point,
and can be visualized as a  "spread-out"  standard normal curve.

-------
                                                                     53
 4.6.8.   The  F-l)istribution



         Let   X-v/X2^)  independently of  Y'VX2^), and let F =



 Then  F  has an  F-distribution with  f^  and  f2  degrees of freedom,



 and we write F^Fff^  f2).  The mathematicaJ form of the probability



 density  function of the F-distribution is also of secondary interest,



 and will likewise be omitted.



        The  short discussion of the Chi-Square, Student's t, and F



 distributions should not be taken to imply that they are not important



 distributions, because  the opposite is the case.   These distributions are



 called sampling distributions because they were discovered and derived



 as a result of studying the behavior of sample statisticsr  Our real



 interest in these last three types of random variables is in the nature



of their cumulative distribution functions,  which have been extensively



tabulated.   A  t-table, an F-table, and a Chi-Square-table are standard



 in almost every text of applied statistics.

-------
                                                                     54
                   5.  SAMPLING AND INFERENCE

      In this section, we shall examine some of the elementary activities
commonly associated with statistical inferences and try to show how
probability theory is used to assist in these activities.  When we speak
of inference, or an inference situation, we will be referring to a
situation where one desires to make statements or guesses about a
population by examining a subset of it which is believed to somehow typify
or represent the population as a whole.  Some of the difficulties involved
with the terms  "typical"  and  "representative"  have been mentioned
already and we have seen that about all one can do is to try to insure
that a random sample  is drawn from the population of inference.  Beyond
that, "typicality" is largely a matter of faith.

5.1  Description of Finite Number Populations
      The function of condensing and summarizing the information in a
set of numbers is one which can be traced to the early beginnings of
statistics as a discipline.  If interest is only in the set of numbers
itself, it is quite evident that no statistical inference is involved,
since generalization beyond the number population is not required.  How-
ever, some of the same activities used in describing finite number
populations have also been found to be useful in inference situations,
and for that reason, we shall mention some of the informative operations
one can perform upon a set of numbers.  Let the numbers be denoted by
x,, x2, ..., Xj,; then one may:

-------
                                                                     55
         1.  Rank the numbers in order of increasing magnitude.
                                   N
         2.  Compute the mean  m = I x./N.
                                   1  x
         3.  Compute the range = x^ -a^.
                                       2   E^-x)2
         4.  Compute the variance of  a  = —^	   or the standard
             deviation  o .
         5.  Find the median, quartiles, deciles, etc.
         6.  Select classes, construct a frequency table,  and draw a
             histogram.
         7.  Calculate a cumulative frequency function.
      It should be emphasized that each of the activities mentioned above
retains only a part of the full information contained in the number
population.  This being the case, it is not difficult to see how a
number population could be misrepresented by a  "judicious"  choice of
descriptive techniques.

5.2  Statistical Inference in Finite Populations
      In a very real sense,  all populations generated by measurement are
finite populations, due to the physical limitation of measuring instruments,
so the distinction being made here between finite and infinite populations
is whether or not a population to be studied is considered to be a finite
subset of a conceptual infinite population of similar members.
      For the moment, assume that one desires to describe a large
population but does not wish to view it as infinite, and assume that it
is not possible in terms of time, money, or both to examine the entire
population.  The experimenter is, of course, confronted with an inference
situation, For he must form his opinions about the population by examining

-------
                                                                     56
one or more samples from it.  To simplify matters, let us suppose that
the goal is estimation of the population mean  y .   It is decided that
when the sample  (x]L, x2> ..., XR)  is obtained,  the estimate of  y
will be the sample mean  x = E x^/n .    Is this a good thing to do?  We
have seen that it is not possible to judge on the basis of the sample
itself, because x  might be  "right on the mark"  if the sample values
(x,, x9, ..., x )  happened to be evenly situated close to  y ,  or  x
  .L   £*        H
could be  "badly off  if the sample values happened to be much higher
or lower than  y .   Moreover, the fact that the value of  y  is not
known precludes knowing what the actual situation might be.  One must
therefore resort to an evaluation of the long-run consequences of his
procedure; that is, if he continued to take samples and compute  x's,
how would his x's,  taken as a whole group, measure up?  There are at
least two things to look for.  First, would the x's  tend to cluster in
such a way so as to not consistently overestimate or underestimate  y  ,
and second, would the variation in the  x's  from sample to sample be
large or small?  In short, what would be the accuracy and the precision
of the estimation procedure?    It is not possible to say anything
definite about either of these properties, unless the method of taking
the samples can be given a probability structure; that is, unless random
sampling is required.  Let us consider the matter of accuracy;   over  all
the possible samples of size  n  ,  what do the associated x's average  out to?
      A quantity such as  x  ,  when regarded as an estimate of a population
parameter, say Q  ,   is often denoted by  9   whenever the particular
estimate is not of immediate  interest, and we say that   9  is  9,   in
other words,  if  E(9) = 9; otherwise, 9 is said  to be a biased estimate
with Bias = E(9)  - 9.  Thus, bias  is simply another way  of describing  accuracy.

-------
                                                                     57


   Recall that simple randan sampling is taking random samples in such a

way that each member of the population has the same chance of being

included in the sample.  One of the primary reasons for using simple

random sampling is that the result of doing so makes  x  an unbiased

estimate of  p; the average of x's over all possible samples of size  n

is  v .   Furthermore, the validity of this assertion does not depend,

in any way, on knowing values of  p ; whatever  y  may be, simple random

sampling guarantees that the  x's average out to it.


Example 11;  As a very simple illustration, let us suppose that a finite

population is made up of the 5 numbers 1, 2, 3, 4, 5, and suppose that

simple random samples of size 2 are to be drawn.  It can be easily shown

that with simple random sampling, every sample of a given size has the

same probability of being drawn.   Assume it is desired to estimate the

population mean \i (which we know to be 3) with the sample means.

We will have:


   Samples *            Values of x              Associated Probability


                                                         1/10

                                                         1/10

                                                         1/10

                                                         1/10

                                                         1/10

                                                         1/10

                                                         1/10

                                                         1/10

                                                         1/10

                                                         1/10

*0rder  is not considered; for example the sample (1, 2) is the same as
sample (2,  1).
(1,
(1,
a,
a,
(2,
(2,
(2,
(3,
(3,
C4,
2)
3)
4)
5)
3)
4)
5)
4)
5)
5)
1.5
2.0
2.5
3.0
2.5
3.0
3.5
3.5
4.0
4.5

-------
                                                                     58
We may consider  x  as a discrete random variable with the distribution


given by:
Values of x
Prob. of value
1.5
1/10
2.0
1/10
2.5
2/10
3.0
2/10
3.5
2/10
4.0
1/10
4.5
1/10
Thus, E(x) =  (1/10)[1.5 + 2.0 + 2(2.5) + 2(3.0) + 2(3.5) + 4.0 + 4.5]


           =  (1/10)(30) = 3 ,


so that x is unbiased estimate of  y.



      It may be that for some population parameter, several different


types of unbiased estimates are possible.  In such a case, the one with


the greatest precision would probably be the most desirable.  We have


seen that random sampling allows us to determine the expected value of


estimates computed from samples of fixed size.  It also allows us to


determine their variances over all possible samples of a given size.  The


variance of estimates is used to measure their precision; larger variance


means less precision, and vice versa.  Thus, if we have several ways of


unbiasedly estimating some unknown population quantity, we may choose


those with smaller variances as being more desirable.


                                                                   7
Example 12:   For a population of size N with population variance  a  ,


it can be shown that over all samples of size n, Var (x) = —  ( ?j ~_ ") .


Thus, if there were some other unbiased estimate of  y  besides  x  being


considered,  its precision relative to  x  could be determined by comparing


its variance to  —  (   " nj.   For the population of the previous example,



                5

       a2  -  lEfXi-3)2 = |  (-2)2 + (-1)2 + (O)2 + (I)2 + (2)2


                          = 2  ,

-------
                                                                     59




 so by formula,



                 Var(x)  = f (•^-2-)= 0.75  .





 By computation over all  samples,





                  Var(x) =E(x2)  - E2(x) ,



 where




 E(x2) = ^ [2.25 + 4.00 + (2)(6.25)  + (2)(9.00) + 2(12.25) + 16.00 + 20.25J  =




       9.75




 so that



                  Var(x) = 9.75 - 9.00 =0.75 ,



 in agreement with the formula.



                                                       2   N         2
      For finite populations, it is customary to use  S  =£ (yi - y) /(N - 1)




                           2
rather than the variance  o   to measure the population variability,


               2        2
because both  S   and  o   are equally valid measures of variability, and



the use of S  permits some simplification.  Since  S  =  ^ _ ^ a  ,  we



                        S2 /     \
can write,    Var(x)  =  -  [1 - rrj






      Although Var(x) can be computed and compared with variances of other



estimates, the expression as it stands is of limited value, due to the


                          2                                     2
fact that in most cases  S   is an unknown quantity.  However, S   may be



a]so estimated from the sample,  and
                 s2 -E (*!-*>2/n - ^



                             '2                 22
is an unbiased estimate of  S  ,  that is,  E(s ) = S ,  where the



expectation is taken over all samples of size  n .   Hence,  an unbiased

                        2 /      ,

estimate of Var(x) is  —n  - £ \

-------
                                                                     60
      The discussion to this point illustrates the type of activity
 associated with estimation in finite populations.  It is important to note
 that  there has been nothing, as yet,  which indicates any connection
 between the mathematical distributions discussed earlier and inference in
 finite populations.  In fact, the number of mathematical distributions
 which deal with finite discrete populations are few; indeed, the most
 common is the binomial distribution.  Thus, the situation with finite
 populations is briefly this:  unless the population definitely arises as
 the result of some process which specifically produces outcomes which
 follow a finite discrete mathematical distribution, or the population can
 be considered to be a subset of some infinite population, then statistical
 estimation is limited to finding estimates of unknown population quantities,
 such  as the mean, variance, and median, and assessing the bias and
 precision of those estimates.  Furthermore, these activities are possible
 only when random sampling is used and are consequences of the probability
 structure induced by randan sampling.
      Statistical testing, which is also an activity associated with
 inference, is similarly limited in finite populations.  We shall illustrate
 one type of test by means of an example.
Example 13:  A manufacturer of a medicine claims that at least 7 out of
 10 doctors recommend his product for a particular ailment.  To test this
 claim, suppose that from a random sampling of 100 M.D.'s in the United
States, it was determined that 52 of them did recommend his product,
Brand A,  and 48 recommended something other than Brand A.  On the  basis
of this sample,  should the claim be considered extravagant?  Since 100
doctors represents a very small proportion of the M.D. population,  we
may suppose that the sampling process did not noticeably change the

-------
                                                                     61
 relative  proportions  for or against Brand A.  If the claim is valid, then

 at  least  70%  of the M.D.'s do recommend Brand A, and the process of

 sampling  could then be considered as a binomial experiment with  n = 100

 and p =  0.7; i.e.,   100 repetitions of a Bernoulli trial where P(Success) =

 P(Recommendation of Brand A) = 0.7.  In such a case,  we may ask what the

 probability is of getting 52 or less successes out of 100 trials.  This is
   52

  V^C.   (0.7) (0.3)     ,  approximately 0.0001.  We reason as follows:
  1=0
 either the manufacturer's claim is valid or it isn't.  If it is valid, then

 the probability of getting a random sample of 100 M.D.'s with 52 or less

 positive  responses is 0.0001.  In other words, if, in fact, at least  7

 out of 10 doctors do recommend Brand A, then of all the samples of size 100

 which could be drawn,  99.99%  of them would have yielded 53 or more Brand A

 recommendations,  and what is more,  99.99%  of all the samples would have

 yielded 64 or more Brand A recommendations.  Thus, a claim of 70% is strongly

 contradicted by  "experimental"  evidence, and we would conclude that the

 proportion of M.D.'s recommending Brand A is almost certainly less than

 70%  and  is probably somewhere between 40% and 60% .

      The essential features of statistical testing are illustrated by

 this example.  One has a tentative hypothesis about some aspect of the

 population, and he obtains observations from that population which tend

 either to support or to contradict the hypothesis.  The degree of

 agreement or disagreement can be quantified whenever it is possible to

 construct from the observations some quantity which should have a certain

 behavior  (distribution) if the tentative hypothesis does actually hold.

One then asks whether or not the value of the quantity obtained from

 the observations could have plausibly occurred under the conditions

 imposed by the tentative hypothesis.   If it is determined that the

-------
                                                                     62
 probability of getting such a value  is  small, then we say that the tentative
 hypothesis is  contradicted, and we tend to regard it with suspicion,
 at the very least,  and in many cases, to discard it altogether.  The
 probability mentioned  above is often referred to as the significance
 level , and statistical tests are sometimes called tests of significance
 or significance tests.   They are also often called tests of hypothesis.
       The area of statistical testing and decision-making has been a
 battleground for statistical theorists  for many years, and a great
 amount has been written on these subjects.  We would not presume to con-
 dense the matter into  a few words, but we can say that much of the
 controversy has been concerned with whether the purpose of statistical
 testing should be regarded as one of decision-making or one of simply
 learning from  the data.  Also, a large portion of the material written
 about statistical testing has been concerned with finding optimal testing
 and/or decision procedures under various criteria of optimality.
       The theoretical disagreements about statistical testing should not
 be permitted to obscure  the fact that it has been found to be a highly
 useful statistical tool by researchers everywhere.   Statistical testing
 helps  one in forming opinions about populations and, moreover,  assists
 one in objectively substantiating and defending those opinions.  It is
 sometimes used  as a device for decision-making and sometimes as a process
 of learning  from the data, and it is a mistake to think that both functions
 are not  served.
      One should note that the construction of the test shown in Example 13
was made  possible by two things:  (1)  random sampling, and (2)  a large
 finite population.   Actually,  in most situations encountered, a population
 is large  enough to be effectively considered an infinite one.  In the next

-------
                                                                     63

 section,  entitiled Statistical  Inference  in Infinite Populations, we shall
 discuss methods  of estimation and  testing which are appropriate whenever
 a population can be considered  to  be  infinite and,  for all practical
 purposes, continuous.
      Although estimation and testing are generally thought of as two
 distinct  and separate  functions, there is a very useful technique of
 statistical  inference  which  is  sort of a  hybrid between estimation and
 testing called consonance interval or confidence interval construction.
 Suppose we want  to form an opinion about  some parameter  6  of a number
 population.   A random  sample of observations is drawn from the population,
 and  from  it  an interval is calculated, within which the value of  0  is
 believed  to  be.   One way  of  interpreting  such an interval is that it is
 a "list" of hypothetical values  of  the  parameter  e  which are consistent
 with the  sample  drawn.  When this  interpretation is used, the intervals
 are  sometimes called consonance intervals.
 Example 14;   Assume that  a population is  large enough so that the drawing
 of one  member at  random does  not essentially affect the probability that
 any  other will be drawn.  Actually, the use of the following requires
 the  assumption of some nonspecific continuous underlying population
 distribution,  but it can  often  be  applied, in an approximate sense, to a
 large finite population.   Suppose we want to find out something about
 the  population median, and that a random  sample of size 10 was drawn.
 Let  the ordered  sample be represented as  (x,, x_, ..., x,0), and consider
 the  question,  "What is the probability that a sample would be drawn such
that the population median lies  between the smallest and largest observa-
tions of the sample?"  Now,  since the median is  such that half of the
 population is  on  either side  of it, any observation lias a 50 - 50 chance

-------
                                                                     64

 of being greater or less than the median.  Thus, the median would fail
 to be between the smallest and largest observations if all sample values
 were, respectively, all larger than the median, or all smaller than the
 median.  The probability of this happening is  1/210 + 1/210 or  1/29,
 about 0.002  .  Thus, we would say that an estimated or a hypothesized
 value of the median less then  x,  or greater than  x,Q  is highly
 inconsistent or disconsonant with the sample, and we could call the
 interval   (x,, x,Q)  a  99.8%  consonance interval for the median.
 Similar intervals with their consonance coefficients are given below.
                     (x2, xg)          97.9%
                     (x3, xg)          89.1%
                     (x., -x_)          65.6%
                     (x5, x6)          24.6%
 Intervals given the consonance interpretation are probably best under-
 stood by considering the  "degree of disconsonance"  of values outside
 the interval.
        Another way of looking at intervals constructed as above is
 regarding them as confidence intervals.  Of all the (ranted) samples of
 size 10  that could be drawn,  99.8%  of them will be such that the
 population median lies between  x,   and  X,Q .   Hence, if one draws a
 sample of size  10  at random, and flatly claims that the population
median is between  x,   and  x,Q,  he has a 99.8% chance of being correct.
Thus, we may call  (x,, x,Q)  a confidence interval on the population
median with confidence coefficient  99.8% or simply a 99.8% confidence
 interval.   Similar statements apply to the other intervals, the confidence
 coefficients being exactly identical to the consonance coefficients.  It
may seem tempting to discard a consonance statement or a confidence

-------
                                                                     65

statement in favor of a simple probability statement such as,  "The
probability that the median lies between  x..   and X,Q  is 0.998," but
we cannot do that, because the position of the median, although unknown,
is fixed and is not a subject of probability.   Furthermore, since the
median is fixed, any given interval either will  or will not include jt;
again, there is no probability involved.  What the probability refers to
is the method of making confidence statements, and is generally given a
frequency interpretation such as:  if one continued to draw randan samples
of size  10  and stated every time that the population median lay between
the smallest and largest observation of the sample,  then  99.8%  of the
time he would be making a correct statement.
        Confidence intervals are most commonly applied to situations
illustrated by the preceding  example, to make statements about population
quantities such as the mean, median, and variance.  Two additional uses
of confidence statements have been given special names.  Sometimes it is
useful to be able to make confidence statements about limits which are to
include a given proportion of all population values, and such limits are
called tolerance  limits or tolerance intervals.   Although the use of
tolerance intervals also requires the assumption of some nonspecific
continuous underlying population distribution, they too are often applied
in an approximate sense to a large finite population.  Another slightly
different application of confidence statements is using them in connection
with future observations.  For example, for any given probability level
p  and any given proportion  q,  it is possible to construct limits which
will include the proportion  q  of future observations with probability  p.
Confidence intervals used in this way are called prediction intervals.
Tolerance intervals and prediction intervals will be discussed further in

-------
                                                                     66

the section,  "Statistical Inference in Infinite Populations."
        We have introduced in this section three activities which can be
considered the basic elements of statistical inference: estimation, testing,
and construction of statistical intervals.   These subjects will be
discussed further in the next two sections.
5.3  Statistical Inference in Infinite Populations
        The large majority of cases where one is attempting to learn
something about a population by studying samples taken from it are such
that the population of inference can be considered to be effectively
infinite.  In the case of finite populations, it was pointed out that no
structure or distribution was assumed for the population; the only
probability structure available for purposes of statistical inference was
the result of random sampling.  We also found that estimation, testing,
and statistical interval construction were thereby somewhat limited,
although the things which can be done represent a giant step above a
policy of pure subjectivity.  At this point, we will now suppose that a
number population is such that its structure or behavior can be represented,
or at least approximated with a probability distribution, and we shall use
the terms population and distribution interchangeably.  By and large, we
will have in mind continuous distributions, although Poisson, geometric,
negative binomial, and other infinite discrete distributions can describe
certain infinite populations.  Continuous populations are generally considered
to arise as a result of numerical measurements.  Classification or
categorical populations are not of this type, and the methods of this
section are not appropriate for studying them.
        Whenever a population can be considered to follow some continuous
distribution, it is not always necessary to know the specific form of the

-------
                                                                      67
 distribution.   There are methods,  called nonparametric  statistical methods.
 for which it is sufficient to assume only that  the population follows
 some continuous distribution without specifying  the  actual  form.  The
 other methods,   which have been by far  the most studied, most applied, and
 most discussed, are the so-called  parametric methods, whereby the form of
 the distribution is specified except for one or more  unknown parameters.
 The parametric  approach,  surprisingly enough,   still  allows  a great deal
 of latitude.  The normal  family of distributions, for example, with  y
       2
 and  o   allowed to vary,  encompasses a wide variety  of shapes and locations.
 However,  the primary reason for the prevalence  of parametric methods in
 present-day statistical methodology is  the fact that  a  great body of powerful
 mathematical theory may be  brought to bear with regard  to determining
 optimum procedures:   "best"  estimates,   "most  powerful"  tests, etc.
 It is possible  to answer questions like,   "Of all the unbiased estimates
 of some population parameter,  does one  exist having maximum  precision
 (minimum variance),  and if  so,  how may it be found?"
        Randan  sampling from  infinite populations translates  into,
 "drawing observations  independently."   From a theoretical standpoint, it
 is  convenient to  think  of a sample of   n   observations drawn at random
 as  being a  set  of n  independent  random variables,   each having the
 same distribution as the population from which  it was drawn.  Quantities
 calculated from samples, such as a sample mean or sample variance, will
 then be functions of randan variables and so will be themselves random
variables having distributions.  Such quantities are called statistics.
        Traditionally, the terra statistic has been applied to any
descriptive quantity calculated from a set of numbers; however,  when we
speak here of a statistic,  we shall mean a random variable which is a

-------
                                                                     68
 function of  sample observations and which does not depend upon any unloiown

 parameters.


 Example 15;  Let a sample of size  n  be drawn at random from a normal

                                          2
 distribution with mean  y  and variance  a ;, denote the sample by (x, ,
    x ).  Then,  the sample mean  x = l x./n  and the sample variance

nTj-  £  (x.-x)   are statistics, and it can be shown that  x  has a
s  =
                                                 2                     22
normal distribution with mean  y  and variance  a /n,  and that  (n-l)s /o


has a Chi -Square distribution with  n-1  degrees of freedom.  The purpose

                                                                   2
of considering these statistics in the first place is that x  and s  are

                                2
unbiased estimates of  y  and  a , respectively; i.e., E(x) = y  and

   22                                                2
E(s ) =  o   .  A somewhat startling fact is that  x  and  s   are


independent random variables whenever the parent distribution is normal.


One would probably not have suspected that this could be possible, as the

              2
formula for  s   seems to functionally involve  x .   A useful algebraic

identity is
             n      7   n  ~     ?    n
             I(x -x)2 = zx 2 - nx2 =  ix
                i         i             i      n
and one of the two expressions on the right is generally used to actually

          2
compute  s .


        Now, suppose that the value of  y  is known,  say  y  =  yQ.

Since  (x - y0)/(a/yffi)  would then be a standard normal variable and

      2  2
(n-l)s /o   an independent Chi-Square variable, we may form
      t - [(x - y0)/(a/Vn)]/  \/(n-l)s2/(n-l)a2   =  Vn~(x - yQ)/s,


which has the Student's  t- distribution with n-1 degrees of freedom.

-------
                                                                     69
         It is useful to think of  the  t-distribution  in relation to a



 normal distribution.   If  X  has  a normal distribution with mean zero



 and variance  a ,   then X/a  has the standard normal distribution.

                              2

 Suppose that we do  not know  a  ,  but can unbiasedly estimate it from

                                      2

 n  observations with the statistic  s .  Then  X/s   has the t-distribution



 with n-1 degrees of freedom,  the  distribution depending only upon the

                                                  2

 number of observations available  for  estimating  a .  The shape of



 the distribution of  X/s  is  similar  to that of  X/o, except the



 distribution of X/s   is more spread  out, owing to the variability of the



 statistic  s.   As   n  increases,  the  t-distribution  becomes more like



 the standard normal until,  for 100 degrees of freedom,  a t-distribution



 is,  for all practical purposes,   the  same as a standard normal distribution.



         There are some facts  about continuous distributions hinted at in



 the preceding example,   across almost all continuous distributions.



 Practically every continuous  distribution of interest has a mean  \i  and

              2
 a variance   a  ,  expressible in  terms of distribution parameters.  In



 random samples  of size  n   from any of these distributions:



   1.   The  sample mean  x   is an  unbiased estimate of u ,  and



       Var(x)  = a2/n  .


                              2                               2
   2.  The  sample variance  s   is an unbiased estimate of  a


                                          s2      2
   3.  An unbiased  estimate of Var(x)  is —  =  s /n .
                                           A


        Thus,  the precision of the sample mean as an estimate of the



population mean depends on the sample size and the inherent variability



of the population.   Since  s.d. (x) = a-   =  a/v/n,  it is evident that for
                                      .A.


a given population, the standard deviation of a sample mean is inversely



proportional to the square root of the sample size.   Precision is often



expressed in terms of standard deviation instead of variance,  and when

-------
                                                                     70
 it is,   the  statement  is often made that to increase the precision of a
 sample mean  by a factor of  k,  the sample size must be increased by a
 factor of k .  For example,  to reduce the standard deviation of a
 sample mean  by a factor of  1/2  (double the precision),  the sample size
 must be  increased by a factor of  4.
 5.4 Statistical Inference in Normally Distributed Populations
        As previously mentioned, many statistical techniques are based
 on the assumption that the population of inference follows a normal distri-
 bution.   To  reiterate briefly, mathematical tractability, usefulness for
 approximations, and consequences of the Central Limit Theorem were
 suggested as justifications for using the normal distribution.  Additionally,
 we should note that in several instances in the sciences where
 probabilistic models have been used to describe physical phenomena,
 theoretical  considerations have led directly to a normal distribution.
 In many more cases, it can be shown that a normal distribution approximately
 fits the  situation and, often, an approximate population distribution
 is  quite  satisfactory.  The general rule of operation whenever a con-
 tinuous distribution is thought to describe some population is this:
 if  there  is good reason to assume that some specific continuous distribution
 applies   (normal,  exponential,  Chi-Square,  etc.), then the appropriate
 theory for the particular distribution is employed.  If the form of the
 distribution cannot be determined then,   unless there is definite
 evidence  to the contrary,  a normal distribution is often used in an
 approximate sense.   If the form of the distribution is not known,  and it
 is  definitely felt that the assumption of a normal distribution is not
warranted,  then one usually reverts to one of the several nonparametric
methods.    Only one or two nonparametric methods have been or will be

-------
                                                                     71



discussed, but they are generally straightforward  and uncomplicated to


apply.  One who is interested in nonparametric methods may find them


discussed in references.


        In the following discussions, we shall assume that a normal

                                                                  2
population distribution applies, but that the parameters  y and  a


are unknown and that inferences are to be made concerning them.  We shall


restrict ourselves to the basic elements of statistical inference:


estimation,  testing, and construction of statistical intervals.


5.4.1  A Single Normal Population


        Suppose that a random sample of size  n, (x,, x,, ..., x )  is


drawn from a normal population with unknown mean  y  and unknown variance


a .   Then we know that the following facts are true:


         1.   The sample mean  x  =  (Ex.)/n  is an unbiased estimate of


             y ,    and over repeated sampling,  x  follows a normal

                                                     2
             distribution with mean  y and variance a /n.  The variance


             of  x  is smaller than any other unbiased estimate of y,


             so that  x  is the unbiased estimate of  y with maximum


             precision.   We characterize this situation by saying that


             x  is a minimum-variance unbiased estimate of  y.   If a


             minimunwvariance  unbiased estimate is unique then we say


             that it is  the best unbiased estimate.   Thus, x   is the


             best unbiased statistic for estimating  y .


        2.   The  sample  variance



                          S2 =
                                               2
             is  the best unbiased estimate  of  o  ,  and over repeated

-------
                                                                     72
                              2  2
             sampling,   (n-l)s /o   has a Chi-Square distribution with
             n-1 degrees of freedom.
                             2                                       2
         3.  The statistic  s /n  is the best unbiased estimate of  o- .
         4.  For any yQ, if the true value of  u  is  yQ, then
             v/n(x-y0)/s is distributed as Student's t with n-1 degrees
             of freedom.  That is, over repeated sampling with sample
             size n, with a new x  and  s   calculated for each new sample,
             repeated calculations of v/n(x - MO)/S  would yield numbers
             occurring according to the probability law for a Student's  t-
             distribution with  n-1  degrees of freedom.
        Suppose we suspect that the value of  y  is  yn.  Then we say
that the tentative null hypothesis is that  y =  yQ,  and we may test the
tenability of  yn  by taking a random sample  (x.. , x9, . . . , x ), and from
                ^                               J.   L        n
it calculating  tcal = v/n(x - yQ)/s.  Now, if  yQ  were the exact value
of  y ,  then  tcal  should be a reasonably occurring value of a random
variable having a Student's  t- distribution with  n-1 degrees of freedom.
Suppose, however, that we are mistaken, that y is really y,  <  yfl.  Then,
in this case,  t.^ = \/n(x - y^/s  would have been the proper value to
calculate.   We see that what we actually calculated,  t  ^  and what we
should have calculated,  t, ,  are related by
                            = v/n(x - y

-------
                                                                      73
 Thus, if  y  is really  j^,  some value  less than  yQ, t  , is a number



 less than the Student's  t-value  t^  by an amount   n(yQ - u,)/s  .  In



 such a case,  we would get  a  value  for   tcal  which would tend to be too



 low a value for a  reasonable  t-variable to assume, and the greater the



 difference y^ - y^,  the lower t,al  would be in relation to a reasonable



 t-value.   If, on the  other hand,   y  is some value  y2 > yQ,  then  t



 is  larger than a Student's  t-variable by an amount  v/n(;i2 - MO)/S •



        Therefore,  to test the tenability of the value of  yQ  as the



 population mean, we calculate  tcal -  >/n(x - UQ)/S, and determine if



 tcal  is  a reasonable value  for a  Student's  t-variable with n-1 degrees



 of  freedom to assume.  If  t^  is judged to be too large or too small,



 then we say that there is  strong evidence against the hypothesis that



 M   =  V



        Throughout  this discussion, we have used the word  "reasonable"



without explanation.  What is meant by a  "reasonable"  value of a



Student's   t-variable?  To answer this question,  we should examine a



table of the Student's  t-distribution.  Such a table can be found in a



CRC Handbook of Chemistry and Physics.   For the various degrees  of freedom,



one will find critical values tabulated under columns for p-values



ranging from 0.9 to 0.01.  The p-values refer to two-tailed critical



values,  and the meaning of the terms is as follows:   consider the



tabulated values for,  say,  10 degrees of freedom.  For the  two-tailed



p-value  0.5,  the critical  value of  t  is  given as  0.7.   This means



that  Pr(-0.7  < t < 0.7)  =  0.5, or  50%  of the  t-values fall between



-0.7 and 0.7 .  Reading across, we  similarly find  that 60%  are between



-0.879 and 0.879, since P(-.879 < t < .879)  =  0.6,  ...,  95% are  between



-2.228 and 2.228, and  so  on.   Often,  tail areas  called a-levels are

-------
                                                                     74
sometimes needed.  To find the two-tailed critical value for a given
a-level, one finds the corresponding two-tailed critical value for p = 1 - a.
Also, it is sometimes necessary to know the probability of getting a
value larger than a certain value, say tp, or the area in the right tail
of the t-distribution above  tfl .    Since the t-distribution is symmetric,
the one-tail a-level is  (1 - p)/2, where  p = Pr(-tQ < t < O  as given
by the table.
        Now, suppose that from a sample of size 10, t  , =\/n(x -  yQ)/s
was calculated to be 2.8; is this a reasonable t-value, or is it too
large?  From the t-table for 9 degrees of freedom, we determine that the
probability of getting a t-value larger than 2.8 and that,  if  p  really
has the value  yQ, then on the average, only one sample in a hundred will
give a t  , of 2.8 or larger.  What if the sample had given a t  , of 0.8
instead of 2.8?  In this case, the probability of a t-value larger than
0.8 is about 0.22 .  Should a t-value of 0.8 be considered unusual?  It
should now be evident that  "reasonableness"  is largely a matter of
personal preference.  The probabilities 0.01 and 0.22 are called
significance levels.  The smaller the significance level, the stronger
the evidence against the null hypothesis, but the strength of evidence
needed to discredit the tentative null hypothesis varies from person to
person and from situation to situation.  As a general rule, people tend
to consider a significance level of 0.05 or less as strong evidence
against the null hypothesis.
        It is also possible to construct confidence intervals for the
parameters  p  and  a  .

-------
                                                                     75
 (a)  Confidence Interval for  y



        Since v/n~(x - y)/s ^-'t(n - 1), for any level of confidence desired,



 say 90%, we may find from a t-taWe, a value  tQ such that



 Pr(-tg < t(n-l) < tQ) = 0.9, and we may assert  (before any one sample is



 drawn) that Pr(-tQ < v/n~(x - y)/s < tQ) =0.9.  Changing quantities



 around, we may make the above interval into a fixed-length interval with



 random placement for which the probability statement still holds, that is,
                  Pr(-tQ <>/n"(x - y)/s < t0) =-• U.9
so                PrOtgS/vn < x - y < tgS/v/n) = 0.9,



and               Pr(x - tgS/v/n < K < x + tQs/\/n) = 0.9.




Thus, we think of the limits  x + t_s/n as a random interval, and over



repeated sampling, intervals will be generated, 90% of which will contain



y.  After taking a sample, we have no way of knowing whether the resulting



interval does or does not contain y.  However, we do know that only 10%



of all the intervals which could be constructed in this way would fail to



include y  and, accordingly, we call  x + t^s/v/n"  a 90% confidence interval



for y.


                              2
(b)  Confidence Interval for a
        22     2
(n - l)s /u '-^  x (n - 1), we may find from tables of Chi-Square values
        Suppose a 95% confidence interval is Desired.  Since



                              may i



   and a- such that Pr((n - I)s2/o2 < a,) = 0.025, or
        if             *>               L-

                2  2
      < (n - l)s /a  < a^) = 0.95  (a. and a_ must be found separately



because of the asymmetry of the Chi-Square distribution).  From the


                                                 2             2
probability statement above, we derive  [(n - ])s /a2, (n - l)s /a,]  as

                                2
a  95% confidence interval on  o   .

-------
                                                                     76
Example 16;  For a pilot -plant experiment, 500 gallons of reactant must
be mixed prior to a day's run.  Although some leeway is permissible, for
best results, the pH of the solution should not exceed 6.3 by much and
should not be much below 6.1.  Experience has shown that fluctuations in
measurements of this type tend to be normally distributed.  Let us assume
that this is the case here.  Over an hour's time, 8 samples were taken
yielding pH values 6.23, 6.31, 6.29, 6.19, 6.23, 6.17, 6.15, 6.21 .  On
the basis of this sampling, may the experiment be started?  We calculate
                  x = 6.22, s2 = 0.0031, s/Vn" = 0.0196
If the actual pH is 6.3 or higher, the probability of getting a sample
yielding a t-value as low as -4.08 is small, less than 0.005.  Thus, we
would say that a pH of 6.3 or higher is strongly contradicted by the sample.
Similarly, if the actual pH is 6.1 or lower, the probability of getting
a sample with a t-value as high as 6.12 is less than 0.0005, so a pH lower
than 6.1 is also contradicted by the data.
        A 99% confidence interval on the pH of the solution is 6.22 +
(3. 499) (0.0196), or 6.15 to 6.29.  Using the consonance interpretation, for
the confidence interval, we could have combined the testing into the single
statement that a true solution pH lower than 6.15 or higher than 0.29 is
highly  inconsistent with the sample evidence.
        A confidence  interval on a parameter does not usually give much
information about population members, and generally should not be used for
this purpose.  However, statistics can be calculated from a sample which
may be  used in constructing confidence intervals for population members.

-------
                                                                    77





       Let us first consider tolerance intervals, which are confidence



statements about population proportions.   Let us also assume that the



parent population is normally distributed with unknown mean and variance.



From a random sample of size n, calculate the sample mean  x  and the



sample standard deviation s .   Suppose we desire an interval for which



we can be  100v%  confident that it includes at least 100P% of the



population.  Special tables are available giving tolerance factors



K(n,y,P) for various combinations of sample sizes n, confidence levels



Y, and coverages P.  From such a table, find the appropriate factor,



say K; then  x  +  Ks defines a  lOOvS tolerance interval for 100P% of



the population.




Fjcample 17:  For the data of Example 16,  let us calculate a 95% tolerance



interval for 90% of all pH readings obtained by the same method.  Recall



that  x  =6.22  and  s = 0.0554 .   From a CRC Handbook of Tables for



Probability and Statistics (6), we find in the Table of Tolerance Factors



for Normal Distributions, K(8, 0.95, 0.90) = 3.264.  Thus, 6.22 +_ (3.624)



(0.0554) or 6.22 + 0.1808 defines the desired tolerance interval, and we



say that with 95% confidence, 90% of all pH readings which could be



obtained in the same way as the sample of 8 are between 6.04 and 6.40.



       It is possible to construct distribution-free or nonparametric



tolerance limits whenever the assumption of a specific underlying



continuous distribution for the parent population is not justified;



however, this case will not be discussed.  For a discussion of distribution-



free tolerance intervals, a standard test on nonparametric methods such



as Conover f 9) should be consulted.

-------
                                                                    78

 5.4.2  Two Non-independent Normal Populations
       A very  common and  effective method of experimentation is the so-
 called comparative experiment.  It is used whenever two sets of observations
 are  thought  to be  correlated with each other.  For example, an effective
 way  to test  the  effect of a new feed ration for animals is to select
 pairs  of individuals  from several litters and to give one member of
 each pair the  new  ration.  In this type of experiment, it is the
 difference in  response that is of interest.  The methods of analysis
 for  the  paired or  comparative experiment are applicable to any case
 involving two  samples, whether they are correlated or not, but the use
 of comparative experimentation is most effective whenever pairs of
 "similar" individuals or  objects are used.
       The discussion will be for the case when the respective populations
 from which the pairs are  drawn are assumed to be normally distributed.
 If normality assumptions  do not appear to be warranted, a comparative
 experiment can still be analyzed with one of the non-parametric
 techniques such as the sign test, the Wilcoxon sign-rank test, or the
 Walsh  test (see Siegel (31) ).
       Let (Xp y^),  (x2, y2), ..., (xn, y )  be  n  pairs of observations
 drawn  at random from normal populations with the x's having a normal
       2                                                  2
 (wp a^)  distribution and the y's having a normal (p2> Qj) distribution.
Also assume  that each  xi  and  yi  is correlated, with correlation
 equal  to p.   Note that because of random sampling, there will be no
 correlation between  x.  and  x., y.   and  y., or  x.  and y-  for i ? j.
 Form a "new" sample of differences  (d,, d2, ..., d )  by taking
 d^ = x.  - yi-  Then the following facts hold:

-------
                                                                    79





          1.  If the x's and y's are normally distributed, then the d's



              are also normally distributed with mean U, = M-, - M?  ^^



                          222
              variance   a, = CT, +  t al), as determined from



the table, is small,  we would say that there is strong evidence that



there is a real difference of at least  d .

-------
                                                                     80
        To get a 100p% confidence interval on the true difference,
from the t-table for  n-1  degrees of freedom, find  tQ  such that
Pr(-tQ < t(n-l) < tQ) = p.  Then d + tQs-  defines the lOOpfc confidence
interval.
Example 18:   In a coal-fired boiler, Unit 10 at the Shawnee Steam Plant,
Paducah, Kentucky, there are essentially two separate furnaces, the East
side and the West side.  During limestone injection testing on the unit,
although input and control variables for the two sides were thought to
be approximately the same, with respect to some variables the two sides
seemed to be consistently different.  Without trying to explain the
differences, it was decided to try to determine* on the basis of the
variation observed in the variables, whether or not there was any evidence
that real differences existed.  The complete data will not be given.
The number of paired observations available for testing each variable
varied,  but where readings were taken for both East and West sides, a
difference  d. = (East reading) - (West reading) was formed.  Thus, for
each variable, a sample of  d.'s was formed, a  d calculated, and
s^  estimated.  With the assumption that the variation in the variables
is governed by normality, t-values such as  t  , = d/Sj  may be
calculated and tested.   The summary of results is given below:
Variable
Initial S02 Concentration
Plane M Temperature
% Limestone Utilization
% Excess Air
% Hydration
I S02 Removal
d
32.3
-9.9
-.97
.88
-1.8
-2.0
i
i
1 sa
14.7
9.2
.19
. .52
; .59
i
i -5
*«
2.
-1.
-5.
1.
-3.
-4.
al
2
1
0
7
4
0
i
df
i 100
I 63
! 97
1 95
45
97
Conclusions*
East
East
East
East
East
East
» West
^ West
« West
^ West
« West
« West
              *»(«)   means  "is significantly higher (lower)  than"
                       means  "is not significantly different from"

-------
                                                                     81






 5.4.3   Two  Independent Normal Populations



        Often when  two samples are taken, there is no reason to consider



 them related  or dependent.  Let  (Xj, x2, ..., XR)  and  (y, y2, ..., y )



 be  respective random samples from normal populations with x.  N(M.. o2)



 and y..  N(y2, o2).  Then we have:



            1.  x  is the best unbiased estimate of y,; y is the best



               unbiased estimate of  y2; and over repeated sampling,



               x~N(y1, a2/k)  and  y/*N(u2, o2/m).



            2«  x - y  is the best unbiased estimate of y,  - p7, and over
                                                        J     &


               repeated sampling, (x - y)«/N(y1 - y£, 02/k + o2/m).




            3.  s2 = jlj [ix2 - Ux.)2/k]  and s2 =  -^ [Ey2 - (Ey.)2/m]



               are, respectively, the best unbiased estimates of  y2,


                2

               a2 '


                    22     2
           4.  If  a^ = a2  = a , say, then the best unbiased estimate



               of  a   is a weighted combination of the estimates


                2        2
               s,   and  s«.  The combined estimate Is called a pooled



               estimate and is given by



                  2   (k-l)s2 + (m-l)s2,     [Ex2 - (£x.)2/k] + [Ey2 - QyA)2/m]


                 s
                          k + m - 2                      k + m -  2



           5.  An estimate of the standard deviation of  x - y ,



                               V2    C2
                              *1     7       7      7

                              -T  * -I  if  "1  ••  °2 •   "*
                      1        -r
                      - •  s    if
        It is very important in the analysis of two independent samples


                         2        2
to know whether or not  a,  and  o_   can be considered to be the same.



This can be tested with an F-test for equality of variances as follows:

-------
                                                                      82


                    2                              7
          Suppose   s]L   is  somewhat greater than  s^  so that we may suspect
         O                    ^
  that  a1  is greater  than  o2  .  Now,  (k-l)s2/o2 and (m-l^/a*  are

  distributed as Chi-Squares with k-1 and m-1 degrees of freedom, respectively.

  Recall  that a ratio of Chi-Square random variables divided by their degrees

  of freedom has an F distribution; thus,


            (sjVo2)

            —2—2~   ^ias an F distribution with k-1 and m-1 degrees of
            (s2/a2)
                      freedom.

      2     2
  If  o-j^ = a2 ,  then the variances cancel out of the expression.   Therefore,

      22           22
 if  0! = <*2 '  then  s2/s2  should be a reasonably occurring value of an

 F random variable with k-1 and m-1 degrees  of freedom.   We may compare the

            2   2
 value of  s.j/s2  obtained with tabulated values for an F(k-l, m-1).   Such

 a table  can be found in any applied statistics  text and  in a CRC Handbook

 of Chemistry  and Physics.  If  s2/s2  is "too  large,"   then it  is concluded
       2                            j
 that   QJ  is  probably larger  than   a^,  otherwise, we say that there  is

 insufficient  evidence to  consider   o2   and   a2,  to  be different.  Depending

 upon  the results  of the F-test, we  can  take  one of  two alternatives:


 Case  1;   Variances Assumed the  Same
                                2         T
         If  one concludes  that   a^  and  a2   are probably  about the same,
       2       2
 then s1   and s2  can be regarded as two independent estimates of the same

 thing, and  they can be combined as in 4, above, for a pooled  estimate s2,

 which makes use of all the  observations.

        Suppose an experiment yielded a positive difference  x - y,  and

suppose a real positive difference of magnitude  d  is practically significant.

To test whether there is evidence that  i^  is greater than  p.  or  d  or

-------
                                                                     83
more, form
           t   . =    ,   J'- •"..-•  based on  k + m - 2 degrees of freedom,

            cai
and compare t  . with tabulated values of t(k +• m - 2).  If



Pr(t(k + m - 2) > t  ,)  is small, then we say that there is evidence of a



real difference of at least  d.



        A 100p%  confidence interval on  y, - n2  is given by   (x - y) +
toVlc  + in s '   wnere ^o  is a value from the t-tab.le for  k + m •• 2



degrees of freedom such that Pr(-tQ < t(k + m - 2) < tQ) = p.




Case 2:  Variances Assumed Different
                                              2

                                  2  ,  2  4! sl

                                 'l^"2'  IT +
        Unfortunately, whenever  a, f «;.,  \-jr- + -r   does not foim the




denominator of a  t  random variable except in an approximate way.  The



problem of testing and constructing confidence intervals for the case


 2    2
o, ^ o_  is one of some note and is known as the Behrens-Fisher problem.



There have been several satisfactory approximate methods proposed; the



following one is called the Aspin-Welch solution:


               2                    22                    2
        Let  s-, /k  be denoted by  s- ,  let s9 /in  be denoted by  s-, and
              -L                     J\.         £t                     y

      2222                     2
let  s-  +  s-  = s, /k + s2 /m  be denoted by  SQ .  Then, if the true



difference  y, - u-  is  d,





                            f =  (x - y) - d



                                       S0




has an approximate  t  distribution with  r  degrees of freedom, where  r



is computed by



                                                2
                         '0

-------
                                                                      84

 Testing  and confidence  interval calculations are similar to Case 1,
 except   t'   and  SQ  are used.
 Example  19:   (Snedecor  (32))
        A quick but imprecise method of estimating the concentration of a
 chemical in a vat has been developed.  Eight samples from the vat are
 analyzed by the quick method, and four samples are analyzed by the standard
method which is precise but slow.   It is desired to determine whether
results by the quick method are biased.   The data and calculations are:

      Standard Analysis (x)                        Quick Analysis fy)
               25                                         23
               24                                         18
               25                                         22
               26                                         28
                                                          17
                                                          25
                                                          19
                                                          16
          x =  25                                    y  =  21
         s!   =  0-67                                S22  =  17.71
  s|=S]L2/4 =  0.17                         s|= s22/8  =  2.21
                            SQ2 = 2.38

            1.1/0.17V.  1/2.21^  .  ^
                               r = 8

-------
                                                                      85




                                   2         2
         First, it appears that  s1   and  s2   are  not  estimating  the same


 thing.   Fca^ = 17.71/0.67 = 26.43 with  7   and   3 degrees of  freedom  supports



 this, as the probability of an F-value as  large as  26.43 occurring (if  a-2

                      2
 is no larger than  c^ )  is less than 0.025 and  only slightly more  than


                                      2     7
 0.01 .  Thus we should assume that  a,   ± o-  and use   t1  for testing



 and confidence intervals.   Let us evaluate the  probability of getting a


 difference of  x -  y - 4  if the quick method is not biased.  We use




                    = 4/V2.38 =2.6 with r  = 8   degrees  of freedom.
 The probability of  getting  a value as large as  2.6  (if  y9 - y, = 0) is
                                                          Li    JL


 less than 0.17  .  Since this is small, there is strong evidence that the



 quick method of analysis consistently underestimates the concentration.



         Let us  take this opportunity to illustrate the use of: a one-sided



 confidence interval.  We shall find a number which we are  90%  confident



 that x  -  y     exceeds; that is, a lower 901 confidence limit for the



 difference \i-^  - P2.  From  a t-table for 8 degrees of freedom, we must



 find a value  tQ  such that  Pr(t(8) < tQ) =0.9 .   The value of  tQ  .is



 1.397, so  that   (x  - y) - tQs-_-  = 4 - (1.397)( 2.38) = 1.84  is the



 desired  limit.  Hence, we say that we are 901 confident that the difference


 M,  -  iJ~  is at  least 1.84 units.




 5.4.4  Several  Independent Normal Populations



        For more than two populations, we will consider only the simplest



 case.  Let (xu, x12, ..., x^) , (x21, *22, ..., x^) . ...,  (xkp xk2, ...,



x^)  be  k  samples all of size  n  drawn from normal populations such



 that for any i = 1, 2, ..., k.  x^'N&i^,  o2)  for all j  = 1,  2,  ..., n.



That is, all  k  samples are from normal populations having  the  same

           2
variance  a   but possibly different means  u..   Stated in general terms,

-------
                                                                      86
 the goal is to draw inferences  of some sort  about  the means  y..
 Translated into specific terms, one or more  of the following questions
 (as well as some others) might  be asked:
            1.   Are there any differences  in  the y., and  if so, where are
                the differences?
            2.   Can some grouping pattern  in  the y.  be  determined?
            3.   What can be  said about  some specific differences, or
                combinations of  differences?
            4.   How may  confidence intervals  be constructed for some or  all
                of the   y.   and  for some or all of  the differences
                y.  - y.?
                 i   3
         The answers to  these and other similar questions have been
 attempted by the  development of multiple  inference procedures.  The
 problem  of  multiple inferences  has been kicked around for years, and the
 theory is by no means complete  on the  subject.  One of the basic difficulties
 seems to center around  how  to deal with error  rates.  The situation is
 most easily explained by using  confidence intervals.  Assume  that one
 adopts a fixed  confidence level,  say 99%, for  constructing all his con-
 fidence  intervals;  then we would say that his  error rate  is  1%,
 because  on  the  average,  1% of all the  intervals he could construct would
 fail to  cover the parameter of  interest, whenever he continued to take
 independent  samples from the same population.  Moreover,  if he constructed
N confidence intervals, he should expect (0,01)N of them to be wrong.
However, what seems to bother people is the fact that with an error rate
of  a,  the probability of getting  k  or more independent confidence
intervals without an error is less than  1 -  a  and is actually (1 -a)^"1.
For example, with  a = 0.05, the probability  of getting 10 or more

-------
                                                                      87





 independent confidence intervals  without  an error  is  only about 0.6  .



 Ordinarily, people don't get  too  upset  about it  and accept as  a  "fact



 of life"  that regardless of  how  unlikely an event may  be, if   "fate"



 is tempted long enough,  the event -is  eventually  s.Lx-st  sure to occur.



 Perhaps, the difficulty  is largely psychological,  that  there is  a natural



 reluctance to  "push one's luck too far"   at any one  time.   In spite of



 the theoretical controversy surrounding multiple inference procedures,



 they are considered by most to be useful  techniques and are, in  fact,


 widely used.



         We will not consider question 5   in any detail, but wL]l try



 to  show  what  may be done  to answer 1> 2, and 3.  Let us consider


 3   first.   Whenever more  than two means are involved,  the concept of



 difference can  be generalized by  the use of cpntrasts_ or comparisons.



 Consider a linear combination of population means  r,y. + r2p2 +  ... +



 rkuk  such tjiat  Zri = °'  Such a linear combination can be estimated


 unbiasedly by the same linear combination of sample means r,x,  + rrx7 +
                                                           _L JL    £» &


 '" + rk"\*  where  x.^  is the sample mean of the ith sample



 0*11' *i2,  •••> x^n)»  and any linear combination  Er.x.  of sample means



 is called a contrast or comparison on the means if  Er.  = 0  with not all



r. =- 0.   Let  C = Er^  be a contrast on sample means.   Then:



           1.  Over repeated sampling,  C has a normal distribution with


                                              2   2
               mean  Ir.^  and variance  (EI^ )ai  /n^   for unequal sample



               sizes and unequal variances.


                                                7       77
           2.   The best estimate of Var(c) is SG  = (Er. )SQ /n, where


                 2                              2
               sr<   is the pooled  estimate of  a   based on k(n-l) degrees



               of freedom; i.e.,

-------
                                 s 2 + ... + s 2]    k   7
                                  2 -- JL_  = Z s 2/k, for
                2
              s.  =  sample variance of the ith sample  =




              -ir[z x..2 -  Ex.. 2/n] .

                  j  13     J  13


          3.  Let GO = Zr.y., then (C - CQ)/SC  is distributed as a


              Student's  t  random variable with k(n-l) degrees of freedom.



        Note that a contrast may involve less than all sample means, since


some of the  r.  are not prevented from being zero.  Thus, a  difference


x. - x.  is a contrast, and the estimate of the standard deviation of


x. - x. can be seen to be /2/n SQ  based on  k(n-l)  degrees of freedom.


Thus, for any general contrast C, t = (C - CQ)/s   is appropriate for


testing and confidence intervals, and for contrasts which are simple


differences, t = [(x.-x.) - (y.-y.)]/v/5/n SQ  is the appropriate quantity.


For example, to test whether there is evidence that  y, ^ u2 > ^orm


t  , * (x,-x2)/v/57n SQ, and compare with tabulated t(k(n-l)).


        One of the problems associated with testing only certain contrasts


is that an experimenter is sometimes told that the contrasts to be tested


must have been planned before the data was taken and that making unplanned


tests is to be discouraged.  There are theoretical justifications for this


advice, but it is often puzzling to someone to be told that the statistical


significance of his results depends upon whether he planned to find them


before the experiment or not.


        When simple differences are of interest, a way to partially over-


come the dilemma is to say that all possible differences will be of interest

-------
                                                                     89





and to employ methods appropriate for answering question 1, "Are there



any differences in the  y., and if so, where are the differences?"



There are several multiple comparison procedures available for handling



the situation, but we shall discuss only one, the protected LSD procedure.



The procedure is implemented in two steps.  With Step 1, the first part



of question 1, "Are there any differences in the p.?" can be answered



by an F-test.  Let  x = z x./k = l[z x../n]/k  and compute

                        i        i j  1J




                2          l2 - kx2)]  .
              2   2
Then F  1=5,  /SQ   is distributed as an F(k-l, k(n-l))  random variable



if  y, = y9 = ... = y , and should be larger than F(k-l, k(n-l)) otherwise.
     j.    L          n


If we deem F  , to be too large to be a reasonable observation from the



distribution of F(k-l, k(n-l)), then we say that there is evidence that



there are differences in the y..  It is important to note that an F-test



can only tell us whether there may be differences in the means y. , but



that it gives no indication of any kind, as to where the differences may



be.  This must be determined by Step 2 of the protected LSD procedure,



which is a series of t- tests.  Let some constant significance level a be



chosen for the individual t- tests; then a simple method of performing the



tests is to compute the a- level least significant difference,



LSDQ= tQV27n sQ, where  tQ  is the value of  t(k(n-l)) for which



Pr(-t0 < t(k(n-l)) < tQ) = 1-a.  The individual differences  x.-x. are



then compared directly with LSD , and any differences exceeding LSD
                                                                   a
are said to be significant at level a.
Example 20:  Suppose the following experiment was performed.  To determine



if four particular limestone types showed any detectible differences in



reactivity under laboratory conditions, six 10-gram samples of each

-------
                                                                      90
 limestone type, pulverized to a Fisher size index of 3.0,  were subjected
 for 1 hour at 2400°F  to a simulated flue gas mixture having an initial
 S02 concentration of 2000 ppm.   At the end of each test, each sample was
 analyzed for per cent conversion by determining the fraction of unreacted
 limestone.   The data summary and calculations are given as follows:
Limestone Type:


% Conversion:



Means:
sb'
so2
Fcal
A
54
62
58
67
46
85
62
= 545.3 with
= 100.9 with
=5.4 with
B
68
81
87
72
75
67
75
3 degrees
20 degrees
C
65
83
79
61
53
66
66
of freedom
of freedom
D
45
56
39
54
60
58
52


3 and 20 degrees of freedom
                LSDQ<05= VlOO.9/3  x  2.086 = 12.1

Step 1.  The probability of an F(3, 20) as large as 5.4 is less than
(about 0.0075), so there is strong evidence of some differences somewhere.
Step 2.  A convenient way to carry out the procedure is to use a ranking
and underlining method, whereby series of underlinings are made under the
ranked means with the interpretation that any group of means with a
continuous line under them are to be considered not significantly different
from, one another.   It may also be helpful to rank the means by simply
plotting them on a horizontal axis.   Figure 4,  on the next page, illustrates
Step 2.

-------
                                                                       91

50



(P)
—@ — — 	
55
, 	 in —
i


(A) (C)
© ©
60 65 70
	 : 	 A 	 ' o
H 	 	 y 	
i


(B)
75
i
i


         Figure 4. Results of the 0.05-level LSD test (LSDn n  = 12.1).
                                                          U • UD

 Thus it is concluded that at the 0.05-level, the significant differences



 Sre  MC " UD» yB ' UD'  and  ^B " PA'


         Let us now turn to question 2, "Can some grouping pattern in


 the  Vi  be determined?"  Unfortunately,  although this  question is


 frequently asked as a result of experiments involving several (possibly)


 different populations, very little  lias been done toward finding viable


 procedures to answer it.   The majority of existing multiple  comparison


 procedures do not yield solutions,  some because  of logical complications.


 Consider,  for example, what happens when  we try  to use  the protected LSD


 procedure to  determine a grouping pattern.   On the basis of  the results


 of  the  0,05-level LSD,  as  seen  above,  Limestones D and  A belong together,


 Limestones A  and  C belong  together, and Limestones C and B belong


 together.  Logically,  then, all four belong  together.   But that  cannot


 be, because Limestones D and C are different, as are Limestones D and B,


 as well as Limestones A and B.  Thus, a logical contradiction is reached.


        Recently, two procedures have been proposed by Jolm R. Murphy


 (28), which have been designed specifically as grouping procedures.


 They are the studentized maximum gap procedure and the studcntized range-


maximum gap procedure.  Let us discuss the second procedure.   For a


sample (x;L, x2, ..., xn) from a normal  (u, o2) distribution,  let


     ' x(2)» •••> x(n)^  represent the ordered sample;  i.e.,

-------
                                                                      92

      =min{x1,x2, ...,xn}, ...,x^ = raax{x1, x2	xn}.   Then,
 over repeated sampling, the statistic (x(n) - x(1))/s  is called the
 studentized range (with n-l degrees of freedom)   and has a well-known
 distribution which has been computed and tabulated.   (Some of the
 multiple comparison procedures not discussed are, in fact, based on the
 distribution of the studentized range.)   The appropriate studentized
 range statistic for a set of  k  means (xp x2,  ...,  xk),   each based
 on  n  observations,  is ^(x^ - x^)/^ with  k(n-l)  degrees of
 freedom.   With a table of critical values of the studentized range
 available, the studentized range-maximum gap procedure  is carried out
 as follows:
         1.   Rank the   k  means  (order  of increasing magnitude, say)  and
             compute the magnitude of the gaps  (differences between
             adjacent  ranked means).
         2.   Compute the studentized range for the whole group and test
             for  significance.
         3.   If the studentized range is  judged not significant,  stop
             testing.   If  the studentized range is judged significant,
             break the means into  two groups by separating it between the
             two means having the  largest gap between them.
        4.   Repeat steps  2 and 3 for the two new groups formed and
             continue repeating 2 and 3 for each of the new groups at the
            previous stage until no more studentized ranges are  declared
            significant.
        The grouping of the means is then determined by the breaks declared
over all stages.

-------
                                                                     93


Example 21;  For the limestone data of the previous example, we calculate

sJ n =  100.9/6 = 4.1  with 20 degrees of freedom.  The ranked and

plotted means were:

           CD)                    (A)         (C)                  (B)

      	©	@	@	®	
       50          55          60          65          70          75
            J	10	1—-4	-j	9	|



        The observed studentized range for all four means is  23/4.1 =5.6

with 20 degrees of freedom, and the probability of a value this large is

less than 0.01 .   Thus, a break is declared between  x,.  and  x., and at

this stage, we now have two groups, D by itself,  and A, C, and B together.
                                   i
To determine whether the group of three should be further broken down,

we calculate the studentized range for the three means as  13/4.1 = 3.17 .

The probability of a value as large as 3.17 is about 0.09 .   One might  or

might not call this a significant result, depending upon his inclination;

but if 3.17 is judged to be a significant value for the three-mean

studentized range, then another break would be declared between x., and XR.

The difference between x. and xr  would not be significant,  and the final
                        A      L*

grouping would then consist of three groups:  Ljjnestone D by itself,

Limestones A and C together, and Limestone B by itself.


5.4.5  Analysis of Variance

        Inferences about the means of several independent normal populations

with common variance are sometimes presented within the context of a

one-way classification analysis of_ variance.  The analysis of variance is

a very useful statistical technique and is much more general than what

will be presented here.  Its basis derives from the fact that for any

sample or group of samples, the total sample variance can be algebraically

partitioned into components which are usually meaningful.  Let us refer

-------
                                                                     94
to the samples from the different normal populations of the previous
section as classes, and show how the partitioning of the total sample
variance may be done for that case.   Recall that we had a total of  kn
observations (x-,-,1 x^ •••, x,  ) , . .., (Xjj,  x,2» •••> xlnP »  w^ere ^
was assumed that for each  i = 1, 2, ..., k (x.,, x.-,  ..., x. )  was
                             2
a sample from a normal (y., a )   population, with some  of the  \i.  being,
perhaps, the same.  Now, each observation  x. .  may be  represented as,
where x. = Ex. ./n  and  x = £x./k = Elx../kn .   Thus,
       1   j 1J              i x     ij ^
For fixed  i, sum both sides with respect to  j.   Then
     E(x  -x)2 = z(x -x)2 + 2(x,-x) rfx-.-x,) + Z(x,.-x )2
The cross-product term vanishes, because   s(xji~*j) = Zxii  " "^ = °-
Now, summing on  i,
                              2      -  - 2          -  2
                     ZE(x..-x)  = nS(x.-x)  + EE(x..-x.)  .

The quantity on the left is called the total sum of squares,  and the
two quantities on the right are called, respectively, the sum of squares
between classes  and the sum of squares within classes.   The sum of
squares between classes measures the failure of the class means to all
be the same value  x; that is, their variation about x.    For fixed  i,

-------
                                                                      95
 l(x..-x.)    measures the failure of  tlie observations in the ith class to
 be equal to the class mean x.;  that is, the variation of the observations
 in the ith class about the ith class mean  x..  Adding together all  k
 of the within-class  variation terms, we get the total within-class
 variation term, the  sum of squares within classes.  Now, note that
sb  ' FT
 Also,  recall that
                                     FT
                                    (n-l)s22 +  ... +  (n-l)sk2]
                                   +  E(x2j-x2)  +  ...  i
                         EE(x..-X.)2  .
                  k(n-l)  ij   1J   1

 It is now evident that the  sum of squares between classes and the sum of
                                   2         2
 squares within classes are  just  s,    and  SQ  , multiplied by their
 respective degrees of freedom, k-1 and k(n-l).  The information can be
 neatly summarized in an  analysis of variance table, sometimes referred to
 as an AOV or ANOVA.
        ANALYSIS OF VARIANCE TABLE FOR A ONE-WAY CLASSIFICATION
                         WITH EQUAL  CLASS SIZES
Source of   Degrees of  Sum Q£ Squares     Mean square
Variation     Freedom            l
                                                             cal
1
Total j nk - 1
Between k-1
Within k(n - 1)
_ 9
? *• ij *
inEC^-x)2 , ^itx.-x)2
1 1 - v2 I 1 1 - -i2
• • ij i i kfn-1) ij i
i] )
p _ between mean sq
within mean sq
(with k-1 and k(n-l
degrees of freedom)

-------
                                                                     96






A  feature of an AOV table is that the degrees of freedom and sums of



squares columns  "add up;" for the table above, in both columns,



Total = Between + Within.  This property makes an AOV table a convenient



means of  "variation accounting," and is partly responsible for its great



appeal.  Another appealing feature is that one or more of the mean square



quantities (which are simply the corresponding sums of squares, divided



by their respective degrees of freedom) can often be regarded as estimates



of things of interest.  For example the  "Within Mean Square"  above is


            2

actually  SQ , the best unbiased estimate of the common unknown variance


 2

a   of the normal populations from which the  k  samples were assumed to



have been taken.  In some complex cases, the AOV mean squares are the



only way to obtain reasonable estimates of unknown population variance.



        Let us construct an AOV table for the Limestone data of the



previous section.  As a first step, the expressions which define the sums



of squares do not lend themselves well to computation, and can be put



into more convenient forms for computation purposes:



           1.   Define  x.. = llx..  and  x-  = Ex..,  so that  x..  is

                             ij V        x>   j V


               the total of all observations and x.  is the total of the
                                                  i.


               observations in the ith class.



           2.   Since  nkx = zzx.. ,  define the  "correction factor"  as

                            ij ^



                        CF = nkx2 = (EZx..)2/kn = x..2/kn .


                                     ij  J



               Then


                                             - 2        2-2        2
               Total Sum of Squares = EZ(X..-X)  = zix.. -nkx  = zzx..  - CF.

                                      ij   J       ij  J         ij  J



                                        -22
           3.   Since nx. = zx.. = x. , nx.  = x.  /n.  Thus,
                       1   - 1J    1»    1     1 •

-------
                                                                     97
                                           -   - 2     -2     -2
               Between Sum of Squares = nE^-x)   = nzxi  - nkx



                                                  = EX.   /n - CF .
                                                    i 1*

                                                 2          2-2
          4.  Within Sum of Squares = l[£(x..-x.) ] = z(Zx., -nXj )
                                      i j   J         i



                                                    = z (ix.. -x. 2/n) .
                                                      i j ij  i.


Now, for the Limestone data,  n = 6, k = 4, Eix.. = x..  = 1530,

                                            ij  J

EEX..2 = (54)2 + (62)2 + ... + (60)2 +  (58)2 = 101192, and CF = (1530)2/24

ij ^


= 97537.5 .  Thus


        Total Sum of Squares = i01192 - 97537.5 = 3654.5


        Between Sum of Squares = [(372)2 + (450)2 +  (396)2  *  (312)2]/6 - 97537.5


                               - 99174  - 97537.5 = 1636.5


        Within Sum of Squares =  [(54)2 +  ...  +  (85)2 -  (372)2/6]


                              +  [(68)2 +  ...  +  (67)2 -  (450)2/6]


                              +  [(65)2 +  ...  +  (66)2 -  (396)2/6]


                              +  [(45)2 +  ...  +  (58)2 -  (312)2/6]


                              =  890 +  302 +  488 + 338  =  2018


Note that the Within Sum of Squares could have been  obtained by subtracting


the Between Sum of Squares from the Total Sum of Squares,  and this is


usually the recommended procedure; however,  computing it  both ways does


provide a check on the arithmetic.  The ADV  table  is, then,
   urce of           Degrees of Freedom  Sum of  Squares  Mean Square   F
  ^nation _
 Total                        23             3654.5


 Between Limestones            3             1636.5         545.5      5.41


 Within Limestones            20             2018.0         100.9

-------
                                                                     98
        As a final word, let us sound a note of caution.  The analysis
of variance has become a popular statistical tool to data analysts
everywhere, and rightly so, for it is convenient and orderly, and
sometimes it is the only way to derive certain estimates.  There is a
tendency for some, however, after gaining some familiarity with the
techniques involved, to perhaps regard analysis of variance and analysis
of data as one and the same thing.   This, of course, is a mistake,
in that the analysis of variance is but one of several statistical tools
available for the data analyst to draw upon.

-------
                                                                     99
                    6.  EXPERIMENTAL TEST PROGRAMS

6.1  Introduction
        In recent times it has become more and more common to see
considerations of  "statistical design"  cropping up in experimenta.1
testing programs, and it is reasonable to ask  "why?".  Perhaps, as some
contend, it is just a faddish thing which, in a few years, will cease
to be fashionable; but, as ones who are called upon to apply the principles
of statistical design time and again, we prefer to believe otherwise.
In that regard, we should note that, as with most disciplines, statistics
is a dynamic subject, constantly changing to address new problems and
to deal more effectively with old ones and, as a result, specific
techniques will come and go.  However, the principles upon which the
various techniques are based will cease to be valid only when our
present methods of scientific reasoning become absolute.  Thus, our
answer to the question of why the prevalence of statistical design in
experimentation is that the underlying principles have been found to be
useful and applicable in a wide variety of experimental situations.
        In this short chapter on experimental design we hope to
accomplish the following:   (1)  introduce the reader to some of the
basic concepts of experimental design, (2)  equip the reader with enough
of the basic ideas so that he can make some application of the subject
of experimental design, and (3)  help the reader to consult with a
statistician on the problems of experimental design.
        As far as the first objective is concerned, it should not be
expected that a single course or a single book on any subject will provide
an in-depth knowledge of a subject.  It can only provide an inti'oduction.
To the practicing engineer who has designed many tests and experimental

-------
                                                                     100
 programs, the need for an introduction may not appear obvious.  We can
 only say that many scientists and engineers have benefitted from an
 introduction to experimental design.
         We should hope that, in addition to providing an introduction,
 this effort will also provide enough information so that the readers
 for whom it is intended will actually be able to make use of it in their
 work.   It will shortly be apparent that most of the material of this chap-
 ter  can be studied independently of the first four chapters and most  of
 the fifth chapter.   Therefore,  it is possible for the reader to jump very
 quickly to what we  have to say  about experimental design.   It  apparently
 was the purpose of  Snedecor in  his monumental work (32)   to provide the
 researcher with methods which he could apply almost immediately.   We hope
 to emulate his example.   Of course it is not our purpose to write a text-
 book, giving a thorough coverage of every topic.
         Despite the natural  human inclination to want  to do everything
 for ourselves,  it is often worthwhile to avail ourselves of the expertise
 of a consultant.  Untold numbers  of scientists and  researchers  have
 found their work strengthened and clarified by consulting with  a
 statistician  in the planning stages  of their research.   The researcher
 going through this experience for the first time may find that  the
 statistician uses many seemingly strange terms  and unfamiliar concepts.
 It  is hoped that this chapter will help the reader to benefit from the
 experience of consulting with a statistician.
 6.2  Data
         In the scientific disciplines, we are accustomed to thinking of
data as  information arising from purposefully designed investigations.
However, we should remind ourselves that, as the term is commonly used,

-------
                                                                     101
 it refers to information arising  from any  source whatever.  We hear  a
 lot nowadays about the  "data explosion,"  and  it therefore  seems
 appropriate to discuss briefly some  interrelated aspects of the subject
 of data.   First,  let us  consider  how it may be viewed from  the standpoint
 of answering the  question,  "What  should be done with it  (or to it)?"
 Earlier,  it was pointed  out that  the function  of condensing and summarizing
 the information in a set of data  is  not generally included  within the
 category  of statistical  inference.   However, in view of recent, technical
 improvements in our ability to obtain more data and obtain  it faster,
 the problem should by  no means be considered trivial or unimportant.
 There  is  no doubt that more imaginative and efficient methods of
 extracting and summarizinf  sal- -tit information from data are needed, and
 some of the tools of statistics can  be of  great assistance  in that regard.
 However,  emphasis upon the  ability to obtain more and more  data sometimes
 goes hand-in-hand with the  attitude  that the more data, the better,  and
 it  is  this attitude we wish to criticize.  Such thinking is to be
 deplored  because  it fosters the belief that if a large enough data base
 is  created,  then  the probability  of  finding a  meaningful trend somewhere
 is  greater.   It can also  lead  to  the belief that it is valid to search
 a set of  data until some  "unusual"   trend is  discovered, and then to
 proceed as  if the subsequent conclusions based on this trend were fact,
 rather than hypotheses to be further  examined.  We regard such behavior
 as  totally unscientific.
        Let us now examine the  gathering and analysis of data in the
 context of  the scientific method. While there are several important
 facets, let us agree that the scientific method is a cyclic activity
with two  important but separate parts to the cycle.   One part of the

-------
                                                                    102

cycle is the study of existing data for trends, and combining the
resulting discoveries with established knowledge to form tentative
hypotheses about those trends.  The other equally important part of the
cycle is the purposeful experimentation (gathering of new data)  designed
for the purpose of determining whether the hypotheses are supported or
discredited by further observation.  It is evident that the practices
described in the preceding paragraph emphasize the first part of the
cycle to the almost total exclusion of the second.  Thus, what it comes
down to is that data should be taken with a purpose in mind.  One must
have a clear idea of what a set of data is to show in order to decide
after the fact whether this or that trend has been demonstrated.  A
large data base of  "historical"  information is valuable only when it
leads to theories which may be tested or questions which may be answered
by further experimentation.  The gathering of new data is justifiable only
when it serves to resolve the questions raised.  The cyclic nature of the
process is evident, because the new data becomes historical data, and
frequently suggests further hypotheses or questions to be resolved, and
the cycle is thus repeated.
        For all the remaining discussion, we shall be talking about
gathering and analysis of data with respect to the second part of the
investigative cycle; that is, data which is collected as the result of a
purposeful experiment.  It can be argued that all data contains useful
information, but let us remind ourselves that we are dealing in terms of
statistical and scientific inference.  In such a case, we must be able
to consider the data under examination to be a legitimate and useful
sample from some larger population which we wish to say something about.
Usually it is difficult, if not impossible, to justify any relation

-------
                                                                     103
 between sample and population for some particular set of data when the
 purpose and manner of its taking are not known.   Even when both of
 these requirements are met, it is sometimes an eye-opening exercise to
 stop and consider what the legitimate population of inference really is.
 One can be quite surprised and thoroughly disappointed to find how restricted
 that population is compared to the population he really wanted to sample.
 Thus, the advantage of carefully planned data collection is that we are
 thereby forced to think about  why and how it is  to be done which,  in turn,
 requires that we define the population of inference,   ft. is only then
 that the principles described  throughout this handbook can be applied.
 To reiterate, once that population has  been delineated,  then one can do
 his best to draw a random sample from it.   And we  have seen that,  as a
 consequence of such a  procedure,  the  powerful mathematics  oi' the theory
 of probability may be  applied  to  the  problem,  which can give us  a measure
 of the  risk and uncertainty associated with the  inferences  that  are
 drawn from  sample to population.  The upshot  of  the preceding discussion
 is  that  the analysis and interpretation  of data can be accomplished with
 at  least a  degree of objectivity, rather than being based largely upon
 opinion and  subjective judgement, and that such an accomplishment is
made possible by  proper data gathering techniques.
        To  summarize, in this section, we have pointed out that the term
"data," in relation to inference, means something other than a volume
of  information to be sorted through and manipulated.  In a manner of
speaking, we are actually referring to information which has not yet
been collected.  We have once again emphasized the importance of the
data gathering step and have stressed how careful planning can reduce
the difficulties and risk associated with drawing inferences.  However,

-------
                                                                    104
the heart of the matter is this:  regardless of whether careful planning
is involved or not, the manner in which the data is collected must be
known, for the analysis and interpretation (inferences to be drawn)
depend directly upon it.
6.3  Comparative Experiments
        A definition of  "experiment"  given by Webster's dictionary is
"A test made  to prove or disprove something doubtful; an operation
undertaken to discover some unknown principle or to test some known
truth; as, a laboratory experiment in physics."
        Although there is little difficulty in achieving a consensus on
the meaning of the words experiment or experimental, it seems that there
is greater difficulty in attempts to distinguish between two types of
experiments, comparative and absolute.  This distinction is discussed by
Kempthorne (22)  who references Anscombe (1).
        Basically the idea of an absolute experiment is that the experiment
is being conducted to determine the absolute value of some quantity; for
example, the speed of light.  The world of physics and chemistry abounds
with physical constants, usually identified with some particular person's
name which resulted from absolute experiments.  For example, we mention
Avogadro's constant, Faraday's constant, and Planck's constant.
        Comparative experiments on the other hand are concerned with
measuring the differences in the effects of two or more stimuli (called
treatments) upon some characteristic (or characteristics) of interest.
These differences are almost always measured in an absolute sense.   This has
sometimes seemed strange to engineers more accustomed to thinking of
relative differences or percentage changes.  However, almost all theory
and methodology in the design of experiments deals with absolute differences.

-------
                                                                    105
6.4  The Use of the Word "Design"
        Although there are subsets of the statistics community for which
the term  "experimental design"  has a clear and agreed-upon meaning,  the
expression is used in a bewildering variety of ways by statisticians ^
(for example those listed in the 1973 directory of statisticians published
by the American Statistical Association).  Of course the practice of
statistics is not confined to this restricted group of people.  There
are other large groups of people with vital interests in statistics and
specifically in the field of experimental design.  The result is that a
novice may have great difficulty in comprehending the way in which the
word  "design"  is being used.  In order to discuss the matter further,
let us refer to Figure  5.
     Treatments
 )erirnental
Units
Response
                  Figure  5.  Experimental design

-------
                                                                     106
        In a comparative experiment, certain stimuli, called treatments,
are applied to experimental material, producing responses.  If the responses
are not already quantitative, they are converted to quantitative responses.
An analysis is then performed to compare the different stimuli with respect
to the quantitative responses.  All of this should be done in a manner
following existing theory and methodology rather than in a completely
ad hoc pattern.  The word "design" seems like a perfectly suitable word to
describe the plan for conducting and analyzing the experiment.  The
difficulty with the word  "design"  arises because it is used to describe
(among other things): (1)  the treatments, (2) the experimental material,
(3)  the way in which treatments are assigned to the experimental material,
(4)  the responses, and  (5)  the analysis.  In this handbook,  we use
the term "experimental design" to describe the experimental material and
the way in which treatments are assigned to the experimental material.
This is consistent with well-known books on experimental design* Cochran
and Cox (8) and Kempthorne  (22).  Thus, we use "experimental design" to include
topics (2) and (3), above.  The use of the word  "design"  to describe
topics (1), (4), and (5) will be discussed in later sections.
6.5.  Properties of a Good Experiment
        In an excellent chapter on comparative experiments, D.R. Cox  (10)
lists the following requirements for a good experiment:  (1)  absence of
systematic error,  (2) precision, (3) range of validity,  (4) simplicity,
 and (5)  the calculation of uncertainty.   We will now illustrate these
 five requirements as we understand and interpret them.
         First, an experiment should be free from systematic error.  Al-
 though this statement  seems totally obvious and compelling, it is sometimes
 flagrantly violated either knowingly or not.   As an example, suppose that

-------
                                                                   107
an experiment was designed to compare participate mass emission rates
resulting from the use of  two gasolines in test automobiles.  Although
the test equipment was quite sophisticated, there were only two automobiles
used in the test.  Automobile A used the first gasoline and automobile "B
used the second gasoline.  Regardless of laboratory techniques and precision
of measurement, the comparison of gasolines is handicapped because of the
systematic error incorporated.  Any comparison of gasolines is also a
comparison of automobiles.  Needless to say, such systematic, errors should
be avoided if at all possible.  The primary technique which *s used '.o
eliminate systematic errors is that of randomization.  This will be
discussed in some detail a little later.
        Secondly, let us discuss the matter of precision in a comparative
experiment.  For simplicity,  suppose we have two treatments.   Inevitably,
some statistic will be calculated as representing the difference in the
effects of these two treatments.  In the simplest cases this statistic
will be simply the difference of two sample means.   If we say that the
experiment has high precision, we mean that the variance of this difference
is small, the variance being defined for the conceptual population of
differences which would result from repetition of the experiment.  For
example, suppose that an experiment was designed to compare the outlet
dust concentration (gr /10   ft )  in  two types of filter  fiber material,
cotton and nylon.  The experimental results give a difference of 105.
The importance of this result depends heavily upon its precision.  If it
were known that the inherent standard deviation is 50, say, the results
would be interpreted much more skeptically than if it were known that the
standard deviation is 25, for example.
        While the need for precision may be clearly perceived, it is not

-------
                                                                     108
 so obvious  how it  is  to be  achieved.  The variance and precision of such



 differences as discussed  above depend upon:  (1)  the  inherent  variability



 of the experimental material,  (2)  the number of  observations, and  (3) the



 design of the  experiment.   While the experimenter can do nothing about



 the first factor,  he  can  exercise  considerable control over precision



 through the second and  third factors.



        One other  comment needs to be made about precision.  The term is



 widely used in statistics to discuss the amount of variation rather than



 the average or central  value.  It  is not to be confused with accuracy



 which  is used  to discuss  the amount of bias in a result.



        The third  property  of a good experiment alluded to above was



 range  of validity.  The basic idea is that the experiment should encompass



 a  sufficiently wide range of treatments experimental material, etc., so



 that the  inferences  drawn from the experiment may with some validity be



made about  the population of interest.  All of us are aware that laboratory



 experiments on SCL emission may not apply at all to industrial SCL emissions
                 ^                                               L


 over Birmingham, for example.  Yet, this point must be emphasized over



 and over again.  It is not possible in any statistical sense to make



 inferences  about populations not sampled.



        The desire to have a wide range of validity is at direct odds with



the motivation to conduct a precise experiment.  Incorporating a wide



range of conditions into an experiment tends to decrease the precision of



the experiment.  On the other hand, carefully controlled laboratory



experiments with great precision may have such a narrow range of validity



that the results are virtually of no interest.



        The fourth requirement of a good experiment is that it be simple.



This does not mean that it should be naive or stupid.  On the other hand,

-------
                                                                      109

 experiments designed by statistical novices are often too ambitious and
 much  too complicated, and the experiment may be doomed to failure before
 it is ever begun.  An experiment should be complex enough that it utilizes
 available theory and methods but simple enough that it can be conducted
 and analyzed easily.
         The fifth requirement listed by Cox was the calculation of un-
 certainty.  We interpret this requirement as meaning that the experiment
 should be performed in such a way that the analysis can be based upon
 theoretical development which has been field-tested.  Generally, the
 theory for any analysis will begin with a probability model;  hence the
 expression "calculation of uncertainty."  It is a curious situation that
 well-trained people \vill often oinbark upon data collection and analysis
 armed  only with ad hoc procedures,  inventing statistical methods as they
 go,  even though they would not consider t'ollov/ing the same practice with
 respect  to chemical or physical theory or laboratory methods.
 6.6  Experimental Units
         The concept of experimental unit is  a central one in  the theory of
 experimental design but is  apparently  not an easy concept to  comprehend.
 Stated simply, we define an experimental unit as  the  smallest  subdivision
 of experimental material so that different subdivisions  of the material  have
 the opportunity to receive  different treatments.  The opportunity to receive
 different treatments refers to  the opportunity in the population of
 conceptual repetitions.  Obviously, this depends to some  extent upon the
 scheme used for assigning treatments to  the experimental material and
 therefore upon the randomization.  Thus, a discussion of  experimental units
 is bound to that of randomization, but we shall attempt to simplify the
discussion by separating the two topics.

-------
                                                                    110
        Consider the following experiment to determine the rate of sulfate
ion formation in water droplets.  Droplets of water were placed on a grid
of Pyrex fibers and suspended in a chamber with a known concentration of
SCL.  The amount of  ^SO^  formed during the reaction period was then
calculated.  The laboratory technicians were reluctant to make many changes
in the SO- concentration so the experiment was done as follows.  The S02
level was fixed and four grids were run through the experiment.  The level
was changed and four more grids were run.  The level was changed a third
time and four more grids were run.  Although there are 12 determinations
of H-SO. ,  there are really only three experimental units.  Note that the
first four grids do not have an opportunity to receive different levels
of S02.
        As a second example illustrating the idea of experimental units,
consider an experiment to study the usefulness of plants for the monitoring
of air pollutants.  Two test chambers were used.  Twelve plants were
placed in test chamber A which was filled with unfiltered air.  Twelve
plants were placed in test chamber B which was filled with filtered air.
After a specified period of time, the fluorine content (mg/lOOg dry weight)
was measured for each of the plants.  Although the experiment involved
24 plants, there were really only two experimental units.  The experiment
could have been modified to involve 24 experimental units by doing the
following:  using only one plant per experimental chamber, refilling the
chambers after each run, and randomizing the order so that the sequence might
appear as U,F,F, U... etc., where F stands for filtered and U stands for
unfiltered.
6. 7 Experimental Error and Sampling Error
        Experimental error is as important to the subject of experimental

-------
                                                                    Ill

design as experimental unit.  To understand the concept of experimental
error let us return to the first example in the previous section, that of
H7S(L  formation.  Having completed the experiment, consider the average
responses between levels A and B of S02»xA-xB.  When we calculate the
difference  XA-XB,  it can be written as

                    XA - XB = "A""]* + e
where VIA-UB  represents the difference in population means of the two
concentrations of SO- and e represents a random error.  The population
referred to is the conceptual population which would result from repetition
of the experiment.  The random error e is composed of all those variables
which cause the difference of sample means to be in error and is called
the experimental error.   It is extremely worthwhile to attempt an
enumeration of these variables.
      The variables which contribute to the experimental error are
       1.  Differences between experimental units.
       2.  Errors in the  application of treatments.
       3.  Measurement errors.
       4.  Other factors  either known or unknown but neglected in the
           design of the  experiment.
The experimental error e  is commonly assumed to have zero mean and
variance a .  The assumption  of zero mean implies that we have designed
our experiment so that we have an unbiased estimate of v^-yg-
      Quite frequently, the  experimental error variance is simply referred
to as experimental  error.  This may cause some confusion but need cause
no concern to the person  consulting the statistician.  It is simply a
matter of learning  the terminology.   Interest lies in estimating the

-------
                                                                     112
 variance of e, the experimental  error variance,  so that we can express

 our uncertainty about v^-v-g-   The emphasis has been almost totally on
                          2
 unbiased estimation of  a .

       One of the most bewildering experiences  for noastatisticians

 in consulting with statisticians occurs when the  estimation  of experimental

 error is discussed.   To the statistician's statement,   "there is no

 experimental error,"  or  "there are  no degrees of  freedom left for error,"

 or  "you have no estimate of  experimental  error,"  the  non-statistician

 either nods in agreement or leaves  in silent resentment.   We  would like

 to give an introduction to the  statistician's  thought processes about

 estimation of experimental error variance.

       The matter almost always  arises  in     analysis of variance situations.
                                                         T
 In the simplest  case,  the experimental  error variance   a    is estimated

 by a quantity which  is equivalent to  a  constant times a sum of squares  of

 deviations  about the mean; i.e.
The problem is in deciding which sum of squares js appropriate tor

estimating the experimental error variance.  Sometimes the highly trained

mathematical statistician despairs of trying to understand how the applied

statistician makes this decision.  This decision making is partially an

art, partially a science; of cour-e, a thread of rigor runs through it.

There is no way that the novice can easily master this idea but we

believe that we can introduce a fow of the important ideas.

      The question of whether or not a particular sum of squares is
                            2
appropriate for estimating a   can be studied by the use of the following

identity           _,   -,2 .

-------
                                                                      113
  The sum of squares  of deviations  about  the mean can be calculated from
  squares of difference between individual numbers.  The basic principle
  in deciding whether or not  a given  sum  of squares  is appropriate is
  that every difference  x. ^x .   should  contain all of the variables that
  enter into the experimental error  e.
         Let us now illustrate the concept with the last example in
 Section 6.6  .  Recall that the example called for putting 12 plants in
 a chamber with filtered air and 12 plants in a second chamber with
 unfiltered  air.  Denote the responses as follows

                   First Chamber          Second Chamber
                       XH                     X21
                       X12                     X22
                       X13                    X23
                       X14                    X24
                       X15                    X25
                      X16                     X
                      X17                     *2
                      X18                     X28
                      X19                     X29
                      Xl> 10                  X2,10
                      Xl, 11                  X2,ll
                      *1. 12                  X2,12
The obvious procedure is to calculate the difference of means
and variance

-------
                                                                       114

  Now, how should  a2  be estimated.  The obvious procedure (though incorrect)
  would be to estimate  c2  by
                                           - ^2
                                   + -:--(x2j-x2)
                                11 + n

  To see that this is not a proper estimate of  o2,  consider the differences
  entering into the sum of squares in the numerator.   None of these
  differences contain the difference between chambers.   The difference
  between chambers,  after all,  is  likely  to be one of  the  major  variables
  which  cause  ^  -  x2  to be  in error.
                            <2
         The estimate of  a , just  considered,  is an unbiased estimate of
  another variance, called  the variance of  sampling error.  Sampling error
  simply contains all  of  the variables which cause observations resulting
  from the same treatment to be different, and generally has a much smaller
 variance than experimental error.  This point is often not appreciated with
 the result that the experimental  error  variance is  badly underestimated
 and, to make matters worse, too many degrees of freedom are assigned to
 experimental error.

 6.8  Degrees of  Freedom in the Analysis of Variance
         We now want  to  try to desci ibe a phenomenon which is not well
 described in any  text book--  probably for  good  reason.   It js difficult.
 Nevertheless,  the user  of statistics will  soon  become aware  of  the
 ability of applied statisticians  to think  and work in terms  of  degrees
of freedom in the analysis of variance.
        For  over fifty years  (See Fisher and MacKen:.:e  (le,)) the analysis
of variance  has been  a powerful tool not only for analyzing experimental
data, but also for thinking about the design of the  experiment.   Basicall.

-------
                                                                    115
the analysis of variance is a partition of the sum of squares of deviations
    _ 2
r(x-x)  into recognizable component sum of squares, sach with its own degrees
of freedom.  If there are n observations in the experiment, there are (n-1)
degrees of freedom associated with che total sum or square*  L(x-x)2 .
It is always possible to write- this as the sun of  n-1  component sum of
squares, each with 1 degree of freedom.  Usually these component sum of
squares would not be interpretable or recognizable.  Instead, there will
generally be fewer than  n-1  component sum of squares, each with more than
1  degree of freedom.
        The experienced statistician develops the alii lily  to describe the
appropriate analysis of variance and the breakdown of degrees of freedom.
To do this he leans heavily upon the concepts of experimental unit,
randomization, and estimation of experimental error variance.
6.9   Randomization
        The idea of arranging a set of numbered objects in random order
is another of the central concepts in experimental design.  This can be
accomplished for small sets of objects by the use of  physical processes
like drawing cards from a deck, marbles from a bowl, or tickets from a
container.  The primary methods of randomization which are used,
however,  are either  the use of random number tables or computerized
random  number generators.  In both car.es, it should be realised that
we are  simulating a  random device.
        There are many different random number tables available.  These
tables  present digits zero through 9 hopefully "scrambled" together in
something resembling random order,  Hald   (18)  presents  numbers  compiled
 from drawings in the Danish State  Interest  Lottery of 1948.  Fisher and
Yates  (I?)  present a table of digits abstracted form Logarithmetica

-------
                                                                    116
Brittanica consisting of the 15th to 19th digits of logarithms to the
base 10.  The Rand Corporation  (30)   presents a table of a million
random digits generated from a computer program.
        Numerous computer programs exist to generate sequences of random
digits.  Many of these generating routines are refinements  of the
following simple procedure described by Kempthorne and Folks (23):
        1.  Choose a multiplicand.
        2.  Multiply by an arbitrary multiplier.
        3.  Use the k digits on the right for the second multiplier
            and recoid the p digits immediately preceding these k.
        4.  Multiply the multiplicand by the second multiplier, and
            continue the process.
Using a constant multiplicand of 341 and an initial multiplier of 443
we obtain a product of 151,063.  Recording the fourth digit from the
right, "1", and using the three digits on the right, 063, as a new
multiplier, we obtain a second product of 21, 483.  Recording "1" and
using 483 as the third multiplier we proceed.  This results in the
following sequence of 100 digits.
                     1974219742
                     1976419764
                     4218642186
                     9753197531
                     6410864108
5
5
8
3
0
3
3
7
1
7
1
1
5
9
6
8
0
2
7
4
6
8
0
5
2
5
5
8
3
0
3
3
7
1
7
1
1
5
9
6
8
0
2
7
4
6
8
0
5
2

-------
                                                                    117
If continued, the process will simply repeat this sequence of 100 digits.
This periodicity is typical of such generating schemes.  Naturally, in
order to be useful, the period should be quite large,  ror the sequence
we have generated, the frequency of digits is as lollops:
Digit
Frequency
0
8
]
16
2
8
3
8
L4 j 5 1
10 | 10
6
10
7
12
8
10
9
8
These frequencies appear to be reasonable outcomes for a random device
which would generate each digit with equal frequency.
        Published random number table?, have been ^rutLiizfcd not oni/
with respect to the frequency of occurrence of digits, but with respect
to runs of the same digit, gaps between successive occurrences of the same
digit, etc.  Excessive irregularities have been eliminated.  The question
often arises about how we can call any table of digits  ''random"
particularly after it has been  "doctored."   Remember that ire want the
use of the table to simulate a random number generator.  Thus we want the
table to be reasonably free from irregularities so that repeated use of
the table might at least resemble repeated use of a random mechanism.
        The use of randomization for the assignment of treatments is
credited to R.A. Fisher (15).  Although not always practiced by experimenter
the principle is widely recognized.  The random assignment of treatments
to experimental units achieves two important effects:  (1)   the minimization
of systematic effects , and  (2) a basis for the assumption of a
probability model.
        Left to his own devices, an experimenter may be more likely to
arrange treatments in a systematic fashion than if he makes formal use
of a randomization device (or table of random numbers)  to assign treat-
ments.  Thus the more or less "haphazard" order introduced by randomization

-------
                                                                    118
 would have desirable effects even if there were to be no analysis based
 on a probability model.
         The second effect requires more sophistication to appreciate.
 We have referred previously several times  to a conceptual population of
 repetitions of the experiment.   The immediate effect of randomization  is
 that the conceptual population  is clearly  defined and the consequence  of
 having used one of the possible permutations of treatments is  that we
 feel justified in using  a probability model.
 6.10  Randomized Designs
         Recall that in Section  6.4 we used the term  "experimental design"
 to mean the description  of experimental units and the assignment  of
 treatments  to  the experimental  units.  From this  viewpoint an  experimental
 design is a plan for generating the data.   It is  not  a description of
 the analysis to be performed on the data.   In the previous  section it
 was indicated  strongly that the only method of assigning  treatments
 would  be random.   Different designs may be  thought  of in  terms of the
 restrictions placed upon the random assignment of treatments.
 Paired Design
         There  are  n pairs  of experimental units and two treatments.
 Within each pair of experimental units, treatment 1 is assigned at
 random to one  of the units and  treatment 2  is  assigned to the other.
 Two-Group Design
        There  are n  experimental units and two treatments.  Treatment 1
 is assigned at random to n.^ of the units and treatment 2 to the other n-
units where  n, + n~ = n  .

-------
                                                                    119
Completely Randomized Design



        There are n experimental units and k treatments.  Treatment 1 is



assigned at random to n. units, treatment 2 at random to n0 units, . . .  ,
                       1                                  b


and treatment k at random to n, units, where  n, + n<, + ... + n, = n.



Quite obviously, a completely randomized experiment is a generalization



of the two-group design.



Randomized Block Design



        There are n experimental units grouped (or blocked) into b blocks



of t experimental units each and there are t treatments.  Within each



block the treatments are assigned at random to the experimental units.



The paired design is a special case of the randomized block design.



Latin-Square Design

                      2
        There are nf=t ) experimental units which have been grouped or



classified by rows and columns into a  t x t  square and there are  t



treatments.  The treatments are assigned to the experimental units subject



to the restriction that the resulting array of treatment numbers has the



Latin-square property- -namely that each number occurs once and only once



in each row and each column.



        Consider an experiment to study the reaction of fabrics made



from synthetic fibers to air contaminated with nitrogen dioxide, ozone,



or sulfur dioxide.  We wish to use this setting to illustrate the ele-



mentary experimental designs just described.



        Suppose that ten pieces of nylon were available and that each



piece of fabric is cut in half.  The treatments are exposure to light



and air, and to light and air containing 0.2 ppm SCL.   For each piece of
                                                   £*


nylon, one of the treatments is assigned to one half and the other



treatment to the other half.  After exposure, the relative breaking load

-------
                                                                    120
is measured for each of the ten pairs of pieces of fabric.   This is obviously
an example of what we called the paired design.
        As a variation of this example suppose that» instead of cutting
the ten pieces of nylon in half, we simply assign at random one of the
treatments to five of the pieces, and the other treatment to the other
five pieces.  This would be an example of what we called the two-group
design.
        In the above experiment, there were three treatments:  (1) con-
trol,  {2} exposure to light and air, and (3) exposure to light and air
containing 0.2 ppm SO-.  If each treatment is assigned at random to
several pieces of nylon fabric, this is an example of a completely
randomized design.
        Suppose that we have three pieces of nylon fabric from each of four
different manufacturers.  For the fabric from each manufacturer, the
three treatments are assigned at random to the three pieces of fabric.
This illustrates a randomized block design.
        As a final illustration in this section, we might have nine pieces
of nylon fabric representing three different manufacturers and three
different years of manufacture.  Then the three treatments are assigned
at random subject to the Latin-square restriction.
        We have not discussed reasons for using the different designs
discussed.  These matters are discussed  at great length in many
statistics textbooks; but in our experience it is not always easy to
understand the differences between the designs.
6.11  Multifactor Experiments
        The problem of studying several experimental variables in an
experimental program is an extremely common problem.  However, the idea

-------
of simultaneously varying all variables of interest is still not
universally known or accepted, years after Fisher (14) introduced the idea.
In a landmark paper entitled  "The Arrangement of Field Experiments,"
Fisher put forward the idea of multifactor experiments which iie called
complex experimentation.  Fisher stated,  "No aphorism is more frequently
repeated in connection with field trials, than that we must ask Nature
few questions, or, ideally, one question at a time.  The writer is
convinced that this view is wholly mistaken.  Nature, he suggests, will
best respond to a logical and carefully thought out questionnaire;
indeed, if we ask her a single question, she will often refuse to answer
until some other topic has been discussed."  Instead of studying one
factor at a time, multifactor or factorial experiments study all
the factors of the experiment.
        In terms of the ideas presented in previous sections, the
description o£ a factorial  (or multifactor) experiment is not a design.
Rather, the factorial description refers to the treatment structure
which may be used with various kinds of experimental designs.  We may have
a two-factor arrangement of treatments in a completely randomized design
or in a randomized block or in a Latin-square, etc.
        Korth  (24)  describes a three-factor factorial experiment designed
to investigate irradiated auto exhaust under conditions ot" continuous
mixing.  The three factors  and their levels were:
         1. Initial exhaust concentration
            13 ppm carbon and 35 ppm carbon.
         2. Average irradiation time
            85 minutes  and  120 minutes.
         3. Fuel composition
            14% olefins and 23&  olefins.

-------
                                                                     122
 As far as  the  treatments  of the experiment were concerned,  there were
 eight treatments determined by the  eight  combinations of  three  factors,
 each at two levels.
         A  number of  different  response variables were looked at but  for
 one of them, N02 formation rate in  pphm/min,  17 runs were reported
 as follows:
                                          Average
 Exhaust Level     Fuel  Coniposition,      Irrdd.  Time,      N07 Formation Rate,
   ppm carbon        %  elegirs
13
13
13
13
35
35
35
35
14
14
23
23
14
14
23
23
85
120
85
120
85
120
85
120
1.76,
1.45,
1.88,
2.10,
2.42,
2.61,
3.59,
2.68,
i
1.25, 1.50
1.67
2.22
1.95
2.90
2.36
3.80
2.74
From looking at this table, one can not decide what experimental design
was used and therefore what the appropriate analysis should be.  There
are many possibilities.  We suggest three of them:  (1)   2x2x2
factorial in a completely randomized design with 17 experimental units,
(2)   2x2x2  factorial in a completely randomized design with eight -
experimental units, and (3)  2x2x2  factorial in a randomized block
design with 17 experimental units grouped into two blocks.  We now discuss
each of these possibilities in greater detail.
          1. Completely Randomized Design with 17 Experimental units
             Unequal sample sizes are awkward for trying to present an
elementary account.  In order to make this presentation simpler, let

-------
                                                                     123
 us assume that there were two runs with all eight, treatments.  Let us use

 (1.76, 1.25)  instead of (1.76, 1.25, 1.50).  Then we consider the

 experimental  units to be the 16 runs and each of the eight treatments

 would be assigned at random to two oF the experimental units.  A random

 assignment might have been as follows, for example:
                    Exhaust      Fuel          Average
                     Level,   Composition,   Trrad.  Time,
            Run   ppm carbon   % elegins       min
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
13
13
35
35
13
13
13
35
35
35
13
35
35
13
13
35
14
23
23
14
14
23
14
14
14
23
23
14
23
14
23
23
85
85
85
85
120
120
120
85
120
85
120
120
120
85
85
120
 The'appropriate analysis of variance for this randomization js
n th
-------
                                                                    124
Then the F value is highly significant and we conclude that treatment
means are different.
        To take advantage of the factorial structure, we partition the
treatment sum of squares into eight components, each with  1 degree of
freedom.  The main effect and interaction sums of squares with F values
are given below:
Source*
A
B
C
AB
AC
BC
ABC
df
1
1
1
1
1
1
1
SS
0.319225
1.288225
4.862025
0.198025
0.354025
0.015625
0.133225
F
6.488
26.183
98.822
4.025
7.196
0.318
2.708
        The tabulated F value at the 0.0005 level with 7 and 8 degrees
of freedom is  15.1, so we conclude that treatments are different.  The
main effect and interaction sums of squares help us determine where the
differences lie.  Of course the factor C, exhaust level, is the greatest
source of variation.  At the  5%  level, all three main effects and the
AC interaction are significant.  To understand the AC interaction, let
us look at the following two-way table of means.
                                   C
                          13               35
A
85 min
120 min
1.7775
1.7925

3.1775
2.5975

        * A = Average irradiation time
          B = Fuel composition
          C = Exhaust level

-------
                                                                   125
At the lower exhaust level, the effect of increasing irradiation time

is to increase the NO, formation rate.  However at the higher exhaust

level, the effect of increasing irradiation time is to decrease the

NO- formation rate.

        2.  Completely Randomized Design with Eight Experimental Units

            In physical experiments the type of duplication often used

has been described as  "bang-bang"  duplication.  That is a treatment

which is run and then run again before going to another treatment.

With this sort of approach, the randomization might- have been as follows;
                Exhaust            Fuel               Average
                 Level         Composition        Irradiation Time
      Run	ppm carbon	%, elegins	min_
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
15
35
35
13
13
35
35
13
13
35
35
13
13
35
35
13
13
23
25
23
23
23
23
14
14
14
14
23
23
14
14
14
14
120
120
120
120
85
85
120
120
85
85
85
85
120
120
85
85
With the experiment run in this fashion, it appears to us that the

experimental unit consists of the two successive time periods for which

a treatment is used.  Then we have only eight experimental units and zero

decrees of freedom for experimental error.  If we use the residual

-------
                                                                   126
mean  square in the analysis of variance for experimental error, we will



declare more results significant than are justified by the data.




        3.  Randomized Block Design with 16 Experimental Units



            Suppose that we run all eight treatments and then repeat the



experiment.  Then we have  two blocks, consisting of eight runs each.



Suppose the data were as follows:                     NO-,
Exhaust
Level
ppm carbon
13
13
13
13
35
35
35
35
Fuel
Composition
% elegins
14
14
23
23
14
14
23
23
Average
Irrad. Time
min.
85
120
85
120
85
120
85
120
Li
ppm/min

Block Block
I II
1.25
1.45
1.88
1.95
2.42
2.36
3.59
2.68
1.76
1.67
2.22
2.10
2.90
2.61
3.80
2.74
Then the analysis of variance would be as follows:
Source            df           SS                MS
Total
Blocks
Treatments
Error
15
1
7
7
7.563975
0.308025
7.170375
0.085575


1.024339 83.7905
0.012225
With this analysis, all of the factorial effects and interactions,



except the BC interaction, would be significant at the 5% level.

-------
                                                                    127

 6.12  Mathematical Models

        For planned experimentation, a general description of most
 situations is that one has. a set of conditions, whidi can be more or less
 controlled, and a response measurable on H numeric fca.'.e, v.liich can be
 made to vary by manipulation of the conditions.  The objective, of
 course, is to be able to relate the response to the settings of the con-
 ditions in such a way as to permit predictJons to be made about the response
 obtainable from a given combination of conditions.  The most common
 approach is to use a mathematical model to approximate the relationship.
 We have already discussed the fact that exact reproducibility is never
 attained and have discussed how the use of probability theory can assist
 in characterizing non-reproducibility or random variation, as we have
 called it.
        It is evident that the large selection of available mathematical
 functions and the great variety of probability distributions allow almost
 unlimited latitude in representing phenomena of nature with bi-component
 models.  It is also evident that the more familial- and knowledgable
 about a particular phenomenon one becomes, the more representative and
 sophisticated he can make his model.  One occasionally hears the comment
 that statistical methods should be avoided because they encourage empiricism.
 This criticism is not wholly without foundation, and jr. is often the
 result of the observation that a relatively crude mathematical model
 coupled with a probability component can sometimes do an ajnazingly good
 job of representing nature.  This, of course, attests to the power of the
 tool of probability.   However, the criticism is also applied to cases
where a crude model has been used when a better one exists and is readily
available.   This is related to the matter of discriminating between

-------
                                                                    128

systematic variation and random variation or, as we have previously
called it, explainable variation versus unexplainable variation.  We
must agree that the fact that statistics provides a means of dealing
with random or unexplained variation in no way excuses sloppiness or
laziness in finding the best model that resources will allow for describing
systematic variation.  However, there are a couple of additional points
one should consider before he attaches the label  "empiricist "   to a
colleague.  First, one must always ask the questions,  "What is the
purpose of the experiment for which the predictive model is sought; how
exact a model is needed in relation to that purpose; what is the cost of
developing a more refilled model; and is such a refinement justifiable?"
Secondly, empiricality is largely a matter of degree  for,  generally, no
matter how  "theoretical"  or  "fundamental"  a model one derives for a
given situation, there remain unknown constants or parameters to be
determined by experimental observation.  In that sense, all models are
empirical.
        From a statistical standpoint, models may be classified in
several different ways.  We may speak of purely stochastic models versus
those which are not purely stochastic.  A purely stochastic model is one
which is based wholly upon one or more probability distributions; in
other words, one which is made up of only the probability component.
The most common type of model, however, is one which is composed of some
closed function of the experimental conditions, with a random error added
on ; i.e., we say
       Response = f(Experimental Conditions) + (Random Error).

-------
                                                                    129
 Of the models  in  this class, two subclasses are differentiated:  linear
 models  and non- linear models.  To simplify the discussion, let us assume
 that the  "experimental conditions"  can be translated into a set of
 numerically measurable quantitative factors.  This, of course, is the
 common situation  in the physical sciences.  More specifically, let us
 suppose that the  set of variables (x(, x-, . . . , x, )  has been identified
 as one which should account for "most of" the variation in some response
 variable,  say, y.  Then, linearity refers to the form of the function f
 in the model
              y = f(xlf x2, ..., xkJ + error.
 A  linear model is one for which the function  f  can be expressed as a
 linear combination of functions
where  bp b2,  ..., bn  are constants to be determined by experimentation.
Any model which cannot be put into this form is said to be non- linear.
Let us note that a non-linear model may sometimes be linearized by
transformation.  For example, a model such as
                bl  b              b
         y =
may also be expressed in log units as
         log y = log bQ + bx log Xj + hz log x£ + b3 Iog[(x3+x4)/x5]
or
         y' = a0 + alZ;L + a^ + b3z3 ,
which is a linear model.  For this reason, the term intrinsically
non- linear  is sometimes used to describe non- linear models which are
not linearizable.

-------
                                                                    130
        One may wonder why such distinctions as linear, linearizable, or
 intrinsically non-linear are made, since the general formulation is the
 same  in any case.  Let us note, however, that up to now in the discussion
 of models, there has been nothing covered which is uniquely associated
 with, or is a unique product of, statistical theory, other than the idea
 that  one may include, as an integral part of any model, a probability
 component.  From a statistical standpoint, it is possible as well as
 advantageous to view the models we have been discussing entirely as
 probability models; that is, to consider the response y as a random
 variable with its distribution determined by both the function f and the
 random error component.  Looking at models in this way permits us to
 see the essentials involved, which are:
          1.  The determination (estimation) of unknown constants
              (parameters).
          2.  The determination of the behavior of the random component.
          3.  The determination of the behavior of the parameter estimates
              obtained in 1, based upon the information from 2.
 While many scientists and engineers recognize the importance of 1 and
 2, unfortunately a great many do not know that a suitably designed
 experiment can yield information for all three activities, rather than
 for 1 only.  In order to keep one's perspective, it is helpful to be
 reminded that the activities we are discussing here are based upon the
 same basic principles covered in Chapter 5.   That is, regardless of how
 complex the structure of the population (as described by some model)
may be, our efforts, though they may be complex experiments, can be
regarded as taking samples from that population.  In Chapter 5, we  saw
that the inferences we draw to the population sampled depend upon what we

-------
                                                                    131
assume about the structure of the population and how the population is



sampled.  It is no different in the present case; the interpretation of



the results of an experiment is dependent upon the model chosen and



the type of experiment conducted.  Unfortunately, at this point, the



distinction linear or non-linear becomes important for, given that the



model is linear, it is a relatively straightforward matter to determine



how an experiment should be conducted and analyzed so that objectives



1,2, and 3 above may be met in an optimal manner.  For this reason,



statistical theory of model fitting has, until recently, concentrated



upon the linear case, with the result that the non-linear case has



lagged behind in the development of a unified statistical approach.  The



situation is being remedied, however, to the point that nowadays one need



not compromise the usefulness of his research by settling for a linear



model to describe a population structure when a non-linear representation



is definitely required.



        Since linear models have occupied a central role in the development



of statistical approaches to the gathering and analysis of data, let us



briefly discuss the essential ideas involved.  Recall that a linear model



is one of the form



      y = b1f1(x1,x2,...,xk) + b2f2(xrx2,...,xk) + ... + IhVW •••




                             +  error.




Let  z. = f.(x, ,x~	xv)  and denote the error term by the letter  e;
      3    1  1  Z      K


then the model is





             y = b,z, + b~z0 + ... + b z  + e  •
             '    11    22          mm

-------
                                                                    132
 Suppose that an experiment is conducted for which  n  observations
(y,,  y  >  •••» v)
               n
                      the response are taken.  Denoting the values of
 the conditions associated with  y.  by  (z~,z~~,...,z.'),   we may
                                 i        i-L   iz        ijn


 express the  n  observations as
            = Vll + Vl2




            = blZ21 + b2z22
                                    bmzlm+el




                                    bmz2m + e2
         n = Vnl + Vn2
                                    Vnm + en
where the b.'s  and  e.'s  are unknown.  Matrix notation may be used to



express the  situation more concisely by



      r
                        Z12"-zlm
                     21
or, simply
                    znl Zn2"-znin
                     Y = XS + e .



~bl
b2
\

+

V
62
en_
Now, since the vector  Y  and the matrix  X  are determined respectively



by the experimental observations and conditions, the vectors   6  and  e



are the only undetermined quantities in the expression.  Also, the



vector  e  we are regarding as the errors between the  observed responses Y



and those responses computed from the model.   It is  clear,  of  course,



that the  m + n  b^s  and  ei's  cannot all be solved for  uniquely with



only  n  equations.  Imagine for the moment that the  b.'s  are known



and the e^s  are unknown, but that the  e.'s  are random fluctuations

-------
                                                                    1.53
about zero.  In such a case, we would say that the responses  y^,  for

the associated conditions  z.,, z.~, ..., z.  , predicted by the model are
                            IX   l^j        IJTl
   y. = blZu + b2z.2 + ... + b^,       i = ],  2, ...,  ri  ,


and we would call the respective  e.'s  the errors of prediction.   The

smaller the  e.'s  are as a group, the better, we  say, that the model and

data agree.

        This, then, suggests a method of choosing  the  b. 's; that  is, to
                                                                      ,%
find the set of  b-'s  that makes the discrepancies between  y.  and  y.

as small as possible.  As a first cut, one might consider  minimizing the
                                 n
sum of the unsigned differences  £ |y--y- •  A bettei way, from some

                                                               n    ~2
points of view, is to minimize the sum of squared  differences  £(y--yO •
                                                              i=l *  x

This is, of course, the familiar method of least squares.   The details

of the method may be found in many statistical references , so let us

simply summarize what is done.  With the matrix notation formulation of
                                              A
the model given earlier,  it can lie shovm that  (J  is  such that

                            X'XS = X'Y

is satisfied.  For those unfamiliar with matrix notation,   the above is

simply a shorthand way of saying that the  b.'s must satisfy a system

of simultaneous linear equations.  This particular set of  equations is

called the normal equations.  Thus, what it boils  down to  is that to

utilize a linear model to describe the results of  an experiment, the

parameters to be determined by the method of least squares are found by

solving the normal equations.  In this day of readily accessible high

speed computers, one usually lets a computer perform the laborious

computations involved.

-------
                                                                     134
         Let us now return  to the  idea that the model is determined for
 the purpose of predicting  the responses one might expect to get from
 experimental conditions  "similar"  to those used to determine the model.
 Once again,  one  is confronted with the fact that, in all probability,
 even though identical conditions  are ostensibly met at some future time,
 the responses observed will not agree with those predicted by any model
 and,  in fact, will not agree with the responses obtained in the experiment
 being analyzed.  The disagreement between the responses predicted
 from the chosen model and  the observed responses is often used to measure
 the variability to be expected if the experiment were repeated indefinitely.
 One can see  here a fundamental and nontrivial difficulty; namely, that the
 measure of variability depends upon the model used.  In other words, the
 discrepancy  between predicted and observed might be largely a result of
 the model being incorrect, sometimes referred to as lack of fit of the
 model.   For  that reason, many researchers, whenever possible, make it a
 standard practice to replicate experimental conditions.  In the context
 of  models, we may think of replication as obtaining several independent
 responses at each experimental condition.  Then, as long as the factors
 included in the predictive model are constant, the several independent
responses for the same  "setting"  would be predicted to be constant.
Hence, the failure of the several observed responses to all be equal to
 the predicted response may be broken down into two components:  (1)   the
failure  of the observed responses to all be equal to their mean value,
and  (2)  the failure of the mean value of the observed responses to be the
same as  the response predicted by the model.   In this way,  one can get
respective measures of both the random variation and the lack of fit of
the model.  The result of not replicating is to force one to assume that

-------
                                                                    135
lack of fit of the model is merely random error  in order  to get a measure



of variability, an assumption which may or may not be  justified.  This



is why replication is recommended whenever possible.

-------
                                                                    136
                            7.   SIM1ARY

        A large portion of scientific activity can be loosely
characterized as  "attempting to discern the general pattern" or
"trying to predict the behavior of some response under a given set of
conditions."  These objectives would be fairly easy to attain if  it were
not for the fact that variability seems to be one property that all
physical material possesses.  In recent years, there has been a growing
tendency to formally recognize variability by incorporating a provision
for it into the models used to characterize physical happenings.   This,
in turn, has led to the practice of visualizing numerical observations
upon a phenomenon as having arisen from some conceptual population of
"similar" observations that would occur under "the same conditions."
The principles of Statistical Inference have shown, in addition,  that the
idea of a population can be extended even further by thinking in terms
of populations of samples which could be generated by repeated sampling
from the original population.  This way of looking at observational data
has been an extremely important development in data analysis, because it
has helped scientists to realize that what one may infer from a set of
data  (sample) depends very much upon how it was obtained; that is, the
procedure by which "similar" samples would be obtained.  Almost everyone
is familiar with the type of calculations which can be performed on a set
of numbers, giving rise to  things  such as mean, standard deviation,
median, and percentiles.  Whenever interest lies only in the set  of
numbers  itself,  such quantities  are helpful  in providing summary descriptions.
However,  when a set  of numbers is  considered to be a sample from  a large
finite population or from an infinite population,  sample quantities  such
as mean and standard deviation are of limited value in terms of

-------
                                                                    137
the larger population unless it is possible to ascertain something about


their behavior over repeated sampling.  There are two factors which


determine this behavior:  (1)  the sampling procedure, and (2)  the


structure of the sampled population.  We have seen that when the selection


of samples is allowed to be governed by the laws of chance, evaluation


of the performance of estimates and formulation of tests and confidence


statements are possible to a certain degree, even when the sampled


population is not given a structure.  In a great many cases, however, it


is advantageous to assume that the sampled population has an underlying


probability structure of distribution.  Of all the distributions available,


the most commonly used for this purpose is the normal distribution.  Models


for population structures can he quite varied.  The simplest type is one


of the form  Y = y + e, where  p  is a fixed constant and  e  is a random


variable.  Such a model is frequently used to represent repeated measure-


ments, where  p  is the true value and  e  is a random measurement.   Note


that if we give  e  a probability distribution with mean zero and variance

 ?                                                                    2
a", then  Y  is a random variable also, having mean  y  and variance a .

                                           2
A simple model of this type with  e ---NCO, a )  would describe the structure


of most of the continuous populations discussed in the context of


statistical inference to this point.  Populations which are sampled by


conducting experiments are generally considered to have structures


described by more complex models such as  Y = f(Set of Conditions, e) where


f  is some function and  e  is some random error.

-------
                                                                     138
 1.  Anscombe, F.J.  The Validity of Comparative Experiments.   Jour.  Roy.
      Stat. Soc. A, 61:181-211, 1948.

 2.  Bennett, C.A. and N.L. Franklin.  Statistical Analysis in Chemistry
      and in the Chemical Industry, New York, Wiley, 1954.

 3.  Bowker, A.H. and G.J. Lieberman, Engineering Statistics.   Englewood
      Cliffs, N.J., Prentice-Hall, 1959.

 4.  Brownlee, K.A.  Industrial Experimentation.  New York, Chemical
      Pub. Co. Inc., 1947.

 5.  Brownlee, K.A.  Statistical Theory and Methodology in Science and
      Engineering. New York, Wiley, 1965.

 6.  Chemical Rubber Company.  Handbook of Tables for Probability and
      Statistics. Cleveland, Chemical Rubber Co., 1968.

 7.  Chew, V.  Experimental Designs in Industry. New York,  Wiley, 1958.

 8.  Cochran, W.G. and M. Cox.  Experimental Designs, Second Edition,
      New York, John Wiley, 1957.

 9.  Conover, W.J.  Practical Nonparametric Statistics.   New York,
      John Wiley, 1971.

10.  Cox, D.R.  Planning of Experiments.  John Wiley, New York, 1958.

11.  Davies, O.L., ed.  The Design and Analysis of Industrial  Experiments.
      New York, Hafner, 1954.

12.  Davies, O.L., ed.  Statistical Methods in Research and Production.
      New York, Hafner, 1957.

13.  Draper, N.R. and H. Smith.  Applied Regression Analysis.  New York,
      Wiley, 1966.

14.  Fisher, R.A.  The Arrangement of Field Experiments.  Journal of
      Ministry of Agric.  33:503-513, 1926

15.  Fisher, R.A.  The Design of Experiments.  Edinburgh, Oliver, and Boyd,
      1947.

16.  Fisher, R.A.  and W.A.  MacKenzie.  Studies in Crop Variation, II.
      The Manurial Response of Different Potato Varieties.   Journal of
      Agricultural Science, 13:311, 1923.

-------
                                                                    139


17.  Fisher, R.A.  and F. Yates.  Statistical Tables for Biological,
        Agricultural, and Medica] Research, Sixth Edition.  Edinburgh,
        Oliver, and Boyd, 1963.

18.  Hald, A.  Statistical Tables and Formulas.  New York, John Wiley,
        1952.

19.  Hald, A.  Statistical Theory with Engineering Applications. New York,
        Wiley, 1965.

20.  Hogg, R.V.  and A.T. Craig.  Introduction to Mathematical Statistics,
        2nd ed.  New York, Macmillan, 1965.

21.  Hooke.  Introduction to Scientific Inference.  San Francisco, Holden-
        Day, 1963.

22.  Kempthorne, 0.  The Design and Analysis of Experiments.  New York,
        John Wiley, 1952.

23.  Kempthorne, 0. and J.L. Folks.  Probability, Statistics, and Data
        Analysis.  Ames, Iowa, Iowa State Press, 1971.

24.  Korth, W.  Dynamic Irradiation Chamber Tests of Automotive Exhaust,
        Public Health Service Publication No.  999-AP-5, 1963.

25.  Larsen, H.J.  Introduction to Probability Theory and Statistical
        Inference. New York, Wiley, 1U69.

26.  Miller, I.  and  J.E.  Freund.  Probability and Statistics for
        Engineers. Englewood Cliffs, N.J.,  Prentice-Hall, 1965.

27.  Mood, A.M.  and F.A. Graybill.  Introduction to the Theory of
        Statistics, 2nd ed.   New York, McGraw-Hill, 1963.

28.  Murphy, J.R.  Procedures for Grouping a Set of Observed Means.
        Unpublished Dissertation.  Oklahoma State University, 1973.

29.  Ostle, B.  Statistics in Research,  2nd ed.  Ames, Iowa, Iowa State
        University Press, 1963.

30.  Rand Corporation.   A Million Random Digits with 100,000 Normal
        Deviates.  Glencoe,  Illinois, Free Press, 1955.

31.  Siegel, Nonparametric Statistics for the Behavioral Sciences.
        New York, McGraw-Hill, 1956.

32.  Snedecor.  Statistical  Methods,  First  Edition.   Ames, Iowa, Iowa
        State Press, 1937.

33.  Walpole, R.E.  and R.H. Meyers.   Probability and Statistics for
        Engineers and Scientists.   New York,  Macmillan, 1972.

-------
                                                                    140
34.  Winer, B.J.   Statistical Principles in Experimental Design,
        2nd ed.  New York, McGraw-Hill, 1971.

35.  Youden, W.J.  Statistical Methods for Chemists.   New York, Wiley,  1951.

-------
                                                                               141
                                  TECHNICAL REPORT DATA
                           (Please read Instructions on the reverse before completing)
  1 REPORT NO
    EPA-650/2-74-080
                                                          RECIPIENT'S ACCESS!01*NO.
 4 TITLE AND SUBTITLE

  Statistical Concepts for Design Engineers
             5. REPORT DATE
             September 1974
                                                        6 PERFOHMING ORGANIZATION CODE
 7 AUTHOR(S)
                                                        B PERFORMING ORGANIZATION IIEPORT NO
  J. R. Murphy and L. D. Broemeiing
 9 PERFORMING ORGANIZATION NAME AND ADDRESS

  Oklahoma State University
  Stillwater, Oklahoma  74074
             10 PROGRAM EL EMENT NO.

             1AB013; ROAP 21ADEN026
             11 CONTRACT/GRANT NO.

             Grant R-802269
 12 SPONSORING AGENCY NAME AND ADDRESS
  EPA, Office of Research and Development
  NERC-RTP,  Control Systems Laboratory
  Research Triangle Park, NC 27711
             13 TYPE OP REPORT AND PERIOD COVERED
             Final; Through 8/74	
             14 SPONSORING AGENCY CODE
   SUPPLEMENTARY NOTES
 16 ABSTRACT
 The report describes basic statistical concepts for engineers engaged in test design.
 Although written in handbook form for use within the Environmental Protection
 Agency,  it is not intended to replace existing statistics textbooks.  Its objectives are:
 to enable design engineers to converse with consulting statisticians,  to introduce
 basic ideas for further individual study, and to enable the reader to make some
 immediate applications to his own work.
                              KEY WORDS AND DOCUMENT ANALYSIS
                 DESCRIPTORS
,j Statistical Inference
 ^Kperimental Design
|| Environmental Engineering
 Air Pollution
b IDENTIFIERS/OPEN ENDED TERMS
c COSATI Field/Group
                         12 A

                         14B
                         05E
                         13B
 "> ' 1'irnigur'ON STATEMENT

 Unlimited
19 SECURITY CLASS (This Report)
 Unclassified
21 NO OF PAGES
    151
                                            20 SECURITY CLASS (Thispage)
                                             Unclassified
                                                                     22 PRICE
   form 7270-1 (9-73)

-------