PB86-203783
nser-Friendly  IBM PC (Personal Computer)
Computer Programs for Solving Sampling and
Statistical Problems
 UJ.S.)  Environmental Monitoring and Support Lab,
 Cincinnati,  OH

                   U.S. DEPARTMENT OF COMMERCE
                 National Technical Information Service

-------
      \
           UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
                    OFFICE OF RESEARCH AND DEVELOPMENT
              ENVIRONMENTAL MONITORING AND SUPPORT LABORATORY
                              CINCINNATI. OHIO
Gentlemen:
    Enclosed is a copy of a diskette of the "User-Friendly IBM PC Computer
Programs for Solving Sampling and Statistical Problems" as requested.
    The program's menu will automatically appear on the screen when the
diskette is inserted and the computer is turned on.  The programs on the
diskette can also be copied to the hard disk.  In this case, the user just
types "EMSLSTAT" on the C drive to run the program's menu.
    If there are any suggestions on the programs, please do not hesitate to
let me know.
    Your personal comments would also be appreciated.
                                       Sincerely yours,
                                       Philip C. L. Lin, PH.D.
                                         Mechanical Engineer
                               Sampling and Field Measurements Section
                                 Physical and Chemical Methods Branch

-------
                                            EPA/600/4-86/023
                                            May 1986


     USER-FRIENDLY  IBM PC COMPUTER PROGRAMS          PB86-20J763

                     FOR

   SOLVING SAMPLING AND  STATISTICAL  PROBLEMS
                      BY

               PHILIP C. L. LIM
ENVIRONMENTAL MONITORING  AND SUPPORT LABORATORY

      OFFICE OF RESEARCH AND PEVFLOPMENT

     U.  S.  ENVIRONMENTAL  PROTECTION AGENCY

            CINCINNATI, OHIO 45268

-------
                                    TECHNICAL REPORT DATA
                             (Please read Instructions on the reverse before completing)
 1. REPORT NO.

   EPA/600/4-86/023
              3 RECIPIENT'S ACCESSION NO
4. TITLE AND SUBTITLE
  User-Friendly IBM PC  Computer Programs  for
  Solving  Sampling and  Statistical Problems
              5 REPORT DATE

                 May 1986
              6. PERFORMING ORGANIZATION CODE
7. AUTHOmS)

  Philip  C.  L.  Lin
              8. PERFORMING ORGANIZATION REPORT NO
9. PERFORMING ORGANIZATION NAME AND ADDRESS
  "Sampling and  Field Measurements Section
   Physical and  Chemical Methods  Branch
   Environmental  Monitoring and Support Laboratory
   USEPA, Cincinnati, Ohio  45268
              10. PROGRAM ELEMENT NO.
              11. CONTRACT/GRANT NO.
12. SPONSORING AGENCY NAME AND ADDRESS
   Environmental  Monitoring and  Support Laboratory
   Office of  Research and Development
   U. S. Environmental Protection  Agency
   Cincinnati,  Ohio  45268
              13. TYPE OF REPORT AND PERIOD COwFBFn

              	i
              14. SPONSORING AGENCY CODE     '



               EPA  600/6
15. SUPPLEMENTARY NOTES
ID. ABSTRACT
          User friendly  IBM  personal computer  programs for solving sampling and
     related statistical problems have been prepared.  The programs are designed
     so  that persons without an In-depth understanding of statistics can easily
     use them.  Specific, detailed, written instructions for application of the
     programs are provided  in the report.  The  computer disc containing the
     programs will be made  available on request to the Environmental  Monitoring
     and Support Laboratory - Cincinnati (EMSL-Cincinnati).
 7.
                                KEY WORDS AND DOCUMENT ANALYSIS
                  DESCRIPTORS
                                               b.lDENTIFIERS/OPEN ENDED TERMS  C.  COSATI Field/Croup
 B. DISTRIBUTION STATEMENT

  Distribute to Public
19. SECURITY CLASS (Tins Report)
  Unclassified
21 NO OF PAGES
    74
                                               20 SECURITY CLASS (Tins page)
                                                 Unclassified
                                                                          22. PRICE
EPA Foim 2220—I (Rev. 4-77)   PREVIOUS EDITION is OBSOLETE

-------
                                   DISCLAIMER
     This report has been reviewed by the Environmental  Monitoring and
Support Laboratory - Cincinnati,  U.S. Environmental  Protection Agency,
and approved for publication.   Mention of trade names or commercial
products does not constitute endorsement of recommendation for use.
                                       11

-------
                                    FOREWORD

     Environmental measurements are required to determine the quality of
ambient waters and the character of waste effluents.   The Environmental
Monitoring and Support Laboratory - Cincinnati conducts research to:

         Develop and evaluate techniques to measure the presence and
         concentration of physical, chemical, and radiological  pollutants in
         water, wastewater, bottom sediments, and solid waste.

         Investigate methods for the concentration, recovery, and
         identification of viruses, bacteria, and other microbiological
         organisms in water, and determine the responses of aquatic  organisms
         to water quality.

         Develop and operate an Agency-wide quality assurance program to
         assure standardization and quality control of systems  for monitoring
         water and wastewater.

     The function of the Sampling and Field Measurement Section of the
Physical and Chemical Methods Branch in the Environmental  Monitoring and
Support Laboratory is to provide field measurement and sampling techniques
relating to water quality sampling programs.  This report provides
user-friendly IBM PC computer programs for solving sampling and statistical
problems so that an Individual  may use the programs and obtain  the benefits  of

-------
the statistical  package without an In-depth understanding  of  statistics
employed.  Descriptions of basic statistics are also presented for those who
wish to know more of the details of the statistics.
                                    Robert L.  Booth
                                        D1rector
              Environmental Monitoring and Support Laboratory  -  Cincinnati
                                       1v

-------
                                   ABSTRACT
    User friendly IBM personal  computer programs for solving sampling and
related statistical  problems have been prepared.  The programs are designed
so that persons without an in-depth understanding of statistics can easily
use them.  Specific, detailed,  written instructions for application of the
programs are provided in the report.  The computer disc containing the
programs will be-made available on request to the Environmental  Monitoring
and Support Laboratory - Cincinnati (EMSL-Cincinnati).

-------
                                    CONTENTS
Foreword	
Abstract 	
Figures	
    1.. Introduction	     1
    2.  Instructions for Using Sampling Programs on the IBM PC   	     4
    3.  Examples of Sampling Programs 	     6
    Appendix A.  Definitions of Basic Statistics	A-l
    Appendix B.  Descriptions of Statistical  Sampling Program on the  Pisk  .   B-l
                 B.I   Curve Fitting With a Linear Regression 	   B-l
                 B.2   Normal Deviate Z	B-2
                 B.3   Percentage Area Under the Normal Curve 	   B-3
                 B.4   Student t	B-5
                 B.5   Percentage Area Under the Student t	B-5
                 B.6   Chi Square	B-6
                 B.7   Sample Mean, Standard Deviation, and Confidence
                       Intervals for the Mean and Variance	B-6
                 B.8   Determination of the Number of Samples 	   B-8
                 B.9   Probability of Exceeding a Standard	B-9
                 B.10  Hypothesis Testing 	   B-10
                 B.ll  Power Spectrum Analysis	B-ll
                 B.12  Comparing Two Means	B-18
                                           vi

-------
                             CONTENTS  (Cont'd.)
             B.13  Percentage Area Under the F distribution 	   B-19
             B.14  F Distribution	B-19
             B.15  Significant Test between Variabilities of
                   Two Samples	B-19
             B.16  Significant Test between the Population Variability
                   and the Sample Variability	B-21
Appendix C.  Nomeclature	C-l
                                    vii

-------
                                    FIGURES
Number
   A-l   Normal  distribution  	    A-2
   A-2   Distribution of student t with 6=4 degrees of freedom  ....    A-7
   A-3   Chi square distribution	    A-7
   B-l   Standard normal distribution 	    B-4
   B-2   Time record of TOC of municipal wastewater at Racine,
         Wisconsin	    B-17
   B-3   Power spectrum of TOC concentration of municipal  wastewater
         at Racine, Wisconsin 	    B-17
                                      viii

-------
                                  SECTION 1
                                 INTRODUCTION

    Statistical techniques are useful  In assessing the quality of a sampling
program.  Frequently, field persons  engaged In sample collection do not have
the time to thoroughly study and understand all  the statistics required to
take a representative sample.  The computer programs described herein were
developed for those people and are designed so that an Individual may use
the programs and obtain the benefits of the statistical  package without an
in-depth understanding of the statistics employed.  A disc containing the
programs will be provided by the Environmental Monitoring and Support
Laboratory - Cincinnati (EMSL-Cincinnati) upon request.
    For those persons who wish to know more of the details of the
statistical package, descriptions are presented in the Appendices.  Those
who wish to proceed directly to the  computer portion will find the programs
in Sections 2 and 3.  The programs are user-friendly to those  familiar with
the IBM PC.
Typical Examples for Use of the Programs
    In order to assist the user 1n working the computer programs, a series
of questions and answers have been developed.  Questions that those
designing field sampling programs may wish to have answered are listed
below, together with the names of the computer programs designed to answer
the questions:
Question - How many samples must be  taken to reduce the anticipated error to
some reasonably fixed value?

-------
Answer - Use program No. 8 "Determination of Sample Number"  if the reduction
of the anticipated error is based on the accuracy of the sample variance.
Use program No. 9 "Determination of Sample Number" if the reduction of  the
anticipated error is based on the accuracy of the mean.
Question - What is the probability of an effluent exceeding  a  standard?
Answer - Use program No. 10 "Probability of Exceeding the Standard."
Question - How does one test whether a sample belongs in a particular
distribution?
Answer - Use program No. 11, "Hypothesis Testing."
Question - What is the sampling frequency required to capture  a significant
event in a long-term monitoring program?
Answer - Use program No. 12, "Power Spectrum Analysis."
Question - How does one determine the sample mean, standard  deviation,  and
confidence intervals for the mean and variance?
Answer - Use program No. 7, "Sample Mean, Standard Deviation,  and  Confidence
Intervals for Population Mean and Variance."
Question - Which program should one use to correlate observed  data in a
linear manner?
Answer - Use program Mo. 1, "Linear Regression"  to determine the linear
relationship and its correlation coefficient.
Question - A material is treated by two different processes.   Would there be
any justification for saying there was a difference between  the two
processes?  Which program should one use to answer this  question?
Answer - Use program No. 13, "Comparing Two Means."
Question - New equipment is used to measure a compound and it  is expected
that the measurement uniformity would improve.  The question to ask is

-------
whether the Improvement (more uniformity) really exists or has that occurred
by chance.  Which program should one use to test for the significant
difference between variances of two samples?
Answer - Use program No. 16, "Test for Significant Difference between
Variabilities of Two Samples."

-------
                               SECTION 2

           INSTRUCTIONS FOR USING SAMPLING PROGRAMS ON THE IBM PC



    Some Individuals,  especially those that have had extensive computer

experience, will be at ease in a few minutes with these programs.  In  those

cases the instructions may be bypassed, and the reader may begin to run the

programs Immediately.  For those who need additional assistance, the

following instructions are provided to assist the reader to "boot up"  the

programs and make logical selections.



Instructions to load and use the disk

1.  Place the program  disk in Disk Drive A and close the door.

2.  Turn on the power  of each instrument beginning with the printer,

    monitor, and, finally, the computer. After a brief warm-up, you will

    see the program menu:

           ««X********»3**«*S*M**»I«*S *************
           *      PROGRAM MENU	PAGE 1        *
           *****************************************

    1.  LINEAR REGRESSION
    2.  CALCULATION  OF NORMAL  DEVIATE Z
    3.  CALCULATION  OF THE  PERCENTAGE AREA OF NORMAL DISTRIBU-
        TION  (FROM  MINUS  INFINITY TO NORMAL DEVIATE Z)
    4.  CALCULATION  OF STUDENT T
    5.  CALCULATION  OF THE  PERCENTAGE AREA OF STUDENT T  DIST-
        RIBUTION 

-------
3.  Type an option number after the  question mark  (?) and press ENTER.  The
    desired program will  be loaded Into the computer.
4.  After you run the desired program,  you have  several choices:
    (a)  go back to program menu,
    (b)  do another calculation,
    (c)  quit,
         by typing the requested option number and press ENTER.
5.  If you want to abort  program calculation, press CONTROL-BREAK key.  If you
    want to start over again, type "A:EMSLSTAT"  and press ENTER.

-------
                             SECTION 3

                     EXAMPLES OF SAMPLING PROGRAMS
                *     PROGRAM MENU ........ PAGE 1       *
                a*************************** *************

          1. LINEAR REGRESSION
          2. CALCULATION OF NORMAL DEVIATE Z
          3. CALCULATION OF THE PERCENTAGE AREA OF NORMAL DISTRIBU-
              TION (FROM MINUS INFINITY TO NORMAL DEVIATE Z)
          4. CALCULATION OF STUDENT T
          5. CALCULATION OF THE PERCENTAGE AREA OF STUDENT T DIST-
              RIBUTION (FROM MINUS INFINITY TO STUDENT T)
          6. CALCULATION OF CHI SQUARE
          7. CALCULATION OF SAMPLE MEAN, STANDARD DEVIATION, AND CON-
              FIDENCE INTERVALS FOR THE POPULATION MEAN AND VARIANCE
          8. CALCULATION OF SAMPLE NUMBER BASED ON THE ACCURACY OF
              THE VARIANCE
          9. CALCULATION OF SAMPLE NUMBER BASED ON THE ACCURACY OF
              THE SAMPLE MEAN
         10. CALCULATION OF THE PROBABILITY OF EXCEEDING A STANDARD
         11. HYPOTHESIS TESTING
         12. POWER SPECTRUM ANALYSIS
         13. PROCEED TO NEXT PAGE
         14. QUIT
         TYPE THE DESIRED OPTION NUMBER AND PRESS ENTER ? 1
                 ft********)***************
                 * 1. LINEAR REGRESSION *
                 ************************
THIS PROGRAM ESTIMATES A LINE, Y=A+BX, WHERE X IS THE INDEPEN-
DENT VARIABLE AND Y IS THE DEPENDENT VARIABLE.

ANSWER EACH QUESTION AFTER A QUESTION MARK  (?) AND THEN PRESS ENTER.

DEFINE X(INDEPENDENT VARIABLE SUCH AS BOD, ETC)=? BOD
DEFINE Y(DEPENDENT VARIABLE SUCH AS TOC, ETC)=? TOC
YOUR DATA STORED IN A FILE MUST BE IN X(INDEPENDENT VAR)
AND Y(DEPENDENT VAR) FORMAT (FOR EXAMPLE, 30.1,100.3).
AN EXAMPLE FILE TEST1.DAT IS ON THIS DISK WHICH YOU CAN USE FDR
A TEST RUN.
IS YOUR DATA STORED IN A FILE (Y/N) ? Y
INPUT FILENAME(NO MORE THAN 8 CHARACTERS)
DATA FROM DISK A, TYPE A:  DATA FROM HARD DISK, TYPE C: FIRST
AND THEN TYPE FILENAME.


DO YOU WISH TO LIST THE FILENAME BEFORE YOU PROCEED(Y/N) ? N
 TYPE FILENAME? AsTESTl.DAT

-------
 SELECT  NUMBER  OF  OPTION:
           1. LIST INPUT DATA
           2. MODIFY  OR  ADD  INPUT  DATA
           3. DELETE  SOME OF THE DATA
           4. PERFORM REGRESSION ANALYSIS
           5. STORE DATA
           6. GO TO PROGRAM  MENU
           7. DO ANOTHER REGRESSION
 OPTION  ?  1
 LISTING OF  DATA
 DATA POINT        X             Y
  1             10            115
  2             31             249
  3             17            208
  4             42            374
  5             36            307
  6             33            299

 SELECT  NUMBER  OF  OPTION:
           1. LIST INPUT DATA
          2. MODIFY  OR  ADD  INPUT  DATA
          3. DELETE  SOME  OF THE DATA
          4. PERFORM REGRESSION ANALYSIS
          5. STORE DATA
          6. GO TO PROGRAM  MENU
          7. DO ANOTHER REGRESSION
 OPTION  ? 4

 REGRESSION  EQUATION:
 Y= 55.95309 +  7.196932  X

 COEFFICIENT OF CORRELATION= .9712766

 ACTUAL  VERSUS ESTIMATED VALUES
 X=BOD   Y=TOC
 X             Y              ESTIMATED Y   ERROR
  10            115            127.9224    -12.92239
 31            249            279.058     -30.05798
  17            208            178.3009     29.69908
 42            374            358.2243     15.77576
 36            307            315.0426    -8.042633
 33            299            293.4519     5.548157

SELECT  NUMBER OF OPTION:
          1. LIST  INPUT DATA
          2. MODIFY  OR  ADD  INPUT DATA
          3. DELETE  SOME OF  THE DATA
          4. PERFORM REGRESSION ANALYSIS
          5. STORE DATA
          6. GO TO PROGRAM  MENU
          7. DO ANOTHER REGRESSION
OPTION  ? 6

-------
                *     PROGRAM  MENU	PAGE  1        *
                #****** *************************** #****«**

           1. LINEAR REGRESSION
           2. CALCULATION OF NORMAL  DEVIATE Z
           3. CALCULATION OF THE PERCENTAGE AREA  OF  NORMAL DISTRIBU-
              TION  (FROM MINUS INFINITY TO NORMAL DEVIATE Z)
           4. CALCULATION OF STUDENT T
           5. CALCULATION OF THE PERCENTAGE AREA  OF  STUDENT T  DIST-
              RIBUTION (FROM MINUS  INFINITY TO STUDENT  T)
           6. CALCULATION OF CHI SOUARE
           7. CALCULATION OF SAMPLE  MEAN,STANDARD DEVIATION, AND  CON-
              FIDENCE INTERVALS FOR THE POPULATION  MEAN AND VARIANCE
           8. CALCULATION OF SAMPLE  NUMBER BASED  ON  THE  ACCURACY  OF
              THE VARIANCE
           9. CALCULATION OF SAMPLE  NUMBER BASED  ON  THE  ACCURACY  OF
              THE SAMPLE MEAN
         10. CALCULATION OF THE PROBABILITY OF EXCEEDING A STANDARD
         11. HYPOTHESIS TESTING
         12. POWER SPECTRUM ANALYSIS
         13. PROCEED TO NEXT PAGE
         14. OUIT
         TYPE THE DESIRED OPTION NUMBER AND PRESS ENTER ? 2
                        ***#***#**#**#** *•*#*#**•
                        * 2. NORMAL DEVIATE Z *
                        ******#*#*#**##*##*****


Z IS THE DISTANCE FROM THE POPULATION MEAN IN UNITS OF THE STANDARD
DEVIATION IN A NORMAL DISTRIBUTION CURVE. THE CREATION OF THE CONFIDENCE
INTERVAL FOR THE MEAN AT A CERTAIN CONFIDENCE LEVEL REQUIRES THE VALUE OF  Z,
TO USE THIS PROGRAM TO CALCULATE THE Z VALUE REQUIRES THE USER TO PROVIDE
THE CONFIDENCE LEVEL (TWO-TAILED TEST). INPUT A VALUE LESS THAN 99.997 '/..

ANSWER EACH QUESTION AFTER A QUESTION MARK C?) AND THEN PRESS ENTER.


         INPUT CONFIDENCE LEVEL 7.  ? 95
              ****#******************#**#****#****#***#**##
                   CONFIDENCE LEVEL =            95  7.
                   THE NORMAL DEVIATE Z =        1.959961
              *********************************************
             DO YOU WISH TO DO ANOTHER CALCULATION (Y/N>? N

-------
                *******#******»*•***************#**#****•»*
                *     PROGRAM MENU	PAGE  1        *
                ****###****##***##*•*#*#*#•*#********#**#**

          1. L'lNEAR REGRESSION
          2. CALCULATION OF NORMAL  DEVIATE Z
          3. CALCULATION OF THE PERCENTAGE AREA  OF  NORMAL  DISTRIBU-
              TION  (FROM MINUS  INFINITY TO NORMAL DEVIATE  Z)
          4. CALCULATION OF STUDENT- T
          5. CALCULATION OF THE PERCENTAGE AREA  OF  STUDENT T  DIST-
              RIBUTION  (FROM MINUS  INFINITY TO STUDENT  T)
          6. CALCULATION OF CHI SQUARE
          7. CALCULATION OF SAMPLE  MEAN,STANDARD DEVIATION, AND  CON-
              FIDENCE INTERVALS FOR THE POPULATION  MEAN AND VARIANCE
          B. CALCULATION OF SAMPLE  NUMBER BASED  ON  THE  ACCURACY  OF
              THE VARIANCE
          9. CALCULATION OF SAMPLE  NUMBER BASED  ON  THE  ACCURACY  OF
              THE SAMPLE MEAN
         10. CALCULATION OF THE PROBABILITY OF EXCEEDING A STANDARD
         11. HYPOTHESIS TESTING
         12. POWER SPECTRUM ANALYSIS
         13. PROCEED TO NEXT PAGE
         14. OUIT
         TYPE THE DESIRED OPTION NUMBER AND PRESS ENTER ?  3
    * 3. CALCULATION OF THE PERCENTAGE AREA 7. OF NORMAL DISTRIBUTION  *
THIS IS A NORMAL DISTRIBUTION PROGRAM TO CALCULATE THE PROBABILITY
INTEGRATED FROM MINUS INFINITY TO A NORMAL DEVIATE Z.
THE USER HAS TO INPUT A VALUE OF NORMAL DEVIATE Z.
DO NOT EXCEED A Z VALUE OF 4.12 WHICH GENERATES AN AREA OF 99.99901  7.

ANSWER EACH QUESTION AFTER A QUESTION MARK. (?) AND THEN PRESS ENTER.
IF YOU WANT TO START OVER AGAIN, PRESS FUNCTION KEY  1 AND RETURN.

INPUT A Z VALUE =7 1.96

         ft*****************************************************
         AREA OF NORMAL DISTRIBUTION (FROM Z=MINUS INFINITY TO
         Z= 1.96 ) = 97.50023  '/.
         *«*************««***-************« ****** ****#**********
          DO YOU WISH TO DO ANOTHER CALCULATION  (Y/N)? N

                                    9

-------
       *************#***************************
       *     PROGRAM MENU	PAGE  1        *
       *****************************************

  1. LINEAR REGRESSION
  2. CALCULATION OF NORMAL  DEVIATE Z
  3. CALCULATION OF THE PERCENTAGE AREA  OF NORMAL  DISTRIBU-
     TION  (FROM MINUS INFINITY TO NORMAL DEVIATE  Z)
  4. CALCULATION OF STUDENT T
  5. CALCULATION OF THE PERCENTAGE AREA  OF STUDENT T  DIST-
     RIBUTION  (FROM MINUS  INFINITY TO STUDENT T)
  6. CALCULATION OF CHI SQUARE
  7. CALCULATION OF SAMPLE  MEAN,STANDARD DEVIATION, AND CON-
     FIDENCE INTERVALS FOR THE POPULATION MEAN  AND VARIANCE
  8. CALCULATION OF SAMPLE  NUMBER BASED  ON THE ACCURACY OF
     THE VARIANCE
  9. CALCULATION OF SAMPLE  NUMBER BASED  ON THE ACCURACY OF
     THE SAMPLE MEAN
10. CALCULATION OF THE PROBABILITY OF EXCEEDING A STANDARD
11. HYPOTHESIS TESTING
12. POWER SPECTRUM ANALYSIS
13. PROCEED TO NEXT PAGE
14. QUIT
TYPE THE DESIRED OPTION NUMBER AND PRESS ENTER ?  4
          ************************ **********
          * 4.     STUDENT T               *
          *#********************************
IF THE VARIABILITY OF A NORMAL DISTRIBUTION IS ESTIMATED FROM
A SET OF SAMPLES, A STUDENT T DISTRIBUTION INSTEAD OF THE NORMAL
DISTRIBUTION IS USED TO CREATE THE CONFIDENCE INTERVAL FOR THE MEAN.
THE STUDENT T IS TO BE CALCULATED FROM THE PROVIDED CONFIDENCE
LEVEL FOR THE POPULATION MEAN AND THE DEGREES OF FREEDOM WHICH
ARE ONE LESS THEN THE NUMBER OF SAMPLES. THE CONFIDENCE LEVEL
PROVIDED BY USER HERE IS FOR TWO-TAILED TEST.

ANSWER EACH QUESTION AFTER A QUESTION MARK (?) AND THEN PRESS ENTER.
PRESS FUNCTION KEY 1 AND THEN PRESS ENTER TO START OVER AGAIN.

 INPUT CONFIDENCE LEVEL? 95
 INPUT DEGREES OF FREEDOM =? 12
*********************************************************
DEGREES OF FREEDOM                     =12
CONFIDENCE LEVEL FOR THE POPULATION MEAN= 95  7.
THE STUDENT T =                               2.178711
•it********************************************************
                              10

-------
      DO YOU WISH TO DO ANOTHER CALCULATION  (Y/N)? N

            **#*******•************#***#************•**
            *     PROGRAM MENU	PAGE  1       *
            ******#*********************#***#********

      1. LINEAR REGRESSION
      2. CALCULATION OF NORMAL DEVIATE Z
      3. CALCULATION OF THE PERCENTAGE AREA OF NORMAL DISTRIBU-
          TION (FROM MINUS INFINITY TO NORMAL DEVIATE Z)
      4. CALCULATION OF STUDENT T
      5. CALCULATION OF THE PERCENTAGE AREA OF STUDENT T DIST-
          RIBUTION (FROM MINUS INFINITY TO STUDENT T)
      6. CALCULATION OF CHI SQUARE
      7. CALCULATION OF SAMPLE MEAN,STANDARD DEVIATION, AND CON-
          FIDENCE INTERVALS FOR THE POPULATION MEAN AND VARIANCE
      8. CALCULATION OF SAMPLE NUMBER EASED ON THE ACCURACY OF
          THE VARIANCE
      9. CALCULATION OF SAMPLE NUMBER BASED ON THE ACCURACY OF
          THE SAMPLE MEAN
     10. CALCULATION OF THE PROBABILITY OF EXCEEDING A STANDARD
     11. HYPOTHESIS TESTING
     12. POWER SPECTRUM ANALYSIS
     13. PROCEED TO NEXT PAGE
     14. QUIT
     TYPE THE DESIRED OPTION NUMBER AND PRESS ENTER ? 5
* 5. CALCULATION OF THE PERCENTAGE AREA 7. OF STUDENT T DISTRIBUTION
     THIS PROGRAM IS TO CALCULATE THE PERCENTAGE AREA OF STUDENT
     T DISTRIBUTION.  THE AREA IS INTEGRATED FROM MINUS INFINITY TO THE
     T VALUE WHICH HAS TO BE PROVIDED BY THE USER. THE DEGREES OF
     FREEDOM ARE ALSO NEEDED.

     ANSWER EACH QUESTION AFTER A QUESTION MARK (?) AND THEN PRESS ENTER.

     IF YOU WANT TO START OVER AGAIN, PRESS FUNCTION KEY 1 AND RETURN.
      INPUT DEGREES OF FREEDOM"? 12
      INPUT STUDENT T VALUE =? 2
   *******************************************
      DEGREES OF FREEDOM =             12
      THE T VALUE        =             2
      THE PERCENTAGE AREA FOR
      THE STUDENT T DISTRIBUTION =     96.56724  X
   *******************************************
      DO YOU WISH TO DO ANOTHER CALCULATION (Y/N>? N


                                   11

-------
       *********#********#****#*****************
       *     PROGRAM MENU	PAGE 1       *
       ###**#*####**#*#************#*-******#»***

 1.  LINEAR REGRESSION
 2.  CALCULATION OF NORMAL DEVIATE Z
 3.  CALCULATION OF THE PERCENTAGE AREA OF NORMAL DISTRIBU-
     TION (FROM MINUS INFINITY TO NORMAL DEVIATE Z)
 4.  CALCULATION OF STUDENT T
 5.  CALCULATION OF THE PERCENTAGE AREA OF STUDENT T DIST-
     RIBUTION (FROM MINUS INFINITY TO STUDENT T)
 6.  CALCULATION OF CHI SQUARE
 7.  CALCULATION OF SAMPLE MEAN»STANDARD DEVIATION, AND CON-
     FIDENCE INTERVALS FOR THE POPULATION MEAN AND VARIANCE
 8.  CALCULATION OF SAMPLE NUMBER BASED ON THE ACCURACY OF
     THE VARIANCE
 9.  CALCULATION OF SAMPLE NUMBER BASED ON THE ACCURACY OF
     THE SAMPLE MEAN
10.  CALCULATION OF THE PROBABILITY OF EXCEEDING A STANDARD
11.  HYPOTHESIS TESTING
12.  POWER SPECTRUM ANALYSIS
13.  PROCEED TO NEXT PAGE
14.  LJUIT
TYPE THE DESIRED OPTION NUMBER AND PRESS ENTER ? 6
               »*•**#****##***•***********
               * 6. CHI-SQUARE PROGRAM *
               •It-*********-***************
NORMALLY DISTRIBUTED DATA COULD BE TRANSFORMED INTO A UNIT
NORMAL DISTRIBUTION WITH MEAN=0 AND VARIANCE^!.  THE SUM OF
SQUARES OF DEVIATIONS FROM THE SAMPLE MEAN THEN HAS A CHI-SQUARE
DISTRIBUTION WITH (N-l) DEGREES OF FREEDOM WHERE N IS THE NUMBER
OF OBSERVATIONS. ONE OF THE APPLICATIONS FOR THE CHI-SQUARE
DISTRIBUTION IS TO DETERMINE THE CONFIDENCE LIMITS OF THE VARIANCE
ESTIMATION FOR NORMALLY DISTRIBUTED DATA.
TO USE THIS PROGRAM TO DETERMINE THE VALUE OF CHI-SQUARE, THE
USER MUST PROVIDE THE PERCENTAGE AREA AND THE DEGREES OF FREEDOM
FOR THE VARIANCE ESTIMATION.  THE UPPER PERCENTAGE AREA IS THAT
INTEGRATED FROM THE DESIRED VALUE TO INFINITY OF CHI-SQUARE.

ANSWER EACH QUESTION AFTER A QUESTION MARK (?) AND THEN PRESS ENTER.
IF YOU WANT TO START OVER AGAIN, PRESS FUNCTION KEY 1 AND RETURN.

INPUT DEGREES OF FREEDOM=? 12
INPUT UPPER PERCENTAGE AREA 7.=? 95

##***###***#***#*#***#»#»*#*#*##*********#•******#*****
DEGREES OF FREEDOM =                 12
CHI-SQUARE ( 12 , .95 ) = 5.22471
THE PERCENTAGE AREA =                95  '/.
*#***##»*#*•******#******«*********#x ***#**#*****#***#*
 DO YOU WISH TO DO ANOTHER CALCULATION  (Y/N)^ N


                              12

-------
            ****##*##***#***##***•**####**##**#*******
            *     PROGRAM MENU	PAGE 1       *
            ******************************************

      1. LINEAR REGRESSION
      2. CALCULATION OF NORMAL DEVIATE Z
      3. CALCULATION OF THE PERCENTAGE AREA OF NORMAL DISTRIBU-
          TION (FROM MINUS INFINITY TO NORMAL DEVIATE Z)
      4. CALCULATION OF STUDENT T
      5. CALCULATION OF THE PERCENTAGE AREA OF STUDENT T DIST-
          RIBUTION (FROM MINUS INFINITY TO STUDENT T)
      6. CALCULATION OF CHI SQUARE
      7. CALCULATION OF SAMPLE MEAN,STANDARD DEVIATION, AND CON-
          FIDENCE INTERVALS FOR THE POPULATION MEAN AND VARIANCE
      8. CALCULATION OF SAMPLE NUMBER BASED ON THE ACCURACY OF
          THE VARIANCE
      9. CALCULATION OF SAMPLE NUMBER BASED ON THE ACCURACY OF
          THE SAMPLE MEAN
     1O. CALCULATION OF THE PROBABILITY OF EXCEEDING A STANDARD
     11. HYPOTHESIS TESTING
     12. POWER SPECTRUM ANALYSIS
     13. PROCEED TO NEXT PAGE
     14. QUIT
     TYPE THE DESIRED OPTION NUMBER AND PRESS ENTER ^ 7
* 7. CALCULATION OF SAMPLE MEAN, STANDARD DEVIATION, AND *
*    CONFIDENCE INTERVALS FDR THE MEAN AND THE VARIANCE  *
   ANSWER EACH QUESTION AFTER A QUESTION MARK  <") AND THEN PRESS ENTER.
   IF YOU WANT TO START OVER AGAIN, PRESS FUNCTION FEY 1 AND RETURN.

   DATA MUST BE INPUT TO THE COMPUTER BY THE USER EITHER FROM THE
   KEYBOARD OR FROM A FILE ON THE DISK BEFORE CALCULATIONS CAN BE
   PERFORMED.
   DEFINE Y
-------
SELECT NUMBER OF OPTION:
          1. LIST INPUT DATA
          2. MODIFY OR ADD INPUT DATA
          3. DELETE SOME OF INPUT DATA
          4. STORE DATA
          5. START TO CALCULATE
OPTION ? 1
LISTING OF DATA
FOR SET 1
DATA POINTS
1
*-»
4.
3
4
5
FOR SET 2
DATA POINTS
1
2
3
4
5
6
Y
50
30
40
30
35

Y
40
40
35
45
50
35
SELECT NUMBER OF OPTION:
          1. LIST INPUT DATA
          2. MODIFY OR ADD INPUT DATA
          3. DELETE SOME OF INPUT DATA
          4. STORE DATA
          5. START TO CALCULATE
OPTION f 5
INPUT CONFIDENCE LEVEL '/. FOR THE POPULATION MEAN? 95
INPUT CONFIDENCE LEVEL 7. FOR THE STANDARD DEVIATION'1 95

         FOR SAMPLE SET  1
         START TO CALCULATE THE SAMPLE MEAN, STANDARD DEVIATION
         , CONFIDENCE INTERVALS FOR THE MEAN AND STANDARD DEVIATION.
                                  14

-------
********************************•)< *******»-***********
NUMBER OF SAMPLES =
MEAN =
ESTIMATED STANDARD DEVIATION =
DEGREES OF FREEDOM =
CONFIDENCE LEVEL FOR THE MEAN =
THE STUDENT T =
                                           5
                                           37
                                           8.3666
                                           4
                                           95  7.
                                           2.776367
THE CONFIDENCE INTERVAL FOR THE MEAN
 26.61179 ^= POPULATION MEAN <,= 47.38822
CONFIDENCE LEVEL FOR THE VARIANCE =
XV2=CHI-SQUARE
XV2( 4 , .975 ) =
XV2( 4 , .025 ) =
                                           95  7.

                                           .4834985
                                           11.15244
THE CONFIDENCE INTERVAL FOR THE STANDARD DEVIATION
 5.010649  s= STANDARD DEVIATION <= 24.06476
*****#********************************# ******** ******

PRESS ENTER TO CONTINUE?
         FOR SAMPLE SET  2
         START TO CALCULATE THE SAMPLE MEAN, STANDARD DEVIATION
         . CONFIDENCE INTERVALS FOR THE MEAN AND STANDARD  DEVIATION.
***#***•************************ ***********************
NUMBER OF SAMPLES =                        6
MEAN =
ESTIMATED STANDARD DEVIATION =
DEGREES OF FREEDOM =
CONFIDENCE LEVEL FOR THE MEAN =
THE STUDENT T =
                                           40.83333
                                           5.845227
                                           5
                                           95  7.
                                           2.570313
THE CONFIDENCE INTERVAL FOR THE MEAN
 34.69979 <= POPULATION MEAN -.= 46.96688

CONFIDENCE LEVEL FOR THE VARIANCE =        95  7.
XV2=CHI-SOUARE
XV2( 5 , .975 ) =                          .8301781
XV2( 5 , .025 ) =                          12.83213

THE CONFIDENCE INTERVAL FOR THE STANDARD DEVIATION
 3.64869 <= STANDARD DEVIATION <= 14.345
****************************************************

PRESS ENTER TO CONTINUE?
DO YOU WANT TO DO ANOTHER CALCULATION  (Y/W)? N
                                  15

-------
                *********************************************
                *     PROGRAM MENU	PAGE 1       *
                ******#*#*#***********************#•»*****

          1. LINEAR REGRESSION
          2. CALCULATION OF NORMAL DEVIATE Z
          3. CALCULATION OF THE PERCENTAGE AREA OF NORMAL DISTRIBU-
              TION (FROM MINUS INFINITY TO NORMAL DEVIATE Z)
          4. CALCULATION OF STUDENT-T
          5. CALCULATION OF THE PERCENTAGE AREA OF STUDENT T DIST-
              RIBUTION (FROM MINUS INFINITY TO STUDENT T)
          6. CALCULATION OF CHI SQUARE
          7. CALCULATION OF SAMPLE MEAN,STANDARD DEVIATION, AND CON-
              FIDENCE INTERVALS FOR THE POPULATION MEAN AND VARIANCE
          8. CALCULATION OF SAMPLE NUMBER BASED ON THE ACCURACY OF
              THE VARIANCE
          9. CALCULATION OF SAMPLE NUMBER BASED ON THE ACCURACY OF
              THE SAMPLE MEAN
         10. CALCULATION OF THE PROBABILITY OF EXCEEDING A STANDARD
         11. HYPOTHESIS TESTING
         12. POWER SPECTRUM ANALYSIS
         13. PROCEED TO NEXT PAGE
         14. OUIT
         TYPE THE DESIRED OPTION NUMBER AND PRESS ENTER ^ B
                                                              : ************
 * 8. CALCULATION OF SAMPLE NUMBER BASED ON THE ACCURACY OF THE VARIANCE *
THE USER HAS TO PROVIDE THE ALLOWABLE ERROR RATIO  ( WIDTH OF CONFIDENCE
INTERVAL OF STANDARD DEVIATION / STANDARD DEVIATION ) AND THE CONFIDENCE
LEVEL FOR THE MEAN.

ANSWER EACH QUESTION AFTER A QUESTION MARK (?) AND THEN PRESS ENTER.
IF YOU WANT TO START OVER AGAIN, PRESS FUNCTION KEY 1 AND RETURN.

ALLOWABLE ERROR RATIO = ? .5
CONFIDENCE LEVEL 7. =    ? 95
     #******#**»#*******#*****************#****#****•*
      NUMBER OF SAMPLE REQUIRED=                34
      THE CONFIDENCE LEVEL FOR THE MEAN=        95
      ALLOWABLE ERROR OF STANDARD DEVIATION=    .5
     **************************************************


          DO YOU WISH TO DO ANOTHER CALCULATION (Y/N>? N

                                    16

-------
              *********************•******+•*************
              *     PROGRAM MENU	PAGE 1       *
              **#**********************************•****

        1.  LINEAR REGRESSION
        2.  CALCULATION OF NORMAL DEVIATE Z
        3.  CALCULATION OF THE PERCENTAGE AREA OF NORMAL DISTRIBU-
            TION (FROM MINUS INFINITY TO NORMAL DEVIATE Z)
        4.  CALCULATION OF STUDENT T
        5.  CALCULATION OF THE PERCENTAGE AREA OF STUDENT T DIST-
            RIBUTION (FROM MINUS INFINITY TO STUDENT T)
        6.  CALCULATION OF CHI SOUARE
        7.  CALCULATION OF SAMPLE MEAN,STANDARD DEVIATION, AND CON-
            FIDENCE INTERVALS FOR THE POPULATION MEAN AND VARIANCE
        B.  CALCULATION OF SAMPLE NUMBER BASED ON THE ACCURACY OF
            THE VARIANCE
        9.  CALCULATION OF SAMPLE NUMBER BASED ON THE ACCURACY OF
            THE SAMPLE MEAN
       10.  CALCULATION OF THE PROBABILITY OF EXCEEDING A STANDARD
       11.  HYPOTHESIS TESTING
       12.  POWER SPECTRUM ANALYSIS
       13.  PROCEED TO NEXT PAGE
       14.  CUIT
       TYPE THE DESIRED OPTION NUMBER AND PRESS ENTER "> 9
**#***##****»*#*******#***#******************************************
* 9. CALCULATION OF SAMPLE NUMBER BASED ON THE ACCURACY OF THE MEAN *
a********************************************************************


  THIS PROGRAM IS TO DETERMINE THE SAMPLE NUMBER BASED ON THE ACCURACY
  OF THE MEAN.  IF THE CALCULATED SAMPLE NUMBER IS LESS THEN 3, THEN
  3 IS SELECTED. THE USER HAS TO PROVIDE THE FOLLOWING:

  CONFIDENCE LEVEL 7. FOR THE MEAN
  COEFFICIENT OF VARIATION (STANDARD DEVI AT I ON /SAMPLE MEAN) IN 7.
  ERROR 7. OF THE MEAN REQUIRED ((SAMPLE MEAN-POPULATION
   MEAN )/ POPULATION MEAN)

  ANSWER EACH QUESTION AFTER A QUESTION MARK  (?) AND THEN PRESS ENTER.
  PRESS FUNCTION KEY 1 AND THEN PRESS ENTER TO START OVER AGAIN.

        CONFIDENCE LEVEL X FOR THE MEAN=? 95
        COEFFICIENT OF VARIATION IN 7.-  7 50
        ERROR 7. OF THE MEAN=            ? 10
 , I*************************************************************
  CONFIDENCE LEVEL X FOR THE MEAN=                          95
  COEFFICIENT OF VARIATION (STANDARD DEVI AT I ON /SAMPLE MEAN= 50  7.
  ERROR OF THE MEAN=                                        1O  7.
  NUMBER OF SAMPLES REOUIRED=                                *?*?
                                    17

-------
     DO  YOU  WISH  TO  DO  ANOTHER CALCULATION (Y/N)0 N

        *******************************************
        *      PROGRAM MENU	PAGE 1        *
        **#*#*******#*****#***#**###**#***#******

  1.  LINEAR  REGRESSION
  2.  CALCULATION  OF  NORMAL  DEVIATE  Z
  3.  CALCULATION  OF  THE PERCENTAGE  AREA OF NORMAL DISTRIBU-
     TION  (FROM  MINUS  INFINITY TO  NORMAL  DEVIATE Z)
  4.  CALCULATION  OF  STUDENT T
  5.  CALCULATION  OF  THE PERCENTAGE  AREA OF STUDENT T  DIST-
     RIBUTION (FROM MINUS  INFINITY TO  STUDENT  T)
  6.  CALCULATION  OF  CHI  SQUARE
  7.  CALCULATION  OF  SAMPLE  MEAN,STANDARD DEVIATION, AND  CON-
     FIDENCE  INTERVALS FOR THE POPULATION MEAN AND VARIANCE
  8.  CALCULATION  OF  SAMPLE  NUMBER BASED ON THE  ACCURACY  OF
     THE VARIANCE
  9.  CALCULATION  OF  SAMPLE  NUMBER BASED ON THE  ACCURACY  OF
     THE SAMPLE  MEAN
10.  CALCULATION  OF  THE PROBABILITY OF  EXCEEDING A STANDARD
11.  HYPOTHESIS TESTING
12.  POWER SPECTRUM  ANALYSIS
13.  PROCEED TO NEXT PAGE
14.  QUIT
TYPE THE DESIRED OPTION NUMBER AND PRESS  ENTER ?  10
»#***#*#*************##*****##***#***#***#*#*####**•***#
* 1O. PROBABILITY OF AN EFFLUENT EXCEEDING A STANDARD *
a******************************************************
THIS PROGRAM IS TO INVESTIGATE THE PROBABILITY OF EXCEEDING
A STANDARD. THIS REQUIRES THE KNOWLEDGE OF  :

1. POPULATION MEAN
2. STANDARD DEVIATION
3. THE STANDARD NOT TO BE EXCEEDED

ANSWER EACH QUESTION AFTER A QUESTION MARK  (?) AND PRESS ENTER.
PRESS FUNCTION KEY 1 AND THEN PRESS ENTER TO START OVER AGAIN.

 INPUT STANDARD=                    ? 100
 INPUT POPULATION MEAN=             ? 120
 INPUT STANDARD DEVIATION=          ? 20


********************************************************
STANDARD=                                   100
POPULATION MEAN=                            12O
STANDARD DEVIATION=                         20
PROBABILITY OF EXCEEDING STANDARD =  84.13449  7.
************************************ ********************


                             18

-------
      DO YOU WISH TO DO ANOTHER CALCULATION  (Y/N)^1 N

            *##*•*********#»•**#****#**#**********•*****
            *     PROGRAM MENU ........ PAGE  1       *
            *#*****************#*********************
      1. LINEAR REGRESSION
      2. CALCULATION OF NORMAL DEVIATE Z
      3. CALCULATION OF THE PERCENTAGE AREA OF NORMAL DISTRIBU-
          TION (FROM MINUS INFINITY TO NORMAL DEVIATE Z)
      4. CALCULATION OF STUDENT T
      5. CALCULATION OF THE PERCENTAGE AREA OF STUDENT T DIST-
          RIBUTION (FROM MINUS INFINITY TO STUDENT T)
      6. CALCULATION OF CHI SOUARE
      7. CALCULATION OF SAMPLE MEAN, STANDARD DEVIATION, AND CON-
          FIDENCE INTERVALS FOR THE POPULATION MEAN AND VARIANCE
      8. CALCULATION OF SAMPLE NUMBER BASED ON THE ACCURACY OF
          THE VARIANCE
      9. CALCULATION OF SAMPLE NUMBER BASED ON THE ACCURACY OF
          THE SAMPLE MEAN
     10. CALCULATION OF THE PROBABILITY OF EXCEEDING A STANDARD
     11. HYPOTHESIS TESTING
     12. POWER SPECTRUM ANALYSIS
     13. PROCEED TO NEXT PAGE
     14. QUIT
TYPE THE DESIRED OPTION NUMBER AND PRESS ENTER
                                                      11
               ***************************
               * 11. HYPOTHESIS TESTING *
               **************** » *********

     HYPOTHESIS TESTING IS TO TEST WHETHER A SAMPLE COMES FROM
     A PARTICULAR DISTRIBUTION. IN ORDER TO USE THIS HYPOTHESIS
     TESTING, THE USER CAN SELECT ANY OPTIONS IF NO DATA POINTS
     HAVE BEEN ENTERED.  IF DATA POINTS ARE TO BE ENTERED FROM THE
     hEY BOARD OR FROM A FILE ON THE DISK, THEN OPTION 3 MUST BE SELECTED.
GROUP 1
     4.
     5.
GROUP 3
     1.
     2.
     3.
     4.
                              GROUP
   POPULATION MEAN
   SAMPLE MEAN
   NUMBER OF SAMPLE
   POPULATION STANDARD DEVIATION
   CONFIDENCE LEVEL 7. FOR THE MEAN

   POPULATION MEAN
   SAMPLE MEAN
   NUMBER OF SAMPLE
   SAMPLE STANDARD DEVIATION
1. POPULATION MEAN
2. ONE SAMPLE VALUE
3. POPULATION STANDARD DEVIATION
4. CONFIDENCE LEVEL '/. FOR THE MEAN
     5. CONFIDENCE LEVEL 7. FOR THE MEAN

     ANSWER EACH QUESTION AFTER A QUESTION MARK  <") AND THEN PRESS ENTER.
     IF YOU WANT TO START OVER AGAIN, PRESS FUNCTION KEY  1 AND RETURN.
     INPUT GROUP NUMBER 
-------
        DO  YOU  HAVE  TO  INPUT DATA SET (Y/N)? Y
        IS  YOUR DATA STORED  IN A FILE (Y/N)  ? Y
        INPUT FILENAME(NO  MORE THAN 8 CHARACTERS)
        DATA FROM  DISK  A,  TYPE A:   DATA  FROM HARD  DISK,  TYPE C:  FIRST
        THEN TYPE  FILENAME.   EXAMPLE FILENAMES  LIN1  AND LIN2 ARE
        AVAILABLE  ON DISK  A  FOR YOUR USE.   IF THE  PROGRAMS HAVE  BEEN
        LOADED  TO  THE HARD DISt ,  THEN THE  FILENAMES ARE  ON THE HARD DISK.
       DO  YOU WISH TO LIST  THE  FILENAME  BEFORE  YOU PROCEED(Y/N)
       HOW MANY SETS OF  DATA  TO BE  RETRIVED?  2
        TYPE FILENAME FOR SET 1 ? A:LIN1
        TYPE FILENAME FOR SET 2 ? A:LIN2

SELECT NUMBER OF OPTION:
           1. LIST INPUT  DATA
           2. MODIFY OR ADD  INPUT DATA
           3. DELETE SOME OF INPUT DATA
           4. STORE DATA
           5. START TO CALCULATE
N
OPTION ? 1
LISTING OF
FOR SET 1
DATA POINTS
1
2
3
4
5
FOR SET 2
DATA POINTS
1
2
3
4
5
6

DATA

Y
50
30
40
30
35

Y
40
40
35
45
5O
35
SELECT NUMBER OF OPTION:
          1. LIST INPUT DATA
          2. MODIFY OR ADD INPUT DATA
          3. DELETE SOME OF INPUT DATA
          4. STORE DATA
          5. START TO CALCULATE
OPTION ? 5
                                 20

-------
   FOR SET                                1
   INPUT POPULATION MEAN=               •» 40
   INPUT CONFIDENCE LEVEL 7. FOR THE POPULATION MEAN=? 95
  **##**#**#**************#**#********************************
  FOR SET                                 1
  POPULATION MEAN=                        40
  SAMPLE MEAN=                            37
  NUMBER OF SAMPLES^                      5
  SAMPLE STANDARD DEVIATION=           8.3666
  THE STUDENT T FOR 95  7. CONFIDENCE LEVEL=        2.776367
  THE CALCULATED T VALUE=                 -.8017838
  THE SAMPLE HAS A MEAN EQUAL TO THE POPULATION MEAN
  *#***********************#****************************•******

   FOR SET                                2
   INPUT POPULATION MEAN=               ? 40
   INPUT CONFIDENCE LEVEL 7. FOR THE POPULATION MEAN=? 95
 ' ****+•*******************************************•»<***********
  FOR SET                                 2
  POPULATION MEAN=                        40
  SAMPLE MEAN=                            4O.83333
  NUMBER OF SAMPLES=                      6
  SAMPLE STANDARD DEVIATION=           S.845227
  THE STUDENT T FOR 95  7. CONFIDENCE LEVEL=        2.570313
  THE CALCULATED T VALUE=                  .3492146
  THE SAMPLE HAS A MEAN EQUAL TO THE POPULATION MEAN
  #**********#************************************************
DO YOU WANT ANOTHER CALCULATION  (Y/N)*? N
                                21

-------
       #**********#*******#******+•****•**********
       #     PROGRAM MENU	PAGE 1       *
       #****************#***********************

 1.  LINEAR REGRESSION
 2.  CALCULATION OF NORMAL DEVIATE Z
 3.  CALCULATION OF THE PERCENTAGE AREA OF NORMAL DISTRIBU-
     TION (FROM MINUS INFINITY TO NORMAL DEVIATE Z)
 4.  CALCULATION OF STUDENT T
 5.  CALCULATION OF THE PERCENTAGE AREA OF STUDENT T DIST-
     RIBUTION (FROM MINUS INFINITY TO STUDENT T>
 6.  CALCULATION OF CHI SQUARE
 7.  CALCULATION OF SAMPLE MEAN,STANDARD DEVIATION, AND CON-
     FIDENCE INTERVALS FOR THE POPULATION MEAN AND VARIANCE
 B.  CALCULATION OF SAMPLE NUMBER BASED ON THE ACCURACY OF
     THE VARIANCE
 9.  CALCULATION OF SAMPLE NUMBER BASED ON THE ACCURACY OF
     THE SAMPLE MEAN
10.  CALCULATION OF THE PROBABILITY OF EXCEEDING A STANDARD
11.  HYPOTHESIS TESTING
12.  POWER SPECTRUM ANALYSIS
13.  PROCEED TO NEXT PAGE
14.  QUIT
TYPE THE DESIRED OPTION NUMBER AND PRESS ENTER ? 12
               it******************************
               * 12. POWER SPECTRUM ANALYSES *
               ************** *****************


THIS PROGRAM CONDUCTS A POWER SPECTRUM ANALYSIS IN WHICH
THE MINIMUM SAMPLING FREQUENCY REQUIRED TO CAPTURE A SPECIFIC
WATER QUALITY EVENT CAN BE DETERMINED.  THE PROGRAM DETERMINES
THE MAGNITUDE OF COMPONENTS OF THE TOTAL VARIANCE OF A RECORD
THAT RECUR AT CONSTANT TIME INTERVALS.

ANSWER EACH QUESTION AFTER A QUESTION MARK  (?) AND THEN PRESS ENTER.
PRESS FUNCTION KEY  1 AND THEN PRESS ENTER TO START OVER AGAIN.


DEFINE X  (TIME UNIT SUCH AS DAYS, HOURS)= "? HOURS
DEFINE Y(DEPENDENT VARIABLE SUCH AS TOC)= ? TOC
INPUT SAMPLING TIME INTERVAL (IN TIME UNIT)=^ 5
 YOUR DATA STORED IN A FILE MUST BE IN Y(DEPENDENT VAR)
 FORMAT SUCH AS 10,20,25,...
 AN EXAMPLE FILE TEST3.DAT IS ON THIS DISK FOR YOUR USE.
IS YOUR DATA STORED IN A FILE (Y/N) ? Y
DO YOU WISH TO LIST THE FILENAME BEFORE YOU CONTINUE TO PROCEED
(Y/N)? N
INPUT FILENAME(NO MORE THAN 8 CHARACTERS)
DATA FROM DISK A, TYPE A:  DATA FROM HARD DISK, TYPE C: FIRST
THEN TYPE FILENAME? A:TEST3.DAT
                             22

-------
   SELECT NUMBER OF OPTION:
         1. LIST INPUT DATA
         2. MODIFY OR ADD  INPUT DATA
         3. DELETE SOME OF THE DATA
         4. PERFORM CALCULATIONS
         5. STORE DATA
         6. GO TO MAIN MENU
         7. DO ANOTHER CALCULATION
    OPTION ? 1

   RECORD OF DATA OF DEPENDENT VARIABLE AT SAMPLING TIME  INTERVALS
           Y

849.96        964.51        916.73        879.38        9B3.39
1168,8        1235.43       1045.1        672.61        351.6
264.6         381.75        504.82        470.01        306.53
194.99        271.45        477.85        612.64        532.92
305.88        155.24        251.97        553.75        849.81
964.5         916.78        879.36        983.27        1168.68
1235.47       1045.33       672.88        351.75        264.58
381.64        504.78        470.1         306.65        195.01
271.33        477.7         612.62        533.05        306.05
155.27        251.81        553.51        849.65        964.49
916.84        879.34        983.15        1168.57       1235.51
1045.55       673.16        351.91        264.55        381.53
504.74        470.19        306.78        195.03        271.21
477.55        612.59        533.19        3O6.21        155.3
251.65        553.27        849.49
                  THE TOTAL NUMBER OF SAMPLES=     73
   SELECT NUMBER OF OPTION:
         1. LIST INPUT DATA
         2. MODIFY OR ADD INPUT DATA
         3. DELETE SOME OF THE DATA
         4. PERFORM CALCULATIONS
         5. STORE DATA
         6. 60 TO MAIN MENU
         7. DO ANOTHER CALCULATION
    OPTION ? 4

        INPUT NUMBER OF LAGS REQUIRED FOR THE CALCULATION OF
        AUTOCORRELATION COEFFICIENTS. THE SUGGESTED NUMBER  IS ABOUT
        15 •/. OF THE TOTAL NUMBER OF SAMPLES. SELECT A NUMBER THAT
        CAN DIVIDE 360 AND GENERATE AN INTEGER

        THE DEFAULT LAG NUMBER 15= 12

        DO YOU WANT TO CHANGE THE LAG NUMBER  ? N
        SAMPLE MEAN=                                 601.4425
        ESTIMATED STANDARD DEVIATION=                323.O8O3

         DO YOU WANT TO SEE THE DETAILS (Y/N)? N
                                 23

-------
     LAG
      0
      1
      2
      3
      4
      5
VARIANCE CONTRIBUTION/
TOTAL VARIANCE 7. ( >.5 7.)
(EVENTS OF SIGNIFICANCE
LARGE NUMBER, GREATER
SIGNIFICANCE)
      26.22311
      32.52712
      19.12219
      7.247379
      10.41866
      4.551088
SAMPLING INTERVAL
TO CAPTURE A SPECIFIC
WATER QUALITY EVENT
INFINITE
 40
 20
 13.33333
 10
 8
     LAG  NOMINAL PERIOD
      0     INFINITE
      1         120
      2         60
      3         40
      4         30
      5         24

     TO CAPTURE ALL SPECIFIC WATER DUALITY EVENTS, THE MINIMUM SAMPLING
     INTERVAL SHOULD BE 8  TIME UNITS
      PRESS ENTER TO CONTINUE?
     THE NORMAL PERIOD MEANS THAT THE EVENT WILL REPEAT ITSELF AFTER
     A PERIOD OF TIME. FOR EXAMPLE, THERE IS AN EVENT THAT REPEATS
     ITSELF EVERY 24  TIME UNITS. IN ORDER TO CAPTURE THE EVENT,
     THE MINIMUM SAMPLING INTERVAL MUST BE 8  TIME UNITS OR LESS.
     OF COURSE, THIS ALSO CAPTURES ALL EVENTS WITH A PERIOD GREATER
     THAN 24  TIME UNITS.

SELECT NUMBER OF OPTION:
      1. LIST INPUT DATA
      2. MODIFY OR ADD INPUT DATA
      3. DELETE SOME OF THE DATA
      4. PERFORM CALCULATIONS
      5. STORE DATA
      6. GO TO MAIN MENU
      7. DO ANOTHER CALCULATION
 OPTION ? 6
                                  24

-------
        *     PROGRAM  MENU  ........ PAGE 1        *
        *********#*#************#***##**•#*>#**#*.#

  1. LINEAR REGRESSION
  2. CALCULATION OF NORMAL  DEVIATE Z
  3. CALCULATION OF THE PERCENTAGE AREA OF  NORMAL DISTRIBU-
     TION  (FROM MINUS INFINITY TO NORMAL DEVIATE Z)
  4. CALCULATION OF STUDENT T
  5. CALCULATION OF THE PERCENTAGE AREA OF  STUDENT T  DIST-
     RIBUTION  (FROM MINUS  INFINITY TO  STUDENT  T)
  6. CALCULATION OF CHI SQUARE
  7. CALCULATION OF SAMPLE  MEAN, STANDARD DEVIATION, AND  CON-
     FIDENCE INTERVALS FOR THE POPULATION  MEAN AND VARIANCE
  8. CALCULATION OF SAMPLE  NUMBER BASED ON  THE  ACCURACY  OF
     THE VARIANCE
  9. CALCULATION OF SAMPLE  NUMBER' BASED ON  THE  ACCURACY  OF
     THE SAMPLE MEAN
10. CALCULATION OF THE PROBABILITY OF  EXCEEDING A STANDARD
11. HYPOTHESIS TESTING
12. POWER SPECTRUM ANALYSIS
13. PROCEED TO NEXT PAGE
14. QUIT
TYPE THE DESIRED OPTION NUMBER AND PRESS ENTER ?  1

       * PROGRAM MENU	PAGE 2  *
       *#**#*#**##*****#»*##**#*#*##*##***#*##***#
15  COMPARING TWO MEANS
16. CALCULATION OF THE PERCENTAGE AREA '/. IN F-DISTRIBUTION
17. CALCULATION OF THE F VALUE IN F-DISTRIBUTION
IB. TEST FOR SIGNIFICANT DIFFERENCE BETWEEN
    VARIABILITIES OF TWO SAMPLES
19. TEST FOR SIGNIFICANT DIFFERENCE BETWEEN THE
    POPULATION VARIANCE AND THE SAMPLE VARIANCE
20. RETURN TO PREVIOUS PAGE
21. QUIT

TYPE THE DESIRED OPTION NUMBER AND PRESS ENTER •? 15
                            25

-------
               **************************
               * 15. COMPARING TWO MEANS
               **************************

THIS PROGRAM IS TO COMPARE TWO MEANS IN ORDER TO DETERMINE IF BOTH
MEANS ORIGINATE FROM THE SAME POPULATION.  FOR EXAMPLE, TWO DIFFERENT
PROCESSES ARE COMPARED TO DETERMINE ANY STATISTICAL DIFFERENCE EXISTS.
THE COMPARISION IS TWO-TAILED TEST. THE NULL HYPOTHESIS TO BE TESTED
IS Ho: MEANS ARE EOUAL. AGAINST Ha: MEANS ARE NOT EQUAL.
IF YOU WANT TO DETERMINE WHETHER PROCESS 1 IS BETTER THAN PROCESS 2,
THEN THE COMPARISION IS ONE-TAILED. THE NULL HYPOTHESIS TO BE TESTED
IS Ho: MEANS ARE EQUAL. AGAINST ALTERNATIVE HYPOTHESIS Ha: MEAN  1
IS GREATER THAN MEAN 2 OR VICE VERSUS.
TWO GROUPS ARE CONSIDERED HERE.  IN GROUP 2, DATA SET DON'T HAVE
TO BE ENTERED.  IN GROUP 1, DATA SET MAY BE ENTERED FROM THE KEY
BOARD OR FROM A FILE ON THE DISK.  IF DATA SET ARE NOT AVAILABLE,
THE INFORMATION REQUESTED MAY BE ENTERED FROM THE KEY BOARD.
BEFORE THE TWO MEANS ARE COMPARED, THE TWO SAMPLE STANDARD DEVIATIONS
MUST BE COMPARED BY USING F-TEST TO DETERMINE WHETHER THEY ARE
SIGNIFICANTLY DIFFERENT OR NOT.  THE EQUATION TO POOL THE SAMPLE
STANDARD DEVIATIONS DEPENDS ON IT. THEREFORE, IF THEY ARE NOT COMPARED,
GO BACK TO PROGRAM MENU AND SELECT VARIABILITY TEST FOR SIGNIFICANT
DIFFERENCE OF THE TWO SAMPLES AND THEN USE THIS PROGRAM.
 GO BACJ  TO PROGRAM MENU (Y/N)? N

GROUP 1                                 GROUP 2.
1. TWO SAMPLE MEANS                     1. TWO SAMPLE MEANS
2. NUMBER OF SAMPLES FROM BOTH          2. NUMBER OF SAMPLES FROM BOTH
   SETS OF SAMPLES                         SETS OF SAMPLES
3. SAMPLE STANDARD DEVIATIONS FROM      3. STANDARD DEVIATION
   BOTH SETS OF SAMPLES
4. CONFIDENCE LEVEL '/. REQUIRED          4. CONFIDENCE LEVEL '/. REQUIRED
   FOR THE COMPARISON                      FOR THE COMPARISON

   ANSWER EACH QUESTION AFTER A QUESTION MARK <•?) AND THEN PRESS ENTER.
     IF YOU WANT TO START OVER AGAIN, PRESS FUNCTION KEY 1 AND RETURN.


     INPUT GROUP NUMBER (1-2)             =? 1

     ARE THE TWO STANDARD DEVIATIONS SIGNIFICANTLY DIFFERENT  - N
     WILL DATA SET BE PROVIDED ?  PLEASE ENTER (Y/N)^ Y

     DEFINE Y(VARIABLE SUCH AS TOC, ETC)=" TOC

     IS YOUR DATA STORED IN A FILE (Y/N) ? Y
   INPUT FILENAME(NO MORE THAN 8 CHARACTERS)
   DATA FROM DISK A, TYPE A:  DATA FROM HARD DISh , TYPE C: FIRST
   THEN TYPE FILENAME.  EXAMPLE FILENAMES  LIN1 AND LIN2 ARE
   AVAILABLE ON DISK A FOR YOUR USE.   IF THE PROGRAMS HAVE BEEN
   LOADED INTO THE HARD DISK, THEN THE FILENAMES ARE ON THE HARD UlSf .
                                  26

-------
         DO YOU Wl&H TO LIST THE FILENAME BEFORE YOU PROCEED(Y/N)  ?  N

       TYPE FILENAME FOR SET 1 " A:LIIM1
       TYPE FILENAME FOR SET 2 " A:LIN2

SELECT NUMBER OF OPTION:
          1. LIST INF'UT DATA
          2. MODIFY OR AUD  INPUT DATA
          3. DELETE SOME OF INPUT DATA
          4. STORE DATA
          5. START TO CALCULATE
OPTION ^ 1
LISTING OF DATA
FOR SET 1
DATA POINTS
1
2
3
4
5
FOR SET 2
DATA POINTS
1
•2.
3
4
5
6
Y
50
30
40
30
35

Y
4O
4O
35
45
5O
35
SELECT NUMBER OF OPTION:
          1. LIST INPUT DATA
          2. MODIFY OR ADD INPUT DATA
          3. DELETE SOME OF INPUT DATA
          4. STORE DATA
          5. START TO CALCULATE
OPTION ? 5

       IF NULL HYPOTHESIS Ho:  MEANS ARE EOUAL. AGAINST ALTERNATIVE HYPOTHESIS
       Ha: MEANS ARE NOT EOUAL. THEN IT IS TWO-TAILED TEST.
       ENTER Y TO THE FOLLOWING QUESTION.
       IF YOU ARE GOING TO DETERMINE ONE MEAN IS SIGNIFICANTLY GRATER THAN
       THE OTHER OR VICE VERSUS, THEN IT IS ONE-TAILED TEST. ENTER N TO THE
       FOLLOWING QUESTION.
       TO DETERMINE ANY SIGNIFICANT DIFFERENCE BETWEEN THESE TWO MEANS  ? Y
       THE COMPARISION IS TWO-SIDED.
         INPUT CONFIDENCE LEVEL FOR THE COMPARISON      =? 95
                                     27

-------
1
 *****#***#**#*****#****#*******#******#*#*#*##*###**** ******
 THE SAMPLES ARE TOC
 MEAN FOR SAMPLE 1                      =  37
 MEAN FOR SAMPLE 2
 NUMBER OF SAMPLES FOR SET
 NUMBER OF SAMPLES FOR SET
 ESTIMATED STANDARD DEVIATION FOR SAMPLE  1
 ESTIMATED STANDARD DEVIATION FOR SAMPLE  2
 POOLED STANDARD DEVIATION
 THE STUDENT T FOR 95  7. CONFIDENCE LEVEL=
 THE CALCULATED  STUDENT T
 A 95 '/. INTERVAL STATEMENT FOR THE DIFFERENCE  IS
       5.861824  >= DIFFERENCE >=-13. 52849
 THE TWO MEANS HAVE THE SAME POPULATION
 THIS IS TWO-SIDED TEST.
 *******************************#**************#***.**********
40.83333
5
6
   = 8.3666
   = 5.845227
   = 7.077612
  2.262207
      =-.8944457
     DO YOU WANT TO DO ANOTHER CALCULATION  
-------
DO YOU WISH TO DO ANOTHER CALCULATION  (Y/N)' N
        ****»*****#****#********•***•*******##**#***#
        * PROGRAM MENU	PAGE 2  *
        ##*#*****************•* **** #******#******•***
 15  COMPARING TWO MEANS
 16. CALCULATION OF THE PERCENTAGE AREA '/. IN F-DISTRIBUTION
 17. CALCULATION OF THE F VALUE IN F-DISTRIBUTION
 18. TEST FOR SIGNIFICANT DIFFERENCE BETWEEN
     VARIABILITIES OF TWO SAMPLES
 19. TEST FOR SIGNIFICANT DIFFERENCE BETWEEN THE
     POPULATION VARIANCE AND THE SAMPLE VARIANCE
 20. RETURN TO PREVIOUS PAGE
 21. QUIT

 TYPE THE DESIRED OPTION NUMBER AND PRESS ENTER ? 18

      *******#**#*##*****##*******#*###*#*#*##*####*#***#
      *  18. TESTS FOR SIGNIFICANT DIFFERENCE BETWEEN   *
      *           VARIABILITIES OF TWO SAMPLES          *
      *#****#***#***#***#****#**#***#*****#**#***********

 THE OBJECTIVE IS TO TEST THE DIFFERENCE IN VARIABILITY BETWEEN
 TWO SAMPLES WHEN THE STANDARD DEVIATION IS NOT KNOWN. FDR EXAMPLE,
 A NEW EQUIPMENT IS USED TO MEASURE A COMPOUND AND IT IS EXPECTED
 THAT THE MEASUREMENT UNIFORMITY WOULD IMPROVE. THE QUESTION TO
 ASK NOW IS THAT WHETHER THE IMPROVEMENT 
-------
         IF A REAL IMPROVEMENT EXISTS, IT WOULD BE NECESSARY FDR THE CALCULATED
         VALUE TO EXCEED 2.03 TO REPORT THAT AN IMPROVEMENT IN VARIABILITY
         EXISTS WITH A 95 7. CHANCE OF BEING CORRECT. THIS IS ONE-TAILED TEST.
         IN THIS CASE THERE IS NO IMPROVEMENT IN VARIABILITY.

         TO USE THIS PROGRAM, THE USER MUST PROVIDE THE ESTIMATED STANDARD
         DEVIATIONS AND NUMBER OF SAMPLES BEFORE AND AFTER.  THE CONFIDENCE
         LEVEL AND ONE- OR TWO-TAILED ARE ALSO NEEDED. ARRANGE THE VARIANCE
         OF SAMPLE 1 TO BE GREATER THEN THAT OF SAMPLE 2  IN TWO-TAILED TEST.
         INPUT ONE- OR TWO-TAILED (1 OR 2) ^ 1
         WILL DATA SET BE ENTERED BY THE USER (Y/N)? Y
       ANSWER EACH QUESTION AFTER A QUESTION MARM (?) AND THEN PRESS ENTER.
       IF YOU WANT TO START OVER AGAIN, PRESS FUNCTION KEY 1 AND RETURN.

       IN GENERAL, DATA MUST BE PROVIDED BY THE USER EITHER FROM THE »• EYEOARD
       OR FROM A FILE ON THE DISK.
       DEFINE Y(VARIABLE SUCH AS TOC, ETC)=^ TOC

       IS YOUR DATA STORED IN A FILE (Y/N) ^ Y
       INPUT FILENAME(NO MORE THAN 8 CHARACTERS)
       DATA FROM DISK A, TYPE As  DATA FROM HARD DISK, TYPE C: FIRST
       THEN TYPE FILENAME.  EXAMPLE FILENAMES  LIN1 AND LIN2 ARE
       AVAILABLE ON DISK A FOR YOUR USE.  IF YOU HAVE LOADED THE PROGRAMS
       INTO THE HARD DISK, THEN THE FILENAMES ARE ON THE HARD DISl .
       DO YOU WISH TO LIST THE FILENAME BEFORE YOU PROCEED (Y/W ? N

        TYPE FILENAME FOR SET 1 •? A-.LIN1
        TYPE FILENAME FOR SET 2 ? A:LIN2
SELECT NUMBER OF OPTION:
          1. LIST INPUT DATA
          2. MODIFY OR ADD INPUT DATA
          3. DELETE SOME OF INPUT DATA
          4. STORE DATA
          5. PROCEED TO THE SELECTED PROGRAM
OPTION ? 1
LISTING OF DATA
FOR SET 1
DATA POINTS      Y
 1             50
 2             30
 3             40
 4             3O
 5             35
FOR SET 2
DATA POINTS      Y
 1             40
 2             40
 3             35
 4             45
 5             50
 6             35
                                        31

-------
SELECT NUMBER OF OPTION:
          1. LIST INPUT DATA
          2. MODIFY OR ADD INPUT DATA
          3. DELETE SOME OF INPUT DATA
          4. STORE DATA
          5. PROCEED TO THE SELECTED PROGRAM
OPTION ? 5
         INPUT CONFIDENCE LEVEL REOUIRED=      ? 95

                ********************************************************
                   THIS IS ONE-TAILED TEST.
                   CONFIDENCE LEVEL                  =       95  %
                   NUMBER OF SAMPLES BEFORE          =       5
                   NUMBER OF SAMPLES AFTER           =       6
                   SAMPLE MEAN BEFORE                =       37
                   SAMPLE MEAN AFTER                 =       40.83333
                   SAMPLE STANDARD DEVIATION BEFORE  =       8.3666
                   SAMPLE STANDARD DEVIATION AFTER   =       5.845227
                   F VALUE AT  95  X CONFIDENCE =            5.192483
                   CALCULATED F RATIO                =       2.04878
                   AN IMPROVEMENT IN VARIABILITY DOES NOT EXIST
                ********#*****************###»##*****#**##*#*##***#***#


       DO YOU WANT TO DO ANOTHER CALCULATION (Y/W N
                ******»***#*********#******##*#******#*#***
                * PROGRAM MENU	PAGE 2  *
                ****************************#***»#*********
         15   COMPARING  TWO  MEANS
         16.  CALCULATION  OF THE PERCENTAGE AREA '/. IN F-DISTRIBUTION
         17.  CALCULATION  OF THE F  VALUE IN F-DISTRIBUTION
         IB.  TEST  FOR SIGNIFICANT  DIFFERENCE  BETWEEN
             VARIABILITIES  OF  TWO  SAMPLES
         19.  TEST  FOR SIGNIFICANT  DIFFERENCE  BETWEEN THE
             POPULATION VARIANCE AND  THE SAMPLE VARIANCE
         2O.  RETURN  TO  PREVIOUS PAGE
         21.  QUIT

         TYPE THE  DESIRED OPTION NUMBER AND PRESS ENTER ? 19
                                     32

-------
*  19. TEST FDR SIGNIFICANT DIFFERENCE FOR THE POPULATION    *
*         VARIABILITY AND THE SAMPLE VARIABILITY             *
THE OBJECTIVE  IS TO TEST DIFFERENCES BETWEEN THE SAMPLE VARIABILITY
AND THE POPULATION VARIABILITY.  THE TEST WILL  INDICATE THAT
WHETHER THE SAMPLE VARIABILITY  IS AN IMPROVEMENT OVER THE
POPULATION VARIABILITY.  FOR EXAMPLE, DOES A LOWER VALUE OF SAMPLE
VARIABILITY MEAN THAT THE NEW MEASUREMENTS ARE  SIGNIFICANTLY
MORE UNIFORM?
TO TEST FOR A  DIFFERENCE IN VARIABILITY, THE CHI SQUARE TEST
USING THE FOLLOWING FORMULA IS  UTILIZED.

CHI SOUARE= DEGREES OF FREEDOM  X SAMPLE VARIANCE / POPULATION
            VARIANCE
PRESS ENTER TO CONTINUED
IF THE CALCULATED CHI SQUARE IS LARGER THAN THE CHI SQUARE
AT A UPPER 5 •/. CHI SOUARE DISTRIBUTION AND (N-l) DEGREES OF
FREEDOM, THEN  THE SAMPLE VARIABILITY IS SIGNIFICANTLY GREATER
THAN THE POPULATION VARIABILITY.  IF THE CALCULATED CHI SOUARE
IS IN BETWEEN  THE CHI SOUARE VALUES AT UPPER 5  AND 95 '/. CHI SOUARE
DISTRIBUTION,  THEN THE SAMPLE VARIABILITY IS NOT SIGNIFICANTLY
LARGER OR SMALLER THAN THE POPULATION VARIABILITY. IF THE CALCULATED
CHI SOUARE VALUE IS SMALLER THAN THE CHI SOUARE VALUE AT A UPPER
95 7. LEVEL, THEN THE SAMPLE VARIABILITY IS SIGNIFICANTLY SMALLER
THAN THE POPULATION VARIABILITY.
THE UPPER X '/.  HERE ARE THE TOTAL AREA INTEGRATED FROM INFINITE TO
THE DESIRED VALUE OF CHI-SOUARE.

ANSWER EACH QUESTION AFTER A QUESTION MARf- <-> AND THEN PRESS ENTER.
IF YOU WANT TO START OVER AGAIN, PRESS FUNCTION KEY 1 AND RETURN.

INPUT SAMPLE STANDARD DEVIATION^ 2O
INPUT POPULATION STANDARD DEVIATION=? 30
INPUT NUMBER OF SAMPLES  =? 20
#**#***»#*******#*##*******************##******##»#*#*
SAMPLE STANDARD DEVIATION =          20
POPULATION STANDARD DEVIATION =      30
NUMBER OF SAMPLES =                 20
CHI-SQUARE ( 19 , .95 ) = 10.11973
CALCULATED CHI SQUARE        = 8.444444
THE SAMPLE VARIANCE IS SIGNIFICANTLY SMALLER THAN THE POPULATION
VARIANCE.
*******************************************************
 DO YOU WISH TO DO ANOTHER CALCULATION  (Y/N)? N
                             33

-------
       ****+**#*****#*****************************•
       * PROGRAM MENU	PAGE 2  *
       ****#******************•»*******#*****•******
15  COMPARING TWO MEANS
16. CALCULATION OF THE PERCENTAGE AREA '/. IN F-DISTRIBUTION
17. CALCULATION OF THE F VALUE .IN F-DISTRIBUTION
18. TEST FOR SIGNIFICANT DIFFERENCE BETWEEN
    VARIABILITIES OF TWO SAMPLES
19. TEST FOR SIGNIFICANT DIFFERENCE BETWEEN THE
    POPULATION VARIANCE AND THE SAMPLE VARIANCE
20. RETURN TO PREVIOUS PAGE
21. QUIT

TYPE THE DESIRED OPTION NUMBER AND PRESS ENTER ? 21
NOTE:  IF YOU WANT TO SEE THIS PROGRAM MENU AGAIN,
           JUST TYPE EMSLSTAT AND THEN PRESS ENTER.
                             34

-------
                                  APPENDIX A
                        DEFINITIONS  OF  BASIC STATISTICS

NORMAL DISTRIBUTION
    The probabilistic model for the frequency distribution of a continuous
random variable is called the probability distribution.   While these
distributions may assume a variety of shapes,  a very large number of random
variables observed in nature possess a frequency distribution which is
bell-shaped and can be approximated by using a normal  curve.   The density
function of a normal distribution of mean u and variance o  is given by
the equation:

         f(y)
-==	   exp  (	g—),   -CD     Eq.(A-l)
 •y 2  IT  a               2o     /
    This equation for the normal  distribution is constructed  such  that  the
area under the curve will represent probability.  Hence,  the  total  area is
equal to one.
    The normal distribution is completely determined  by two parameters:  \i,
the population mean of the distribution,  and o,  the population  standard
deviation.  The parameter u is the center of the distribution while o is a
measure of the dispersion of the  data from the population mean.  Thus,  a
change in \i merely slides the curve right or left without changing  its
profile, while a change in a widens or narrows the curve  without changing
the location of its center.
                                  A-l

-------
    In practice, variables seldom are fn a range of values from "minus
infinity" to "plus infinity".  Nevertheless, the relative frequency
distribution for many types of measurements will generate a bell-shaped
figure which may be approximated by the function shown in Figure A-l.  One
property of the normal distribution as illustrated in the figure is that
randomly selected observations will have approximately a 68.3% probability
of falling within the interval u*<», 95.5% within the interval n*2o and 99.7%
within the Interval u*3cr.
       f(y)
                    Figure A-l. Normal  Distribution
                                  A-2

-------
SAMPLE MEAN, y
    A specified number of Items (a sample) randomly drawn from a large body
of data (a population) 1s presumed to represent the population.  One of the
most common and useful measures of the center of the distribution for the
sample Is the arithmetic mean of a set of measurements.   This Is often
referred to as the sample mean.  The arithmetic mean of  a set of n
measurements y^, y2, y$	yn 1s equal to the sum of the
measurements divided by n.  It Is used to estimate the population mean u and
can be calculated from the following formula:
                            A  y<
                            1-1  	                              Eq. (A-2)
                              n
SAMPLE VARIANCE, s2
                             2
    The population variance a  Is an Indicator of the spread  of  a
probability distribution about its mean.   It is estimated  by  the sample
          2
variance s .  The sample variance of a set of n measurements  y,, y2,
y^	y_ Is equal to the sum of the square of the deviations  of  the
measurements from the mean divided by the degrees of freedom  (n-1).   The
                 2
sample variance s  can be calculated from the following  formula:
                           n         2
                          £1  (y,- y)                                Eq.(A-3)
                   .'.    1'1
                               n-1
                                   A-3

-------
SAMPLE STANDARD DEVIATION, s
    The sample standard deviation s is the positive square root of the
sample variance s2.  It is the estimate of the standard deviation o which
is defined as the distance along the abscissa from the mean to the point of
inflection on the normal curve.  The standard deviation is, as was the
variance, a measure of dispersion and has the same unit of measure as the
mean p.  The sample standard deviation can be calculated from the following
formula:
                                                                     Eq.(A-4)
    Both the variance and the standard deviation play an important role in
statistics.  Since y approximates u and s approximates a,  those percentages
as described in the normal  distribution will  hold approximately for y  * s, y
* 2s, and 7 * 3s.

SAMPLE COEFFICIENT OF VARIATION.  CV
    Another measure of dispersion from the mean  is the coefficient of
variation CV.  The CV provides a  measure of the  dispersion relative to the
location of the data set, so that the spread of  the data in sets with
different means can be compared.   It can be calculated from the following
formula:

                          CV=-5-                                  Eq.(A-5)
                                    A-4

-------
CENTRAL LIMIT THEOREM
    The central limit theorem states that, if random samples of n
observations are repeatedly drawn from a population with a finite mean u and
a standard deviation a, then, when n is large the frequency distribution of
the sample means will be approximately bell-shaped.  Thus, when n is large,
the sample mean, y, will be approximately normally distributed with mean
equal to M and standard deviation o/^n~.  The approximation will become more
accurate as n becomes larger.

RANDOM SAMPLING
    A set of observations may be regarded as a random sample from the
population if each member of the sample is a random drawing from the whole
population.  Mathematically speaking, suppose that a sample of n
measurements is drawn from a population consisting of N total  measurements.
There are the following different combinations of n measurements which can
be selected from the population:
                                                             N
If the sampling is conducted in such a way that each of the C  samples has an
equal probability of being selected, the sampling is said to be random and
the result is said to be a random sample.
                                    A-5

-------
STUDENT t DISTRIBUTION
    A random variable y having a student t distribution with B=(n-l)  degrees
of freedom has a probability density function (pdf)  of the form:
    F(y) =  	    	r^
              	    /   \            9   »"
                    r(fJ/2)      (1 * yz/B)
    It is noteworthy that a student t distribution with infinite degrees of
freedom is a standard normal distribution.   The student t distribution
instead of the normal distribution is utilized when the variance of the
normal distribution must be estimated from a set of observations.  The
distribution of

                               t =  ? ' "                            Eq.(A-S)
                                8  s//T
is a student t distribution with e=n-l degrees of freedom.  A student t
distribution with B = 4 degrees of freedom is shown in Figure A-2.
CHI SQUARE DISTRIBUTION. X2
^—•—————^—~~~~~~   u
    If s2 is the variance of a random sample of size n from a normal
distribution, the quantity Bs2/02 has a chi square distribution as shown
in Figure A-3.  This quantity is represented by Xg.  The chi square
distribution is characterized by one parameter B, the degrees of freedom
(n-1).   It has a mean value of B and a variance of 2B.
    A test based on the chi square test determines whether there is any
significant  difference between the sample variability and the population
variability.  For example, does a lower value of sample variance mean that
the new  measurements are significantly more uniform?
                                   A-6

-------
                                    ^.O.OS- 2-13
                                              ,0.025 • 2.78
          4-3-2-10    1     2    3
Figure A-2.  Distribution of student t with 0  =  4  degrees of  freedom.
    Figure A-3.   Chi  square distribution.
                            A-7

-------
    The mathematical  form for this distribution is:
          =  -T75  -   /v     -     Ovn t  Y/5M         «         Fn
       8     2S'   r(B/2)   *  B            ^ *"XB                     Eq


          = 0                                        for  X2< 0
                                                           P~
F DISTRIBUTION



    Suppose that a sample of n^ observations is randomly drawn from a


                                     2
normal distribution having variance cj, a second sample of i\2 observations


                                                                          2
is also randomly drawn from a second normal  distribution having variance o^,


               2      2
and estimates s^ and s2 of the two population variances are calculated,


                                                  2        2
having BJ and 62 degrees of freedom.  The ratio (XB /BJ)/(XB /B2) has



an F distribution having BJ and B2 degrees of freedom.  The F distribution


                                                      2    2
may be used to test whether the variances are equal (o^ = o?) by comparing


                                    2  2
the ratio of the sample variances (sj/s2) with the F distribution having BJ



and BO degrees of freedom.



    The mathematical form for this distribution is:
f(F) =  'r'(B /2)  P(B /2)—F         (B2* BiF1    1+ S2//t for F> °  E(i-(A-10)

                                                               for F< 0
                                    A-8

-------
                                  APPENDIX B

           DESCRIPTIONS OF STATISTICAL SAMPLING PROGRAMS ON THE DISK

B.I  CURVE FITTING WITH A LINEAR REGRESSION
    A linear regression Is a curve fitting technique.   It  is based on
fitting a straight line through a series of observed data  points so that the
sum of squares of the deviations of these points  from the  line are
minimized.  A straight line is expressed by the mathematical form:
                                  y = a+bx                          Eq.(B-l)
    where:

        a = y - bx"                                                 Eq.(B-la)
                      - zxi  z;y,-
        b	—	l—r
             n  Zxf  - (  £*.)*
                                                                   Eq.(B-lc)
         y.j= dependent variable  or observed data point
         x. = independent variable

    After fitting a line through a set of data, 1t is necessary to determine
if the fit is good.  One indicator is the coefficient of correlation (cc),
which can have a value between -1 and +1.  It may be expressed as:
                             nzx.y--2x.iy.
                    cc =                                           E* (B'2)
                                            (nZy?   •
                                   B-l

-------
    A positive coefficient of correlation indicates that increases in the
independent variable will result in increases in the dependent variable.
The variables are directly related.  If the coefficient is negative,  then
they are indirectly related.  A coefficient of correlation of o means that
the variables are not related.

B.2  NORMAL DEVIATE Z
    Figure A-l in Appendix A shows the percentage of elements of the
population contained in various intervals of a normal  distribution.   About
68.3% of the area under the normal curve falls within u *lo.   About 95.5% of
the area under the curve falls within \t * 2o.  And about 99.7% of the area
under the curve falls within n *3o.  If we introduce Z, which is defined  as
the distance from the population mean in units of the standard deviation,  we
can produce a standard normal  profile as shown in Figure R-l.   The Z  value
is calculated by the following formula:

                                   Z = X-^JL                         Eq.(B-3)

    By redefining Z as the distance from the population mean  in units of  the
standard deviation of the average, the same standard normal profile as shown
in Figure B-l is still produced.  Now, however,  it represents  the normal
distribution for the sample average.  The Z is calculated by  the following
formula:
                                 7 .  7 - u                          Eq.(B-4)
                                  B-2

-------
    where:
           n is the total number of observations.
           y is the sample average.
           a
           _ is the standard deviation of the average.
          /n

    The percentage area in Figure B-l only depends upon the value of 7.   For
example, the percentage area between Z = ±1 is 68.26%. To use this program
to obtain the Z value requires the user to provide the confidence level  or
the percentage area.  For example, the Z value for a 95% confidence level  in
a two-sided test is 1.959961.

B.3  PERCENTAGE AREA UNDER THE NORMAL CURVE
    This program is the opposite of the program in (B.2).  In this program
the user needs to provide the Z value in order to obtain the corresponding
percentage area.  This program is very useful.  For example, it can answer
the question, "What is the probability that a single observation y, drawn
                                                                  2
from a normal distribution with population mean u=6 and variance a =4,
lies between 6 and 9?"  By using equation (B-3), two values of Z, namely,  0
and 1.5 are obtained.  The corresponding areas for Z=0 and 1.5 are 0.5 and
0.9332, respectively.  Therefore, the probability that a single observation
                                                          2
y, drawn at random from a normal population with n=6 and a =4, will  have a
value (y{) between 6 and 9 is (.9322-0.5) = 0.4322 or 43.22%.   The user
must input a value of Z in order to calculate the area of normal
distribution (integrated from minus infinity to the desired Z value).
                                   B-3

-------





^
h— 0.955 — • |

y
r- 0.6831
r
\

v


"^



     -3-2-101     2
•Figure B-l.  Standard normal distribution.
               B-4

-------
B.4  STUDENT t
    If the variability o2 of a normal  distribution  is  estimated  from  a  set
of samples, the student t, instead of the normal  deviate  Z,  should  be used
when the confidence interval for the mean is to be  calculated.   The normal
curve is replaced by a student t distribution which varies according  to the
degrees of freedom.
    This program requires the user to provide the desired confidence  level
and the degrees of freedom, B=n-l, in order to obtain  the value  of  t  in a
two-sided test.

B.5  PERCENTAGE AREA UNDER THE STUDENT t
    This program is the opposite of the program in  (B.4). This  program
requires the input of the value of t in order to obtain the  percentage  area
of the student t distribution.  The area is Integrated from  minus infinity
to the provided t value.  For example, the percentage  area for 6 =  15 and
t = 2.7 is 99.18%.
                                          B-5

-------
B.6  CHI SQUARE
    One of the applications  for the chi  square distribution, other than
those described 1n the previous section,  is to determine the confidence
limits of the variance estimation  for  normally distributed data.

    To use this program to determine the value of chi square, the user must
provide the degrees of freedom and desired percentage area (integrated from
0 to the desired chi square  value).  For example, the value of chi square
for 6=15 and percentage area = 95% is  24.99.

B.7  SAMPLE MEAN. STANDARD DEVIATION.  AND CONFIDENCE INTERVALS FOR THE MEAN
     AND VARIANCE
    If a sample is taken from a population, the  sample  average will seldom
be exactly the same as the population  mean.   An  estimate of an Interval that
will bracket the population  mean is  then made.   If  such interval estimates
were made a large number of  times, and actually  did contain the true mean in
95% of the cases, it might be said that we are operating at a 95% confidence
level (C.L.).  The interval  estimates  are called 95% confidence intervals
(C.I.).  The expressions for confidence intervals for the mean and the
variance are:
    Confidence Interval (C.I.) for the Population Mean  » if a is unknown
    y-t8;o/2

                                      B-6

-------
    where:
         6 = degrees of freedom, (n-1)
         a = a significant level, % = C.L. = 100% - o%
         £ = number of samples
         y = sample mean
         tBja/2 = student t at degrees of freedom, B, and a significant
                   level a.
         s = sample standard deviation.

    Confidence Interval (C.I.) for the Population Mean » if g is known


    y_Z     _2_1  "  1  J+z    _2_                             Eq.(B-6)


    where:

         Za/2= normal  devi'ate Z at a significant level a.


                                           n
    Confidence Interval for the Variance, a


      Bs2    l  o2£     BS2                                      Eq.(B-7)

      Xe;a/2            x B;(l-a/2)


    The user must provide sample data (individual  observations),  number of

samples, the desired confidence levels for the  mean,  and the standard

deviation in order to use this program.
                                       B-7

-------
B.8  DETERMINATION OF THE NUMBER OF SAMPLES
    The number of samples necessary to reasonably characterize a water or
wastewater can be determined if background data on the concentration and
variance of the concentration of the parameter are available.   Two
techniques can be used to determine the required number of samples;  one is
based on the allowable confidence interval for the standard deviation, the
other on the accuracy of the mean.

    Determining the Number of Samples Based on the Accuracy of the Mean

    The relationship among the number of samples n, the coefficient of
variation s/y", the accuracy of sample mean |i, the student t with degrees
of freedom B, and confidence level (1-a) can be expressed as:
                                                                    Eq.(B-8)
    This program requires the user to input the following information:
     .  Confidence level for the mean
     .  Coefficient of variation s/y (CV)
     .  Error of the mean (p - y)/y
    For example, given a = 5%, CV=0.5, (M-y)/y=0.25, the number of samples
 needed would be 18.
     If the coefficient of variation is not already available, it can be
 estimated by collecting three or four samples to determine the sample mean
 and the standard deviation.
                                    B-8

-------
Determining the Number of Samples Based on the Accuracy of the Sample Variance
                                                                2
The relationship among the number of samples n, the chi square X' and
                                                                p
                                   2
confidence level for the variance s  can be expressed as:
     *     fi  \
           v   L
                                   t
                                ^ *B;
                     ;  1 - a/2      ]j AB; a/2
    where A is the allowable width of the confidence interval  for the standard
deviation with a confidence level  (100% - a).

    To apply this method to determine the sample number required, the user
must provide the following information:
    .  Confidence level  (100% - a)
    .  Relative error of the standard deviation A/S.
    For example, given  A/S = 0.5 and (100% - a) = 95%,  36 samples would be
    required.

B.9  PROBABILITY OF EXCEEDING A STANDARD
    The probability of  exceeding a standard is one of the statistical  methods
to determine the percentage violation of a parameter being monitored.   The
user must provide the following information in order to use this  program:
    .  Population mean   \i
    .  Standard deviation  a
    .  The standard that should not be exceeded, Y.
                               B-9

-------
    The probability P of an effluent exceeding a  standard can  be  determined by:



    .   calculating the Z value using the following  formula:



                                             i-                      Eq.(B-lO)
    .  determining the area from Z to co  from the  standard normal

       distribution.



    For example, given Y = 100,  u = 75,  o =  18,  the probability  (P)  would  be

8.23%.

B.10  HYPOTHESIS TESTING

    Hypothesis testing determines whether a  sample  comes  from a  particular

distribution with a specific parameter.   The information  required  for this

program 1s as follows:
Group 1                                Group 2
    1.  population mean,  u             1.   population mean,  p
    2.  sample mean, y                 2.   one sample value, y
    3.  number of samples, n           3.   standard deviation,  o
    4.  standard deviation, o          4.   confidence level  for the  mean
    5.  confidence level  for the mean

Group 3
    1.  population mean,  u
    2.  sample mean, y
    3.  number of sample, n
    4.  sample standard deviation,  s
    5.  confidence level  for the mean

For example, given n = 100, y = 120, n = 10, a = 50, confidence level  = 95%,
determine whether the sample comes  from a  particular distribution with a
population mean a 100.

    Solution:
            Substituting the given  conditions into equation  (B-4),
                                               1.265
                                  B-10

-------
which is less than the normal deviate Z = 1.96 at a 95%  confidence .level.   We
conclude that the sample, therefore, has a mean equal  to the population mean.

B.ll  POWER SPECTRUM ANALYSIS
    The use of statistics discussed so far depends on  the assumption  that  the
data record remains in equilibrium about a constant mean.  For a  long-term
monitoring program the data may consist of a harmonic  and a trend that the
variance will not be a random dispersion about a constant mean.   If trend  and
harmonics are not identified or removed, distortions can occur both in data
processing and in conclusions on the probability distribution of  the  measured
parameter.  Two techniques used to evaluate and identify these components  are
trend removal and power spectrum analysis.
    A trend may be defined as a harmonic component whose period is longer  than
the record length.  The technique to remove a trend requires the  method of the
least squares or regression analysis.   Linear regression can be expressed  by  a
straight line which has the same equation as (B-l):
                                    y = a+bx
                                    B-ll

-------
The coefficients a and b are also calculated by the same equations (B-la)  and
(B-lb).  The new time series, y., after a linear trend is removed  is:
                                 y* -y, - (a+bx.)           Eq.(B-ll)
    The other technique is called the power spectrum analysis which  is  a
statistical method for analyzing time-dependent records.   It is  used to
analyze a long, continuous record with high frequency of data acquisition.   It
should not be used for short surveys or low frequency monitoring when limited
amounts of data are available, or if part of the record is missing.  The
general rule is that if we want to monitor one year's data,  we should have
accumulated a data record for ten years.
    A time-dependent record can be resolved into a  spectrum.   Any dominant
periodicities in the record will  appear as peaks in the power spectrum.
Therefore, the spectral analysis is used to extract any regular  variations of
the respective parameter with respect to time.   It  leads  to  uncover  some of
the phenomena governing the variation of the parameter being studied by
connecting the frequencies corresponding to the peaks of the power spectrum  to
physical, chemical and other factors which may be present in the record.   In
short, power, spectrum analysis computes the following:

 .  Those parts of the total variance of a record which recur at constant time
    intervals.
 .  Those parts of the total variance of a record which are  not  recurring in
    character either a trend or random fluctuations.
 .  The frequencies at which different factors cause the  record  to vary.
 .  The determination of optional sampling frequency.
                                    B-12

-------
The Computation Procedures of a Power Spectrum
    An important requirement for the computation  of a  power spectrum is  that
there is no missing data in the record.   If some  of the data are  missing,
their values must be interpolated before the spectral  analysis  is attempted.
No more than 5% of the total data should be interpolated.   The  following are
the computational procedures:
    .  Calculates the sample mean of the record.
                                           yi                         Eq.(B-12)
    where n is the number of measurements
    .  Calculate autocorrelation coefficients Cr.
                                                                    Eq.(B-13)
                       n'r  i=l
    where:
         r = 0, 1, 2,	m
         m = the total  number of lags to which the computation is carried out.
         n = the number of samples in the record.
       Fourier cosine transform Vr for each autocorrelation coefficient.
     V  = —   |"c + CCos Ur) + 2  21,   Cn Cos (-^
      r    m    L o   m              q=i    q       m
                                      B-13

-------
where:
     K = -K for r = 0 and r = m
     K = 1 for r = 1, 2, . , m - 1.

.  Smooth some distortion of the spectrum for the small  sample size.
                    UQ = 0.54 VQ + 0.46 Vjj
                    Ur = 0.23 Vr_1 + 0.54 Vr + 0.23 Vr+1;
                         r = 1, 2, 3, .... m-1                   Ea.(B-15)
                    Um • °'46 Vl + °-54 V

where :
     U , U, ..... U  are power spectrum estimates corresponding to
     lags 0, 1, 2 ..... m, respectively.
     Calculate percentage contribution of each lag to the  total  variance
     of the record.

                          Pr  = - — x  100%                    Eq.(B-16)
                                m  .
Each of these estimates represents the part of the total  record variance
that is estimated to occur within a certain period of time.
     Calculate period corresponding to each lag tr.
                        tr-  -^^-                           Eq.(B-17)
                                B-14

-------
    where :
         tr is the period corresponding  to  lag  r.
         At is the sampling interval.
    The spectral  values,  Ur, represent estimates over  a  range of periods
from having a band with limits of
                            to  fa *  sit)
    In other words, U  are average values  for  all  frequencies within a band
with a lag r.
    At lag 0, the period becomes infinite.   Thus,  the  spectral estimate
includes all the record variance that does  not recur during  the length of the
record used in the analysis.   Therefore,  it includes any  random fluctuations
and linear trends in the record.
    The longest period other than zero frequency period is determined by the
number of lags used in the computation.   It is generally  recommended that the
number of lags be no greater than 15% of  the total  number of points in the
record.
         Determination of sampling frequency
    The shortest period that is theoretically  possible to resolve with a given
sampling interval is one which is twice as  large as the sampling interval.  In
practice, the sampling interval should be  equal or less than one-third the
length of the shortest period which we want to resolve; for  example, if we
want to resolve a 24-hour period, sampling  intervals of 8 hours are required.
In mathematical form, the highest frequency which  can  be  resolved from a
discrete record with sampling interval At  is
                                    B-15

-------
                        f      -
                         max   -                                     Eq.(B-19)
Exampl e :
    The wastewater influent for the city of Racine,  Wisconsin,  was sampled
hourly in the summer of 1974 and analyzed for TOC.   The record  is shown in
Figure 5.  The mean and variance were calculated to  be 70.56 mg/L and 1262.07
  2  2
mg /L , respectively.

    The power spectrum corresponding to the record of Figure 5  is obtained as
depicted in Figure 6.  This power spectrum exhibits  a significant peak  at  the
1/24 hour frequency and a less significant peak at 1/8 hour.  Since the last
significant peak in the spectrum occurs at the 1/8 hour frequency,  the
sampling frequency should be at least two times the  frequency of the last
significant peak, i.e., 1/4 hour frequency.  Therefore, a  sampling  interval
less than 4 hours should be selected.
                                   B-16

-------
     .'00
   c
   o
  3

  8
     100
  . 70.56



S* . 1262.07mg2/L2
            Sun      Mon       Tues       Wed      Thur     Fr1      Sat




                                         Time



    Figure B-2.  Time  record of TOC of municipal  wastewater  at Racine, Wisconsin.
    CM
(M
    I
    OL
                                   Frequency  I/ hour




         Figure B-3.   Power spectrum  of TOC  concentration  of municipal  wastewater

                      at Racine,  Wisconsin.

                                       $-17

-------
B.12  COMPARING TWO MEANS
    This program is to compare two means in order to  determine  if  both  means
originate from the same population.  For example, two different processes  are
compared to determine if any statistical difference exists.   The comparison is
a two-tailed test.  The null hypothesis Ho:  "means are  equal"  is  against  Ha:
"means are not equal."  If you want to determine whether process 1 is better
than process 2, then the comparison is one-tailed. The  null  hypothesis Ho:
"means are equal" is against alternative hypothesis Ha:   "mean  1 is greater
than mean 2."
    Before the two means are compared, the two sample standard  deviations  must
be compared by using F-test to determine whether they are significantly
different or not.  The equation to pool the sample standard deviations  depends
on it.

    To use this program, the information required is  as  follows:
Group 1.
    1.  Two sample means
    2.  Number of samples from both sets of data
    3.  Sample standard deviations from both sets of  data
    4.  Confidence level required for the comparison
                                   B-18

-------
    In Group 2,  the calculating processes  are  similar  to  the  above  example.
However, the user must provide the population  standard deviation  and  the
normal deviate Z.  The Z test statistic  is calculated  from the  formula:
               Z =	                              Eq.(B-21)
                        I  *  1
                        nl    n2
B.13  PERCENTAGE AREA UNDER THE F DISTRIBUTION
    The F statistic is the ratio of two estimates of variance.   It  is  a
two-parameter distribution, the degrees of freedom in two  estimates of
variance.  It is used to test hypothesis concerning the treatment effects  or
significant difference between variabilities of two samples.
    The calculation of the percentage area in this program is an integration
from zero to the desired value of F.  The necessary conditions  to obtain the
area are the degrees of freedoms for the two variances and the  desired F
value.  For example, if the degrees of freedom for both variances are  12,
respectively and the desired value of F is 3, then the percentage area in
the F distribution is 96.567%.

B.14  F DISTRIBUTION
    This program is to calculate the F value for any desired  percentage area
of the F distribution.  It is the inverse of the program in (B.13).

B.15  SIGNIFICANT TEST BETWEEN VARIABILITIES OF TWO SAMPLES
    The objective is to test the difference in variability between  two
                                    B-19

-------
samples.  For example, new equipment 1s  used  to measure a compound and 1t 1s
expected that the measurement uniformity would Improve  (less variance or
more precise).  The question to ask 1s whether the Improvement  really exists
or has that occurred by chance.  To be sure of a  significant Improvement 1n
variability, a ratio (F-rat1o) of two variabilities before  and  after must be
calculated, and compared with the F value  at  a 95X confidence level and
degrees of freedom for both sets of data.   If a real  Improvement  does exist,
It would be necessary for the calculated F value  to exceed  the  F  value at a
95X confidence level.  Then It can be reported that an  Improvement 1n
variability exists with a 95% chance of  being correct.  The null  hypothesis
Ho:  "variabilities are equal" 1s against  Ha:  "variability before 1s
greater than variability after."  This Is  a one-tailed  test.  However, If
the null hypothesis Ho:  "variabilities are equal" 1s against Ha: "variabil-
ities are not equal," then this 1s a two-tailed test.
    To use this program, the user must provide the sample  standard
deviations and number of samples before and after.  The confidence level and
one- or two-tailed are also needed.
    For example, given y. = 79.1, y2 = 76.2,  nj = 7, n,, =  5, Sj = 5.10,
Sy = 3.33, determine whether a significant difference between the
variabilities does exist.
                                          B-20

-------
Solution:
                                                         2   2
        First, calculate the ratio of  the  two  variances  Sj/Sg

                        Sl	o «                                Eq.(B-22)
                              =  2.35
                        S2
        Second, calculate the F value at a 95% confidence level  of 6
        and 4 degrees of freedom.
                        F (6, 4, 0.95) = 6.16                       Eq.(B-23)
                         2  2
        Third, compare  s^/Sj and F (6, 4, 0.95).

        Since the ratio of the two variances is less than the  value of  F
        (6,4,0.95) it is correct to assume that the  variances  are  not
        significantly different.

B.16  SIGNIFICANT TEST BETWEEN THE POPULATION VARIABILITY AND  THE  SAMPLE
      VARIABILITY
    The objective is to test the difference between  the  sample variability
and the population variability.  For example, does a lower value of sample
variability from a new measurement mean that it is now more uniform than the
past population variance?  To answer this question,  a chi  square test using
the following formula must be utilized:
                    chi  square = -5-4-                             Eq.(B-24)
                                     B-21

-------
    where:
         B = degrees of freedom,  (n-1)
          2
         s = sample variance
          2
         o = population variance
    If the calculated chi  square value from equation (B-24)  is  larger than
that at an upper 5% chi square with (n-1)  degrees  of freedom, then  the
sanple variability is significantly greater than the population
variability.  If the calculated chi square value is  in between  the  upper 5
and 95% chi square values, then the sample variability is  not significantly
larger or smaller than the population variability.   On the other hand,  if
the calculated chi square  value is smaller than  that at an upper 95%  level,
then the sample variability is significantly smaller than  the population
variability.
                                   B-22

-------
                  APPENDIX C
                 NOMENCLATURE
u        Population Mean
o        Standard Deviation
e        Degrees of Freedom
»2       Chi-Square with Degrees of Freedom B
*B
a        A Significance Level
yj       Observations at i=l, 2, 3, ....
y        Sample Mean
s        Sample Standard Deviation
n        Number of Samples
cc       Coefficient of Correlation
Z        Normal Deviate
t        Student t
Exp      Exponential Function
         Gamma Function
                    C-l

-------