EPA-600/1-76-015
February 1976
Environmental Health Effects Research Series
                                    REALISTIC  MODELS  FOR
           MORTALITY RATES  AND  THEIR  ESTIMATION
                                            Health Effects Research Laboratory
                                           Office of Research and Development
                                          U.S. Environmental Protection Agency
                                    Research Triangle Park, North Carolina  27711

-------
                 RESEARCH  REPORTING SERIES

Research reports  of the Office of Research and Development, U.S. Environ-
mental Protection Agency, have been grouped into five series.  These five broad
categories were established to facilitate further development and application
of environmental  technology.   Elimination  of traditional  grouping  was con-
sciously planned  to foster technology transfer  and a  maximum interface in
related fields. The five series are:
    1.    Environmental Health Effects Research
    2.    Environmental Protection Technology
    3.    Ecological Research
    4.    Environmental Monitoring
    5.    Socioeconomic Environmental Studies
This report has been  assigned to the ENVIRONMENTAL HEALTH EFFECTS
RESEARCH series. This series describes projects and  studies relating to the
tolerances of man for unhealthful substances or conditions.  This work is gener-
ally assessed from a  medical  viewpoint, including  physiological or psycho-
logical studies.  In addition to toxicology and other medical specialities, study
areas include biomedical instrumentation and health research techniques uti-
lizing  animals—but always with intended application to human  health measures.
This document is available to the public through the National Technical Informa-
tion  Service,  Springfield,  Virginia 22161.

-------
                                         February 1976
 REALISTIC MODELS FOR MORTALITY RATES AND

            THEIR ESTIMATION
                   By

             V. K. Murthy
University of California at Los Angeles
                  and
   University of Southern California
        Los Angeles, California
          Grant No. 800230
            Pro.iect Officer

         Dr, Wilson B.  Riggan
      Population Studies Division
   Health Effects Research Laboratory
   Research Triangle Park,' N.C.  27711
  U.S.  ENVIRONMENTAL PROTECTION  AGENCY
   OFFICE OF RESEARCH AND DEVELOPMENT
   HEALTH EFFECTS RESEARCH LABORATORY
   RESEARCH TRIANGLE PARK,' N.C.  27711

-------
                             DISCLAIMER
     This report has been reviewed, by the Health Effects Research
Laboratory, U.S. Environmental Protection Agency, and approved for
publication.  Approval does not signify that the contents necessarily
reflect the views and policies of the U.S. Environmental Protection
Agency, nor does mention of trade names or commercial products
constitute endorsement or recommendation for use.

-------
                              ABSTRACT
        The object of a "medical follow-up study"  is generally to determine
the effectiveness of each of several types of treatments or regimen by
analyzing the responses of the patients.  Depending on the nature of the
disease and the nature of the drugs or combination of drugs or other
forms of treatment  such as surgery,  chemotherapy  etc.   a "Protocol"
is developed with the sole object of rendering the  study feasible and
valid.   Thus, the "Protocol"  concept is much broader in scope than
the concept of plain so-called clinical trials which often give the
investigator the misleading belief  that he has an experimental design
(in its usual sense)  for his study as in the case  of agricultural and
animal experimentation.
        The most reliable  reponse  data coming out of these investigations
thus far is time to death of patients who  are not otherwise lost to
the follow-up or  our investigation.   The statistical nature  of this data.
among other things, will be  characterized in the body of this report.
A  salient parameter of interest is  the force of mortality experienced by
patients under the specific protocol.
        By definition the force of mortality is the rate associated with
the probability of the patient's death at a specified neighborhood of
time given that the patient has survived that instant of  time.  This
quantity expressed a a function of  time is also called the "mortality
rate function. "  Given the mortality rate function,  the survival function
can be determined.
        The estimation of the mortality rate function therefore assumes an
important role in the estimation, as well as prediction, of the survival
function.  Nonparametric methods  for estimating hazard rates  are consid-
ered in Murthy (1965) and  Grenander  (1956).  However, the most that the
aforementioned rates can do is to  throw  light on the shape of the   under-
lying hazard rate function, since they cannot be used for decision making
                                  111

-------
and prediction purposes, as the estimation of a continuous infinity of
parameters is implicit in all those procedures.  Thus,  parametric
procedures based on realistic models for the mortality rate  are the
only suitable approaches for efficient estimation and prediction.  It
is generally accepted, based on both deductive reasoning and extensive
empirical evidence,  that the ideal mortality rate curve consists of an
early decreasing phase due to burn-in and de-bugging  (infantile mortality)
followed up by a phase of useful life,  where it remains approximately
constant before climbing to failure due to wear out, aging, or fatigue
damage.  However, no models  for the mortality  rate have been advanced
so far that depend on as few parameters as possible and which  also
contain all the three phases mentioned earlier and which,  at  the same
time, give rise to the survival  function in a simple closed form.
        Therefore,  families of distributions which contain those
corresponding to the increasing mortality rate (IMR), those
corresponding to the decreasing mortality rate (DMR),  and finally,
those corresponding to the  constant mortality rate (exponential distributions)
and combinations thereof giving rise  to the classical bath-tub shape
mortality rate curves,  are characterized in a closed form in this report.
Methods of estimating the parameters are developed and procedures
dealing with computational  details and statistical properties are
discussed.
       A  graphics program is developed which gives rise to  the
curve that is most apt, in a given situation, by playing with the
parameters in the present models.  The interactive part of the
program, as well as  the problem of dealing with  competing risks
in a realistic manner are reserved for the continuation part of the
study.

-------
                              TABLE OF CONTENTS
Chapter                                                                 Page
   1    Introduction 	    1
   2    Annotated Bibliography on Statistical Analyses of Follow-Up
        Studies for Patients 	 	    7
   3    Realistic Models for Mortality Rates and Their Estimation ....   38
        Introduction 	   38
        Estimation of the Hazard Rate Function 	   38
        Theory	   39
        Realistic Models of the Mortality Rate 	   42
        Application to Medical Follow-Up Studies	   4 5
        Life Table Model	   1*9
        Follow-Up Model 	   ^9
        Test Results 	   39

   k    Applications of the Models and an Example 	   72
        Real Life Examples to Which the Model Can be Applied	   72
        Statistical Models with the Bath-Tub Property	   73
        Graphics Program	   7^
        Maximum Likelihood Procedure for Estimation	   75
        A Numerical Example	   8l

   5    Computer Program for Graphics 	   8k

   6    Concluding Remarks	   94

        Bibliography 	     95

-------
                               CHAPTER 1
                            INTRODUCTION

       The investigation leading to the study in this final report
interestingly, has its  roots in a typical problem which is widely
encountered in medical or epidemiological follow-up studies.  The
particular follow-up study, brought to the attention of this author
via Dr. Emanuel Landau,  then at the Bureau of Radiological Health,
and Dr. W. J. Dixon of the Health Sciences Computing Facility at
U. C. L. A. , is widely  known as "The Cooperative Thyro-Toxicosis
Study. "  The data base came from twenty-four hospitals in the United
States arid one from the United Kingdom.  Several people  notable Mrs.
Tompkins,  presently of the EPA at Durham, North Carolina and Profes-
or Lewis  of the  California Institute of Technology,  Pasadena, California.
have worked with this  data base and came up with conflicting conclusions;
        Dr.  Lewis maintained that the incidence  of the acute form of
leukemia  in patients treated with radioactive iodine is statistically
significantly larger than the incidence among those treated with tyro-
dactomy and/or medicines.
       On the other hand,  Mrs, Tompkins maintained, basing her
analyses on the  same  data base, that there was indeed no significant
difference in the incidence of the acute form of leukemia among the
two groups.  The author agrees with the stand  taken by Mrs. Tompkins
because of the followingtwo flaws in the arguments advanced by Dr.
Lewis;
       The main flaw is the age grouping that Dr. Lewis has resorted
to (indicating possible deliberateness to make the data vouch-safe to
the stand  taken by him) lies  in its arbitrariness rather than taking it
as a concomittant or co-influencing variable.
       The other flaw is inherent to the data base by way of the
relatively infinitesimally small incidences of the acute form of
leukemia  in relation to the large sample of the patients.

-------
        Also,  the empirical age distributions of the patients in the
two groups, namely those treated with radioactive Iodine and those
treated otherwise,  turn out to be almost the mirror images of one
another, the one corresponding to those  subject to surgery being
skewed to the right while the other, being its mirror image  is
skewed to the left.
        More explicitly  and in greater detail we  recall the following
findings;
        The February 1972 issue of the Journal of the National
Cancer Institute contained an article entitled "Irradiation in the
Epidemiology of Leukemia Among Adults, "  by Gibson, et al.
This paper was also presented by them at the American Public
Health Association annual meeting in October 1971.   The authors
stated in the summary  that the relation between leukemia and
diagnostic medical irradiation was compared.  The comparison was
based on 1, 414 cases  of adult leukemia in patients who were diagnosed
as having leukemia or who died with leukemia between 1959 and 1962
in twenty-six upstate New York counties  and counties in and around
Baltimore.  Maryland,  and Minneapolis,  St.  Paul, Minnesota.  These
1, 414  cases  were compared with a so-called control  group of 1, 370
persons randomly selected  from households in  the same geographic
area as the  cases in the sample population.  The  association of
irradiation with leukemia was examined by histologic type, sex, and
age of the subject.  The two histoligic types of  leukemia considered
were lymphatic and myeloid -monocytic and within each type  the acute
and chronic types were distinguished.   In the rest of the cases  the
type of leukemia was  unspecified.
       From a statistical analysis of the data,  the authors concluded
       (a) irradiation had an enhanced  risk for both acute and chronic
         myeloid leukemia in males but not  in females;
       (b) the effect or association was more pronounced for irradiation
          to the trunk than to all other sites;
       (c) neither irradiated males  nor females had  an elevated risk

-------
        for any other histologic type of leukemia; and



     (d)only a small proportion of the irradiated population developed



        leukemia,  and irradiation accounted for only a small proportion



        of leukemia  cases.



     Many criticisms can be made about their report and procedures.



here are not sufficient data published to examine their adjustments  for



ge differences in the exposed and control groups.  Any modification



ould affect the analysis significantly.  The  computations for estimated



elative risk were not given.  Examination of these computations would



e desirable,  since the observed correlation of leukemia with number



f films throughout their tables occur,  whether the ratios are significantly



irge or  not.  This may indicate an even stronger relation between



jukemia and exposure than these authors indicate.



      Some of the fundamental criticisms  about most of the analyses



lade by  different investigators of the thyro-toxicosis data base were



Lven in a paper by Murthy, et al,  that appeared in the Journal of the



ational  Cancer Society.  Their criticisms  are;



      (I)  The major defect  of the paper is  the large difference in age



      distribution between cases and controls.   Only 37 percent  of the



      controls were over sixty years of age.  compared with 62 percent



      of the cases.  An age standardization may not be an adequate



      compensation  for such a large difference, in view of the implied



      cohort differences.  Moreover, the chi-square test is a  crude



      assessment of the proportional differences of cases and  controls



      with a  given attribute,  and the Cochran modification cannot be



      relied  on to control variations in the age differences of the two



      groups.



      (2)  So much of the paper deals with the  method of removing possible



      sources of bias from the data that it  is  difficult  if not impossible.



      to characterize the ultimate samples which form the basis of the



      author's analysis.



      (3)  Table 1, on page 303, shows age  distribution (in  10-year intervals)



      of 1,  370 controls and I960 census data  for the appropriate geographic



      area  which includes approximately 13 million persons.  These data

-------
    are not consistent with the statement on page 302 that
    the controls form approximately a "one in 3, 000 probability
    sample" of the population.  Also,  in our opinion a "probability
    sample" is not relevant unless the authors of the paper explain
    their meaning of the term.

k)  In our opinion, the paper would be strengthened by a statistical

    appendix, explaining the age adjustment, the weighted percent

    exposed, the test of significance, and the estimation of relative

    risk; if variances are important,  they too should be included.

5)  The suggested appendix should also

    a)  describe the respondents for cases and controls, to determine

        whether husbands speaking for their wives and vice versa, are

        equally reliable, and

    b)  show the effectiveness of the  "auxiliary reporting system,"

        e.g., did this affect the proportions of cancers notified

        during life and after death?  If,  as the reviewers  suggest,  a

        high proportion of leukemias was first reported by death cer-

        tificates,  were the radiographic histories equally  reliable

        for leukemias and other cancers?

6)  Conclusions comparing trunk with all other sites, as stated  by

    the authors, cannot be examined, since they merely  report "all

    sites"  and "trunk only."

7)  The conclusion, that X-rays  of the chest  and abdomen were respon-

    sible for a small proportion of  the myeloid leukemias,  did not

    consider all the  pertinent factors.   For  instance,  the  authors did

    not record intervals between exposure  and onset or  make allowance

-------
             for the possibility that the preleukeimic state, which increases



             susceptibility to infection,  might  increase  the chance of being



             X-rayed during the 5-6  years before  the onset of symptoms.




         The  interval between exposure and onset is  important.   Stewart,  in



his article  "Tissue Aging  as a Factor in Juvenile Cancers," [Proc.  R. Soc.



Med., 65-2^5-246, 1972] has shown that when myeloid leukemia  is caused by



obstetric radiography, the interval  between birth and onset of symptoms  is



usually  5-10 years.  The  preleukemic state is  also important.  The probabil-



ity of developing a nonfatal respiratory infection  while incubating leukemia



is increased, as described in Kneale's article  "access Sensitivity or Pre-



Leukemics to Pneumonia:  A Model Situation for  Stiidying  the Interaction  of



Infectious Disease  with Cancer," [Br.  Prev. Soc.  Med., 25:152-159,  1971], as



well as  the  probability of developing a fatal infection.  The latter is



brought  up in Kneale's 1971 article  as well in  a  titudy done by  A.  Stewart,



"Epidemiology of Acute (and Chronic) Leukemias,"  appearing in Clinics in



Haemabology, edited by S.  Roath [London:  W.B.  Saunders, 1972].




         It was suggested by Stewart,  et al, in  an article entitled "Adult



Leukaemias and Diagnostic X-rays," [Br. Med. J.,  5309:882-890,  1962] that



X-ray3 of the chest and abdomen might be  responsible for a few  myeloid leuke-



mias in  adults.  However, their data showed that  only for five  years before



the onset of symptoms were the patients with myeloid leukemia more at risk of



being X-rayed than their controls.    Therefore,  it is reasonable to conclude



that -the extra X-ray examinations in the adult  surveys were a consequence of

-------
 the heightened sensitivity of the recipients  and not  a cause of the  disease.




         Since basic issues such as the selection of an appropriate model, the



 procedure used to adjust for the age of the subject,  and finally, taking into



 consideration the environmental factors and the  subject's individual medical



 status are involved in all these studies,  the author  undertook a systematic



 investigation that will probably extend over  a period of time,  in order to



 tackle these basic issues.




         Towards this end,  during the first phase of this  investigation and



 culminating in this report,  the author made an extensive  study of the state



 of the art of procedures for analysis,  under  the title "Annotated Biblio-



 graphy.   This has been summarized in Chapter  Two of this  final  report.  Chap-



 ters Three and Four deal with extremely versatile models and an  example of the



 relative performance of the  models dealing with  "Survival Data," dealing with



 a hypothetical cohort of 1,000 people born  in  the United States in the year



 1900,  is  given.




         In Chapter 5  a graphics method which can be made  interactive is de-



 scribed and this procedure will enable the  investigator to choose the appro-



priate model, based on his empirical experience.




        In Chapter Six, the author points out how these methods can be applied



to assess the marginal detrimental effect of any designated pollutant on the



mortality rate, in the presence of other competing risks such as the concen-



trations of an entire spectrum of harmful pollutants.
                                       6

-------
              Annotated Bibliography on Statistical Analyses
                    of Follow-Up Studies for Patients
       In this chapter, we review comprehensively the state of the art

situation in modeling for the survival curve and the methods used for com-

pariitig different survival curves.  The review follows the alphabetical list-

ing of the authors.  This chapter, therefore, gives an insight into where we

stand in our ability to deal with the survival experiences of patients in

the (competing risk) environment subjected to different regimens or treat-

ment!!.  Consequently, this chapter serves as a stimulant for additional re-

search, meeting the real world situation and revealing challenges as to how

the investigator should tackle the various violations of the underlying

design or profile.


       In his paper "The Comparison of Survival Curves, " Armltage [19591

considered the problem of two survival curves.  The survival time distribu-

tions,  FA^) = 1 - e"    and  F(t) = 1 - e"   ,  are compared by the
                                fi

statistic
Several techniques are compared by considering the ratio of their respective


      3 •



       Parametric maximum likelihood estimators of  ^  are



                                     SIP  -L. V +
                                     A. T /J U .
                                        d
                                                t

-------
where  t.,  1 < j < d,   is the time at death  from entry and  T.,  1 < i



< n  -  d,  are the survivors' times since entry out of a total of  n   patients



entering treatment.




       The Sign Method assumes that patients  enter in       and one is given



treatment A and the other is given treatment  B.  This test requires  that



(l - F/.(t)) ™ d - *"B(t))   for some constant k.  Any of the sequential



procedures for  binomial  sequences can be applied.




       Direct comparison of proportions of survivors at time  T  after treat-



ment is shown to be less efficient than the sign method.




       Actuarial methods divide the interval  (O,T)  into equal subintervals



and  estimate the probability of survival through each subinterval directly.



The  product-limit (PL) estimate is
                                   TT      n-r
                                   Jl   n _ r + i  »
where  r  is the number of patients for which  tr < T.  This is a non-



parametric maximum likelihood estimator of  F(t).




       Computations of relative efficiencies are given and an illustrative



example is worked out.




       A "Study of Life-Testing Models and Statistical Analysis for the



Logistic Distribution," has been made by Bain, et al, [1973] in which they



have reviewed bath-tub hazard rates in Section I of their paper; the model I
                                     8

-------
of Marthy and Swartz  [1971] is shown to correspond to the distribution of

the ii^nl"""" of a Pareto and a Weibull variable.  It is shown that the mini-

mum of Weibull variables gives a hazard rate


                                      -l           -l
                     h(t)  =


which is U-shaped in  PI < 1  and  ^ > 1.


       Estimation of the parameters in a polynomial model are discussed in

Section II.  For a linear hazard, as proposed by Kodlin [1967], it is shown

that the maximum likelihood equation  dL/da » 0  and  dL/db » 0,  where

L = L(t.,...,t  ; a,b)  in the likelihood function may have multiple solutions

and that the maximum likelihood may occur on the boundary.  The Cramer-Rao

lower bounds are computed for sample sizes of  10, 20, kO  and  80.  Esti-

mates using a least squares fit to the empirical distribution function and a

fit of  In (l - F(t±))  to  In (l - (i/n+l))  are developed.  A test of

b » 0  vs.  b^O  is proposed for testing whether the distribution is expo-

nential or has a linear increasing hazard rate.


       Section III presents the results of Monte Carlo techniques to obtain

the distributions of maximum likelihood estimates of the logistic

distribution.


       In a paper dealing with the "Failure of a Model to Predict Cancer

Survival," Berg and Robbins [1967] consider that the mortality due to cancer

is asisumed to be constant and equal to a linear sum of tumor size, tumor

-------
 grade,  node metastases  and node  sinus histiocytosis.   It  is  shown that  for a



 time beyond ten years after mastectomy, mortality does not agree  with the



 assumption of constant  mortality.




        Following this,  Berkson and Gage  [1952] take into  consideration  a



 "Survival Curve for Cancer Patients Following Treatment," using a function


                     —6+
 t.  « cjtn  +  (l-c)/_ e   ,   of  years after  operation,   t,  depending upon two
  "C      U          U


 parameters.   These  parameters  are  c  the  percentage cured and  P the  death



 rate from the cancer.   It  is fitted by least squares to data on survival of



 cancer  of the stomach.  The function is based on  the assumption that  a  cer-



 tain fraction of the cancerous population  is dying at  the normal  rate,  while



 the rest  encounter  an instantaneous rate of risk  to the cancer that is  con-



 stant.  The  function fits  the  data very closely.   It is recommended that   c,P,



 with the  ratio of expectations of  life of  the cancerous to the normal popula-



 tion be used as a summary  of the mortality of the cancer, rather  than the



 five-year survival  rate.   Adjustment for age-at-entry-to-treatment is not



 made.




        Boag  [19^9]  uses maximum likelihood estimates to obtain the propor-



 tion of patients cured by  cancer therapy.  Histograms of survival times for



 all patients who died from cancer, after treatment, show  a  distribution that



 is  skewed with  a marked tail extending towards long survival times.   This  is



 true for data collected from various hospitals, for different cancers, var-



 ious sites of the disease  and  different methods of treatment.  The gamma,



Weibull, lognormal and exponential distributions were considered, but the
                                     10

-------
lognormal gives the best fit, as measured lay chi-square goodness-of-fit


criterion.  For a large group of people treated for cancer, the survival time


is described by a distribution that is a mixture of  c  percent who are per-


manently cured and  (l - c)  percent whose survival time follows a lognormal


distribution with parameters  H  and  
-------
where  f(x,y)dx  Is the number of survivors after  y  years and  f(x,0) is



the age distribution among patients at the time of treatment, with  F(X)



the total number of deaths in the interval  (x, x + dx)  and  g(x + y) is



the death rate of the normal population.  The derivation was made on the



assumption that the age of the patient has no effect on the response of the



tumor to the treatment*




       The discussion contained in this paper brings out many practical con-



siderations involved in follow-up studies such as the time between occurrence



of the cancer and its detection, the accurate determination of cause of



death, and the assumption of independence of deaths from cancer and deaths



from other causes.





       In his paper "Simple Mortality Rates," Broadbent [1958] using the



mortality rates,  m(t),  has fitted them by graphical plots to data on the



lifetime of milk bottles.






                                             m(t)




                      Constant            a




                      Linear              a + 2b t




                      Strange             a\ t




                      Gompertz            a e




                      Makeham             a e   + c

-------
       Buckland  [1964] presents the advantages of using the hazard rate in



his "Statistical Assessment of the Life Characteristic.,  A list of reference



papers that consider Weibull, Gamma, Normal, Extreme Value and Log Normal



distributions and their estimates are listed.  Chapter 2.15 of this paper



surveys the literature on the life table.




       In his paper, "A Note on the Consistency and Maxima of the Roots of



Likelihood Equations," Chanda [195*0 presents properties of the maximum



likelihood estimators for  0 = (e.,...,0 )  unknown parameters which are de-



veloped for random samples of size  n.  The density,  f (x,0),  is assumed to



meet the following regularity conditions:





1.    Q  lies in a k-dimensional interval  ft,
           a log e
                             log f
                           ae  do
                            r   s
                                     and
                                                   f
aoao  ao7
  rut
exist for all  r,s,t » l,2,...,k  for all  6 e ft  and for almost all  x.
2.
       For almost all  x  and for all  6 c
and
with
                 df
                          Fr(x)  ,
                           yiog f

                         ae  ao  ao,.
                           r   s   t
                                      ao
                                        r   s
         v*>
                                     13

-------
                          /_aoHrst(x)fdx  <  M
 for all  r,s,t = l,2,...,k  where  M  is a positive constant.





 3.      For all  0 e ft  the matrix  J,
                               -»
is positive-definite and  |J|   is finite.





       By expanding
                           log 9  »   )   log f ±
as
                     /     a    o '   -**a *   *   ^  /   * «   a * * *•   +> '   «^
                     /	i    o    o   To       o  ^ i    o   S   M   T/   FSTj



                     8=1                       t»l
where
                          (9)  --
                          ie;      n
                                     i-1

-------
and




                      L    _1   V  f   y log f i
                       rst   n   U  V  d9  30   de.

                                i«l      r   8





for all  6 » O'foj),  where  O'(x)  is a value depending on  x  for



r » 1,2,...,k.




       It is'shown that  L   (e*) -> J   (e*)  with probability tending to  1
                          rs         rs


and that  0  - 0 *  have asymptotically a Joint k-variate distribution with



zero means and variance-covariance matrix  V =  (n J   (0*))" •  Of all possible
                                                   rs


solutions to the likelihood equations, only one tends, in probability, to the



true parameter,  9*.






       Chiang [I96l] presents a "Stochastic Study of the Life Table and its



Applications," where the lifetimes of individuals in a group are assumed to



be i.i.d. random variables and it is also assumed that a constant force of



mortality for each year applies for each cause of death.  The case of cen-



sored samples is considered.  Maximum likelihood estimators are found for  p ,
                                                                            Jv


the probability that an individual who is alive at time,  x,  will survive the



interval  (x, x+l), assuming a constant force of mortality at each  x.  Com-



putation of the life-table is made, using  p ,  if all who entered the study
                                            Jt


die before the study is completed, that is, if life withdrawals occur the



life table is approximated by assuming a constant force of mortality beyond




the end of the study period.




       In the case of simultaneous risks, it is assumed that in each year the
                                      15

-------
force of mortality is constant and their sum is the total force of mortality*



The crude probability of death, in the presence of all risks, the net proba-



bility of death due to one cause, and the partial crude probability of death



due to a specific cause with all but one of the competing risks eliminated.



The maximum likelihood estimators of these probabilities are obtained.  The



asymptotic variance and covariance of the estimators are given, as is an ex-



ample of life-table construction using cancer data.  The interval over which



the force of mortality is constant depends on the investigator; in this



case, one year was selected.




       It is remarked that the problem of cases lost due to failure to



follow up is unsolved and probably has no unique solution.  Suggestions for



handling the data are given if the number of lost cases is small.





       David [1970] has presented a paper on "Chiang's Proportionality



Assumption in the Theory of Competing Risks," that an individual's lifetime,



under  k  independent hazards  H.,  1 < i < k,  is assumed to be the minimum



of the  k  lifetimes.  Hence, the total force of mortality,  n(x),  is
                             )  -
Chiang assumes that the ratio
                              cit
                                    16

-------
over the time interval   (x. , x.., )>  is constant and independent of  x.  This



condition, it is shown, can be satisfied by members of the three classes of



extreme-value distributions.  If the distribution must be of the same func-



tional form, then these are the only permissible forms.





       In another paper entitled "An Analysis of i3ome Failure Data," D. J.



Davis [1952] presents failure data from JO systems which are compared to ex-



ponential and normal distributions.  Chi-square comparison is made on time-



to-failure data by choosing convenient intervals of time, estimating the mean



by the sample mean, and calculating the expected number of occurrences in



each interval.  When data are the number of failures in equal time intervals,



a comparison is made to the Poisson distribution.  When only a few time in-



tervals are available, an index of dispersion is calculated as  S (X. -X)/X .



Normality of the data is assessed by visual observation and by a chi-square



test on equal intervals of time.  In the example considered, it is shown



that the exponential distribution may be regarded as a useful approximation



to the actual failure distribution.




       Dixon, et al, present a study of the "Use of Triethylenethiophosphor-



amide (thio TEPA) as an Adjuvant to the Surgical Treatment of Gastric and



Coloreetal Carcinoma: Ten-Year Follow-Up."  Data on survival from a color-



ectal study and gastric study are presented.  For the colorectal study, the



survived rate is fitted by the equation  EXP (-bt), excluding 50-day deaths.



For the gastric study, survival is fitted by  EXP («0.087t) + (l-a) EXP (-bt)
                                     17

-------
using a nonlinear least square fit.  The curves are described by the half-



life, which  is the number of years after the operation at which point  50



percent of the patients have died.   In the gastric study, the proportion of



patients with a half-life of  8  years is found and the half-life of the



remaining  cases is given.  The half-life of  8  years was apparently found



by trial and error.  The survival rate is compared with the survival for the



U.S. population of the same age, sex and race constitution, although neither



the age of the patients in the study is indicated, nor was it used to adjust



the survival rate.




       In  their paper "The Relative  Survival Rate:  A Statistical Methodol-



ogy*" Ederer, et al  [undated] define the survival rate as the proportion of



patients surviving a specified time  interval at a given age.  The relative



survival rate, then, is the ratio of observed survival rate to the expected



survival rate of a group of patients with similar characteristics of age,



sex and race, but free of the specific disease under study.  This rate is



preferred over the computation of survival rate by excluding deaths from



other causes then the disease under study.  The relative rate is also called



the age-adjusted survival rate, but It does not adjust for any association



between age and risk of dying from the specific cancer under study.  Life



tables of the entire population may be used for the expected rate in most



cases, since excluding the specific disease has a negligible effect on the



overall rate.  It is, however,  pointed out that there may be a large effect,



depending on the effect of associations such as that between smoking and

-------
lung cancer.   In making the adjustment, care must be exercised  in  selecting


either the static or  fluent life table.  For analysis over an extended


numb«r of follow-up years, the varying age composition  of the patient group


must be considered in determining the expected survival rate.


       An exact method, approximate methods, and a simple approximate method


for calculating expected survival rates are given..  The approximate methods


divide the population into varying age groups.  The exact method uses the


survival probability  for each individual and averages over the  group.



       The standard error of the relative rate is shown to be approximately


equal to the standard error of the observed survival rate divided  by the


expected survival rate.



       Epstein and Sobel [1953] in their article, "Life Testing,"  have shown


that times-to-death are assumed to have an exponential  distribution with mean


6.  The estimator
                  ^        x, + x-,+  *" + x+(n- r)x
                  Q     a   •*•    g	±.	I
                  r,n                   r                 '
based on the first  r  out of  n  possible failures, is shown to be the maxi-


mum livelihood estimate, unbiased, minimum variance:, efficient and sufficient,


with density
                  1      / 7*  \T"  T"—1  •T£T/O

               T7TT7   (l)1/1^879    'or    y>0
                                     19

-------
        Fix and Neyman [1951]  in their paper entitled "A Simple Stochastic

 Model of Recovery,  Relapse, Death and Loss  of Patients," have found that

 cancer studies on either the  same treatment applied to different  categories

 of patients or of different treatments applied to a specific  category of

 patients,  usually must be based on small  numbers of observations.   The  first

 step  towards the  development  of statistical tests must consist in building

 a  stochastic model  of the phenomena studied.   The model must  involve simpli-

 fications.   The following four  states  are defined:


        SQ  » initial state;

        S.  » dead  from cancer  or operative death;      ,   ,   ,

        Sy  « recovered from cancer or leading  normal life; and

        S,  » lost  after recovery,  either by  death due to other causes
             or inability to trace the  patient.


A  state diagram showing  the possible transitions is  given.  The probability

of transition  from  state i   to   J   in the  time interval  k,   Q . . v(t. , t?),

when  in state   i  at time  t,   and  state  j   at time  tg  is  defined.   The

rate  intensities
                                         for    k * °  or
are defined and only one transfer in the time interval is assumed.  The

probabilities are assumed constant in small time intervals.  The solutions

for  Q. .(t)  where  t » t^ - t.,  are obtained and the estimation problem is
                                     20

-------
discussed.  The notion of the expected length of normal life, state  S_,



following a given treatment is introduced and calculated.  A numerical cal-



culation shows that different intensities can produce essentially the same



death rates, but different normal life lengths.




       A study of the "Maximum-Likelihood Estimation of the Parameters of



the Gompertz Survival Function," is undertaken by Garg, et al [1970].  They



define the Gompertz distribution survivor function as
and hazard rate as






                         Mt)  •"  k e





A shift in the time scale keeps the same functional form, but with a new



constant for  k.




       The maximum likelihood equation for grouped data is given and its



solution obtained by an iterative method.  The empirical covariance matrix



of the maximum likelihood estimates is derived.  The estimation is applied



to grouped data from a study on the mortality of mice, used to investigate



the effect of oral contraceptives.  A chi-square value is given, but it is



not clear what was tested; it is stated that the distribution described the



mortality quite well but no Justification is given.  The (MLE) maximum



likelihood estimation appears to give a better fit than least square values.
                                     21

-------
        The use and advantage of using the hazard rate  is  described by Gehan

 [1969]  in his  paper "Estimating Survival  Functions  from the  Life  Table."

 The  author shows  that  the  log normal survivorship function,  although  used  in

 medical follow-up studies, has  a hazard rate that is inappropriate for most

 medical situations.  Methods of estimating the hazard  rates, based on a par-

 tition  of the  time scale,  are reviewed.   Approximations to the variance of

 the  estimates  of  the hazard  rate and probability density  functions are de-

 rived.   A new  statistic, the median  remaining life-time,  is proposed  and an

 approximation  to  its variance is obtained.


        In developing "An Approximating Function  for the Hazard Rate," Ghare

 and  Kim [1970]  define  the hazard rate as:
             h(t)  -  -*±—  +  cet(d/e),      0 0  and  b,c,e > 0.  This hazard rate is used to approximate the

bath-tub hazard shape.  The parameters are estimated by a non-linear least

squares fit of  h(t)  with the empirical hazard rate determined by an arbi-

trary partition of the time period.  Two examples are given, the first based

on  1,000  failure times of B-52 aircrafts and the second based on time-to-

failure of  k80 fuses.  Neither example, however, exhibited a bath-tub

hazard rate.


       Grenander [1956], in his paper "On the Theory of Mortality Measure-

ment," discusses some of the problems encountered in this type of study.
                                     22

-------
Among others, the more basic problems that arise in biometric mortality



studies are the small size of the sample and also that of choosing the best



method, among the various methods available, for measuring mortality.  The



author suggests choosing a computationally simple method which has a rela-



tively high efficiency, as measured by its mean square error.




       Denote by  t ,  the probability at the time of birth for a person be-
                   A


longing to the population under study, of still being alive at age  x.  I



is normed so that  £Q ** 1.  The mortality intensity  H   (also, otherwise



known as the conditional failure rate, failure rate, or hazard rate) is then



defined, as usual, by
where  /'  denotes the derivative of  I  with respect to  x.




       In Chapter 1 of Grenander's paper, he discusses some of the usual



methods (maximum likelihood, method of moments and its variations and the



King-Hardy method) for estimation of the so-called Makeham model where  t*



is given by
                                           V

                                    a + B e x
Some tests for differential mortality (i.e., whether or not the mortality



rates in two populations are the same against appropriate alternatives) are
                                     25

-------
 also considered in this chapter.




        In Chapter 2,  the author deals  with mortality measurements when it  is



 not known and,  therefore,  do not  assume any particular analytic  form for the



 mortality intensity  V •  Estimates  and their properties  are  studied for the



 probabilities of death, reverses,  etc.




        The situation,  when we have some information about the general form



 of the mortality curve, although we  cannot express  this knowledge as a para-



 metric representation for H ,  is considered in Chapter  3*   For example,
                             Jt


 when the  frequency function  is  known to be nonincreasing,  it  is  shown that



 the maximum likelihood estimate is given by the  largest convex minorant.   A



 similar result  was shown to  hold  for a nondecreasing mortality intensity.




        Finally,  in Chapter k, the  results  of numerical computations  on data



 of the efficiencies of various  methods of  estimation are presented.   The



 question was  left  open to  further  research as to how to make  a compromise



 between efficiency and the specific  case of computation.




        "Maximum Likelihood Estimation  in Truncated  Samples,"  is the  subject



 of Halperin's [1952] paper.   The maximum likelihood estimation of parameters



 from a censored sample  of  r out  of  n  ordered observations  is considered.



 The  following regularity assumptions are required for the main theorem, which



 states that maximum likelihood estimators  from the  joint density function



behave the same as  those from identically  distributed and Independent



 samples.
                                     2k

-------
       The  following Regularity Assumptions are made on Density



       ,...^  ):
c> In f(x,er...,e )      In
A.
exist a.e.
)
                                                       In
  and
                                                             >l
B.
          df
                                 XT.
ft
»l
< V*> ,
y In FU^,
...;o )
" 5
i
                                                              Ht(x)
where  F  .(x),  F2.(x),   F...(x.)  are integrable over  (-00,00)  and
                  _ 00
                                  .,...,Op)dx
and  M,.  is independent of  6-,....©
      Ji                       J-      T
C.  The matrix with elements  A,
                                                              dx
               +  —
                                       ,,...
                                       •1
                                                  dx
                      dx
                                      25

-------
 is positive definite for all  9,,...,0,  where  ^  is defined by



                               r~*

                        q  «  /     f(x,e.,...,0 )dx
                             J «       J-      P
                               — 00 •
 and  q  is such that  M- = [qn].




 D.   t(x,Q ,...,9 )  is continuous at  ^ = x  and its derivative is continu-


 ous with
                In f (x)          In f (x)       o  In f(x)

                     ~
 continuous  at   x
E.   I/(F(X,S. ,...,0 ))   is bounded independently of  0. ,   1 < i < p,   by
            J.      p                                  i      —   —

I.(x)  and  Ed^XjO., ...,Q  )) < I.  when  I  is independent of  6..   Under


these regularity assumptions, Halperin proves:
       THEOREM :  The likelihood equations




              S In
                                                    i
-------
            a multivariate normal limit lav;  and




        3)   the concentration ellipsoid, corresponding to the covariance



            matrix of the  limiting distribution of */n (0 -  0),   is  identical



            with
       A  study of the  "Timing of the  Distribution of Events Between Obser-



vations,"  is done in Harris, et  al's  paper  [1950].   They have determined



that the  rate  at  which a disease occurs  (the example considered herein is



tuberculosis)  can be estimated from knowledge of the time of last diagnosis



and the time of first  diagnosis.  Lost cases are (accounted for.  Two cases



are considered:   the constant rate  p,  and stepped  rates  p..  Letting  n



be the date of loss, that is,  n is  the last date before occurrence, and



n + m  be the  first date after occurrence, then a first approximation to the



rate is given by
                                    S'l
where  S  indicates the summation over all cases and  7,'  indicates summation



over eventful cases.  In the stepped case
                                     27

-------
where  f   depends on the estimated  p.,






                                     pk\
                            f,
       Kale  [19 2] presented a paper on "The Solution of Likelihood Equations



by Iteration Processes.  The Multiparametrie Case*"  The metric





                            b(A)  -  max \\.\

                                      ij    1J





for a matrix  A « (V .),  1 < i < k  and  1 < J < k ,  is used to prove that
                    ij      —   —           —   —


as the sample size approaches Infinity and under regularity conditions on the



density, the iterative methods of solving maximum likelihood equations con-



verge*  A consistent estimate of the parameters must he used as the starting



values.  The general Iterative method is
where
                                     - G(Tp) A(Tr)
and
                                     28

-------
                   0(0)
                                 log L
log L


cek
and A  is  a matrix*   It  is  shown that  if

                       b   •—   < ^ *     6 
-------
           number of deaths in the  i    Interval,   jr.,   are recorded; and



       2)  the  yi  are preassigned and the boundaries of the Intervals,  x,



           to  x..,  are observed.



                                                                   "hh
       The classical estimate of mortality at the midpoint  of the  i    in-
terval is
                     ,co  .  ifi  ._!£..

                      *4               i I Iff -i.  IT
                       JL               o \*»j  •*  •"
where  T. « x, - - x.  is the interval width and  N.   is the number alive at



time  x..  Another estimate is
                           +    T   .      log
                      v       94      T   J-IT>.
                      Xi      2  i      T±
A numerical study of these estimates showed the bias to be a function of all



the parameters, assuming a Gorapertz mortality function.



       For the second type of sampling,  the estimate used is
and it is unbiased for  M-.   if  M-  + T » M-^   for  0 < i < T,  i.e., constant
                         JL        ***4                  ~~


mortality in the interval.   If  |i.  increases  linearly over the interval,
                                     50

-------
the estimate
where:  A.  and  B.  are complicated functions of the statistics is proposed.




       The four estimates are applied to statistics on time- to- death of rats



exposed to radiation.  Differences of up to  8  percent between the four



estimates are found.  Comparison due to different partitions is not dis-



cussed.  Plots of  p.     and  M-     are given and show several dips.





       In his paper, "A New Response Time Distribution," Kbdlin [1967] gives



an increasing mortality rate,  M-(t)  as






                             Mt)  m  c + let





with probability density function,  f(t),  as
              f(t)  -  (c+ kt)e-c,      t,c,k>0  ,






for the time-to-death of an individual under a specific treatment.  The mean



and variance are obtained.  The maximum likelihood equations are given for



grouped data with censored trails, using:  n = number of individuals in a



sample;  d. » number dying at  t.;  s. ° number alive at  f .;  and,  T  =



time of loss or end of follow-up.
                                     31

-------
        The model is  applied to  121  breast cancer patients to compute the


 excess  risk by vising only those that die from causes other than cancer.  The


 overall mortality is decreasing and cannot be used with the model.



        A  second  model for mortality,  Mt),  is proposed
                     Mt)  -  -   ±-  -  + c + kt

                              8 e  * +1-8
as are some "crude" estimators.



       Interpretation of  f(t)  for various parameters is given, as applied


to medical follow-up studies.




       Mantel  [1966] proposes the overall comparison of two survival func-


tions, requiring a value function for rating the duration of survival.  A



chi-square procedure is proposed with an implicit value function.  The study


begins with a population of  N.^  homogeneous patients.  In the  i    inter-



val of the study,  r^ .  die  and  t..  are lost track of.  The estimated prob-


ability of surviving to the  J    period, then, is
                                   N   - r
                                    li   rli
where  N     - R^ - rn
       A chi-square test, based on a contingency table is proposed.

-------
        In their paper, "Life Tests Under Competing Causes of Failure and the



Theory  of Competing Risks," Moeschberger, et al [1971] use the nonnegative



random  variables,  y,,  to denote the lifetime of a subject due to cause-of-



failure,  Ct,  I ** 1,...,k.  It is assumed that  y. » min. y.  and  C.  are
           *                                      1      K  *        *


observed.  Parametric distributions are assumed, in particular, a Weibull



distribution with ungrouped data are desired.  Results with this method are



smoother than similar tests made using nonparametrlc or partially nonpara-



metric  methods.




        Murthy [1965a] in a paper entitled "Estimation of Probability Density,"



defines a function  k(x)  as a window, if




        1)  k(x) > 0;



        2)  k(x) » k(-x);



        3)  lia x       fc(x) - 0;  and




        k)  r^ k(x)dx - 1.




It is shown that the estimate
                  fn(t)  -J     Bn k(Bn(x-t)) dFn(x)
where  F (x)  is the empirical distribution and  1)   is a sequence of non-



negative constants with  B  -» °°  and  B_/m -» •»  as  n -»  °°  is consistent



at a point of continuity of  f(x).




       In another paper reviewing the "Estimation of Jumps, Reliability and

-------
Hazard Rate," Murthy  [I965b] defines the reliability estimator

as having the large sample mean,  R(t),  and variance  R(t)(l -  (t)),  which

is asymptotically normal.  The hazard rate estimator is defined as


                                   fn(t)
                         Zjt)  -   n
                                   R (
where  R  (t)  =  -   times the number of observations greater than  t  among

x., . ..,XQ,  and where  fn(t)  is as defined by Murthy in [1965a] and is

shown for large samples to be consistent.  The estimator
                         «?<*>  '  WTtT
                                    n


is shown to be consistent and asymptotically normal.


       Murthy and Swartz [1971] in their report "Contributions to Reliabil-

ity," present various realistic models for life distributions in Chapter IV

that correspond to density functions and that can be expressed in closed

form.  Some methods of estimation, using order statistics, are also

suggested.


       Two factors are considered in Sampford's [1952] paper on the estima-

tion of response-time distributions.  They are:

-------
                       I.   fundamental concepts  and general methods,  and


                      II.   multi-stimulus distributions.



                       A normal distribution of  the response time  is  assumed and the experi-


                ment is carried through to completion.  Maximum likelihood methods are de-


                valorped for grouped data.  A second part,  considers estimation of parameters


                in the distribution of response  time when  two causes  of  death are operating


                on the subject.   The main model  assumes that the joint action of the two


                causes of  death is  independent and  the density  of the primary cause is normal•


                Various densities for  the secondary cause  of death are considered, including


                those  of the exponential and normal types.




                       Swartz,  in his  paper "The Mean Residual  Lifetime  Function," [19731


                establishes  conditions and properties on a function which characterize that


                function as  a mean  residual lifetime function.



                       The elementary  properties of hazard functions  are discussed by Watson


                and Leadbetter in "Hazard Analysis,  I" [1964].  Estimation of hazard rates


                using windows is presented with  the  estimator being



                                                 n
Z
                                                       n-r+1
               and
                                 p 00                       p 00


                   E[h*(x)l  -  /    5(x-y)h(y)dy  -  /    8 (x - y) F°(y) h(y)dy
                      n         J0    n                 JQ    n
                                                    35

-------
with an expression for Var  (h*)  given also.  8   is a window function.

Some less  sophisticated estimators are given:  a graphical derivative method,

a histogram estimator, and the classical actuarial estimator.  Death dates

on mice, exposed to gamma radiation, are used to obtain hazard estimates by

various methods.


        Zelen  [1966] explores the application of statistics and the role it

can  play by the increased use of probability models to guide the researcher

in the  types of data to collect.  Suggestions for further experiments and as

a frame of reference are discussed.  The time-to-failure is assumed to have

a survival function



                               r e-Mt-G),    t > G
                      S(t)  -  \                      ,
                               L    1    ,    t < G



where  G   is the guarantee time.   The mean is  G + (l/A).  This model is

applied to animal tumor systems and acute leukemia,  destruction of tumor

cells with laser energy and the analysis of survival data,  with concomi-

tant information.   In the last case,  it is assumed that time-to-failure has

density


                                -At
                         »  A.  e        for all   t > 0
for the  i    patient, and that  l/^ » a + bx.   where  x.   is the
                                     36

-------
concomitant variable, such as the logarithm of white "blood count, as in the



sample subjects with acute myelogenous leukemia.
                                     37

-------
                                  CHAPTER III
                    Realistic Models for Mortality Rates and
                               Their Estimation
 Introduction

         In this chapter mathematical models  are advanced which,  as  special
 cases,  represent constant,  increasing and decreasing mortality rates, along
 with combinations of these  properties.   Bath-tub shaped mortality rate  curves
 are the general shape of these models.   The  first part  of the  tub corresponds
 to infantile mortality,  the second part  (more or less,  constant  mortality)
 corresponds to useful life,  and  finally,  the last part,  which  is increasing
 corresponds to decay,  aging, etc., culminating  in death. Their  corresponding
 probability distributions and  survivorship functions are obtained in closed
 form.   To  the best of our knowledge,  the  closed form representations of sur-
 vivorship  functions  corresponding to the wide class  of  forces  of mortality,
 considered herein are  new to the literature  relevant to  mortality rates and
 their estimation.  The variety of mortality  shapes derived are illustrated
 and applied.   Estimation is performed by  several methods:  empirical, various
 window-smoothing,  and  maximum  likelihood techniques.

 Estimation of the  Hazard Rate  Function

        Observed times-to-failure can be used to  estimate the hazard rate
 function (mortality rate  or risk will also be used,  interchangeably).  Non-
parametric methods for estimating hazard rates were  considered by Murthy [196
and Grenahder  [1956].  However, the most they can do is to throw light on the
shape of the underlying hazard rate function and  cannot be used for decision-
making and prediction.  Thus, parametric procedures based on realistic models

-------
for the hazard rate  function are the only suitable) approaches  for estimation



and prediction.   It  is generally accepted that the most general hazard rate



curve consists of an early decreasing phase, followed by a section of constant



failure (corresponding to the useful life period), followed by a rapidly  in-



creasing failure  rate as death approaches.





Theory





        Let the nonnegative random variable,  T,  'be the time  to death of the



subject.  Let  T  have a cumulative distribution function (cdf)





                             F(t)  -  P(T < t)                             (l)





and assume that   T  has a mean,  JA,  variance,  * „  and a probability density



function





                            f(t)dt  -  dF(t)  .                            (2)





The mortality rate,  Mt),  is the instantaneous chance of dying at time  t,



given that the subject has survived to time  t  and: is defined as
where



                             H(t)  -  1 - F(t)                              00






is/the survivorship function.  That  Mt)  is a mortality rate can be seen by



finding the probability of death in the time interval  (t, t + At),  where

-------
At  is  a small positive time  increment, given survival to time  t.  We have
 from the definition of conditions! probability that
                  P(tt)     F(t+At)   F(t)
                         P(T > t)        "       ~~Rft}*
 By the mean value theorem of calculus,  F(t + At) - F(t) « f U)At ,  for some
 6  between  t  and  t + At,  hence


          P(t < T < t + Y > *)  -  fjlj A*>     t < 5 < t + At  .         (6)


 In the limit, as  At —> 0,  the conditional probability is  A(t)dt.

        The mortality rate also uniquely defines the survivorship function,
as can be seen by writing the difference equation for the probability of
death by time  t + At:

                   F(t-t- At)  -  F(t) + [l-F(t)] Mt)dt  .                 (7)

The right-hand side is the sum of the probabilities of two mutually exclusive
events,
        l)  death by time  t  with probability  F(t);  and
        2)  no death until time  t  and death in the interval  (t, t + At),
            with probability [l-F(t)]X(t)At.

Dividing both sides by  At  and taking the limit as At —> 0  gives

-------
                                -  [1 - F(t)] Mt)                         (8)
                           uv




which can be written as





                          Mt)  -  - £ In R(t)                            (9)
Solving this differential equation for  R(t) and using the  initial condition



R(o) =1  and  R(«) - 0,  we obtain





                                     -/* Mx)dx

                           R(t)  -  e  °                                  (10)





The integral  A(t) « /Q A(x)dx  is also called the accumulated, or cumulative



hazard by time  t;  its derivative  Mt)  is the hauard rate.




        Another interpretation of the mortality rates  is to  compute the ratio



of expected deaths in the interval  (t, t +  At)  to the expected number of



subjects alive at time  t.  Let  n(t)  be the expected number alive at time



t  from an original population of  N « n(0),  then





                   n(t + At)  -  N[R(t + At) - R(t)]   ;                    (ll)





dividing both sides by  At,  and taking the  limit as  At —>  0,  we have
           »     -  -Nf(t)  »  -N 1        [1 - F(t)]  -  -n(t) Mt)      (12)
or

-------
 This differential  equation may be  solved for n(t), producing


                                      -/* Mx)dx
                          n(t)  »  N e             .


        The mortality rate is  sometimes  referred to as a conditional density;
                                          00
 however,   X(t)   is not a density since   /_ Mx)dx  * *•   ^(*)  is> in fact,

 f(t/(T > t))  which is a conditional density only when  x » t, but is not a

 conditional density as  t  varies.


        We are concerned in this chapter with selecting a mortality function

 that models a medical follow-up  study and that also can be integrated to give

 a survivorship function in closed  form.  Murthy and Swartz [1971] show that

 any function,  h(x),  which is positive  and  for which


                                      rt
                               lim    /    h(x)dx                          (15)
                              t -»*   0


 is unbounded, can be used as a mortality rate which will give a unique surviv-

 orship function


                                     -/* h(x)dx
                           R(t)  »  e  u         .                         (16)


Three models are discussed in the following section.



Realistic Models of the Mortality Rate


        MODEL I;  This class  of distributions is characterized by the hazard
                                      1*2

-------
rate
                   Mt)  -  -       + c dt*"1,       t > 0                (17)
where the parameters,  a,b,c > 0  and  d > 1.  Some special cases of this

model are:  when  c - 0,
                     a    ,  a decreasing function of time  .             (18)
                  1 + b


The mortality rate reduces to a constant when  a » 0,  d • 1,  or  c « 0,

b » 0  and we have an exponential distribution.  When  b - 0, d = 2,  we have

the linear rate used by Kodlin [1967] to model death from breat cancer.

Finally,



                    Mt)  c  c dt1"1    when    a - 0  ,                  (19)


corresponds to a two-parameter Weibull distribution.


        The cumulative distribution function for Model I is
                                     (1 + bx)f


and the density function is
            f (t)  -  ( r^ + c dt*-1 ) (	3^-75 ectd )  .          (21)
                     \1+bt           /\ (H- bt)a/b      /

-------
         MODEL II;   This  class  of distributions has the hacard rate



             Mt)   -  ab  e"bt -i- c dt*"1;       d > 1, a,b,c > 0  .           (22)


 The corresponding  cumulative distribution  function is



                         F(t)   =  1 - e'a(l'e   )"ct                        (23)


 and the  density is



               f(t)   »   (ab e"   + c dt "  ) e    ~e

                                        n
 The  special  case of  (24), where  a « 1/b ,  may be of interest since it is a

 three parameter family of distributions with the bath-tub property.  This

 class of distribution is given by
                                       ---

                        F(t)  »  1 - e  b                                 (25)
with density function
             f(t)  -    V-+ cdt*'1   (  eb                 .           (26)
Note that the two-parameter Weibull and Kbdlin's models are also special cases

of this general model.

-------
        MODEL III:  Ibis class has the hazard rate!
                Mt)  -  j~+bt* cde   J      a,b,c,d>0  .               (27)


The corresponding cdf and density are,  respectively, given by
                   F(t)  -x'  ^rt^'                          (28)
and
          f(t)  -    -£_+,*.       	e-'e-       .       (29)
                   V1+bt         A(i+bt)a/tl           /


When  a » 0  and  d > 0,   Mt) » cde  ,  which is a monotonlcally  increasing

hazard rate called the Gompertz rate by Broadbent [1958]  and applied to the

lifetimes of milk bottles.  This rate is studied by Garg, et al  [1970] and is

used to model mortality of mice in oral contraceptive studies.
Application to Medical Follov-Up Studies


        The mortality rates corresponding to many real life situations fall

within the domain of our models.  From the practical, point of view, we dis-

tinguish among three categories of mortality rates:   BTR, DHR  and  ME,

depending on their respective shape.  Category BTR is marked by a decreasing

risk of mortality function, followed by an increasing one.  Category DHR is

purely decreasing,  while that of IHR is increasing.   The classical example of

-------
 a bath-tub hazard curve  is  the survival data of humans  (as well as  other



 animals).  It  Is  characterised by a high Infant mortality during the first



 year of life,  slowly decreasing risk  from ages   1 through   k,   a low mortal-



 ity rate from  ages  5 to around  35,   and increasing risk for  older ages,  as



 shown In Figure 1.




         Usually in statistical analysis of data a cumulative distribution



 function is assumed.   The random variable under study,  in this  case,  the



 time-to-death, can in many  situations be described by a mathematical model



 which also determines the functional  form of the distribution function.



 Zelen [1966] has  given several such models for  death  due  to  specific  types  of



 cancer.  These models lead  him to consider the  exponential distribution.



 Davis  [1952] showed that the exponential model  gave a good fit  In the case  of



 thirty nonhuman systems.  Unfortunately,  the exponential  distribution has  a



 constant mortality rate which  may be a  reasonable assumption in the case of



 certain short periods of time.   That it is not  the case for  time periods over



 five years, in medical follow-up studies  on humans, was shown by Berg and



 Bobbins  [1967]*   In the absence  of a mathematical model to guide us in the



 selection of a distribution, the concept  of the mortality rate has been



 adopted.  Buckland [1964] discusses many  examples  of hazard  rates that cor-



 respond to the Weibull, gamma, normal, log normal, and extreme value  distri-



butions.  Boag [19^9] considered the Weibull, gamma, log normal  and exponen-



tial distributions in a study on deaths from cancer after treatment.  He



 found that the best fit to be given by the log normal distribution; and, for
                                      1*6

-------
 a large group,  suggested that  the  survivorship be  modeled by a mixture of c
 percent completely cured and  (l-c)  percent  whose  survival times follow a
 log normal distribution.  Gehan [1969] showed  that the mortality rate of  the
 log normal distribution was  inappropriate for  medical follow-up studies and
 that a bath-tub rate was  necessary.
         The DHR class of distributions characterize  situations where there is
 an Initial high risk which decreases with time.  Over a limited time period,
 this might describe survival data  of patients  after  transplants or implanta-
 tions of mechanical devices, besides the usual infantile mortality in a
 "normal" situation.
         The IHR class of  distributions fit situations where a very high mor-
 tality rate is  reduced  and the patient is given a  longer time to live, but is
 not  "cured."  This  class  may be a useful approximation to the mortality rate
 over the   100   odd years  of  human life, because the  high mortality rate of the
 final five years is several  orders of magnitude greater than the initially
 high infant death rate.   Broadbent [1958], Garg, et al [1970], Kbdlin [1967]
 and  Grenander [1956] consider this type of hazard.
        To demonstrate the versatility of the three models proposed in this
 chapter, with respect to their ability to describe the various mortality
 rates that  one  encounters in real life, an interactive graphics program was
written so  that a person at the console can select one of the models and the
parameter values.  The corresponding hazard function is displayed on the
 screen.  If desired, increments of the parameters can be typed in,  and plots

-------
«J

>s
o
s
1
Infantile
mortality
                        Useful life period
                         (Polsson failures)
Chance and
  fatigue
 mortality
                B
                     -»• Length of life T (age)

                      Typical mortality rate as a function of age
                                      "Figure 1.

-------
of these curves displayed, one after the other, as the parameters Increase or



decrease.  Thus, the characteristics of each model can be fully illustrated.



Pictures of some of these plots are shown in Chapter k of this report.






Life Table Model





        Assume a homogeneous collection of people.  Let the time-to-death of



a person in the group be a positive random variable,  T,  with hazard  z(t)



of either our model I, II  or  III.  At a particular time, we have recorded  n



time-to-deaths,  t, < ••• < t ,  of the first  n  people to die.  One problem



which arises is to estimate the parameters in the model.  Another is to com-



pare the hazard of one such group with another.





Follow-Up Model





        In this model, a fixed number of people,  n,  at the same age, all



are assumed to have similar mortality rates when observed for a fixed length



of time,  L,  and their times-to-death are recorded as,  t, < t, < ••• < t ,
                                                          j. —  & —     —  r


where  r  is an integer-valued random variable  r < n.  (if  r < n,  we refer



to the sample as being truncated.)  The problem is to estimate the model para-



meters and to compare the hazard with other groups.




        A probabilistic model of the preceding models is obtained by letting a



positive random variable,  T,  be the time-to-death of an Individual and as-



suming that the individual times-to-death are independent and identically dis-



tributed random variables with common survivorship;,  R(t).  A random sample

-------
of size  n  of the random variable  T  will be denoted by  t., . ..,t .  In the
first model, the times are observed in order, and hence, correspond to the
order statistics,  t/.,\  where  t/.\  is the  i    order statistic.  The three
models will be fitted, by nonlinear least-square technique, to the estimated
hazard.
        Hazard rate,   Mt),  defined as

                                              R(t *      R(t)
                                      50

-------
suggests the estimator,  ^  (t):
                    z (t  )	L—	  .  ±-i                   (33)
                     n  i               n.          n n.
where  n. « No.Cj : t. > t.)  is the number of persons alive at time,  t.,



n. .  « Wo.(j : t. > t. + h)  is the number alive at time,  t. + h,  and  m. =
 i+n           j    i                                     11


n. - n. .  is the number who die in the time interval   (t. , t. ,  ).  Because
 i    n-n                                                i   i+h


the ratio of random variables is involved, it has not been possible to evalu-



ate the expected value of this estimator.




        Another estimator is provided by the definition,







                  Mt)  -  - £ In R(t)                    ,              (3^)
                  X(t)  -  -  lim    *          -  n       .               (35)

                             h-*0            h
This suggests the estimator, which we call the empirical estimator:





                  In n. ,  - In n.       ..
For small  m.,  This estimate approaches  Z _(t.) of formula (33).  The fol



lowing program segment will compute  zn(*.i )  for time-to- death dates, in



days, stored in order in array,  T(l),  with  TOT = n'^  and  SUM = n. + h.
                                      51

-------
                            DO 100 I-1,H
                            TOT = 0
                            SUM-0
                            DO 200 J«I,N
                            IF(T(J).GT.T(I)) TOT»TOT+I
                       200  ZF(T(J).GT.(T(I>H)) suM
                            IP(SUM.LE.O.) SUM«1
                       100  ZN(I)»ALOG(TOT/SUM)/H.
        We compute the  expected value of the empirical estimator as
EZ  (t.)  -  - £
  n 1        n
                                                                        (37)
Since  n.  has a binomial distribution
   (n  - k)
                                  J ) Rk(ti)
                                                               (58)
we have
EZ  (t. ,h)  »  - J
  n l           n
                                    In k( J )(G(ti
                                                       (39)
where
                         G(t)  -  Rk(t)
        The expected value, as  h  is made  arbitrarily small, is
          EZ(t.)  =   lim EZ(t.,h)
                              n  i
                                Onkf^G'd^)   .      (1H)
Evaluating  G'(t),  we obtain

-------
                                f (t.)
                   EZn(ti)  -  £7—^ b(n, t±t 9^ , ... , 0)               (1*2)
                                   1

where  b(«)  is a bias factor that depends on  n,  t.  and the parameters

G ,...,0   of the survival and is expressed as:



         b(n,t1,e1,...,op)

                    n                                                    (U5)

                -  £  in k ( J ) Kk(t1)(l-H(t1))B"k.(k-iiR(t1)) .



        If the time-to-death has an exponential survival,  R(t) = e~ '  , t > 0,

it was shown by Epstein and Sobel [1953] that  a minimum variance unbiased

estimator of  6  is  0,  where


                         T/.x +•••••• T/.x + (n-i)T/. x
                          5\ -L /          \ 1 /          \ l /
                      B  	1  n.        i T             >    ^



        Since the hazard  A(t) = 1/6, we  propose  the estimator  Z (t.),


                               Zn(t.) -   i .                            (45
                                n  1       g


Preliminary investigation of this estimate, under  different survivorships, is

not encouraging as the estimator exhibits  severe bias.

        Estimation of densities by using "windows" has been applied to  hazard

rate estimation by Murthy [1965].   A  window K(w)  is a function satisfying

the conditions
                                      55

-------
       1.  K(w)  » K(-w);
       2.  K(w)  > 0;

       3.  /"„  K(w) - 1;  and

       4.    lim    wK(w) « 0   .
           |w| -»«°

Murthy  [1965] shows that the estimator, called the window estimator,
                            f*(t )
                                          '                        (lt6>
                  R*(t)  = J    fj(x)dx
                             "C
For a rectangular window


                      r l/2h ,      -h < w < h
             K(w)
                                   otherwise
                              1   o,

we have the rectangular window estimator
 is a consistent estimator for  Mt),  where


                   f*(t)  -  	   /   K(ln n(t. - t.))
                    n          n    i_j          j    i

 and
                                     p 00
                                                                          (W)

-------
           n
(t)
                  n
                     No. I
                                              V*>
                                          +  —
where  Set A » (j ; t - n^ < t. < t+ 1^),  and  1^ » h/(ln n).




        The expression is derived  in the following way.  Let





                         D(W)   •   In nK(ln n - w)  .
Then
                D(w)
                         r
                            0   ,    otherwise
hence,
                                                                         (50)
                                                            (51)
          l*(t)

                                        D(t  - t)  .
(52)
Since,  D(t. - t)  is   1/2^  when   |t. - t| < h.,  the sum counts the number



of elements in the  Set A « (j : t-h, < t. < t+h^l.  The estimate of R(t)



given by  /. f*(x)dx which, upon substituting  w » t. - x,
           «  n                                     j



                               r^"*

                   f*(x)dx  -  /      D(w)dw
                    n          J  „
                                             ,  t j - t > hx


                                             , -l^ < t.j - t


                                             ,  otherwise
                                                            (55)
                                     55

-------
we obtain the denominator of  Z  (t).



        A computational simplification of the window estimators was suggested


by Watson and Leadbetter [1964].  me denominator of the window estimator was


replaced by the estimator,  Rn(*.i ) •" 1 - i/(n+ l).  Using the density esti-


mator  f*(t)  suggested by Murthy [1965], we consider the window estimator
                      VV
                                   n + 1
Since the denominator is not a random variable, we can write the expected


value of this as,
        EZn(ti)  -  n n * * ±  In n E   )   k(ln n(T. - tj))


                                       J-l

                                                                          (55)
L  L
                                                               (t)dt  ,
where  f . (t)  is the density of the  J    order statistic
        J
                fj(t)  -  J( *-* ) p^Ct) R^^Ct) f(t)  .               (56)





This is applied for rectangular and triangular windows to the three  sets of


model data.



        A triangular window estimator is obtained when
                                      56

-------
                                  0   ,     w < -h
                               JL + 1
                               J. + h
                                           -h < w <0
                   K(w)
                                  + = ,     0 < w < h
                               ,2 T h
                                            h < w
Let  D(W) » In nK(v In n),  then,
                    D(w)
                                     0    ,     w <
                                  -^ + vT >    -hn < w < 0
                                  v,^   h.         i —   -
                                  hl    1
                                                 J*
                                -•%+&•     0
-------
        The reliability estimator is
                                                      - x)dx
                         ^  rV*
                     £   )   /      D(u)du  ,
                     11   £_! J_ «>
                                                                     (61)
where
       r*^"*
       /      D(u)du
      w _ oo
                                    o           ,    tj-t<-h1


                           (t4-t)2   (t4-t)   ,
V^
  2h,
bl     2
                                                       <*-t<0 » (62)
                                                     0 < t.-t < 0
                                                          J    ~
                                                        < t.  - t
                                                           J
and, the hazard rate estimator is
                                                                    (63)
                         _i_  y
                         n h.   A
           1 -
        1

        n
                              JcA,
vhere  SGN(x) » x/ |x|,  (i.e., the sign of x).

-------
        The simplified triangular-window estimator  is
                           __
                           nhi
                ZA:
Test Results

        The empirical estimator,  given in (36), the simple rectangular-window
estimator of (5^)> and the simple triangular-window estimator of (6k) were
used on a set of randomly generated data from a known truncated exponential
distribution and on a set of 200  times-to-death of people from the Los
Angeles area*  The time-to-death  data  of people WBJS obtained from the Los
Angeles Public Health Cooperative Data File.  We wish to thank Dr. Anne
Coulson of the School of Public Health,  UCLA, for her assistance in obtaining
a random sample of  200  people from that file.  The date of birth and month
and day of death in 1970 was received  from cards; if the year of birth was
not given, the age at death was subtracted from 1970 and used for the year of
birth.  The birth date and death  date were converted to days and subtracted to
give time-to-death in days.  A  FORTRAN subroutine,  CALL DATE (I YEAR, J
MONTH, K YEAR,  N DAYS),  supplied by Steven Chasen of the Health Sciences
Computing Facility was used to  make the  conversion and account for leap years.
        The 200 exponential,  random numbers were generated by a FORTRAN sub-
routine,  CALL R GAMMA (l.,15000.,200,v),  developed by Frane and Murthy
                                     59

-------
 [197?]*  The mean was  15,000  days and the variables were truncated at
 U3,800  days.  The hazard is constant at  A = 6.7.10  ,  with a spike at
 120  years.
        A sort routine,  CALL SORT (A,l,200)  developed by Steven Chasen was
 used to put the data in ascending order.
        Programs for the estimators were tested for numerical correctness by
 using uniformly generated data with  t, » l82(i - l),  1 < i < 200;  the
 hazard is  l/(c - t),  0 < t < c,  with  c - 36,218  days and  *(o) •
2.75.10"5.
        The results of the three estimators are shown in Figures 6-12.  The
 time-to-death in days was divided by 365 to give time-to-death in years and
 the hazard was multiplied by  10   to provide clarity in the presentation.
 Various window widths,  h,  were used; the effect of  h  on the estimate is
 shown in Figure 5»
                                      60

-------
. '93 *
800 »
Model I
TOO *
*.
600 *•
H
A
I
A
R 500 «•
0
400 »
*
300 »
200 +

100 *
. ! PPP P
o +. ooo n
0
-5 5
' tml •
0 »
SIMPLE TRIANGULAR WINDOW
h= 10,000 p - estimated point *
o ~ data point 0 .
* - both *
'• :
0 . .
' • V *
• 4-
o • .
r 00 I
0 .
0
• o
0 »
0
• . o . .1
oo • .1
00 • P
000 PP P P »l
ppp»»ppppp .
ppppppp»*pppp*noo .1
PPPPP PP PPPPPPPPP 000000
PP p p p PPPP PPP PP p p oonoonnooooono .
oo o oo ooou onn oo o ocarina do nn »
10 20 30 40 50 60 7O 80 90 ^(J0
15 . 25 35 45 55 65 /5 85 9^
Figure 2

-------
1125  »
- . .
. 1000 +
*
875 »
Model II
750 »
• '
625 *
A
Z 500 +
A
R
D,75 ;
•
250 *
125 I

0 > *** •
•
-125 »
0
-5 5
P
0 »
:
SIMPLE TRIANGULAR WINDOW 0 I
h=10,000 p - estimated point *
•*. •
o - data point ' .
0
* - both p *
;
* .
"*
p , ..
0 . • *
• ' •
00 .
p
* ' .
• »
* .
PPO
P 0 .
POO
**o »
**o .
p**
0****0
0000********* • .
** * * * **** *** ** •* ****** ** *****«ppppp «
'•

10 20 30 40 50 60 70 8C 90 1JO
15 25 35 *B S5 65 ~ " '/!> Bi '" 75 ~ 	
AGE(YRS)
                                        Figure

-------
o\







H
A
Z
A
R
D







• ' ^ • • • • •
'1000 +
•
•
P7-5 <•
Model lU SIMPLE TRIANGULAR WINDOW
•
. • H = 10,000 p - estimated point
I o - data point
• • •
625 I * ' both
•
•
•
•
•
. 373 *
•
•
•
253 »
•
125 »:
•
. a * *•• * ** « * * •«»* *•• »* * ««•*»»
•
•
•
0 10 20 30 40 50
•
f
. . m
o .;''
• . •
•
- . •
*
a . •
•
*
•
*
0
•. • »
p
•
•
»
* '
* •
• •
0 . +
00 I
— •
. *
* *
PO
PPO
•P*0
• •0 »
• »p I
0*****
00***********? .
»* *»»»*»pp ^
•
'•
•
60 70 80 90 100
                                           15
                                                    25
                                                              5
                                                                               55
                                                                                        65
                                                                                                  75
                                                                                                           85
                                                                                                                    9»
                                                                     AG6IVPSI
                                                             Figure

-------
      10-
       8-
Zn(t).  6-
   I xlO"5-
                        T    r   i   n   r
                       40      60     80
                          YEARS
                   TIME-TO-DEATH IN LOS ANGELES
            I    SI
   Zn(tj)s2htn JEAj
   where Aj = |j: /tj-ti/
-------
               2n (t) EMPIRICAL Z EXPONENTIAL DATA
100-i
80-
60-
40-
                                    ' 1 h= 1000
         i    i    i    i    i    i   i
                                               Theoretical
     T    I
80      100
                      YEARS

               TIME-TO-DEATH EXPONENTIAL



                        Figure 6
                           65

-------
2n (t) SIMPLE TRIANGULAR WINDOW EXPONENTIAL DATA
                                            = 10,000




                                            Theoretical
                                        100
         TIME-TO-DEATH EXPONENTIAL
                    Figure 7
                       66

-------
              2n ft) SIMPLE RECTANGULAR WINDOW
      120-n
      100-
      80-
 Zn(t)
      60-
      40-
      20-

lOxlO'5   -
        !_•-
                                   h = 10,000
                                h= 1,000
              i   i    I    i   I    IIT   i    \
                20     40     60      80     100

                          YCARS

                   TIME-TO-DEATH IN LOS ANGELES
                       Figure 8
                         67

-------
                 Zn tt) SIMPLE TRIANGULAR WINDOW
    100-
    80-
Zn(t)
    60-
    40-
                                         .h = 10,000
                         YEARS
              TIME-TO-DEATH IN LOS ANGELES, n=200
                        Figure 9
                           68

-------
                        Z't) EMPIRICAL
                         n
Zn(t)
     100-1
      80-j
      60-1
      40-1
         0
20
                                                   Zn (tj)—  In
                                                    n  i    n
                                                                 n|
                                                        h= 1,000 days
 •40


    YEARS


TIME-TO-DEATH DATA
                             Figure 10

-------
              Zn ft) SIMPLE TRIANGULAR WINDOW
     10-
      6-

      4-

      2-
I xlO'5
      0
*«r
1
}
20
40
60
i i
80
100
                         YEARS
                 TIME-TO-DEATH IN LOS ANGELES
                           Figure 11
                               70

-------
             Zn (t) SIMPLE RECTANGULAR EXPONENTIAL
      40n
     20H

lOxlO"5
                               h=IO,000
                                          i heoretical
       0-h
         0
\\l\\\    \.\    \
   20     40     60      80

               YEARS

        TIME-TO-DEATH EXPONENTIAL
100
                                      j?*ig\irt 12.

-------
                                  CHAPTER k

                 Applications  of the Models  and an Example
 Real Life  Examples  to Which the Model Can Be Applied

         Die mortality rates corresponding to many real life  situations  fall
 within the domain of  our models.  From the practical point of view, we  dis-
 tinguish among three  categories of mortality rates.  A, B  and  C,  depending
 on the shape.  Category  A is marked by a  decreasing  risk function, followed
 by an increasing  one.  Category B is  purely decreasing, while category  C is
 increasing.
         In the biological field, the  classical example of a bath-tub-hazard
 curve — category A — is the survival data of humans (as well as other
 animals).  It is  characterized  by:  high  infant mortality during the first
 year  of life with slowly decreasing risk  from age  1  through  k, and the
 lowest mortality  rate starting  from age   5  to around 35 » increasing risk
 is seen as age increases above  the 35 year level  (see Figure 1 of Chapter 3).
        This category can be used for survival data pertaining to many var-
 ied diseases, as  long as they have this general pattern.  For example, Gehan's
 [1969] study was  an application of category  A, in which  F(t)  is called the
distribution function associated with the mortality rate  A(t)  and
                     B(t)  -  l-F(t)  .  ey-y                         (1)

is called the survivorship function.  The density function corresponding to
(l) is given by
                                      72

-------
                             f(t)  -
                                        Ul.




        It is clear from Equations (l) and (2) that there is a one-to-one



correspondence between the mortality rate on the one hand and the distribution



and survivorship functions on the other.  If we can discover mortality func-



tions  Mt)  with the bath tub property and obtain the integral  y(t) -



/* Mt)dt  in a closed form, Equations (l) and (2) will give the correspond-



ing distribution and survivorship functions, respectively, in closed form.  In



what follows, we obtain families of distributions whose corresponding mortal-



ity rates have the above desired properties.





Statistical Models with Bath-Tab Property





        MODEL I;  Hiis class of distributions is characterized by the hazard



rate








                  Mt)  -  IT^bt  +  c dfcd"1»      * - °                  (5)




where the parameters  a,b,c > 0,  d > 1.  Some special cases of this model



are:  when c » 0,
a decreasing function of time.




        When
                                      73

-------
                         a • 0,  d • 1,     X(t) » c



                         c - 0,  b • 0,     Mt) •* a






 Here,  the mortality rate reduces to a  constant and we have  an exponential dis-



 tribution.  When  b « 0,  d » 2,   we have the form described by Kbdlin [1967]:



 that form the observed data on  survival of males with angina  pectoris, the



 death  rate per yeat is highest  in the  first  year after diagnosis.  For the



 next nine years, the hazard function remains fairly  constant  and increases



 somewhat  later.




        Category B curves characterize situations where there is an initial



 high risk which  decreases slowly as  time progresses.  Survival data on



 patients  after surgery, transplant operation,  or other medical treatments  that



 would  Improve the  chances of survival, belong  to this category.




        Category C is  marked by  an increasing  risk function.   Experiments  that



 introduce  harm to  initially  healthy  animals, or  sick control groups that are



 unattended are good  examples for this.   Garg,  et al  [1970] and Kodlin [1967]



 gave numerical examples of this  type in  their papers.





 Graphics Program






        We shall now demonstrate the versatility of the models considered in



this chapter, with respect to their ability to describe the various mortality



rates that one encounters in real life.  To this end, an interactive graphics



program was written in such a way that a person at the console can select one



of the models at a time, and type in different values of the parameters

-------
through a keyboard.  The corresponding hazard function will then be displayed


on the screen.  If desired, different increments of the parameters can also


be typed in and plots of these curves will be displayed one after the other,


as the parameters increase or decrease by the specified amount each time.

Thus, the characteristics of each model can be fully illustrated easily,





Maximum Likelihood Procedure for Estimation



        The method of maximum likelihood is discussed here.  Since similar

techniques are applied to all three models, more detailed explanations will


be given for Model I, while only brief summaries of equations will be given


for Models II and III.


                                  MODEL I



        Let  t-,tg,...,t   be a random sample of sine  N  from the distribu-


tion  F(t)  given by 00.   The likelihood of the observed sample is
                       —       e
                  1-1  (l+bti)a/b
                        f                    r
                    a.   \   ••_/•.  .  ^^j \   _   \    ^d
an i  -  - 5   /   m(i + bti) - c  ^   ti(
                                    1-1
                                                                           (7)
                                     75

-------
The following equations can be solved, vising a,£,c  and  &  that will maxi-



mize (7); the second partial derivatives given by:
   a2 in L     V2
   	s— ° ~ b
                        U2(b-l)
      In L

         -"
N


I
      In L
       ad
      In L
           « - a
                  ZN
                                            In
                            - ab
                                  f1   *2i°
                                 1 + b  In t
J
                                  ctt)
                                  ctt)
                          -1  In tj,(2 + b  In t±)
                                - a
      InL
    ob oc
             ad
                              b In
                              (1
                                     76

-------
*1.

InL
      l    2d
      ct
                                     .   d
                                  Ct . ^ HI- —

                                    1    c
               -H 2d
                                            [X(t1)](i+
                            f
                        + d  >
                                          
-------
 and
          In L
          3d
- c
tJ in t,
                                                  + d In
where
                      Mt±)
                                                         (8)
With the aid of a computer, these equations can be solved iteratively.  Init-
                                       •

ial estimates can be obtained from nonlinear least-squares fit to the empiri-


cal hazard function.



        Estimates of the asymptotic variances and covariances of the maximum
    M
estimates may be
_
-
-------
                                   MODEL II
        The equations  Involved in solving for Model II are, briefly:
     In L  -  )   lir(ab
                                           ctj       (10)
with
                                 -bt.
           In L
                               -bt,
                             (1 -
                                                          -bt.
                                                                    0
                       1  ab e
         din L
f    e'*tl(l " bti}        V

 1  ab e     + cdt1"         1
and
         din L
           dc
                                  Ld-l
Li      -bt.
1  ab e    +  cdt
                                         "1
4
                                    0
                           t  1(1 + d In- t.)
           8d
                                             - c   >  t" In  t^   -  0
                       1   ab e     + cdt.
                                      79

-------
                            MODEL III
       InL  » >   In
         J
                      1 + bt,
                        +  cd e
                             - c
In 1 + bt,
                                                              (11)
vith
 la L
~aT~
& " J
                    a         dti
                             at,
                               dt,
              1  (  , .\t  + cd e '* ]( 1
                                                            0  ;
                                          ?  I   Ki+bti)
                                              b  L> H-bt,
                                                             0  ;
                      dt,
     In L
                de
                             dt,
                   bt,
                          cd e
and
     In L
        ZN     c e  i(H- dt.)        ^

           —	^r  -  °t
        1   ,  .a. .   •»•  c de  ^^       1
            1  +• Dt;.
                                              dt.
                                80

-------
A Numerical Example

        A hypothetical  cohort of 1,000  people born  in the United States in
the year  1900 was made  up, using information recorded in Vital Statistics of
the United  States.   (The probability of dying for ages  > 70  was approximated
using the values in  the latest 1967 of the aforementioned reference.) The
survival  data are presented below and the empirical hazard function was cal-
culated using

                           Mt J  .     '«!    .
                              mi     h.(l + p.)
where
               t . • midpoint of the  ith  interval;
                mi
                h± - width of the  ith  interval;
                q. = conditional proportion dying in the  i
                     interval; and
                p. = conditional proportion surviving in the
                     ith  interval;

as described by Kimball [I960].

        Nonlinear least-squares  fits to the empirical hazard function were
attempted for all three models.  (Computations were performed using BMIK8^,
Nonlinear Least Squares, a package program written at the Health Sciences
Computing Facility, UCLA.)  The estimated parameters and hazard functions
are shown in Table 2.

        It was found that  Model  I seems to fit the data best,  so the method
                                      81

-------
of.maximum likelihood was performed, giving estimates as follows:
                              a  -  1.59^ E 01
                              £  »  2.612  E 02
                              c  -  8.635  E-ll
                              £  •  6.299  E oo
The value of the log likelihood (Eq. 7), using maximum likelihood estimates
is  -0.4378E04,  compared to  -0.4413E04  using the least-squares estimates.
                                                                        (12)
                                TABLE 1
                           U.S.  SURVIVAL DATA
Mt) ° 2q/(h(l+p))  where h « year interval,  according to Kimball's  method.
Age
N
Dead
q
P
pp
Mt)*

0- 0.99
1- 4.99
5-1^.99
15-24.99
25-3^.99
35-^.99
45-5^.99
55-64.99
65-74.99
75-84.99
85-9^.99
1000
838
799
776
738
703
666
609
503
314
66
162
39
23
38
35
37
57
106
189
248
66
0.1620
0.0465
0.0288
0.0490
0.0474
0.0526
0.0856
0.1740'
0.3757
0.7898
1.0000
0.8380
0.9535
0.9712
0.0951
0.9526
0.9474
0.9144
0.8260
0.6243
0.2102
0.0000
0.83800
0.79903
0.77602
0.73800
0.70302
0.66604
0.60903
0.50306
0.31406
0.06602
0.00000
0.176270
0.011900
0.002922
0.005023
0.004855
0.005402
0.008942
0.019058
0.046259
0.130534
0.200000
                                    82

-------
                                TABLE 2
                           U.S. SURVIVAL DATA
                           (Hazard Function)
Midpoint
Year
0.5
3.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
80,0
90.0




Observed
0.17630
0.01190
0.00292
0.00502
0.00485
0.0051*0
0.00894
0.01906
0.01*626
0.13052
0.20000
Model I
a - 1.9872E 01
b « 2.2764E 02
c - 2.4398E-12
d - 6.1851E 00
Model I
0.17307
0.02906
0.00873
0.00445
0.00360
0.00524
0.01147
0.02649
0.05693
0.11237
0.20591
Model II
a • 2 .8035E-01
b - 1.0781E 00
c » 3.9204E-12
d • 6.0841E 00
Model II
0.17630
0.01191
0.00001
0.00010
0.00077
0.00333
0.01036
0.02617
0.05730
0.11299
0.20563
Model III
a - 3.2761E 01
b • 3.7340E 02
c • 1.153UE-20
d - 4.9992E-01
Model III
0.17453
0.02922
0.00877
0.00439
0.00292
0.00219
0.00175
0.00146
0.00126
0.00244
0.20087




Least-squares estimation on  Mt).

-------
                                  CHAPTER 5





                       Computer  Program for Graphics
         The  computer program described in this section was used to display, on



 a single plot, both the mortality rate and the mean residual lifetime for



 three  realistic models of the mortality rate.  This program was written in



 FORTRAN IV and was run on an IBM 360/91 at the Health Sciences Computing



 Facility,  UCLA.




         The  mean residual lifetime,  v(t),  may be defined as
                                      r  ffe
                                     Jt   *(t
where •R(t)»l-F(t)  is the survivorship function and  F(t)  the cumulative



distribution function.  For Model I, the mortality rate,  Mt)  and the sur-



vivorship function  R(t)  are
                                t",      t > 0, a,b,c > 0, d > 1
                                             ~~          —      —


and





                         R(t)  -
                                    . + bt)a/D



For Model II, the equations are




             Mt) - abe"bt + cdt*"1,    t > 0, d > 1, a,b,c > 0
and
                        R(t)
                                     84

-------
 Finally,  for Model  III the equations are
              Mt)  «     *    +  cdedt,    t > 0,  a,b,c,d>0
and

                                             _/dt
R(t)
                              _
                               (1+ bt)a/b
        The mortality functions for the three models are computed in the sub-

routine HAZ listed below.  The value of  LM  may be  1, 2  or  3,  and refers

to the appropriate model number with HAZ being the mortality rate value at a

given time.


               FUNCTION HAZ(T)
               DOUBLE PRECISION A,B,C,D,Z,X
               COMMON LM
               COMMON/PARS/A, B, C , D
               X-DBLE(T)
               GO TO (l,2,e),LM
            1  CONTINUE
               Z»A/(L.+B*X)
               HAZ-&4-C*D*X**(D-l. )
               RETURN
            2  CONTINUE
               Z-A*B*DEXP(-B«X)
               RETURN
            3  CONTINUE
               Z=A/(l.+B*X)
               RETURN
               END


The survivorship functions for the three models are computed in the double

precision subroutine FUN listed below.  The value of  LM  may be  1,2  or  e

and refers to the appropriate model number with FUN being the survivorship

-------
function value at a given time.


        DOUBLE PRECISION FUNCTION FUN(X)
        DOUBLE PRECISION X,A,B,C,D,EX,BX
        COMMON LM
        COMMON/PARS/A,B,C,D
        GO TO (l,2,3),LM
     1  CONTINUE
        EX»C*X**D
        IF(EX.GT.IOO.) EX-IOO.
        EX»DEXP(-EX)/(l.OIX>fB*X)**(A/B)
        FUN-EX
        RETURN
     2  CONTINUE
        BX»B*X
        IF(BX.GT.170.) BX-170.
        EX-A (l.ODO-DEXP(-BX))fC*X**D
        IF(EX.GT.100.) EX»100.
        FUN-DEXP(-EX)
        RETURN .
     3  CONTINUE
        EX-D*X
        IF(EX.GT.IOO.) EX-IOO.
        EX-C (DEXP(EX)-I.ODO))
        IF(EX.GT.IOO.) EX-IOO.
        EX»DEXP(-EX)
        FUN«EX/(l. ODO*-B»X V* (A/B )
        RETURN
        END

        Within the main program, the appropriate model number is read, the

values for the parameters  a,b,c  and  d  of the model are read and values

for the mean residual lifetime are computed.

        Finally,  the specified information is rpinted along with the computed

time values,  mean residual lifetime values,  v,  and the mortality rate values

Z;  the main program calls a subroutine GPLOT which plots  Z  and  v  at

various times.   The routines used in GPLOT were developed at the Health
                                      86

-------
Sciences Computing Facility,  U.C.L.A.

       Sample output from this program is  included after the listings of the

main program and subroutine GPLOT.
                           S ION  V ( 4 IHJ-H  Tl-ME-H 00)- » -G <
                              N  CMC 3)
                     CUUBLE  PRECISION D X ,K , L , T , E , A RLA
                     DIMENSION  Z(40G»,  X(40G» t Yf4004
                     UOU8LE  PRECISION   A,  Bf C, 0
                     DOUBLE  PRECISION DIES
                     Cu-^MGN  /PARS/ AtBfC fD
                     DATA  Ch/« V1, «Z','* '/
                     COMMON  LM
                10    KtAO(fflOC)  LM
                     IFtLM  .Lt . 0) GO TO  9G
                17    KEAOl 5iUl) AtBtCtO
                111   FORMAK^FIO. 5)
                     WKlTE(ttl20)  LMtAtBtCtO
                     IF (A  .LE.  -i.O) GO  TO  1C
                     TMIN = C.
                     TMAX=AC.
                     STEP=2.
                     AREA = C.UDO
                     MSTtP=100
                     NNUM-={ IMA X-TMIN) /STEP
                     TU'=TMA)(
                     DU 11  1=1,1000
                     1=DBLE(TU)
                     1F(FUN(T)  .LE. l.OD-6) GO  TO  20
                11    rU=TU+STcP
                     v^lR I TF ( 6f 1 10 I  TU
                110   FORMA !('  TAIL TOG  BIG AT  TU't F10.5)
                20    CONTINUE
                     XX= STEP AJ STEP
                     OX=OBLE( XX)
                     R=FUN( 1 )
                     IENU=lCOv>00
                     UiJ  I  1=1, IcND
                     TEST=TN'AX-S1EP*NUM
                     UTES=C-BLE (TEST )-
                     IF( OABSU-OTESI  .GT.  l.CD-3)  CO TC 2
                     J=iMiMUM + l  -NUM
                     IF(R  .tEr--l.-€t-6h 6G TC 12    •  •-  •••
                     V< J)
                                   87

-------
12
13
       GO  TO  13
       vm=c.
       CJNTIMt.
       L( J)=hA£( TEST)*1.0t2
       •TI-MEt J»»TfcST ------
       NUM=NlM+l
       1F(NUM  .GT. NNUM) GO  TC 69
       T=T-DX
       L=FUN( T)
       E = ( L+K )/
-------
OOEL
I
1
2
3
4
5
6
7
8
9 -
10
11
12
13
14
15 	
16
17
ia
19
20
21
1 A, 6, C,
T IME
-U.C
2.GOOOO
4 . CO-JuiJ
6.COOOO
8.COOOO
10.00000
12.00000
14.COOOC
16. GOO 00
13.00030
20.00000
22.00000
24.COOOO
26.COOOO
	 2tfiOOOOO -
30.00000
32.00000
34.00000
36.00000
38.COOOO
40.00000
D= 0.0
V
IS; 375C2
17. 50595
it>. 88o43
14. 40230
13. 26U82
12. 19372
11. 257^1
10.43222
9. -70 t 74
9. C5235
8.47264
7.95308
7, 4356*
7. 06355
	 6nr6 81-10
6. 33339
6. C162d
5. 72620
5. 460U8
5. 2152)3
4,^6S5U
10.GCOOC
I
c.o
C. 74533
1.55460
2.39003
3.24256
-4.10819
4.98442
5.86955
	 —6. -76 23 8
7.6619S
8.56768
9.47886
10.39506
11.31589
	 1^-2-4101
13.17013
14.10301
15.03943
15.97916
16. 9220o
17.86797
2.06040

-------
MODEL
A,  6,  C,
D =
L. 00000
0. 1COOO
                                                    0.0002U
2.00000
I
1
2
3
4
6
6
7
8
9
10
11
12
13
14
-15
16
17
13
19
20
2-1
T iv, t
-O.G
2.COOOO
-4.GOOUO
6.COOOO
8.00000
1-0.00000
12.CCOOO
14.00000
-1-6.CGGGO
Id. 00000
20.COOOO
22.COOGG
24.CGGGG
26.COOGO
	 2-8.GG4KHJ
30eCOQQO
32.00000
34.00000
36.00000
38.00000
--- 4G.COOOG
V
2 7 . 7 5 S s S,
31. 11 boo
34.C2692
36.45024
3d. 38336
39. 65265
40. 9G19fc
41. 58397
41. S5340
42. C63Cb
41.96121
41.69G55
41. 28772
40. 78357
	 -4^,203o 7
29. 56L86
38. 89590
38. 19H52
. 37. 48705
36. 70991
- —-3-6.05354,
I
1C. 00000
6.26731
6.06319
5.7281^
4.81329
4.07879
3.4919^
3.02597
2.65896
2.37299
2.15335
1.98803
1.8671b
1.78273
	 1.72810
i.697b/
1.68762
1.69373
1.71324
i. 74371
1.78-316

-------
Z,V  VERSUS T

   45  +
                              V   V  V  V   V
                            v                   V  V
   «G  «                 V                             V   V
                                                            V
                     V                                          V
                                                                  V  V
                  v       .       •     • • -	              v
       .V
   20
   10  t?
   0.   «•
                              /!   i  -
                                     ^   /    111    lit   II!   I.
             i.        •!.        15        ii         <; /
        0.         6.        I?        1>1        t'r        30
                                         92

-------
Z,V VERSUS T

   i i .'.••«  • • -
                                                                         I

                v                                                     z

   11..G \ 	 -	-	-	--	:	 -    - 	 Z    •• --.

                                                               I

                                                            I
             •  •  •         •   -  V	I  	
   10.0 >
                                    V          I

        ,                                 *
        ,                                      V
   /.50 +                              /         V
                                    Z               V
                                                        v   v

                                                                   V   V
   •j.oo »                •    z                                           v    v
   2 ,?0
  -0.00 «Z -
                        S.        Ib        ^i         27         33         39
                             12        la      -  2^. •        ii'J   .  .   J6
                                          93

-------
                                  CHAPTER 6
                              CONCLUDING REMARKS
        During the next twelve months, the marginal and combined effects of



the basic pollutants on morbidity for the City of Los Angeles and the mile-



high city of Denver, Colorado, will be assessed using the methods developed



in this report.  The various ways for modeling for the competing risk situa-



tion will be explored.  Both parametric and nonparametric methods will be



considered.  We will show how prior knowledge of empirical distributions of



the concentrations of the various pollutants can be used to sharpen the



statistical procedures of estimation and testing for goodness-of-fit as a



follow-on effort, including applications of paramount importance to the



Environmental Protection Agency of the United States.
                                     94

-------
                                 CHAPTER 7
                                Bibliography
Armitage, P.  [19591
        "The Comparison of Survival 
-------
 Dixon, W.J.;  Longmire, W.P.;  and Holden,  W.D.   [January 1971]
         "Use  of Triethylenethiophosphoramlde (thioTEPA) as  an  Adjuvant to the
         Surgical Treatment of Gastric  and Colorectal Carcinoma:   Ten-Year
         Follow-Up," Annals of Surgery,  Vol.  173,  No.  1,  pp. 26-39.

 Ederer,  F.; Axtell, L.M.; and Cutler,  S.J.  [Undated]
       .  "The  Relative Survival Rate:   A Statistical Methodology," in Cancer;
         End Results and Mortality Trends,  National Cancer Institute  Monograph
         No. 6,  U.S. Department of Health,  Education and Welfare,  National
         Institute of Health.

 Epstein, B. and Sobel, M.   [1953]
         "Life Testing," Journal  of the American Statistical Association,  Vol.
         1*8, pp.  1*66-502,

 Fix, E.  and Neyman, J.   [1951]
         "A Simple Stochastic  Model of  Recovery, Relapse,  Death and Loss of
         Patients," Human Biology,  Vol.  23, pp. 205-21*1.

 Qarg, M.L.; Rao,  B.R.; and Redmond, C.K.   [1970]
         "Maximum-Likelihood Estimation of the Parameters  of the Gompertz
         Survival Function," Journal of the Royal  Statistical Society, Applied
         Statistics, C, Vol. 19,  pp. 152-159.

 Gehan, E.A.   [19691
         "Estimating Survival  Functions  From the Life  Table," Journal of
         Chronic  Diseases, Vol. 21, pp.  629-641*.

 Ghare, P.M. and  Kim, Y.H.  [1970]
        An Approximating Function  for the Hazard  Rate.  Pittsburgh,  Penn.:
        1970  Technical Conference  Transactions, American  Society  of  Quality
        Control, pp. 367-373.

 Gompertz, E.  [1825]
        "On the Nature of the Function Expressive Law of  Human Mortality and
        on the New Mode of Determining the Value of Life  Contingencies,"
        Phil. Trans. Royal Soc., A, Vol. 115, PP» 513.

 Grenander, U.   [1956]
        "On the Theory of Mortality Measurement," Skandinavisk Aktuarietid-
        skrift, Vol. 39, pp. 1-55.

Halperin, M.  [1952]
       "Maximum Likelihood Estimation in Truncated Samples," Annals of
        Mathematical Statistics, Vol. 23,  pp. 226-238.
                                      96

-------
Harris, T.E.; Meier, P.; and Tukey, J.W.   [December 1950]
        "Timing of the Distribution of Events Between Observations," Human
        Biology, Vol. 22, No. 4.

Kale, B.K.   [1962]
        "On  the Solution of Likelihood Equations by Iterations  Processes.  The
        Multiparametric Case," Biometrika, Vol. 49,, pp. 479-^86.

Kimball, A.W.   [196©]
        "Estimation of Mortality Intensities in Animal Experiments,"
        Biometrics, Vol. 16, pp. 505-521.

Kbdlin, D.   [1967]
        "A New Response Time Distribution," Biometrics, Vol. 23, pp. 227-350.

Mantel, N.   [1966]
        "Evaluation of Survival Data and Two New Rank Order Statistics Arising
        in its Consideration,"  Cancer Chemotherapy Reports, Vol. 5°>
        pp.  163-170.

Moeschberger, M.L. and David, H.A.  [1971]
        "Life Tests Under Competing Causes of Failure and the Theory of
        Competing Risks," Biometrics, Vol. 27, pp. 909-933.

Murthy, V.K.  [1965a]
        "Estimation of Probability Density," Annals of Mathematical Statis-
        tics, Vol. 36, pp. 1027-1031.

Murthy, V.K. [1965b]
        "Estimation of Jumps, Reliability and Hazard Rate," Annals of Mathe-
        matical Statistics, Vol. 36, pp. 1032-101(0.

Murthy, V.K.  [1968]
        "Study of Nonparametric Techniques for Estimating Reliability and
        Other Quality Parameters," NASA Contract Final B*port, McDonnell-
        Douglas Corporation paper.

Murthy, V.K. and Swartz, G.B.  [March 1971]
        "Contributions to Reliability," Aerospace Research Laboratories Report
        ARL  71-0060, Wright-Patterson Air Force Base, Dayton, Ohio.

Sampford, M.R.  [1952]
        "The Estimation of Response-Time Distributions. I.  Fundamental Con-
        cepts and General Methods; II.  Multi-Stimulus Distributions,"
        Biometrics, Vol. 8, pp. 13-32 and 307-369.
                                      97

-------
Swartz, G.B.   [June 1973]                       	
        "The Mean Residual Lifetime Function,"  IEEE  Transactions on Reliabil-
        ity, pp. 108-109.

Watson, G.S. and Leadbetter, M.R.   [196*4.]
        "Hazard Analysis, I," Biometrika, Vol. '51, pp.  175-184.

Zelen, M.   [1966]
        "Application of Exponential Models to Problems  in  Cancer Research,"
        Journal of the Royal Statistical Society, A, Vol'.  129, pp.  368-398.
                                     98

-------
                                   TECHNICAL REPORT DATA
                            (Please read Instructions on the reverse before completing)
 W-W/t-76,015
                              2.
                                                           3. RECIPIENT'S ACCESSION-NO.
 4. TITLE AND SUBTITLE
   REALISTIC MODELS FOR MORTALITY RATES AND THEIR
   ESTIMATION
             5. REPORT DATE
                peb>uarv 1P76
             6. PERFORMING ORGANIZATION CODE
 7. AUTHOR(S)

   V.  K.  Murthy
                                                           8. PERFORMING ORGANIZATION REPORT NO.
I9. PERFORMING ORGANIZATION NAME AND ADDRESS
   University of California  at Los Angeles and
   University of Southern California
   Los Anqeles, California
             10. PROGRAM ELEMENT NO.
                1AA601
             11. CONTRACT/GRANT NO.

                800Z30
 1:2. SPONSORING AGENCY NAME AND ADDRESS
   Health Effects Research Laboratory
   Office of Research and Development
   U.S.  Environmental Protection Agency
   Research Triangle Park, N.C.  27711
                                                           13. TYPE OF REPORT AND PERIOD COVERED
             14. SPONSORING AGENCY CODE

                EPA-ORD
 1!>. SUPPLEMENTARY NOTES
 Hi. ABSTRACT
             The objective of a medical  follow-up study is generally  to  determine
        the  effectiveness of each of  several  treatments by analyzing  the responses
        of the patients.  Frequently  the response data coming out of  these
        investigations is time to death  of patients who are riot otherwise lost
        to the follow-up of our investigation.   The statistical nature of this
        data are characterized in this report.
             By definition the, "Force of mortality or mortality rate function",
        is the rate associated with the  probability of the patients'  death in
        a specified short interval of time,  given that the patient has survived
        to this instant in time.
             Mathematical models are presented  which,  as special cases,  represent
        constant,  increasing and decreasing  mortality  rates, along with  combinations
        of these properties.  Usually, these  mortality rate curves are "U" shape.
        The  first  part of the curve corresponds to infantile mortality,  the second
        part corresponds to useful life,  and  finally,  the last part corresponds
        to decay,  aging, etc., culminating  in death.   Their corresponding  probability
        distributions  and survivorship functions are obtained in closed  form.
        Methods of estimating the parameters  are developed and procedures  dealing
        with computational  details and statistical  properties are disrusspd.
                                KEY WORDS AND DOCUMENT ANALYSIS
                  DESCRIPTORS
b.IDENTIFIERS/OPEN ENDED TERMS  C. COSATI Field/Group
  Mortality
  Mathematical  Models
                           12A
                           05B
                           06G
 8. DISTRIBUTION STATEMENT
  RELEASE TO  PUBLIC
19. SECURITY CLASS (This Report)
  UNCLASSIFIED
                                                                         21. NO. OF PAGES
104
                                              20. SECURITY CLASS (Thispage)
                                               UNCLASSIFIED
                                                                        22. PRICE
EPA Form 2220-1 (9-73)
                                             99

-------