vvEPA
    k
US EPA Office of Research and Development
          United States
          Environmental Protection
          Agency
           Office of Research and
           Development
           Washington DC 20460
EPA/600/R-01/104
September 2001
Human Health Metrics for
Environmental Decision
Support Tools:

Lessons from Health
Economics and Decision
Analysis

-------
                                             EPA/600/R-01/104
                                              September, 2001
HUMAN HEALTH METRICS FOR ENVIRONMENTAL
            DECISION SUPPORT TOOLS:

    LESSONS FROM HEALTH ECONOMICS AND
                DECISION ANALYSIS
                           BY
                       Patrick Hofstetter
                  ORISE Research Fellow at U.S. EPA
               National Risk Management Research Laboratory
                     26W Martin Luther King Dr
                      Cincinnati, OH 45220

                       James K. Hammitt
                      Center for Risk Analysis
                   Harvard School of Public Health
                      718 Huntington Ave.
                       Boston, MA 02115
               National Risk Management Research Laboratory
                  Office of Research and Development
                  U.S. Environmental Protection Agency
                      Cincinnati, Ohio, 45268

-------
                                       Notice

This  report has been subjected  to the Agency's peer and administrative review  and has been
approved for publication as an EPA document. Mention of trade names or commercial products does
not constitute endorsement of recommendation for use.
                                           11

-------
                                       Foreword

The U.S. Environmental Protection Agency is charged by Congress with protecting the Nation's land,
air, and water resources.  Under a mandate of national environmental laws, the Agency strives to
formulate and implement actions leading to a compatible balance  between human  activities and the
ability of natural systems to support and nurture life.  To meet this mandate, EPA's research program
is  providing  data  and technical support for solving environmental problems today  and building  a
science knowledge  base necessary to manage our ecological resources wisely, understand how
pollutants affect our health, and prevent or reduce environmental risks in the future.

The  National Risk Management Research  Laboratory  is  the Agency's center  for investigation of
technological and management approaches for preventing and reducing risks  from pollution that
threaten human health and the environment. The focus of the Laboratory's research program is on
methods and their cost-effectiveness for prevention and control of pollution to air, land, water, and
subsurface resources; protection of water quality in public water systems; remediation of contaminated
sites, sediments and ground water; prevention and control of indoor air pollution; and restoration of
ecosystems. NRMRL collaborates with  both  public and private sector partners to foster technologies
that reduce the cost of compliance and to anticipate emerging problems.  NRMRL's research provides
solutions to  environmental problems by:  developing and  promoting technologies that  protect and
improve the  environment; advancing scientific and engineering information to support regulatory and
policy  decisions;  and  providing  the  technical  support  and  information  transfer  to   ensure
implementation  of environmental regulations and strategies at  the national, state, and community
levels.

This publication has  been produced  as part of the Laboratory's strategic long-term research plan.  It is
published  and made available by EPA's Office of Research and Development to assist the user
community and to  link researchers with their clients.
                                    E. Timothy Oppelt, Director
                                    National Risk Management Research Laboratory
                                            in

-------
                                            Abstract

Environmental decision support tools often provide information that predicts a multitude of different
human health effects due to environmental stressors. Medical decision making and health economics
offer many metrics that allow aggregation of these different health outcomes. This paper provides a
review of this literature with special attention to aspects relevant in the environmental context. Based
on a  characterization of  medical and environmental applications, recommendations for the use of
human health  metrics in  different environmental decision support tools have been  derived. Further,
three  metrics  (quality adjusted  life  years  (QALYs),  disability adjusted life  years  (DALYs)  and
willingness-to-pay (WTP)) have been used to compare a wide range  of different environmental risk
factors. In  this example, WTP tends to  reflect mortality outcomes only. QALYs  and DALYs are
sensitive to mild illnesses that affect large  numbers of people, which are difficult to assess in an
unbiased manner.  Since  health metrics tend to follow  the  paradigm of utility  maximization, these
metrics may be supplemented with a semi-quantitative discussion of distributional and ethical aspects.
Finally, the magnitude of age-dependent disutility due to mortality for both monetary and non-monetary
metrics may bear the largest practical  relevance out of a series of suggested research questions.
                                             IV

-------
                                     Table of Contents

Notice	ii
Foreword	iii
Abstract	iv
Acronyms and Abbreviations	vi
Acknowledgements	vii

1.   Introduction	1
2.   Human health metrics: a review of the literature	2
    2.1   What to measure?	3
    2.2   A classification of approaches for health metrics	4
    2.3   Short  Introduction to QALYs, DALYs, HYE and WTP	5
    2.4   Social welfare function	7
    2.5   Properties of scales, attributes and the QALY-equation	9
    2.6   Discounting	10
    2.7   Whose values?	13
    2.8   How to elicit values and utilities?	15
    2.9   Insights in elicitation methods	18
    2.10  How to measure premature death?	20
    2.11  Time proportionality of HALYs	24
    2.12  Short-term and chronic effects	25
    2.13  Multipathology/co-morbidity	25
    2.14  Utility  maximization versus distributional/ethical considerations	26
    2.15  Beyond disutility: costs of illness and averting behavior	28
    2.16  What is not measured by health metrics?	29
    2.17  Practical aspects	30
    2.18  Authorization of health metrics	32
3.   Comparison of DALYs, QALYs and WTP based on an example	33
4.   Characterization of medical applications and environmental tools	41
5.   Consequences for the choice of metrics  in different applications	43
6.   Discussion and Conclusions	46
References	50

-------
                            Acronyms and Abbreviations
$
15D
BCA
CA
CBA
CEA
COI
CUA
CV
CVA
DALYs
DC
EPA
EUR
EuroQol
GDP
HALYs
HALYS+
HUI
HYE
ISO
ME
03
OE
PM10
PTO
QALYs
Qm
QW
QWB
r
SEYLL
SF36
SG
t
TO
TTO
U.S. EPA
UN
USEPA
UV-A/B
VAS
VSL
WHO
WTA
WTP
YLD
YLL
U.S. Dollar
quality of life measurement instrument using 15 attributes (or dimensions)
Benefit Cost Analysis (same as CBA)
Conjoint Analysis
Cost-Benefit Analysis (same as BCA)
Cost-Effectiveness Analysis
Cost of Illness
Cost-Utility Analysis
Contingent Valuation
Cost-Value Analysis
disability adjusted life years
Dichotomous choice format
Environmental Protection Agency (same as USEPA or U.S. EPA)
European currency prior to the introduction of the Euro
European Quality of Life measurement instrument
Gross Domestic Product
health adjusted life years
Health Adjusted Life Years with age-weighting
Health Utility Index
Health-Years Equivalent
International Standard Organisation
Magnitude Estimation
(tropospheric) Ozone
Open-ended question format
Particulate Matter smaller than 10um
Person Trade-Off
quality adjusted life years
chronic health state
Quality Weight
Quality of Well-Being
risk aversion factor
standard expected years of life lost
short form with  36 questions/attributes
Standard Gamble
time
Tradeoff Method
Time Trade-Off
United States Environmental Protection Agency
United Nations
United States Environmental Protection Agency
Ultraviolet radiation within  spectrums A or B
Visual Analogue Scale
Value of a Statistical Live
World Health Organisation
Willingness to Accept
Willingness-to-Pay
Years Lived Disabled
Years of Life Lost
                                           VI

-------
                               Acknowledgements

We would like to thank Jane Bare, Gordon  Evans,  Matt  Heberling, Glenn  Rice  (all  U.S EPA,
Cincinnati), Ruedi Muller-Wenk (University St. Gall, Switzerland), and John Evans (Harvard School of
Public Health, Boston) for their valuable comments on earlier drafts. Thanks also to Jean Dye for the
technical editing.  Patrick Hofstetter was supported, in  part, by an appointment to the  Postdoctoral
Research Program at the National Risk Management Research Laboratory administered by the Oak
Ridge Institute for Science and Education (ORISE) through an interagency agreement  between the
U.S. Department of Energy and the U.S. Environmental Protection Agency. This article may or may
not reflect the views of the supporting agencies.
                                          vn

-------
1. Introduction

Environmental impacts on human health  are (i)  relevant compared to other health impacts1, (ii)
considered as important as damages to ecosystems (Goedkoop et al. 1999, Harada et al. 2000), and
(iii) often trigger a change of behavior and regulations (Morgenstern 1997). However, environmental
impacts  cause a myriad  of different health effects for different durations (Lippmann 2000, de
Hollander  et  al.  1999) and in  environmental decisions they often have to be  compared with a
different set  of health impacts  caused  by competing  decision  alternatives  (comparative  risk
assessment or life cycle assessment) or with regulation costs (benefit-cost analysis) or impacts on
ecosystems and resources (life cycle assessment).  Therefore, a common metric for health outcomes
that allows adding a wide range of different health outcomes would enable decisions that are more
informed.

Applications of health metrics in environmental decision support tools have been explored  in many
different ways. While pure mortality statistics and years of life lost were used in early energy studies
(e.g., Inhaber 1982), willingness to pay (WTP) has been used for some time now (e.g., Viscusi et al.
1991, ExternE 1995, ESEERCO 1995, USEPA 1999b). Quality adjusted life years (QALYs)  have
recently been used in USEPA (1998a), Hammitt et al. (1999a) and Ponce et al. (2000), and disability
adjusted life years (DALYs) have been used in Hofstetter (1998), Goedkoop et al. (1999), Mara et al.
(1999), de  Hollander et al. (1999), Muller-Wenk (1999), Havelaar et al. (2000), Frischknecht et al.
(2000).

In many environmental studies where human health metrics were used they have been selected based
on the  historical roots of the field2, the needed measurement unit3, or the authors'  background4. Only
a few  general investigations concerning human health metrics for environmental decision support
tools  have been identified  (see  e.g., O'Brien et al.  1994,  Hofstetter  1998,  Carrothers et al.
submitted). Exchange  between concepts and knowledge in health economics and medical  decision
making on the one hand and  environmental  decision making on the other hand has primarily  been
case- or application-specific and rarely based on a broader overview and analysis.

Therefore,  this article provides a summary of the concepts and findings available from the  fields of
health  economics and medical decision making (Section 2). This summary should ease the access to
these fields for researchers  in environmental decision making, and also reflect the findings in the
1 De Hollander et al. (1999) estimate that health impacts due to particles, noise, lead, ozone, radon and environmental
tobacco smoke cause almost 5% of the Dutch burden of disease
2 Most methods used in Life Cycle Impact Assessment for the assessment of human toxicity have their roots in Chemical
Risk Assessment (e.g., Guinee et al. 1993, Hertwich 1999, Huijbregts 1999). The chosen metric to compare impacts from
different pollutants is a derivate of the margin of safety concept (ratio of exposure increase and no-effect exposure limit).
The non-occurrence of health impacts is in this case the anchor of the health metrics scale.
3 Externality studies used methods to  assign monetary values to different health outcomes because their aim was to value
environmental damages in monetary units, (e.g., Frey et al. 1985, ESEERCO 1995, ExternE 1995)
4 A recent example may be the Comparative Risk Assessment study performed by U.S. EPA (USEPA 1998a), where
QALYs have been chosen to express  human health impacts due to microbiological water pollution and the effects of
chlorination and its side-products while another group at the RIVM, The Netherlands (Havelaar et al. 2000) choose to
use DALYs for a similar case study.
1

-------
light of environmental applications. Some practical implications of three widely used health metrics
are shown and discussed by applying them to a recent survey of environmentally caused health
impacts in the Netherlands (Section 3). In order to understand the different metrics that have been
suggested in the medical field and their potential transferability to  the environmental  arena we
characterize both the medical and environmental decision support systems (Section 4). Based on this
characterization, sensitive elements of health metrics can be identified and recommendations can be
made for congruent health metrics for environmental decision support tools (Section 5).

Further,  we  raise some  issues that are also  relevant within  medical applications (time-non-
proportionality, actual age-dependency of disutility due to premature death and distributional/ethical
attributes) or could be considered important aspects in environmental applications (importance of
mild  impairments,  appropriate  life  tables  for  intergenerational  and  international   impacts,
interpretation of costs of illness).

In this article we assume that the human health endpoint, survival and cure rate, age of onset, and the
duration of disability are known,  i.e., a complete prediction of health profiles  is possible.  This
assumption is often not met because conclusive epidemiological  studies are needed to supply this
data. Metrics that make less restrictive assumptions have been suggested in de Rosa et al. (1985),
and ILSI (1996) and TERA (1999).

-------
2. Human health metrics: a review of the literature

The  simplest form of human health  metrics is to select health outcomes as reported in health
statistics.  Systematic  health statistics were  started in the  18th  century in Australia and 1837  in
England  and Wales (WHO 1993).  Many statistics have been  started as mortality records only,
separated by sex and age. Later, they were extended to include morbidity information. Today's
standard  list of diseases was  initiated in 1853  and is  revised  at the beginning of each decade
(Alderson 1988). About 100 internationally defined disease classes are adopted widely; in addition,
single countries or continents use classifications that are more specific. The human health metrics we
are interested in go beyond these health  statistics. We  are interested in a measure for the loss  of
health due to diseases and premature death.

Medical decision making and health economics have dealt with  questions like choosing treatment
methods  and resource allocation for the last 30 years. This section draws on the research of these
fields. Subsections 2.1 to 2.3 provide a short overview on the metrics that are of highest interest to
applications in environmental decision support tools. Subsections 2.4 through 2.7 address some of
the fundamental  assumptions and choices behind most metrics. Subsections 2.8 and 2.9 deal with the
elicitation  of quality  weights for morbidity  outcomes, while subsection  2.10  addresses  the
measurement of premature death. Finally, subsections 2.11 through 2.18 review the literature with
respect to aspects relevant to the application of health metrics in environmental and decision support
contexts.

2.1    What to measure?
The World Health Organisation defined health in 1946 as follows: "Health is not only the absence of
infirmity and disease, but also a state of physical, mental and social well-being" (WHO 1947). This
broad definition captures  essential  elements of quality-of-life  and underlies most human health
metrics. Based on this definition, it is also clear that the loss of health cannot be solely measured by
statistical information  on mortality. It is commonly understood that mortality measures  alone
provide decision makers with incomplete and insensitive information about overall population health
(Murray  et al.   1996a, Field et al.1998). Summary measures  therefore  include information on
mortality and morbidity and are the primary focus of this article.

Figure 1  provides a useful distinction of different assessment levels for morbidity outcomes. The
further down we go on the level the more relevant becomes the information to individuals and the
more relevant become local factors like health care system, family/household structure, economic
development (farming versus service economy), cultural and religious beliefs. The metrics discussed
here  are mostly located on the disability or handicap level. While the former allows for applications
that are more generic and international, the latter can more appropriately take into account a patient's
environment.

-------
0)
u
c
TO
>
0
0)
u
"TO
3

C
'^
_C
0)
U)
TO
Si
u
.= ^
u
U-
'o

o
U)
c
o
"^J
i
^,
O)
E
'(fl
TO
£
u
c

                                 Disease or Disorder
                                       (intrinsic)
                                          I
                                     Impairment
                                     (exteriorized)
                                          I
                                       Disability
                                     (objectified)
                                       Handicap
                                     (socialized)
                                                  Example
                                                 Brain injury
                                              retardation at birth
                                            Mild mental retardation
                                              Difficulty in learning
                                                Social isolation
Fig. 1:
Possible assessment levels for human health metrics (after Murray et al. 1996a)
While health statistics provide a snapshot in time, we are more concerned with the consequences, the
disease history. This information is captured in so-called health profiles. They include information
on duration and disease stages, cure, remission and co-morbidities. If the health assessment concerns
a group of people1 or ex ante assessments of individuals, then the relative frequency or probability
for each disease stage is used as a basis to quantify the health profile.

Therefore,  human health  metrics  summarize mortality  and  morbidity  outcomes and attempt to
measure  physical, mental  and social well-being on the level of disability or handicap for health
profiles of individuals or the population at large. The application directs the assessment levels (see
Section 5).

2.2    A classification of approaches for health  metrics
Spilker et al. (1996) provide an overview  of about 300 different instruments for the comparison of
different health states. This wealth of instruments can be classified by  few characteristics, which
reduces the number of relevant approaches for this article. The following distinctions can be made:
   Time-based versus time-point  related approaches;  as  introduced  earlier we are seeking for
   instruments that assess  health profiles,  i.e., health state  over time. Therefore, time-based
   approaches have been considered only.

   Generic versus disease-specific versus problem-specific approaches; we are interested  in the
   disabilities due to a broad range of diseases that are caused by environmental impacts. Therefore,
   the more  restricted instruments that focus,  e.g.,  on  asthma (Anonymous  1994)  or cancer
   treatments (Rusthoven 1997) alone, are not sufficient for our purpose.
1 See an example for asthma in Anonymous (1999a).
4

-------
-  Indicator versus single index approaches; not all  approaches allow for an aggregation of the
   indicator values on different health dimensions to a single index. Here we are interested in single
   index approaches rather than descriptive health state instruments like SF36 (Fryback et  al.
   1997)2.

-  Explicitly decomposed versus statistically inferred decomposed versus holistic approaches; All
   three approaches acknowledge the multidimensional nature of health that includes aspects such
   as: health  perceptions;  social function  (social relations, usual  social role, intimacy/sexual
   function, communication/speech); psychological function (cognitive  and  emotional function,
   mood/feelings);  physical  function  (mobility,  physical  activity) and impairment  (sensory
   function/loss, symptoms/impairment)3(Gold et al. 1996:95). In the holistic approach, judges are
   confronted  with a full verbal description of a health outcome along  with some of the above
   dimensions and asked for a direct utility judgment. The explicitly decomposed approach, on the
   other extreme, uses multi-attribute utility theory (see below) to  make  judgments separately  on
   how health states influence single health dimensions and finally how  to combine the different
   health dimensions into a single  number. The statistically inferred decomposition derives the
   relative  importance  of  the  health  dimensions by  multiple regression analysis  of  holistic
   judgments  on health  states  and the  health states'  scores  on  each dimension.  While the
   decomposed methods reduce the cognitive load of the judges compared to the  holistic approach,
   the explicitly decomposed  approach may assume invalid  properties of  the aggregation structure
   and nature  of scales. Although Frohberg et al. (1989a) recommend  the  statistically  inferred
   approach because of its  superior validity, all three approaches are applied in medical decision
   making for  practical and historical reasons.

   Composite versus whole profile; the  need for time-based metrics that are able  to measure health
   profiles  suggests  that  either the utility of a health profile over the  time-span of interest is
   holistically  assessed or composed by a number of time slots multiplied by their specific health
   state specific utility.

Based on this overview, the rest of the article will concentrate on time-based, generic, and single
index approaches. Whether holistic or decomposed and composite or whole profile approaches are
favorable or more feasible is less obvious.

2.3    Short Introduction to QALYs, DALYs, HYE and WTP
Figure 2 shows a hypothetical health profile4 of an individual. The gray and black areas represent the
quality adjusted life years (QALYs) and disability adjusted life years (DALYs), respectively. While
2 Due to the large amount of available information on public health measured with such descriptive indicator systems
many algorithms for their transformation to utility scales have been developed, see e.g., Patrick et al. 1993, Torrance et
al. 1996, and Fryback et al. 1997.
3 A selection of these dimensions is usually included in multi-dimensional quality of life instruments (EuroQol (Essink-
Bot et al. 1993),  15D (Sintonen 1981), QWB (Kaplan et al. 1988), HUI Ml/Ill (Torrance 1986), Rosser Index (Rosser et
al. 1972), but none includes all dimensions (Gold et al. 1996:108).
4 An illustration would be yellow fever at birth, a broken leg due to a skiing accident at the age of 12, a major accident
with a motor bike at the age of 18, burn-out syndromes at the age of 35, heart attack at the age of 45 with almost full
recovery, typical age-related morbidities between the age  of 50 and 70 with a skin cancer surgery at the age of 58. Lung
cancer at the age of 70 leads to death at age 72.
5

-------
QALYs measures the actual health quality integrated over time, DALYs measure the loss compared
to a hypothetical profile.

Pliskin et al. (1980) describe QALYs as utility functions under a number of different assumptions.
The most general form is the risk-adjusted version:
QALY™ = U(Qin,t) = [H((» * t]r
                                                                [a]
(1)
where U is the utility function of the constant chronic health state Qm during the life years t. H(Q)
refers to the value function of quality (we will call it quality weight) and r is a risk-aversion factor5.
It is common practice to discount future health outcomes if QALYs are used in cost-utility analysis
(see Section 2.6).
i&  CD  §  (0   W
5 £  £ £   £
5 .S>  re .S>  "re
.so   so   a>
o  £  O  £   i
                                Life Quality Measure
             0  1.0   Perfect
            1.0    0   Death
                                             QALYs
                                                                          Lifetime
Fig.2:   Graphical illustration of a health profile and its measurement by Quality Adjusted Life Years (QALY, gray area) and
       Disability Adjusted Life Years (DALYs, black area).
DALYs are the sum of the years life lost (YLL) and the years lived with disability (YLD) (Murray et
al. 1996a):
5 The following notion applies: r>l risk seeking, r=l risk neutral, r
-------
DALYm = YLLm + YLDm                                                   [a]

        = discounting*age-weighting* (SEYLLra + disability weight™*disability duration™)     (2)

where m is the type of disease. The YLLs lost are calculated with the standard expected years of life
lost (SEYLL). For both YLL and YLD a continuously falling discounting function of the form of e-rt
is used, where r is the discount rate and  t the time. Age-weighting is included by the  expression
C-a-e-Pa where C and /? are constants and set equal to 0.1658 and 0.04 respectively (see Section
2.10) and a is the age. For YLD, similar to QALYs above, the disability weight is multiplied by the
disability duration. See Murray et al. (1996a:64ff) for the detailed equations  for continuous and
Elbasha (2000) for discrete age of onset. DALYs assume risk neutrality.

Murray et al.  (1996a) allow DALYs that both use or do not  use  discounting  and age-weighting.
Therefore, there are only  a few key differences between QALYs and DALYs and we introduce the
term health adjusted life years (HALYs) as an umbrella term. Figure 2 and equation (2) make clear
that the DALYs framework needs to define a reference life expectancy while QALYs just quantify
changes from one health  profile to another. This implies that the reference state used for DALYs
does typically assume perfect health until death (Figure 2). Differences in the elicitation of disability
and quality weights will be addressed in Sections 2.8 and 2.9.

While  both  QALYs and DALYs make  the  restrictive assumption on time-proportionality, the
Health-Years Equivalent (HYE) of Mehrez et al. (1989) does not decompose the health quality and
duration aspect.

WfEm(t) = U(Qi«,t)                                                  [a]                   (3)

where U is the utility function of the health state Qm during the  life years t. Although Willingness to
Pay (WTP) is embedded  in welfare economics and measures loss in life quality in monetary units
that have an external reference, its  simplest form is similar to the HYE:

WTPinfl = V(AQOT, At)                                              [$]                   (4)

where  V is the value function of the health state change AQm during the time interval At. WTP
should be understood as the rate of substitution between health and wealth. It  is typically used to
evaluate small changes in health states rather than  to construct  a total  burden of disease  (see
Hammitt (2002) for a more detailed elaboration of the nature of WTP for premature death).

QALYs have traditionally been  the most important summary measure  in medical decision making.
However, WTP  and more recently  DALYs find widespread use  as well. HYE or SAVE (Nord
1992b) have not been widely used, which may be due to the increased burden of deriving standard
values  for all  possible combinations of health states and their duration. However, one  also should be

-------
aware that Azimi  et al. (1998) found in  109 cost-effectiveness and cost-utility studies6 published
between 1990  and 1996, only 18% used QALYs, but 71% used  no summary  measures at all.
However, Bell  et al. (1999) collected 228 studies that used QALYs  (almost all 1990-1997), which
suggests a wide use of summary measures.

The underlying assumptions and problems of the chosen functions in the presented equations and the
questions that arise when the variables are derived are addressed in the subsequent sections.

2.4    Social  welfare function
In environmental and many medical applications,  it is the social, rather than individual welfare,
which must be optimized. The way social welfare is defined and assessed will influence the way
preferences for health qualities are elicited (Sections 2.7ff). Therefore, principles and construction of
a social welfare function need to be addressed here.

The neo-classical  approach in  economics suggests that the social welfare function  should be an
aggregate of individual preferences. This means that individuals are the best judges of their own
welfare  (consumer sovereignty),  that individuals  can choose rationally  among  options (utility
maximization), that only the outcome matters (consequentalism), and that the value of any situation
should  be  judged solely on the  basis  of the  utility  levels attained  (welfarism).  An important
distinction is between  individual and social choices.  Choices that affect groups  of people are
inherently more complicated than those that affect  an individual, because social choices can affect
the distribution of consequences across people. Neoclassical economics often assumes that it is not
possible to make  interpersonal utility comparisons; i.e., it is not possible to  say whether  one
individual  gains more or  less  than another from an  increase in  health  or wealth.  Without
interpersonal utility comparisons, it is possible to say  that a Pareto improvement (a change  that
benefits some  people and harms no one) improves social welfare, but one cannot  say  whether
changes that benefit some people but harm others improve welfare.

In benefit-cost  analysis, the interpersonal utility comparison problem is "solved" by measuring all
gains and losses in monetary terms - by the affected individuals' willingness to pay for the gains and
willingness to accept compensation for the losses - and assuming that one dollar gain contributes the
same to social welfare regardless of who receives it, be he rich or poor, healthy or ill. Formally, a
change  that benefits some people but harms others is assumed to improve social welfare if it satisfies
the "Kaldor-Hicks  criterion." This requires that those who benefit from the change could compensate
(with money) those who are harmed, so  that everyone benefits by the change plus the payment of
compensation.
6 Nord (1999) defines cost-effectiveness analysis (CEA) by its use of natural units (mortality, number of cases) to
quantify the health effects, cost-utility analysis (CUA) by its use of utility measures like QALYs to quantify the utility of
health improvements, cost-benefit analysis (CB A) by its use of the Willingness-To-Pay approach to quantify the health
benefits in monetary units, and cost-value analysis (CVA) by its use of a holistic assessment of the health benefits of a
whole program from a societal point of view.

-------
An alternative approach to interpersonal comparisons that is conventional in the medical cost-
effectiveness literature is to measure health benefits in some form of "health-adjusted life year"
(HALY, i.e., a QALY or DALY type of metric). In this case, health benefits and harms to different
people are evaluated by assuming a HALY contributes the same to social welfare, regardless of
whether it goes to a rich person or a poor person, to a healthy or a sick one.

Another alternative  approach  captures  societal or altruistic preferences. The  elicitation of these
preferences is  very difficult.  The  effect of altruism on  health values is somewhat  subtle  and
uncertain, because  altruism can take many forms. Altruism about  another person's welfare may
reflect concern for the other's  total welfare, as  the other person evaluates it ("pure" altruism), or it
may reflect concern for only one  aspect of the  other  person's welfare,  e.g., his mortality risk
("safety-oriented"  altruism, a form  of "paternalistic" altruism). Bergstrom (1982) shows that a
society's total willingness to pay for a publicly provided reduction in mortality risk is the same if
individuals care only about their own welfare, or if they  are pure altruists. Jones-Lee (1992) shows
that  the value is  also  the same in the case where individuals  are  paternalistic altruists.  For
intermediate cases where individuals care about  others' welfare, but give somewhat greater weight to
their physical health risks than to other aspects  of their well-being, willingness to  pay can  be
somewhat larger, on the order of 10% to 40% under reasonable assumptions.

The existence of approaches based either on individual (self-interest) or altruistic preferences may
suggest that the type of welfare function depends on the decision at  hand. For societal decisions in
medical decision making, both approaches have  been suggested. Since altruistic preferences can only
be derived if self-interest can be ruled out, Nord (1999) suggests that this approach is used to support
decisions on ad hoc public programs for others, while choices for private or long-term public health
plans can  well be based on self-interests.  Environmental decision support tools may be confronted
with both situations. Air pollution  affects all,  therefore, self-interest may be justified  in a  social
welfare function; lead poisoning, on the other  hand, will only affect families with young children
who live in contaminated buildings and environments. Here, societal or altruistic preferences may
come into play. The same holds for impacts that will affect people on other continents (malaria due
to climate change) or future generations.

2.5    Properties of scales, attributes and the QALY-equation
The ideal  metric for medical decision making and environmental decision support tools should be
measured  on a utility scale that would allow  addition  of different health  episodes for the same
person, add health outcomes of different persons, and allow for use in cost-utility analysis. All health
metrics presented in the previous paragraph implicitly assume that such  an aggregation is possible
under expected utility theory. Although it is known that  expected utility is not descriptive, there is
some debate whether it  shall be prescriptive or  even  normative (Raiffa 1961/1970, von Winterfeldt
et al. 1986, Cohen 1996a/b, Wu 1996, Baron 1996, Douard 1996, Eeckhoudt 1996). Here we assume
that, even  if expected utility is not always normative, it is  at least the most mature theory.

-------
The QALYs and DALYs make additional assumptions by splitting up the time duration from the
quality/disability attribute.  Pliskin et al. (1980)  show  that the  following conditions must  be
empirically  satisfied for QALY to represent a valid utility function  for health outcomes with a
constant health status level over time (based on von Neumann and Morgenstern 1943, Keeney et al.
1976):

    1.  The two attributes duration and quality shall be mutually independent in their contribution to
       the utility (i.e., H(Qra) for all t constant)

    2.  The proportion of remaining life that a person would be  willing to trade off for a specific
       health improvement shall be independent from the expected remaining life  time. This is
       called constant proportional trade off.

If it is assumed, for practical reasons, that the utility function is linear over time (r=l) then a third
condition  is required (Pliskin et al., 1980):

    3.  Risk neutrality regarding life years shall hold for the individual values.

In real life applications, the health status is not constant over time  but follows a health  path or health
profile. Therefore, distinct intervals of different health states should be additive. From that request, a
fourth condition has to be fulfilled (Keeney et al. 1976):

    4.  The value of a health state in period A shall be independent of the value of another  health
       state in period B, i.e., additive utility independence.

Miyamoto et al. (1985) find r^l because risk neutrality is empirically not given and they confirm
that the  above assumptions  are  violated.  Fryback  (1998:42)  states, "The most fundamental
assumptions in the construction of HALY [which includes DALYs and QALYs] measures is that the
part of the measure dealing with weighting health state can be obtained separately from the [...] time
duration part of the measure." He  acknowledges that this major assumption may  well be wrong.
Nord (1999) makes clear that the time-proportionality has  been introduced right from the beginning,
but has no empirical evidence. He also claims that time discounting is both a different issue and does
not explain the  full effect. Nord (1992a)  also cites examples where, in one  study,  one day in bed
performing no major activities was weighted 0.61, while another study with a non-specified duration
for  the health state 'bedridden' found a weight of 0.09!7 Multi-attribute utility theory  says that simple
shapes of utility functions8 are only  applicable if at least utility independence is given  (Fischer
1979). However, empirical studies show that information  about the  expected  duration of a state has
an effect on the valuation of its severity (Sackett et al. 1978, Sutherland et al. 1982,  Dolan  1996).
McNeil et al. (1981) find that if a health state (e.g., less than perfect level of speech) is experienced
for  less than 5  years then individuals are unwilling to  trade longevity for  health improvements.
Loonies et al. (1989), Bala et al. (1996/1998), and Richardson (1994) provide  more evidence against
the  four mentioned assumptions.
7 In both cases, the scale ranges from 0 to 1 where 0 equals death and 1 full health.
8 like multilinear, quasiadditive and additive models
10

-------
Richardson et al. (1996) and Kupperman et al. (1997) showed that composite  and whole profile
measurements show a poor accordance, i.e., that a known sequence of different health states over a
full lifetime is judged different from the results of a calculated composition. Krabbe et al. (1998)
confirms this  finding by  showing  that additive  utility independence  is not fulfilled. However,
MacKeigan et al. (1999) find good  accordance between composite and whole profile methods for
relative minor health  impairments and Treadwell (1998) shows that preferential independence is
satisfied in the QALY model and argues that controversial results can  be explained by (negative)
time discounting and lacking independence of the health states.

Gafni et al. (1993) plead against QALY for the above reasons and suggest HYE, which need not
fulfill the restrictive requirements of additive independence and constant  proportional trade-off as an
alternative (MacKeigan et al. (1999).

However, being aware of the strong evidence against the validity of assumptions (1) through (4)
many  authors  consider that QALY  (and consequently  DALYs) may  still be useful  because
distortions are small, the composition rule is  simple and  the cognitive task in empirical studies is
easier than, e.g., with HYE.

Whether the distortions due to the violations of all  major assumptions behind QALYs (and DALYs)
are indeed small enough to be accepted has not been demonstrated on a sufficiently large set of case
studies.

2.6    Discounting
Discounting is generally used to account for two  factors: preferences for health at different dates,
and opportunities for providing health benefits at different dates. Much  debate has occurred on the
question whether health  outcomes should be time discounted, how large the discount rate should be,
and whether the rate should be the same as that used to discount costs (Weinstein et al. 1977, Gold et
all 996).

It is useful to distinguish the individual and social  choice problems. For  an individual, date and age
are perfectly correlated and so an individual's preferences  for health at  different  dates and at
different ages cannot be distinguished. In principle, an individual's preferences for health at different
ages are virtually unrestricted.  Some individuals  might  consider an increment to health  equally
valuable at all ages, while  others would consider a  health increment more valuable if it occurs when
they are young (positive  time  preference),  and still  others  would consider the increment  most
valuable if it occurs when  they are old (negative time preference). Moreover, preferences for health
might be related to age in some non-monotonic fashion. Apparent positive time preference may be a
defect of myopia. It might  also arise from the latent risk of death that makes it uncertain whether one
will  experience  future costs and benefits, or decreasing marginal utility of health (if health is
11

-------
expected to  increase).  Zero and negative time preference can  be explained by  dread9  and by  a
preference for sequences that improve over time (Wathieu 1997).

Within the context of environmental decision support tools, we are usually interested in social time
preferences and also have to deal with interpersonal and intergenerational aspects.  In this setting,
risk of death would be translated to risk of extinction - which is very small. Pure myopia would not
be considered in a prescriptive tool that is concerned with intergenerational equity10.  This leaves the
argument of decreasing marginal utility of health. Since health is generally measured per capita and
not in number of individuals, the growth in health is best reflected by increasing life expectancy and
its  adjustment for  health state (health  adjusted life expectancy [HALE], see, e.g., Murray  et al.
(1996a) for disability adjusted life expectancy).  While this growth in HALE can be measured there is
less known on the marginal  utility of this growth. Since we are not aware that any study that deals
with environmentally-caused health effects considers the growth in HALE for future  effects, there is
no decrease in marginal utility that would need to be accounted for by discounting.

So  far, we  have argued  within  a closed non-monetary  health  market and we  found that no
discounting is justified, at least so long  as increases in HALE are  neglected as well. However,
restricting  attention to  a closed health  market is generally unrealistic, since  both individuals and
societies can shift the availability of market goods through time (by savings and investment). Given
this, a  second school of thought claims that the opportunity costs should determine the discount rate
(Weinstein et al. 1977, Keeler and Cretin 1983, Gold et al. 1996). To illustrate, let  us assume that
there is a pill on the market that sells at a real cost  of $100 and improves your health for the month
after taking it  from the state "good" to "very  good." Investing  the  $100 divided by one plus the
market interest rate(e.g., $97) now will return $100  in a year, which can then be spent to buy the pill
and experience the health  benefit.  Thus,  a one-month improvement in health next year can be
purchased by investing $97 this year.

Since the health gain stays the  same in physical terms, the cost-effectiveness of the pill will improve
the longer you wait. Based on  the  same argument, a health plan may delay the inclusion of this pill
in the covered part of its services.  More generally,  delaying investments in health may improve the
cost-effectiveness of many health plans. To avoid this situation, Weinstein et al. (1977) suggest that
the marginal internal rate of return that could be achieved by investing in alternative  projects by the
same actor should be used as discount rate. Gold et al. (1996) suggest in their recommendations to
use the same discount rate for costs and health outcomes and to apply a social discount rate.
9 Van der Pol et al. (2000) present a literature review and show that subgroups of respondents have either a zero or even
negative time preferences. They also find that individuals in severe health state are more likely to have negative time
preference because they want to eliminate dread (=Loewenstein hypothesis).
10 Pigou (1932:29f) argued "there is a wide agreement that the State should protect the interests of the future in some
degree against the effects of our irrational discounting and our preference for ourselves over our descendants. The whole
movement for 'conservation' in the United States is based on this conviction. It is the clear duty of Government, which is
the trustee for unborn generations as well for its present citizens, to watch over, and, if need be, by legislative enactment,
to defend, the exhaustible natural resources of the country from rash and reckless spoliation."
12

-------
The opportunity cost argument is only  correct if the rate at which money can be transformed into
health is constant (e.g., the cost and efficacy of the pill  remain  constant) and the relative social
benefit of monetary and health increments remain constant (e.g., the monetary value of health does
not change) (e.g., van Hout 1998). Otherwise, different discount rates for costs and health may well
make sense.  In  our example, the cost of the pill might increase or decrease next year, altering the
amount that would need to be invested now to purchase it then. Alternatively, one might prefer to
enjoy the health increment now rather than next year, and be willing to spend the additional  $3
(=$100 - $97) to get it now rather than next year. There is no reason to assume that the value of one
HALY or one statistical life stays the same while real income increases. In short, it appears that the
monetary value of health should be discounted at the market interest rate; if the value of health
changes  over time, the rate  at  which  health should be discounted  differs from  the market rate
(Cropper and Sussman, 1990; Hammitt, 1993). Therefore, we conclude that  the literature has  not
adequately considered the question by how much the value  of a HALY or statistical life is changing
over time.  Once this value increase is considered, discounting  can be applied11. Available empirical
evidence does not yet allow  us to  suggest correction functions  for future values of HALYs or
statistical life12.
Therefore, we recommend the following discounting practice:

      1.  If  health  is  measured as  utility  in  HALYs  and  one HALY stays  equally  valuable
          independent of its timing and who profits then these HALYs are  discounted at a social
          discount rate, e.g., 3% (Murray et al. 1996, Gold et al. 1996).

      2.  If the value of health is measured, the following distinction is needed:

          -   If future increases in the value of HALYs and statistical life have been included in the
             analysis,  the  marginal internal  rate of return  that could be  achieved by investing in
             alternative projects should be used as discount rate.  For societal decision making this
             rate may be approximated by a social discount rate of 3% (Murray et al. 1996, Gold et
             al. 1996).
11 Johannesson et al. (1997) find an average marginal rate of time preference for health of about 1%. Murray etal.
(1996a) and Gold et al. (1996) suggest both a social rate of time discounting of 3%. Others suggest using the time
preference of the market only to discount close future but to use a minimal discount rate for distant future because a
damage occurring in 30 years or 40 years should not be valued much differently (Weitzmann 1998). Therefore, the
discounting with a constant rate is questioned. Since environmental decisions may have health effects in the distant
future (e.g., climate change) it may be appropriate to discount such health outcomes at very low or zero rates.
12 Most empirical estimates suggest VSL varies less than proportionately with income, although a few comparisons
between industrialized and developing countries suggest the variation may be greater than proportionate. Over a time-
span of 16 years, the value of a statistical life (VSL) increased in Taiwan by a factor of 10 while the income per capita
increased in the same period only by a factor of 2.5 (Hammitt et al. 2000).
13

-------
         -  If future increases in the value of HALYs and statistical life have been omitted in the
            analysis, one should discount by the difference between the (unknown) rate of value
            increase of HALYs and statistical life  and the social discount rate.  Absent other
            information, this net rate may be approximated by zero.

2.7    Whose values?
Before we turn  to the description  of methods  to elicit values for quality weights needed  in the
QALYs approach, disability weights needed in the DALYs approach, WTP or HYE we need to ask
whose values should be considered in those elicitation procedures?

A recent review of 38 studies (de Wit et al. 2000) that included groups of patients and non-patients
to elicit quality weights found that 11 of these studies show no statistically significant differences
between  different groups (in many cases  due to small sample sizes).  22 studies reported higher
patient values, two studies showed lower patient values and three studies found contradictory results.
Therefore,  it matters which group  or how the  study  population is  selected.  In  the  course  of the
Global Burden of Disease study (Murray  et  al. 1996a), it has been questioned whether globally
universal disability weights make sense due to cultural differences in health perception and the very
different consequences of disabilities. An empirical study performed in 14 different  countries
suggests  a fairly stable rank ordering among 17 selected health conditions with the big exception of
HIV infection (Ustiin et al.  1999).  They also  find that the differences in ranking of mental versus
physical  conditions are larger between different groups of physicians and care givers than between
countries.

Different groups that might provide preference information can be  positioned in a 3-dimensional
space (strength of relationship [self, family, friends, no experience], time with illness [immediate,
soon, distant future,  never], subjective probability of illness [certain, likely, unlikely, no chance at
all]) (Dolan 1999). Patients are positioned at the origin of this system  of coordinates while physicists
and health professionals have usually a lot of experience but little chance of experiencing the illness
soon themselves. Elicitation of preferences of people with no experience with a disability and little
chance of experiencing the disability soon is a challenge. Therefore, preferences from either patients
or health professionals are widely used in CEA (Bell et al. 1999).

What are  the  reasons for different  (higher)  quality  weights  of patients  compared to  health
professionals or the public? It was found that

-  the given description to the general public did not correspond with what patients actually suffer
   (Jansen et al. 2000),

-  human beings are very flexible in adapting to new situations,
14

-------
   human beings tend to state relative preferences that probably compare to people of similar age or
   fate13 (Groot 2000),

-  aversion against disability only plays in ex ante situations but patients are in ex post situations,

-  aversion against death (which is often used as scale end in elicitation methods) may be higher for
   patients because death is more real or closer (Gabriel et al. 1999), and

-  the whole meaning of quality of life is redefined14.

For medical decision making, most of the stated  reasons  for higher quality weights of patients are
not just plausible but also valid, i.e., not distortions to be controlled for. In environmental decision
making, the number of "cases"  can  be influenced, i.e., how many people get asthma attacks or die
prematurely. This means that aversion against the disability as shown by the public may make sense
and adaptation by comparing just with people of similar age or fate may  not. Health professionals,
on the  other  hand, may have  a good idea what patients  are  actually suffering  but may  have
systematic biases  related to their training, social  status and work experience (Field et al. 1998).
Practically speaking, the "true" weights  for avoiding health cases may lay  somewhere between
patients' values and the public' values as the health professionals'  values usually do.

It  appears from this  discussion  that  the application in environmental  decision support  is less
dependent on patients' values, but that it may be difficult to inform the  public  accurately enough
about the health outcomes to elicit their preferences. A two-step procedure, where patients describe
in step 1) their health states in multi-dimensional quality of life instruments and the public provides
in step 2) aggregated values (either with MAUT  or holistically)  could  solve some of the problems
mentioned (De Wit et al.  2000, Nord  1999). Alternatively, some of the problems  with patients'
preferences  can be solved  by  eliciting  preferences  for  changes in health states rather than for
absolute health states.

We conclude that first, it is important to decide whether self-interest or altruism  should  be elicited.
Second, it is a crucial step to make sure that the health state is well understood which can be done by
choosing patients or health professionals or two-step procedures; and third - as we will discuss in the
next two sections - the phrasing of the elicitation question will influence which values are activated.

Finally, one could also ask whose values for whom?  Since the severity of disabilities also depends
on the relevance of certain handicaps to specific groups of individuals, it has been shown that quality
weights depend on the  patients'  occupation, gender and family status (Holmes 1997). However,
environmentally induced health  effects are not sensitive to these characteristics. The higher shares of
environmentally affected children, elderly and already sick people can  be considered by age group
13 People tend to reduce cognitive dissonance by overstating their health state and psychological adaptations help them to
shift to a new anchor (Ubel et al. 2000).
14 Koch (2000a/b) argues that disabled people repeatedly confirm their good health because the physical disability is
indeed no handicap anymore in a chronic situation. Therefore, the high quality weights of chronically ill patients make
sense. Brickman et al. (1978) found that persons 1 year after winning a lottery or developing paraplegia show very little
difference in happiness.
15

-------
specific quality weights (Murray et al. 1996a) and co-morbidity factors respectively. Other sensitive
subgroups are assumed to show no deviation from an average disability to handicap relationship.
Therefore,  age group  and  co-morbidity  of  affected  populations  should be considered  in
environmental decision support tools.

2.8    How to elicit values and utilities?
Here we present the elicitation methods that are used in medical decision making to derive quality
weights for QALYs, disability weights for DALYs, and values for HYE and WTP. The use of the
terms 'preferences', 'values' and 'utilities' is not uniform. Here we use 'preferences' as the most
general term that does  not  imply certain  scale characteristics or  other  properties, 'values'  are
'preferences'  measured on a  cardinal scale,  and  'utilities'  denote 'values' under risk that fulfill the
requirements by von Neumann and Morgenstern (1943) as  outlined  in Section 2.5.

The following short descriptions shall describe prototypical versions of each method (see also Nord
1992a, Patrick et al. 1993:143ff, Murray et al. 1996a:71).

•   Rating Scale/Visual Analogue Scale (VAS): A typical rating scale consists  of a line with clearly
   defined endpoints. The most preferred health state is placed at  one end of the line and the least
   preferred at the other. The remaining states  are placed between the two endpoints  so that the
   intervals between the placements correspond to the differences in preferences as perceived by the
   subject that is asked to determine the weights. This method is the easiest to  administer and to
   understand for respondents. However, the resulting preference weights have usually only ordinal
   meaning.

•   Magnitude Estimation (ME): Subjects are asked to provide the ratio of undesirability  for pairs of
   health states. For instance, state A is felt, for example, to be two times worse than  state B. A
   series of questions allows the subjects to locate all the health states on one scale of undesirability,
   where at least one health state should be perfect health or death (similar to the  procedure used in
   the Analytical Hierarchical Process (AHP) (Saaty 1980)).

•   Standard Gamble (SG): A subject is offered a choice between two alternatives. Alternative 1 is a
   treatment with two possible outcomes: probability/? of being restored to normal health and living
   another t years,  and probability (1-p) of dying immediately. Alternative 2 is the certain outcome
   of living in a given health state /' for t years. The probability p is varied until the respondent is
   indifferent between the two alternatives. The probability p at the point  of indifference is the
   utility weight  for health state /'. This method provides  utilities that  conform with von Neumann
   and Morgenstern requirements for decisions under risk. Since human beings have difficulties in
   dealing with (low) probabilities, it is suggested to use cumulative prospect theory (Tversky et al.
   1992) to transform elicited probabilities (Stalmeier et al. 1999, Bleichrodt et al. 2000).

•   Tradeoff Method (TO): A subject is asked to choose a health  state i+1 so that it is indifferent
   between the gambles (p,r;l-p,i+l) and (p,R;l-p,i) where/? is a  constant probability, r and R are
   two reference health outcomes such that R>r, and / is  first the  starting health outcome and then
   the previously elicited health outcome.  This  procedure constructs an interval  scale with a large
16

-------
   number of trade-offs between similar outcomes of equal preferential difference. (Wakker et al.
   1996). It would fulfill the von Neumann and Morgenstern requirements but health outcomes are
   not available on a continuum, therefore, this method has so far not  been applied in medical
   decision making (Bleichrodt et al. 2000).

•  Time  Trade-Off (TTO): A subject is offered two alternatives. Alternative 1 is health state / for t
   years followed by death and alternative 2 is normal health for x years, x is varied until the
   respondent is indifferent to the choice between the two alternatives at which point the preference
   weight for state /' is x/t. Torrance et al. (1972) introduced TTO and found good accordance with
   SG. Therefore, this method has been widely used, as it is less demanding than standard gamble
   and does not suffer from the difficulties of deriving (low) probabilities. Nevertheless, whether
   TTO works for minor health impairments is questioned because people  have proven unwilling to
   trade  life expectancy for minor disabilities (MacKeigan et al. 1999). Therefore, others choose to
   use the worst health  outcome rather than death (Krabbe et al. 1998). Since TTO has inherently
   inbuilt the consideration of time-preference, Johannesson et al. (1994) show how QALY that use
   TTO have to be calculated if additional time discounting is needed.

•  Person Trade-Off (PTO): A subject is offered two alternatives. Alternative 1 is to extend life for
   x individuals in normal health and alternative 2 is to extend life for y individuals in health state /'.
   y is varied until the respondent is indifferent to the choice between the two alternatives,  at which
   point the preference for state / is x/y. Other forms of person trade-offs can be constructed where
   subjects are asked  to trade-off restoring health to x individuals in health state /' versus restoring
   health to y individuals  in health  state j. Patrick et al. (1973) introduced this method as
   "equivalence  of numbers technique" and Nord (1992a) gave  it the  name Person Trade-Off
   method. The  PTO most directly reflects resource allocation situations whereas SG, TTO, and
   VAS  do not ask this  question and respondents that are confronted with the implications confirm
   that they did not have resource allocation in mind (Nord 1995). While the methods mentioned so
   far are explicitly about one's own  health and health-preferences, PTO is  explicitly about other
   people's health. Pinto Prades (1997) finds that PTO is empirically superior compared to SG and
   VAS  for societal resource allocation. He defines three versions of PTO. PTO1 has a gain/gain
   framing, PTO2 a gain/loss framing  and PTO3 uses a number of health states that  are  close
   together and builds up a chain (similar to TO). He finds clear  differences between PTO1 and
   PTO2 and stresses that PTO3 may work best for mild illnesses because it is both cognitively
   easier and easier for users to make trade-offs between severe illnesses and premature death.

•  Attribute Based Stated Choice, Conjoint Analysis (CA): Paired comparisons of multidimensional
   alternatives with factorial regression  analysis are the basic features of this method  (Huber et al.
   1993). If the  comparison involves  just a statement choice one speaks  of a Conjoint Choice or
   Attribute Based Stated Choice method.  If rankings or ratings are used, this is called Conjoint
   Ranking or  Conjoint Rating Methods respectively or  more generally  Conjoint  Analysis
   (Adamowicz  et al. 1998). It is a very useful method when very different attributes matter in a
   decision and  it has  a  high  degree of realism  because potentially  similar alternatives are
   compared. For example, it was shown that the value of in-vitro fertilization can not be measured
   only on a health scale but the attitude of the staff, time on the waiting  list or follow-up support
   have been considered as  non-health outcomes of the medical treatment (Ryan 1999). Attribute
17

-------
   Based Stated Choice methods also gain popularity in determining WTP (Johnson 1998, 2000).
   While earlier regression analysis was restricted to linear additive models (Ryan et al. 1997) more
   sophisticated models are available nowadays. It has to be noted that although realistic scenarios
   are compared by judges the results of the regression analysis may not be acceptable to the judges.
   One should also be careful in the number of attributes that are presented in order to stay within
   the cognitive possibilities of humans (Miller 1956).

   Contingent valuation (CV) (monetary valuation, stated preferences): Subjects can be asked in at
   least four  different ways  to estimate  their willingness-to-pay (WTP) or  willingness-to-accept
   (WTA) certain health states. One can then measure which amount individuals would accept to
   pay (1) for reaching a better health state, or  (2) to prevent a worse health state from occurring.
   Or, one can determine the payment they would accept in order (3) to give up the opportunity for
   achieving an improvement in their health, or (4) to accept a further decline in their health state
   (see also Jones-Lee et al. (1997), and Wenst0p et al.  (1997)). The number of studies of type (1)
   and (2) has rapidly increased in the 1990s for use in benefit-cost analysis (Diener et al. 1998).
   Next to starting point biases, anchoring biases, strategic biases, information biases and framing
   biases that are  common  pitfalls of all listed elicitation  methods  the monetary valuation also
   suffers from scope insensitivities, hypothetical biases, and payment vehicle biases (Viscusi et al.
   1987, Jones-Lee et al. 1995, Baron 1997, Beattie et al. 1998,  Willis et al. 1998, Blumenschein et
   al.  1999). Those additional problems are due to the fact that respondents  are not only  asked to
   weight different health states but also to relate these weights to a (health-external) monetary unit.
   An important property of CV values is their dependency on income15. Typical elicitation formats
   used for CV studies include open-ended  question format (OE), (bounded) dichotomous choice
   format (DC), and iterative bidding. It was found that DC is most compatible with incentives and
   gives reasonable upper bound  estimates while OE is just in a  comfortable range and tends to
   understate the maximum WTP (strategic bias). The observation  that people prefer to say yes
   (yea-saying effect) and the  starting-point bias are  potential problems. A  debriefing may be
   important  to understand potentially relevant biases (Bennett  et al. 1998).  Deliberative and
   discursive methods have been developed to  deal with framing and  embedding biases (Sagoff
   1998);  calibration factors have been  suggested  to  adjust  too-high  WTP  values due to  the
   hypothetical bias16 (Fox et al. 1998); a chained approach has been suggested that first elicits the
   WTP for the certainty of a complete cure from a road injury and the WTA compensation for the
   certainty of sustaining the same injury and then a standard gamble question elicits the injuries'
   severity compared to death (Carthy et al. 1999). Guidelines for good practice in the derivation of
   willingness-to-pay (Arrow et al. 1993) and a recent guide to CV  (Carson  2000) are in place to
   improve the state-of-practice.

   Wage-risk method,  household production function  method, hedonic price  method  (revealed
   preferences):  Instead of  asking  people hypothetical questions   one  can  also  observe  their
   behavior,  i.e., their willingness to accept increased job risks (wage-risk  approach) or their
15 Some critics oppose the assumption that individuals' WTP should be constrained by their ability to pay that is
generally dependent on their income (Gafni 1997). However, as mentioned in Section 2.4, the applicability of this
criticism solely depends on how we choose to compare utility between people.
16 Such adjustment factors may depend on the commodity and whether it is a private or public good, i.e., is not one
universal factor (Fox et al. 1998).
18

-------
   willingness to pay for reducing individual risks (market approach). Viscusi  (1983/1993/1998)
   presents comprehensive  overviews on studies that calculate the value of statistical life (VSL)
   mostly from wage-risk and few market approach and CV studies. Although Viscusi controls for
   many  confounders that may  bias the ratio between increased risk to die on the job with wage-
   differences between high and low-risk jobs he admits that riskier jobs may be preferred by risk-
   seeking individuals which means that the derived VSL may understate the true values. However,
   further confounders like the healthy worker  effect17 and the fact that environmental risks are
   perceived  very different from job  risks may limit  the  usefulness  of wage-risk estimates in
   environmental decision support tools (Hammitt 2000b). Another basic assumption is that high-
   risk workers know their  individual risk.  Viscusi (1993) states that the valuation of morbidity is
   more difficult than mortality because revealed methods  do not work due to lack of markets.
   People and society also make investments in safety features like seat belts and air bags or provide
   regulations to reduce risks that impose costs. These values vary widely between <0 USD and 20
   trillion USD (Tengs et al. 1995) and are poor proxies for perfect risk-cost markets.
2.9    Insights in elicitation methods
The descriptions above imply that much research has been done to test the methods and that there is
no consensus on which method is preferable. However, there is some consensus that methods like
VAS and ME do not really ask the trade-off questions at stake, and that the VAS produces ordinal
rather than cardinal scales (Nord 1992a). Nevertheless, VAS is still in use since it is the cognitively
least demanding method. The  lacking interval  property of the scale can either be dealt with by
transformation functions that compress the upper and lower tails of the scale or by its exclusive use
for interpolations between health states that have been valued by trade-off methods (e.g. Murray et
al. 1996a).

In the other methods, subjects are faced with a choice between pairs of conditions. The question is:
how much are you willing to sacrifice of certainty (SG), life span (TTO), and health of others (PTO),
respectively in order to improve your own quality of life (SG&TTO) or that of an imaginary patient
(PTO) (Nord 1992a). Due to these different questions, it is not surprising that the derived quality
weights differ for the same judge and health condition if different elicitation methods  are used. By
relying  on  earlier studies (Froberg et al. 1989c) and closer investigations, Nord (1992a) offers a
number of reasons for the observed pattern of weights in empirical studies:

Differences in what is being valued/framing
    In SG people may show risk aversion, death  aversion or reluctance of gambling with one's own
    health which all increase quality weights.
17 Wage-risk studies represent only a small part of the population, the working population in risky jobs (often males at
age of 20-50). The 'healthy worker effect' means that workers that feel the higher risk or that are involved in an accident
drop out to find less risky jobs and that the majority of the workers that stay in such jobs have actually lower risks
because of their skills. This last effect is a bias because the risk is calculated based on all events while the wage-lever
may be determined by this remaining high-skill majority.
19

-------
-  People with positive time preference will trade life years that will be lost in the distant future for
   smaller health improvements right now. This effect leads in the TTO to lower weights the longer
   the time horizon chosen (violation of constant proportional trade-off).

   The different  versions of PTO are confounded by  distributional considerations.  If somebody
   prefers not to  spend all health care money on one person then the disability weights tend to be
   skewed to  high values with little difference between severe and mild  conditions. Others that
   prefer to invest in the persons with the worst state will produce different outcomes (inbuilt
   distributional criterion). Therefore, it is important whether only one or many lives will be saved
   in exchange for treating ill persons.

   In PTO, one sacrifices the lives of others while in TTO and SG one's own life. People with an
   attitude that they should not sacrifice others' lives but give priority to saving those lives will state
   higher quality weights. However,  a test of this hypothesis could not reveal such differences
   between individual and altruistic values (Richardson et al.  1997).

-  It depends  whether one asks how good or desirable a health state is or one asks to compare
   different illnesses.

   Since people show a status quo effect, they are averse towards changes (Dolan et al. 1996).
Differences in anchors18
   It depends whether death or full health is used as a reference state (in those methods that do not
   use both).

   If worst versus best imaginable health state is used to label the 0 and 100 endpoints of a scale
   respectively then the scale may be understood as percentages of fitness which means that the
   upper state is chosen as anchor and the scale interpreted as 'percentage of fitness'.  This leads to
   lower quality weights.

-  If only 'dead' or only 'perfect health' is mentioned as endpoint, this will anchor the results.

-  If the scales extend the labeled endpoints, they influence the rating as well.  Dolan et al. (1996)
   found that a large number of health outcomes score  worse than death while others do not offer
   such weights at all.
Labeling effects (Froberg et al. 1989c)
   It depends whether elicitation under uncertainty is presented as insurance  or gamble.

-  Whether one offers a cash discount or credit card surcharge matters,  i.e., the presentation as a
   gain or loss is important (Stalmeier et al. 1999, Bleichrodt et al. 2000)19.
18 When preferences are partly formed during the preference elicitation process, humans tend to state preferences relative
(and often close) to fixed values suggested by the elicitation procedure, i.e., are anchored by them. If other anchors
would yield to different preferences for the same question, anchoring is considered to enter a bias.
19 They do not only show the importance of this bias but also show how to debias gain/loss and probability distortions.
Debiasing is a research field in decision analysis and may fertilize the development for environmental decision support
tools (see, e.g., George et al. 2000 for debiasing of anchoring and adjustment biases).
20

-------
While some  of the effects  are intended because  they  show that indeed different types of health
outcomes are valued, the strong anchoring  and unintended framing effects suggest that individual
preferences for health are not pre-existent but  constructed during the task (Dolan 1997). Based on
reasonable good re-test-reliability of the methods shown one can conclude that preferences exist at
least  partly20. However,  focus groups  prior21 to and between the elicitation procedure and  post
elicitation questions may help to form preferences and to detect biases (Froberg et al. 1989c, Nord
1995, Dolan  1997, Johnson et al. 1998).

Another indication of preference construction rather than elicitation is the wide spread of the weights
found (Torrance 1986, Nord 1995). They consider random error as important sources of the spread.
While most  studies  showed that  the values  are independent from socio-economic  factors  or
professional level (Torrance 1986, Froberg  1989c), a more recent study found small but significant
dependence on age and sex  (Dolan  et al. 1996). Thanks to the small effect, present evidence allows
us to assume  the weights' independence from socio-economic factors.

Many criteria lists have been suggested to judge the different elicitation methods (see, e.g., Froberg
et al. 1989b, Richardson 1994, Gold et al. 1996, Field  et al.  1998,  Brazier et al.  1999) but the
recommendations show a broad variety. Nord (1992a) mentions that there are three reasons that the
different experts do not agree on the "best"  method22. First, they do not take into account that there
are different versions of each method; second, they  do not differentiate  between the  different
applications;  and third, they do not differentiate between utilitarian and preference interpretation  of
the outcomes of the methods.  Sections 4  and 5 will elaborate on  the specific applications  in
environmental  decision support tools and  make  recommendations based on application-specific
criteria.

Since we use health metrics to value present and future  health outcomes, it  is important to know
whether the  derived values are temporally  reliable. Research in WTP methods  suggests that the
temporal reliability is better than assumed (Reiling et al. 1990, Carson et al. 1997). However, Cutler
et al.  (1998)  report different QALY weights for 1970 and 1990 (although on a ordinal scale). Since
the importance of physical  disabilities decreases  in an information society and the amenities for
physically disabled people get better, this finding is not  surprising. To value health effects in the
future one may want to consider such predictable trends.

2.10   How  to measure premature death?

Everybody dies, but when is  it  premature  and by  how many years?  From  the individual's
perspective, premature may mean that,  e.g., one is mentally not ready to die, one wants to reach a
20 Only one-third of the judges have changed their values during interview process (Shiell et al. 2000). However, such
resistance to changing former values may also be explained by other psychological factors.
21 This is also called warm-up process (Froberg et al. 1989c).
22 This disagreement is not only shared by researchers but also by practitioners. Rating scale (21%), TTO (18%) and SG
(12%) have been found to be the most commonly used elicitation methods in a review of 228 published CUA (Bell et al.
1999).
21

-------
certain round age (e.g., 80), one wants to  survive the parents (or more realistically parents do not
want to survive their children), one wants  (not) to survive the husband/wife, or one wants to die a
"natural" cause of death. However, from a statistical perspective all deaths are premature because
other individuals of the same  age survive. Life expectancy tables can be used to calculate  how
prematurely somebody died. Such tables  need to be a) valid  for states, nations, ethnic groups,
continents or world-averages b) either averages for all individuals in the chosen area, or differentiate
by sex, lifestyle factors, profession etc. and c) either based  on today's  death statistics alone, by
calculating cohort life expectancies assuming that a child born today will be at each age subject in
the future to the currently observed age-specific mortality rates, or by estimating future age-specific
mortality rates  that will apply when the subject cohort reaches those ages. Therefore,  the question:
"when is a death premature and by how many years?" is far from trivial.

The global burden of disease that attempts to estimate years of life lost on a globally comparable
level is the place where these questions are treated very explicitly. The following propositions were
made to  decide on  the above question (Murray et al.  1996a:6): "I: The burden calculated for like
health outcomes should be the same; and II: The non-health characteristics of the individual affected
by a health outcome that should be considered in calculating the associated burden of disease should
be restricted to age and sex". Based on these propositions they chose a standard expected years of
life lost that differentiates only  between age and sex and applied it worldwide. Although the chosen
model23 is very close to the demographics of Japanese women it was corrected for peculiarities that
are not health related  (like war). For Japanese men they derived a theoretical genetically caused sex-
gap of 2.5 years, which is less than today's observed difference24. Not surprisingly, this "closing of
the health  gap" has been criticized for its effect of increasing men's  years  of life  lost and the
potential shift of health resources to men (Anand et al. 1997).  Since the life expectancy of Japanese
women is the highest worldwide and much  higher than in developing countries, it was criticized that
the chosen approach should not be used  when single  health interventions have  to  be evaluated
because this would enter a bias to save  the lives of the old (Williams 1999). Whether one agrees with
these objections depends on the application in mind25 and whether the propositions apply.

Risk assessment and life cycle assessment often assess marginal health increases or  decreases due to
specific interventions. In these  cases, non-affected risk factors are assumed to stay  constant and no
assumptions on "genetically based" life tables are necessary. However, since many health impacts
due to environmental pollution are global  and may concern future generations (when sex gap  and
inequalities in life expectancy may be  smaller, i.e., the assumption of ceteris paribus does not hold
anymore) the approach by Murray et al. (1996a) may serve as a prototype.
23 The UN model 'Coale and Demeny West Level 26'.
24 After this adjustment, they used the 'Coale and Demeny West Level 25' model for men - although initially developed
for women.
25 National burden of disease studies used national rather than global life-tables (Melse et al. submitted, Anonymous
1999b).
22

-------
Is each life year of equal value?
Here we ask whether the implicit assumption in equation (1), p.6 - that the value of one life year
depends on its health state only - empirically holds or not. On the other end of extreme assumptions,
estimates  of the value of a statistical  life  (VSL)  often  assumed a constant value of a  VSL
independent of years of life lost (Viscusi 1993, ExternE 1995). Empirical studies show that in the
USA and Sweden saving 85 and 35, respectively, 70-years-old is equivalent to saving one 30-year-
old (see Johannesson et al. 1997a for references). This is strong evidence against the constant VSL
but does also not comply with the assumptions in equation (1), if typical age-specific health states
and life expectancies are assumed.

A simple consumption model that excludes dependents shows that the VSL is strongly dependent on
income and follows in the case of a perfect market a slight increase until the age of 25 and then
slight decreases until age of 40  and then larger decreases (Shepard et al. 1984). Based on a similar
model (see also Fig 3) it is concluded that the marginal utility of money decreases with increasing
age and that the real  rate of interest is crucial for knowing how much the  curve deviates from a
monotonically decreasing function (Ng 1992). The dependence on age in these economic models
occurs because the benefit of a unit decrease in mortality risk decreases with age and opportunity
costs of spending decline with age. The size of the utility discount rate compared to the interest rate,
the inclusion of dependents and the possibility to borrow money alter the shape and position of the
curves (Hammitt 2000b). These approaches fall short because they ignore the fact that humans are
social beings where friends and family matter. This last argument may work in both directions: the
end of life may be higher valued because of the social environment but may also prevent that  all
remaining money is spent to delay the inevitable, i.e., the dead-anyway effect (Pratt et al. 1996) may
be less pertinent in a  social environment. However, the inverse U-shaped  curve for age-dependent
VSL has also been shown by two empirical WTP studies (Johannesson et al.  1997a, Carthy et  al.
1999)26.

DALYs include an inverse U-shaped age-weighting function that was included based on a number of
arguments given in Murray et al. (1996a). However, as also pointed out in a discourse in Barendregt
et al.  (1996),  Murray et al. (1996b), and Sayers et al. (1997), this age-weighting alters the life
expectancy-dependant utility of life only little since the inverse U-shaped function does not replace
the life expectancy table but acts just as a multiplicative modifier with most factor values between
0.5 and 1.5. Doing so means that life years lived above the age of 50 are discounted slightly more
than the  life-expectancy tables already suggest.  This  contradicts  the above mentioned empirical
findings and the life cycle consumption model outcomes (see also Figure 3).  If an age-weighting
function should be combined in a multiplicative way with life expectancy, then this function should
have a U-shape rather than an inverse U-shape to reflect the finding that the value per year of life
lost increases with age. Therefore, we do not suggest to use the age weighting suggested in Murray
etal. (1996a).
26 The VSL varies only a factor of 1.5 between a 30 and 70 year old and is therefore closer to the predictions by Shepard
et al. (1984). However, the authors speculate that embedding and anchoring may have affected their results (Johanesson
et al. 1997c).
23

-------
                                                               4000
              
                             premature death at age
                                                                                YLL(O.O)
                                                                               .VSL
       Examples for the different values of remaining life at different age. The solid line is measured in years life lost and
       represents the statistical life expectancy used in the global burden of disease (Murray et al. 1996a), the dashed line on the
       top is an estimate of age-independent VSL and the age-dependent VSL is taken from Ng (1992).
Since a large share of premature deaths due to environmental pollution occur at  high age,  it is
important to know how to value these years life lost at high age. Present evidence  shows that the
assumption of an age-independent value of life is not supported. However, theoretical models that
produce inverse U-shaped functions are over-simplistic by ignoring social interactions and their
absolute values are based on a number of uncertain  assumptions.  The few empirical studies suffer
either from potential biases (Johannesson et  al. 1997c) or focus only on longevity and report much
lower values than expected by common sense (Johnson et al. 1998). Therefore, an interim solution
may be to rely on life expectancy alone with  an additional reporting of the age-profile of the affected
population or an age-weighting based on the most recent empirical findings in Carthy et al. (1999) as
done in Seethaler (1999).

So far, we concentrated on the fact that the value of a life year may be a function of age and health
state. However, from  research in risk perception it is well known that the cause of loss and its
psychometric characteristics matter when people judge risks (e.g.,  Fischhoff et al.  1978). Lives lost
due  to involuntary, unfamiliar, and catastrophic risk sources are found to be valued higher than
others and lead to different WTP per life lost (Tolley et al. 1994,  Ramsberg 1999, Cooksen 2000).
Many environmental risks belong to involuntary, unfamiliar but chronic risks which means that the
WTP is higher than average but more or less  similar within this group of risks. As mentioned before,
WTP is usually dependent on the individual's ability to pay. Finally, the value of an additional year
24

-------
of life may also depend on the individuals' assumption whether s/he is dying prematurely or not (i.e.,
whether age-goal has  been achieved), or on the societal assumption  of a fair inning (everybody
should achieve a certain age (Williams 1996), see also Section 2.14).

2.11   Time proportionality of HALYs
Section 2.5  summarized for morbidity outcomes some of the empirical evidence against the major
assumption in QALYs and DALYs (=HALYs), the time proportionality. The section above provides
the same evidence for mortality. Due to a lack of convincing alternatives, we concluded above that
the assumption of time proportionality might be a necessary interim  solution. For morbidity, the
same argument was made also claiming that the deviations are small  (Dolan 1996). However, for
both mortality and morbidity there are examples where the deviations are major and examples could
be constructed that show  preference  reversal if age  and duration-dependency are considered
respectively.

WTP studies using  stated preferences (Alberini et al.1997) and conjoint analysis (Johnson et al.
1998/2000)  show for short term outcomes like cough or asthma attacks strong non-proportionalities
if durations are 1,  5 or 10 days. For acute and/or short-term health effects  due to  air  pollution
Johnson et al. find that ln(d+\}  where d are the numbers of days shows approximately a linear
behavior as time factor in their attribute based stated choice analysis27. This is the only alternative
proposal we found in the literature to incorporate the duration in a non-linear way.

A correction of time-proportionality for morbidity should be able to deal with two major effects:
change aversion and adaptation. Since the environmental context implies that we can prevent health
effects from occurring we are in an ex ante situation. The above findings can partly be explained by
this effect. There is a  strong aversion to get sick at all,  i.e., to change the health state (status quo
effect). Further, it seems to be important whether the health state is perceived to be fully reversible.
Whether reversibility is assumed  or not depends probably on the predicted time duration in the bad
health  state (Sackett et al.  1978)28.  Therefore, aversion against both change of health state and
perceived irreversibility should be accounted. When we discussed the differences between patient
and non-patient values, we  already mentioned that adaptation to health outcomes increases the
perceived quality of life. This can also be seen as a marginal decrease  in dis-utility and is not to be
confused with time preference. The empirical effort to estimate the additionally needed parameters
to take into account the mentioned  deviations from time-proportional weights is huge. However,
instead of investing in further research that confirms the lacking time-proportionality one could
estimate these parameters and functions. These estimates could then be used in the screening phase
of applications to get an estimate whether the assumption of time-proportionality enters a relevant
bias or not. If the bias appears to be major, HYE, WTP or a program evaluation following proposals
27 Part of the found effect may also be caused by scope insensitivity, i.e., a fixed budget for averting mild illnesses that is
insensitive to the number of days.
28 Although this early study is often cited when the time-proportionality is questioned it has not been pointed out that the
study provides evidence for marginal increase rather than decrease of dis-utility.
25

-------
of Nord (1999) may be most efficient  and useful. In all  other  cases, the much simpler time
proportional approaches may be acceptable.

2.12   Short-term and chronic effects
QALYs have  explicitly been developed for chronic health  outcomes (Pliskin et al.  1980) and
DALYs concentrate usually on  permanent  conditions (AbouZahr  et  al.  2000). However, the
application in medical decision making and environmental decision support tools makes it necessary
that both short-term and chronic health effects can be evaluated (Alberini et al. 1997, Johnson et al.
1998, Balaetal. 2000).

Stouthard et al. (1997) distinguish diseases with an  episodic pattern (e.g., asthma, migraine) and
short-term  conditions with  full recovery  (e.g., colds, gastroenteritis). The episodic diseases have
been  described as chronic outcomes while short-term  conditions with full recovery  have been
presented in an annualized profile, e.g., 50 weeks of perfect health  and 2 weeks of a cold. If time-
proportionality applies then the latter example would lead to a quality weight of at least 0.96, even if
the cold would be perceived as equally  severe as death.  However,  as discussed  above,  aversion
against changes in health states may justify different values. Therefore, the judges need not be forced
to comply  with time-proportionality  for short-term  conditions and the  procedure suggested by
Stouthard et al. (1997) may be a pragmatic solution.

2.13   Multipathology/co-morbidity
People often suffer not one health outcome but different (mild) disabilities at  the same  time. In
Beaver Dam, Wisconsin, a  township  in the USA,  1356 individuals  above the age of 45  rated their
own health with different methods. About 20% of the individuals had no, one, two or three health
conditions, respectively. The remaining 20% had as many as 4 to 10 different health  conditions
(Fryback et al. 1993). Epidemiological studies that are used to estimate dose-response relationships
in environmental decision support tools  do  report all health endpoints that are considered to be
caused by mechanisms triggered by the specific agent. Therefore,  it does not matter whether the
different health outcomes are causally related or not.  However, the  question arises how  the quality
weights can be added if different health effects affect the same individual and if this individual
shows age-related deviation from perfect health? This question is rarely addressed in the literature
and has been mentioned as a shortcoming of the DALYs approach (Williams 1999, Sayers  1997).
Anonymous (1999) adjusts for co-morbidity by assuming a multiplicative model among morbidities.
They were interested in allocating the burden of disease to  different causes. Therefore, they also
assume  that the most severe state gets the full quality weight while the quality weights  of the less
severe co-morbidities are  adjusted.  If there are  two health  outcomes with QWa and  QWb, and
outcome (a) is the more severe outcome of (a) and (b) then

QWacomorbfflty = QWa   and    QWbcomorbfflty = 1 - (QWa - QWa*QWb)                   (5)
26

-------
Due to the high share of correlated morbidities within mental disorders and within injuries, different
procedures have  been suggested  for these outcomes (Anonymous 1999a).  Since  the  purpose of
environmental  decision support tools  is not to find a just allocation to single  morbidities but to
estimate a decrease or increase in overall health state we only need guidance on how to calculate co-
morbidities and not on how to allocate disutility to single morbidities. For this purpose,  we suggest
using the multiplicative model. Instead of excellent health, many CUA studies use the absence of the
disease under study as the upper end for quality weight. Such quality weights have to be adjusted by
the age-related quality weight (Fryback et al.  1993). We suggest that the age-related quality weight
is QWa and the morbidity under study QWb and use equation (5) to adjust QWb, i.e., the age-related
quality weight  is kept constant.  Age-related quality weights can be found in Fryback et al. (1993)
and Bell etal. (1999).
2.14  Utility maximization versus distributional/ethical considerations
Although none of the discussed health metrics empirically  satisfy the strong assumptions  of von
Neumann and Morgenstern utilities (see section 2.5) they have been developed under the assumption
that health measured by these metrics should be maximized; this is called utility maximization. This
policy is usually followed by consequentalists who are primarily concerned with the health outcome
attained. Other policy alternatives concentrate on the process by which health  is achieved or the
opportunities people have to obtain health (Holmes 1995). Since the maximization of all three policy
goals is usually not possible  (Rawls 1971),  a choice has to be made at this stage. Environmental
decision support tools considered here attempt to minimize health effects. Therefore, they require the
consequentalists' view, which will be discussed here in more detail.

A lot of research in medical ethics has analyzed  whether people agree to maximize QALY and HYE
or minimize DALYs and WTP  as a sole criterion for resource allocation. A number of deviations
from this sole reliance on metrics have been found:

•   People want to improve the situation for the worst-off first (behind veil of ignorance, see e.g.,
    Rawls 1971, Andersson et al. 1999). This is also known as the severity criterion, see Nord (1999)
    for a review29.

•   Three groups of people can be differentiated: 1) Utility maximizers that accept the health metric
    as the  only  criterion, 2) diffusers that prefer to spend  health  care resources among all  with
    disabilities and not just for the patients with the largest increase in health, and 3) concentrators
    that prefer to spend the resources on fewer patients with visible improvements30 (Olsen 2000,
    Richardson et al. 1997). Others call this the  realization potential, i.e., that group with the larger
    improvement potential may (or may not) be treated first, see Nord (1999) for a review.
29 If the quality weights show a so-called upper-end compression, i.e., that only very severe health states get quality
weights below 0.65 but most health states are between 0.9 and 0.999, then this severity argument can in most cases be
fulfilled by the health metric. Due to death aversion, such upper-end compression is expected from utility measures (see
Nord (1999) for a review).
30 Olson (2000) also finds that a threshold for minimal improvements may exist for the concentrators.
27

-------
•  While 70% of the judges of a convenience panel mentioned that the maximization criterion
   should  be the most important allocation criterion for donor liver grafts only  0.7%  finally
   followed  a strict maximization of health outcome. All others also paid attention to age  (prefer
   younger), cause for liver disease (treat innocent first), waiting time and whether it is already the
   second transplant (Radcliffe 2000).

•  Survival is judged  by patients  as much more important than perfect health. The present health
   metrics may underestimate the importance of survival (Nord 1999, Cohen  1996).  However,
   Johnson et al. (1998) show that the prolongation of life at poor health gets very low or even zero
   WTP.

•  As mentioned earlier,  the fair innings  argument  claims that  everybody  should  enjoy  the
   healthiest life possible until a certain age  (70-75 years) (Williams 1996). This is also known as
   equality argument,  see Nord (1999) for a review.

•  When values of WTP are derived one typically assumes that the current distribution  of income
   among  individuals is  appropriate.  Therefore, WTP  has been  criticized  to  violate  equity
   principles. However, if WTP is used within a country and within the  health sector  alone this
   assumption may be unproblematic or adjustments can be made (Kenkel  1997, Donaldson 1999).
   The finding that socio-economic factors have  no influence on health quality weights supports
   this claim if the  population is concerned by health outcomes to the same extent  or if average
   WTP are used for  all population  groups. On  a global level, the application of local WTP for
   global consequences of environmental problems may  lead to strong violations of the  equity
   principle and result in giving less weight to health damages in poor countries.

•  The notion of double jeopardy was introduced to spotlight disabled people. It is argued that they
   are disadvantaged twice: first they suffer the disability, maybe for their whole life and second, if
   resource allocation follows QALY maximization, they are disadvantaged because  a year of life
   saved counts less and - if co-morbidity is calculated following equation (5) - additional health
   outcome may count less as well. (Singer et al. 1995, Koch 2000a/b) This problem was also found
   when the health loss  of HIV infected  subpopulation due  to  drinking  water  impurities is
   assessed31.

•  Due to  the limited  dimensionality of health metrics, it was found that the sensitivity for  certain
   groups  of health outcomes might be weak and therefore  set biased priorities.  This  point was
   made with respect to mental  health care (Chisholm  et al.  1997) and sexual  and reproductive
   health conditions (AbouZahr et al. 2000). However, several instruments consider non-physical
   disabilities and both Murray et al.  (1996a) and  Anonymous (1999a)  show major  shares of
   DALYs attributed to non-physical health outcomes.

This summary of arguments mostly against pure utility maximization leads to the question whether
health metrics are useful at all, whether they should be adjusted accordingly to account for the
31 In this case it is even a triple jeopardy: they are already struggling with a disease, they show a higher susceptibility to
drinking water infections and their premature death would be counted less because of their lower quality weight and
shorter life expectancy (USEPA 1998a). Therefore, this subgroup was analyzed separately to allow for tailored risk
management.
28

-------
mentioned points or whether these points should be considered  in other phases  of the decision
making process. Most authors, even the ones that are critical about many features of HALYs, agree
that health metrics are important and useful as long as they are not seen as ultimate measures of
quality  of life and as long as other criteria are used as well in decision making (Dougherty 1994,
Singer  et al.  1995, Holmes  1995, Williams  1996). Contrary  to this,  Leonard  et  al.  (1986:41)
conclude  "it is generally undesirable to include  them  [distributional considerations]  in  project
analysis". They feel that this would distort the CBA or CUA.

Since environmental decision support tools may (risk assessment for  regulation)  or  may not (life
cycle assessment) make protective decisions that are directed towards a specific  social or patient
group the considerations of the mentioned points will  be revisited  in Section  3.  The share of
Norwegian politicians opting for the pure utility maximization was for the social  democrats about
half that of the conservatives  (Nord 1999:130). Therefore, political orientations  lead to different
distributional judgments among politicians and let us conclude that a transparent breakdown of total
HALYs or WTP or HYE has to be provided to  allow for distributional judgments. Such breakdowns
should be made for  severity, realization of potential, groups with pre-existing disabilities, age, and
timing of effect32.

2.15    Beyond disutility: costs of illness  and  averting behavior
We focused so far on the individually borne disutility associated with health outcomes. However,
Table I shows that there are also individually  borne costs due to morbidity and collectively borne
consequences. The individual WTP is supposed to include all individually borne  or private costs
while the social  costs  would include both individually  and collectively borne costs.  In  medical
decision making, the ratio between cost of a specific intervention  (medical and production cost of
illness (COI)33) and the gain  in health due to that intervention is used to identify the most efficient
treatments. However, in environmental decision making investments are made to avoid the cause of
adverse health outcomes. The benefit of these investments is the avoidance of 'cost of illness' due to
treatment and production loss, of 'cost of averting behavior', and 'intangible costs'.

External costs  due to  illnesses caused  by environmental  impacts are sometimes estimated as a
multiple of COI34 (see,  e.g., ESEERCO 1995, ExternE 1995). Table II presents  selected willingness
to pay values to avoid health conditions that  result from air pollution.  The calculated  WTP/COI
ratios span a wide range, suggesting this rule of thumb is not very accurate.
32 Nord (1999) suggests that in addition the following factors are important: number of people affected, size of perceived
loss in quality of life, duration of effect, responsibility of affected person, responsibility of affected person for caring for
others, effect on patient's productivity. He also suggests that sex, race, education and income should not be used as
criteria.
33 See Gold et al. (1996) and Weinstein et al. (1997) for guidance on which cost factors are included in the nominator and
denominator.
34 Sources for COI in the USA can be found in USEPA 1998b, USDL/BLS 1999, Leigh et al. 1997, Hoffman et al. 1996,
Elixhauser et al. 1999.
29

-------
Tab. I:  Overview on the costs of morbidity (adapted from Seethaler 1999). Dark shaded indicates 'included in health metrics', light
       shaded indicates 'market prices are available' and no shading indicates 'usually neglected'.

Collectively
borne
Individually
borne
Cost of illness
(medical)
Treatment cost
(health care,
infrastructure,
medication etc.)
Treatment cost
(health insurance,
medication etc.)
Cost of illness
(production)
Loss of production
(GDP)
Loss of production
(household
income)
Cost of averting behavior
Averting expenditures (noise
protection walls, water treatment
plants etc.)
Averting expenditures (water and
air filters in private homes, no
(cheap) outdoor sport during
high ozone periods etc.)
Intangible costs
Disutility associated
with health outcome
(effects on family,
friends etc.)
Disutility associated
with health outcome
If one wants to include collectively borne COI in the WTP estimates one could assume that about
half of the  medical  costs for hospital admissions  are  borne  collectively  and add them to the
individual WTP which would  increase these values by  3330 and 4080 EUR  for respiratory and
cardiovascular hospital admissions respectively. This large increase is not found for other conditions
where only  minor increases can be calculated.  Therefore, depending  on the  study's  goals35 and
endpoints,  each  cell  in  Table I  may  be  included  in the  calculation  of  health benefits for
environmental decision making.

Tab. II:  Values for willingness to pay (WTP) and cost of illness (COI) for five health conditions caused by air pollution (Seethaler
       1999).
Health condition
Respiratory Hospital Admissions
Cardiovascular Hospital Admission
Chronic Bronchitis (adults >25 years)
Bronchitis (children, <15 years)
Asthmatics: Asthma Attacks (person day)
WTP (1996 EUR)
7870 per admission
7870 per admission
209'000 per case
131 per case
31 per attack
COI (1996 EUR)
7910 per admission
9700 per admission
3300 per case
33 per case
0.55 per day
Ratio WTP/COI
1
0.8
63
4
56
2.16  What is not measured by health metrics?

Following the arguments of the previous sections, we can summarize that


1.  Health metrics are generally following the paradigm of utility maximization and incorporate only
    one out of many sets of distributional and ethical justice.


2.  None of the  major  health  metrics covers  all of the cells presented in Table  I. While WTP
    attempts to  cover  all individually  borne  costs it  usually neglects  collectively  borne costs
    altogether36. HALYs and HYE are concerned with the individually borne intangible costs.  The
  According to ISO (1997), Life Cycle Assessment includes effects on human health, ecosystems and natural resources.
Therefore, only intangible costs would be included directly while environmental impacts due to treatment, production
loss and avertable behavior would be separately considered if relevant.
36 However, such collectively borne costs could be listed when WTP values are derived and, e.g., intangible costs may
well be included when the elicitation instruments make clear that affected family members and friends shall be
considered as well. Free-rider-problems may be expected with other collectively borne costs.
30

-------
   DALYs, based on the use of the PTO method, is the only one that may include some aspect of
   intangible costs that are borne by the society.

3.  Quality of life has probably a broader meaning than actually reflected in health metrics.

The following comments can be made with regard the importance of these issues and how to deal
with them:

ad 1:  Parts of the problems  occur because of the  general problems with aggregating individual
preferences to a social welfare function.  Whether  altruistic or  individual preferences  are more
important is a question of paradigm and not a unique problem encountered only here. As suggested
earlier, providing a desegregation of the damage score measured with a specific health metric will
make sure that distributional considerations  can be considered in decision making.

ad 2:  This finding suggests that before a human health metric is chosen, it has to be known from the
decision makers  which cells of Table I shall be covered. In many cases this would mean that HALYs
and HYE have to be complemented by information on COI and costs of averting behavior while
WTP  estimates  may need to be  complemented by  collectively borne costs. Surprisingly,  little
research results are available on intangible costs borne by the patients' family and friends and people
providing health care to the patient.  This may lead to  a systematic  underestimation  of health
damages.

ad 3:  A comfortable life, equality, an exciting life, happiness,  health,  individual  freedom, mature
love, pleasure, salvation,  security, self-preservation, self-respect, a sense of accomplishment, a sense
of community, social recognition, true friendship, wisdom, a world of beauty, a world at peace, inner
harmony are all  values that have been suggested to be important human values that contribute to a
high quality of life (Rokeach  (1973) and Kristiansen  (1985)). Although some of them are not, most
are related directly or indirectly to health conditions.  Their inclusion or exclusion may depend on the
information37 provided in the elicitation procedure.

2.17   Practical aspects
The availability  of consistently derived quality weights for a large number  of health states may be
considered as a practical  advantage, especially if the  decision support is needed within a short time
or with little resources. The following are sources for such tables known to us (see also Section 3 for
additional references to sources for environmental related diseases):

•  QALY weights (holistic  and decomposed): Quality weights  are published from the Beaver Dam
   study for 28  health conditions (Fryback et al. 1993), from the US  health census for 10 health
37 Information is understood in its broad sense including warm-up sessions, focus groups or introducing these values as
explicit attributes.
31

-------
    states (Cutler et al. 1998), from a comprehensive review of CUA studies including almost 1000
    quality weights measured by different instruments and different judge groups (Bell et al.  1999).

•   QALY recipe (explicitly decomposed):  While  some  of the weights above are  also based  on
    decomposed approaches Kaplan et al. (1988), Rosser et al. (1972), Patrick et al. (1993),  Fryback
    et al. (1997) and  Torrance et al. (1972/1986)  provide overviews  on decomposed approaches,
    their aggregation rules and suggest weights to be used.

•   HYE: No compilation is known to us.

•   DALY weights:  Several hundred  consistent disability weights  are reported in  Murray et  al.
    (1996a)  and recommended for a worldwide application.  For 56 diagnostic groups separating
    more than  100 different disease stages disability weights for The Netherlands have been derived
    (Stouthard  et al.  1997/2000). Environmental   disease related  disability  weights have  been
    provided by de Hollander et al. (1999) based on Stouthard et al. (1997). Anonymous (1999a/b)
    build on Murray et al.  (1996a) and  Stouthard et al. (1997) and add some  additional disability
    weights (by interpolation) for the specific Australian context.

•   WTP: An  overview on morbidity costs for acute and chronic symptoms,  value estimates for
    dysfunctions and a list of cause dependent VSL  is provided by Tolley et al. (1994). Most sources
    are  old,  derived in  different  contexts  and  with different elicitation methods.  Environmental
    disease related WTP estimates have recently been published or re-compiled by Magat et al. 1996,
    Alberini et al.  1997, Johnson et al. 1998/2000,  Blumenschein et al. 1999, Seethaler 1999,
    ExternE 1999, USEPA 1999a.

This incomplete compilation suggests that there are two reasonable large and consistent data sets for
world-wide  and Dutch disability weights published and that the explicitly decomposed systems to
calculate QALYs can be seen as another  source of consistent information38 for different regions
(mostly for the North America (QWB, FUJI, Rosser-Index) and Europe (EuroQoL)).

The application of  health metrics implies  also  the knowledge on  the age-distribution of  affected
individuals and the duration of diseases. For this purpose, information on incidence rates, prevalence
and additional disease-specific  knowledge  has  often  to be combined. Methodologies and simple
software tools have been developed for this matching process (Murray et al.  1996a, Anonymous
1999a and Hoogenveen et al. 2000).
38 Consistent refers to the internal consistency of the data set. However, the scales' cardinal property can often be
disputed and the health conditions to be valued need also to be consistently characterized by the quality of life scoring
instrument.
32

-------
2.18  Authorization of health metrics
Decision makers may prefer to rely on health metrics that have been authorized as standard or state-
of-the-art approach.  The global burden  of disease  study  and its disability weights performed on
behalf of the World Health Organization and the Worldbank is probably the most authorized source
for a health metric39.

On a national level Gold et al.  (1996) tried to set a standard for the USA by making a number of
recommendations that narrow down the number of alternatives to HALY-type of approaches. They
also favor TTO as the  elicitation method and recommend using a social discount rate. Since EPA
performs benefit-cost analysis (BCA) rather than CUA, they use WTP. Such governmental use of an
approach to support policy making can also be seen as an  attempt to authorize a method. The same
may hold for the Dutch burden of disease study (Melse et al. submitted).
39 This is also reflected by the many attempts to criticize the approach (most critiques focus on points that can be
criticized with almost all health metrics. More specific points have been the one of the used versions of PTO, age-
weighting, and the use of one standard life table for all countries (AbouZahr et al. 2000, Anand et al. 1997, Arnesen et al.
1999, Barendregt et al. 1996, Elbasha 2000, Hanson 1999, Mansourian 1996, Sayers et al. 1997, Williams 1999/2000).
However, the fact that Murray and Lopez replied on many critical articles (Murray et al. 1996b/l 997/2000) may also be
an indication that the approach is still within the research sphere.
33

-------
3. Comparison of DALYs,  QALYs and WTP based on an example

Table III presents a comparison of the three most widely used summary health metrics (DALYs,
QALYs and WTP) applying them to health effects  due to  environmental risk factors. From this
comparison, we expect further insights into the practical relevance of some of the theoretical aspects
discussed in Section 2. The health effects have been assessed for The Netherlands within the Fourth
National Environmental Outlook 1997-2020 and have been directly taken from de Hollander et al.
(1999).  For pragmatic reasons we excluded risk factors that are not strictly caused by (external)
environmental pollution  like accidents, environmental tobacco smoke or damp houses  and also
exclude a large number of carcinogens that contribute only little to the total health effects and add
little insight for the comparison. The remaining five risk factors1 are therefore neither a complete  set
of all environmental health effects in the Netherlands nor necessarily the most important ones. For
mortality and acute morbidity incidence data with additional estimates for the duration of diseases
have been used. The life  years lost by premature death is estimated based on  Dutch life tables that
are very similar to the  standard table used  by Murray et al.  (1996a).  For chronic morbidity,
prevalence  data has been used (see columns 3 and 4 in Table III).

We  provide  here only  best estimates without additional  information  on  the  uncertainty and
variability.  However,  many of the used sources  like  de Hollander et al. (1999), Bell et al. (1999),
Tolley et al.  (1994)  and USEPA (1999a) provide additional  information  that would allow a
probabilistic analysis.

All three health metrics could be used with or without time discounting. Here, we  analyze health
effects in the same year and discounting should  therefore not alter the presented results. However,
there are two exceptions:  (1) the neurocognitive effects of lead is the only chronic morbidity that has
been  analyzed  based on  an  incidence  basis and  (2) for mortality  the incidence  rate  was used
combined with estimates of years of life  lost. We assume here that the prevalence rate  for these
effects would  roughly be the incidence cases multiplied by the  assumed duration,  i.e.,  that these
incidence rates have been constant over the last  decades. Accepting this assumption makes that all
health effects actually happen in the same year and time discounting becomes a non-issue2.

The disability weights for the calculation  of DALYs (column  5-7)  have  been  taken from  de
Hollander et al. 1999,  where '0' stands for perfect health and  ' 1' for death. They based their weights
on  Stouthard  et  al.  (1997),  Murray et  al.  (1996a)  and  an own  panel  of  environment-oriented
physicians  adjusting for the health consequences typical for environmental exposure. The resulting
numbers in column 6 are slightly different from the  numbers in  de Hollander et al. (1999) due to
rounding errors. Age weighting, as suggested by Murray et al. (1996a) for one version of DALYs
has not been applied in de Hollander et al. (1999).
1 Long-term effects from particles smaller 10|am (PM10), short-term effects from increased tropospheric ozone levels,
impacts due to lead from drinking water pipes, traffic related noise, and health effects due to increased UV-A and UV-B
exposure caused by ozone-layer degradation
2 It needs to be reminded here that the time-tradeoff method (TTO) has an inbuilt time discounting that - in principle-
would need to be corrected for (Johannesson et al. 1994).
34

-------
QALYs are calculated in columns 8-11 using quality weights from different sources and sometimes
using the same weight as provided by  de Hollander et al. (1999) (perfect health '!', death '0'). The
quality weights are not consistent, different elicitation techniques  and groups  of judges have been
used  and  in  some cases rough  approximations had to  be made. The most  relevant assumption
concerns noise effects.  The effective health state of 'severe annoyance' has been approximated by
'anxiety'  and the 'sleep  disturbance' approximated  by 'sleep  disorders'. These are obviously
different severity levels but are the only quality weights available in the literature. Since we evaluate
the decrease in health due to environmental risk factors, the decrease in  QALYs has been calculated
(AQALYs). The values taken from Fryback et al. (1993) have been adjusted for co-morbidity. It is
assumed either that the other values have been adjusted as well, or that the effect under study is the
major health condition, or that the difference  is  minor.  However, we did  not  account for the
decreased utility of life years lost at higher ages due to co-morbidities. To do so one would need the
information on the age-distribution of the premature death that was not provided in de Hollander et
al. (1999).

The WTP values  are effectively a mixture  of WTP values (based on  contingent valuation or/and
labor market studies  and hedonic price methods for noise) and COI or  an estimate based on COL
This inconsistency is slightly  reduced by heavily relying  on one compilation of values (USEPA
1999b). All  values have  also been transformed to  1990 USD. Since USEPA (1999b) uses  in the
baseline scenario  the VSL approach without adjustment for age this assumption has been adapted.
More sophisticated approaches use age-adjusted VSL values (Seethaler 1999).
35

-------
Tab. Ill:  Health consequences for five environmental risk factors evaluated by three different health metrics (only best estimates are
          shown, number of given digits does not suggest that these are significant digits)
Risk
factors

PMIt






03






Lead (*)
Noise





Ozone
depletion





Health effects

mortality total
mortality
cardiopulmonary
mortality lung cancer
chronic respiratory
symptoms, children
chronic bronchitis,
adults
Total
mortality respiratory
mortality coronary
heart disease
mortality pneumonia
mortality other
hospital admission,
Respiratory
ERV, Respiratory
Total
Neurocognitive
development (1-3
IQ-points)
Psychosocial effects:
severe annoyance
Psychosocial effects:
sleep disturbance
Hospital admissions
IHD
Mortality IHD
Total
Melanoma morbidity
Melanoma mortality
Basal
Squamous
other mortality
Total
incidence or
prevalence
cases per year

7114
8041
439
10138
4085


198
1946
751
945
4490
30840

1764
1767000

1030000
3830
40

24
7
2150
340
13

duration
[a]

10.9
8.2
13
1
1


0.25
0.25
0.25
0.25
0.038
0.033

70
1

1
0.038
0.25

6.9
23
0.21
1.5
20.2

Total
mortality
morbidity






disability
weight (a)

1
1
1
0.17
0.31


rO.7
rO.7
rO.7
rO.7
0.64
0.51

0.06
0.01

0.01
0.35
rO.7

0.1
1
0.053
0.027
1




DALYs
(a)

77543
65936
5707
1723
1266

152176
35
341
131
165
109
519
7300
7409
17670

10300
51
7
28028
17
161
24
14
263
478
189390


DALYs[%]

40.94%
34.81%
3.01%
0.91%
0.67%

80.35%
0.02%
0.18%
0.07%
0.09%
0.06%
0.27%
0.69%
3.91%
9.33%

5.44%
0.03%
0.00%
14.80%
0.01%
0.09%
0.01%
0.01%
0.14%
0.25%
1
79.35%
20.65%
quality Ref
weight

0
0
0
0.86 d
0.86 d


0
0
0
0
0.56 g
0.49 j

0.94 j
0.91 d,l

0.92 d,o
0.56 g
0

0.7 p
0
0.947 j
0.973 j
0

QALY QALY

77543 19.28%
65936 16.40%
5707 1 .42%
1419 0.35%
572 0.14%

151177 37.59%
50 0.01%
487 0.12%
188 0.05%
236 0.06%
75 0.02%
519 0.13%
1554 0.39%
7409 1.84%
159030 39.54%

82400 20.49%
64 0.02%
10 0.00%
247504 60.05%
50 0.01%
161 0.04%
24 0.01%
14 0.00%
263 0.07%
511 0.13%
402155 1


37.44%
62.56%
WTP or Ref
COI
[1990S] per
case
4800000 b,c
4800000 b,c
4800000 b,c
28946 b,e
28946 b.e.f


4800000 b,c
4800000 b,c
4800000 b,c
4800000 b,c
6000 b.h.i
194 b,k

10005 k
265 m,n

265 m,n
9000 b.h.i
4800000 b,c

8218 q
4800000 b,c
4696 q
8218 q
4800000 b,c




WTP or
COI [Mio
1990S]

34147
38597
2107
293
118

75263
950
9341
3605
4536
27
72
18531
18
468

273
34
192
968
0
34
10
3
62
109
94888


WTP/COI

35.99%
40.68%
2.22%
0.31%
0.12%

79.32%
1.00%
9.84%
3.80%
4.78%
0.03%
0.08%
79.53%
0.02%
0.49%

0.29%
0.04%
0.20%
7.02%
0.00%
0.04%
0.01%
0.00%
0.07%
0.77%
1
98.61%
1.39%
(*) from drinking water pipes
a) de Hollander et al. 1999
b) based on USEPA (1999) in 1990$. Most values are based on incidence cases and refer to health effects due to air pollution.
c) This central estimate is slightly higher than the values in Tolley et al. 1994 but derived based on a large body of literature reviewed in USEPA
(1999a). However, Seethaler (1999) argue that the underlying studies have been biased and use values derived by  a chained approach (Carthy et al.
1999) for road accident victims and adjust those values for the higher age of air pollution victims ending up with 0.9 million EUR( 1996). This value is
about 5 times lower than value suggested by USEPA (1999a).
d)Fryback et al. 1993 (TTO, general public >45a). Since the data allows correcting for co-morbidity, we subtract the mean for persons affected by the
condition from the mean for the persons unaffected by the condition.
e) USEPA (1999b) bases their values on incidence, therefore the value of Tolley et al. (1994) is taken for a yearly value and adjusted from 1991$ to
1990$ by multiplying by 0.965.
f) Viscusi et al. 1991 derived a total value of $516000-904000 (adjusted to 1990$, based on two different elicitation methods)), considering discounting
of future years this equals to an assumed duration of CB of >20a, which is confirms the order of magnitude. Krupnick et al. 1992 (see Viscusi  1993)
estimate a media value of $496800-$691200 in 1990$, again the same range.
36

-------
g) Sackett et al. 1978, value based on 3 months of 'Hospital confinement for an unnamed contagious disease'. Based on data one expects higher quality
for shorter admission time and known rather than unknown cause. (TTO, general public)
h) Derived by dividing the total mean welfare benefits in Table H4 by the change in incidence of cases in Table D-21 which results in costs per
admission (not per day) (USEPA 1999b).
i) These values are comparable to what is used in other studies (Seethaler et al. 1999) but inconsistent with findings of Johnson et al. (2000) who find
much lower values below 1000$. They use conjoint analysis and different duration periods. One day alone would account for 535 1997 Can$.
Multiplied by typical durations of 11 to 14 days would result in slightly lower values than reported by USEPA (1999a). However, the COI given in
Seethaler et al. (1999) are alone about the same amount as the WTP reported.
j) No appropriate quality weights have been found in the literature, therefore the disability weight from de Hollander et al. 1999 has been used here.
k) Levin (1997) estimates the damage due to a decrease of one IQ point to be a loss in future earnings of 1.76% or $4600 (1988). We double this value
for an average loss of two IQ points and adjust for 1990$ with a factor of 1.0785.
k) The given value is the COI for one ERV. However, Hollander et al. (1999) describe the health state as a weighted average of duration of
exacerbations requiring ERV or hospital admission. Therefore, we assume we multiply the COI by the given duration.
1) Assumption that 'annoyance' can be described with 'anxiety', which is obviously a different severity level
m) Banfi et al. (2000) estimate the traffic related WTP to avoid disturbance by noise for the Netherlands using both hedonic price methods and
contingent valuation and assuming a threshold of WTP at 55 dB(A). This results in 1087 million ECU(1995) per year (=740 Mio US$ 1990).
n) The total WTP to avoid disturbance from traffic noise is allocated to severe annoyance and sleeping disturbance assuming that these cases have an
equal severity (as suggested by QALY and DALY). This results in (740 Mio US$/2.797 Mio cases = 265 US$ per case and year).
o) Assumption that 'sleep disturbance' matches 'sleep disorder', which is obviously a different severity level.
p) Bell et al.  1999 cite 216 (author judgments) and 249 (clinician judgment). Metastatic conditions and recurrent melanoma get both an average of 0.5,
treatment causes quality weights of 0.7-0.8 and remission after surgery 0.9. An average weight of 0.7 is assumed.
q) Dickie et al. 1996 find WTP to avoid skin cancer cases in the range of $720-1200. However, they cite an EPA study that report medical treatment
costs for basal and squamous cell carcinomas cost $4000 and $7000 respectively. We adjust these values for 1990$ and take the higher value as well
for the melanoma. All costs are per case.
r) This disability weight applies to a period of disease before death plus the period of the premature death.


Based on these  assumptions it  was  possible to calculate the  total  DALYs,  QALYs  and cost

consequences due  to  the five risk  factors and to  compare their relative shares between the health

metrics.



The following insights are important:



•   The resulting DALYs and loss of QALYs can be compared to  about 15 million years of life lived

    per year in The Netherlands. Therefore, the relative share of the burden of disease for these five

    environmental risk factors together compared with  the  total years of life lived lies between 1.3%

    (DALYs)  and  2.7%  (QALYs).  The  health risk costs of 95 billion  USD (almost completely

    intangible costs) amount to about 30% of the Dutch GDP in 1990! The magnitude of this amount

    suggests either that major budget adjustments are warranted or that the value of a statistical life is

    less in this application or that the estimate of particle  related health effects are too high.



•   The share of (premature) mortality on the total health burden varies from 37% for QALYs, 79%

    for DALYs to 98.6% for  WTP/COI. The difference between QALYs and DALYs may  be biased

    by our assumptions on the quality weights for noise  and the DALYs value may be  the better

    estimate.  Therefore, we can conclude that all health  metrics are heavily influenced by  mortality

    outcomes  but  that  in this  application  WTP/COI  seems  to  make   a morbidity assessment

    unnecessary (last column Table IV).



•   The assessment of the relative  importance of noise  is very  different between the three metrics

    (DALYs  15%.  QALYs  60%, WTP  1%).  We already  mentioned  that  the quality weight for

    QALYs was based on a crude assumption. A separate study to elicit  such values or the  use of an

    explicitly  decomposed instrument would be  needed  to improve this estimate. The  disability

    weights for sleep  disturbance and severe  annoyance derived by de Hollander et al. (1999)  have

    been  0.01. Miiller-Wenk (1999) derived for the same  endpoints  disability weight using a small


37

-------
   convenience panel  of six  physicians.  The mean  weight  was  0.048 for  communication
   disturbances and 0.05  for sleep disturbance and a larger study that is more representative has
   been planned. The example  of noise shows  how the relative importance of a mild morbidity
   outcome is very sensitive on the quality weights and metric used. This special situation usually
   does not occur in medical decision making. The reason for the high sensitivity is, first, the large
   number of affected people and secondly by the large relative impact of uncertainties in small
   changes of the quality weights. In Sections 2.8 and 2.9 it was mentioned that most methods work
   worse for outcomes of low severity, since people are reluctant to trade premature death for mild
   disabilities at all and since the trade-off numbers get either very large (PTO) or very small (TTO,
   SG) or beyond  the possibilities of graphical methods (VAS). This will be further discussed
   below.

•  The increased mortality rate due to increased ozone levels is considered to affect old or already
   sick people.  This fact  is reflected in the DALYs and QALYs calculations and leads to minor
   health damages. However, if VSL is used without age-adjustment, increased  ozone levels are
   (probably wrongly) judged very relevant.

•  Increased UV-A and UV-B radiation is so far no problem in The Netherlands, only few cases
   occur and the mortality rate is very low.  Uncertainty in the morbidity weights and costs hardly
   influence the outcome. The  same holds  true for most morbidity outcomes  (not for noise and
   neurocognitive effects), where uncertainty in the morbidity weights or costs hardly matter.

•  While the rank order is stable between DALYs and QALYs (only noise gets different ranking
   which may be an artifact), the WTP suggest that increased ozone level should get high attention
   while lead exposure from drinking water pipes is a very minor problem (see Table IV). This rank
   order reversal is due to the dominance of mortality rates in the WTP approach.

Tab. IV: Rank order of the five environmental risk factors if evaluated by different health metrics

Long term effects of PM-io
Increased tropospheric ozone concentrations
Lead from drinking water pipes
Noise
Increased UV levels due to stratospheric ozone depletion
DALYs
1
4
3
2
5
AQALYs
2
4
3
1
5
WTP/COI
1
2
5
3
4
Mortality
1
2
5
3
4
•  The ranking of risk  factors and the discussion  above was based on the utility maximizing
   paradigm. However,  these health  damages are not equally distributed among the population.
   Major health damages due to exposure to fine particles and ozone occur at  higher ages or in
   already  sick  people, lead poisoning  affects a  small number  of children with  life-long
   consequences, noise  affects those who cannot afford a living/working place free from traffic
   noise, and ozone depletion affects the group of people with fair skin or extensive exposure to the
38

-------
   sun (sun-bathing, construction workers, farmers, etc.). Will this additional information on the
   affected population alter the ranking? Let us reconsider some of the arguments summarized  in
   Section 2.14:

      -  Improve situation for worst-off and support survival. This would suggest that the mortality
        rate  should be reduced and would support the ranking derived by WTP.

      -  Support high  realization-potential group. The  largest realization-potential can be found
        among  health risks causing premature death with many years of life lost like the cancer
        cases due to ozone depletion and  mortality by long-term  effects  of particulate exposure.
        This may give a higher priority to prevent ozone depletion than suggested by Table IV.

      -  Improve situation for young and innocent. Here we assume  that all subjects are  equally
        innocent since the considered  environmental  risk factors are only loosely attributed  to
        lifestyle factors (maybe with the exception of sun-bathing). All risk factors affect children
        and  young adults  to some extent. However,  neurocognitive  effects from lead poisoning
        may be considered as typical risk factors affecting children and should get a higher priority
        than suggested by the WTP metric.

      -  Allow for fair-innings. This criterion would need a reanalysis  of the data with a threshold-
        age  of 70 to 75 years beyond which  health loss would not be considered. Health damages
        due  to particulate and ozone exposure would drop dramatically in such an analysis. Other
        health risks would probably be less affected.

      -  Income should not matter. Since we assumed that impacts are distributed across population
        uniformly, we assumed  the distributional concerns  "away". However, since  the WTP
        values for noise have mostly been derived from hedonic price methods we have an estimate
        of how much noise one socio-economic group (home-owners) is ready to trade for money
        but the  same information is not available for the other income groups.

      -  Correct for double jeopardy. None of the considered environmental risk factors is supposed
        to affect physically handicapped individuals  more  than  non-handicapped. However,
        respiratory symptoms and premature  death due to particulate exposure and ozone is known
        to affect already sick people to a larger extent.

      -  Consider overlooked dimensions.  It is  not obvious that important  characteristics of the
        included health endpoints are overlooked by the used health metrics.

   These  different distributional concerns point partly in different  directions but may suggest that
   lead poisoning and ozone depletion may get slightly more importance than suggested by all
   health  metrics. We suggest here  that similar result discussions should be offered to the decision
39

-------
   maker.  A more formalized  procedure would calculate the relative share of the health metric
   distribution among the different disadvantaged groups.

•  The data need for quality weights (QALY) and WTP values could not be fully satisfied by the
   literature and the compiled data are inconsistent. The data basis for environmental health  is
   presently probably best for DALYs.

In addition to the insights summarized above there are a few points worth mentioning that are
potentially important but did not show up in our example:

•  Time discounting was excluded by design.

•  Age weighting is often  applied in DALYs as a correction function of the statistically expected
   years of life  lost (Murray et al. 1996a). However, as discussed in Section 2.10, their proposal
   reflects neither empirical findings nor theoretical models. Age-dependent values or utilities  of
   life to be used in a prescriptive or even  normative setting may need to be based on a societal
   consensus.  It may well follow the ethical principles that either each year of life lost is of equal
   value or that each (remaining) life is of equal value.

•  The remaining statistical life expectancy at the time of death is chosen to be the same for DALYs
   and QALYs in our example. However, DALYs as suggested by Murray et al. (1996a) have been
   developed for international applications with the aim to attribute all health losses to diseases. To
   do so they needed to state  a number of equity assumptions that resulted in a life expectancy
   function that depends only on sex and age. This attribution mode is different from the change
   mode we are interested in most environmental applications (e.g., reduction of health damages
   thanks to clean air act or net health benefits of improved drinking water treatment). We are often
   interested in changes of risks. However, this is not an inherent limitation of DALYs but rather a
   matter of assumptions.

•  The QALY framework  suggests not only to control for co-morbidity when quality  weights are
   developed  for specific diseases but also to consider age-specific co-morbidity of the general
   population when the years of life lost due to premature death are calculated. Due to the lack  of
   access to the age-profiles of premature deaths, we did not correct for them. However, data  in
   Fryback et al. (1993) suggests, that a woman's year lost at the age of 65-74, 75-84 and  85+
   should be counted only as 0.83, 0.79 and  0.8 respectively3. This is probably the appropriate  way
   to deal with the question of marginal changes addressed in the point before and suggests that the
   number of QALYs due to mortality has been overestimated in our example.

•  Both, the DALYs and  QALYs include only the individually borne intangible costs.  At least
   collectively and individually  borne costs of illness  should be  added  in a  comprehensive
   assessment. WTP based on individually borne costs  may be complemented by information on
   collectively borne costs.
3 Measured by TTO. Men's values are 0.84, 0.84 and 0.82 respectively.
40

-------
The two most stunning results directly derived from our example are the insensitivity of WTP to
morbidity outcomes and the huge effect of uncertainties in the assessment of mild diseases. Both
findings deserve further research:

    •  For the insensitivity of WTP three main problems need to be resolved: (1) age dependent
       VSL for environmental risks need to be further explored and developed for different cultural
       and economic settings; (2) the valuation of acute and chronic  morbidity  outcomes due to
       environmental risks needs to be further explored; and (3) the often observed insensitivity of
       WTP to magnitude of risk reduction (Hammitt et al. 1999). Promising developments that use
       chained approaches (Carthy et al. 1999, Viscusi et al. 1991) or attribute based stated choice
       analyses (Johnson  et al. 1998) might ease the dollar-risk trade-offs.

    •  For the disability and quality weights for mild illnesses we need to address the findings that
       people  are not ready to trade  life for them  and that  some  elicitation  method compel
       respondents to use very low probability numbers for mild  illnesses, i.e., quantify something
       human  beings  proved to fail. To ask for tradeoffs between different more or less mild
       illnesses may solve both problems as suggested in Pinto  Prades (1997). Further, for mild
       disabilities with long durations like noise or reduced neurocognitive development time-non-
       proportionality due  to adaptation  and adjustment may  have  decisive  influence  on  the
       outcome.
41

-------
4. Characterization of medical applications and environmental tools

The review of the literature in Section 2  revealed a tremendous number of different metrics and
within these metrics different elicitation methods, judges and assumptions are used. One obvious
reason for this variety is the many different applications within medical decision making and health
economics.  Table V attempts to characterize some  of the major applications in medical and
environmental decision support1 using the following attributes:

   Type of diseases: Since it makes a difference for a metric whether only chronic or mostly acute
   health outcomes have to be assessed, we use this attribute for characterization.

-  Need for monetary units: If it is likely that changes in health status will be evaluated in a cost-
   benefit framework this favors monitization of health impacts.

-  Identifiability of victims and veil of ignorance: These two attributes are correlated and should be
   read  together.  These  attributes  determine  whether  additional characteristics  (disabilities,
   profession, family circumstances etc.)  that may influence the disability weights should/can be
   taken into account and whether a purely individual or societal perspective is more appropriate.

-  Authoritative status: If an assessment needs to be authoritative then the metric used needs to be
   acceptable not only by single decision makers but by the society at large.

-  Affected generations:  If future generations are affected then the debate on  appropriate time
   discounting becomes very relevant. Further, the disability weighting should be done disregarding
   any socialized handicaps that cannot be predicted for future generations.

-  Distributional requirements: Since the discussed health metrics follow the  paradigm of utility
   maximization it is important to see in which applications this maximization may be sufficient and
   when additional distributional/ethical requirements will need to be considered.

As demonstrated in Table V  these attributes differentiate well between the listed applications and
tools  and none of the medical applications  fits exactly with one of the environmental tools. The
clinical decision support for single patient does not fit at all with the environmental tools. Therefore,
health metrics  developed for  "bedside  reasoning"  may not  be  relevant  for  environmental
applications.
1 Descriptions and characterizations of the environmental tools can be found in Hofstetter et al. (2002).
42

-------
Tab. V:  Medical and environmental decision support tools and their different attributes that may be relevant for the selection of
         congruent health metrics.
Applications:
Type of
diseases
Need for
monetary
units?
Identifia-
bility of
victims
Veil of
ignorance
Authoritative
status
Affected
generations
Distributional
requirements
Medical decision
support
Clinical decision support
for single patient
In principle all,
but per
application
only few
yes
                                                                    lifted        none
Technology /product
assessment for
pharmaceutical
companies and health
care providers
Tool for resource
allocation of health
insurance or national
health planning plan
Global health monitoring
and resource allocation
(Global Burden of
Disease)
Environmental decision
support tools:
Micro-tools: Life Cycle
Assessment
Meso-tools:
(Comparative) Risk
Assessment for
Technology Assessments
Macro-tools:
(Comparative) Risk
Assessment for regulation
Macro-tools: Cost-Benefit
Assessment for regulation
In principle all, sometimes partly
but per
application
only few
all sometimes no
all no no


Many chronic no no
diseases
(including
episodic)
Few, mostly no no
chronic
diseases
Few, mostly sometimes partly
chronic
diseases
Few, acute yes partly
and chronic
diseases
mostly
lifted
applies
applies


applies
applies
mostly
lifted
mostly
lifted
none own
national/ own plus
binding next
international/ own plus
not binding next


usually none >1 00 years
none or own, 50a,
limited >100a
national/ own
binding
national/ own
binding
none
important (age,
race, economic
status, disabled)
important (age,
race, economic
status, disabled)


Intra- and
intergenerational
may be relevant,
sensitive
subgroups
relevant,
sensitive
subgroups
relevant,
sensitive
subgroups
43

-------
5. Consequences for the choice of metrics in different applications

How do the characteristics of the application or tool determine the choice for human health metrics?
Table V illustrated some of the differences between and within the medical applications  and the
environmental tools. How does this affect the choice of the metric, the elicitation method to derive
preferences, the group for preference elicitation, time discounting,  and the type of life tables to be
used? Table VI summarizes our recommendations for the choices to be made according to Section 2
based on the characteristics summarized in Table V. The following arguments were used to come up
with recommendations:

-  Life Tables: The need for appropriate spatial and temporal coverage and the  (im)possibility to
   identify subgroups with non-average mortality risks have been the guiding attributes to determine
   the appropriate life tables.

   Whose values?: Patients'  preferences about their own  disease are  always important but may
   become impractical when  a large number of different health outcomes need to be evaluated. In
   such cases, health care professionals may provide the necessary relative comparison. Depending
   on the degree  of how socially binding the metric needs to be an additional representative panel
   may need to be formed (Nord 1999).

   Time preference: The level of individual versus societal decision making and the importance of
   intergenerational  aspects  were the  guiding principles.  The  mentioned  discount  rates  are
   illustrative for the range and do not imply that an exponential discount function needs to be
   chosen. It is also assumed that the future increase of value  of HALYs and statistical life are
   considered. The zero discount rate for Life Cycle Assessment is based not only on the very long
   assessment horizon but also on present practice, where increase in future life expectancies are not
   considered.

-  Preferred elicitation method: The  main difference is here whether  monetary or non-monetary
   values are derived. Further, the time trade-off (TTO) method with an adequate time horizon or
   the person trade-off method (PTO) with application compatible framing of the question have
   been judged to outperform other methods for the individual and societal application respectively,
   although the standard gamble often provides a more realistic description of the choice.

-  Level of measurement: The better the social  environment of the affected group  is known the
   more these parameters should be  included in the elicitation step (handicap  level).  If a large
   number of different social environments  have to be covered or if future environments are
   unknown then a disability level is preferred.

-  Preferred metrics: Both monetary  and non-monetary metrics have flaws for valuation of both
   mortality and morbidity. However,  since monetary methods require not only a health/health but a
   health/wealth  tradeoff they are  cognitively  more demanding  than  non-monetary metrics.
   Therefore, we suggest using them only when monetary units are desirable1 as  a measurement
   unit. "HALYs+" stands for Health Adjusted Life Years with age weighting. We use this notion
1 "Desirable" stands for decisions where trade-offs between human health and monetary expenditures are at stake.
44

-------
   because the column headings above specify most of the specific features that would differentiate
   between QALYs and DALYs and because the age weighting to  be used deviates from the
   standard procedure in the DALYs framework. For environmental applications, we also suggest to
   supplement the  HALYs+ with cost of illness. HYE are not considered preferable because
   empirical  experience and  data are lacking. However, this metric may well  be developed for
   environmental applications where the number of relevant health outcomes is limited.

-  Marginal/average and distributional aspects: If we are interested in the analysis of changes due
   to an intervention compared to a reference situation, e.g., present situation, then we call this a
   marginal analysis (where all other risk factors are kept constant). If the distributional aspects will
   play a major role in the decision making, we suggest to calculate the health metric scores for all
   relevant sub-groups and to add a semi-quantitative discussion.
We are aware that the recommendations in Table VI may be challenged in specific applications for
arguments that could not be captured on this generic level. We also expect major developments in
the areas of WTP that may alter our assessment within the coming years. Finally, we will list some
strengths and weaknesses of the suggested metrics in the concluding Section 6.
45

-------
Tab. VI:  Recommendations for the choice of human health metrics and their specific assumptions.
Applications:


Medical decision
support
Clinical decision
support for single
patient

Technology/product
assessment for
pharmaceutical
companies and
health care
providers
Tool for resource
allocation of health
insurance or
national health
planning plan

Global health
monitoring and
resource allocation
(Global Burden of
Disease)





Environmental
decision support
tools:
Micro-tools: Life
Cycle Assessment






Meso-tools:
(Comparative) Risk
Assessment for
Technology
Assessments

Macro-tools:
(Comparative) Risk
Assessment for
regulation


Macro-tools: Cost-
Benefit Assessment
for regulation


Life Table to
calculate YLL



Clinical
estimate
based on
diagnosis
Disease
group-
specific,
future-
oriented

Regional/
national life
tables,
present or
future

Universal life
table for
monitoring,
Future-
oriented
regional/
national life
tables for
resource
allocation



Future-
oriented
regional life
tables




Group/area-
specific (all
levels
possible)


Present/
future
national life
tables


Present/
future
national life
tables

Whose values




Patient



Patients or
health care
professionals


Patients or
combined
patients/
societal
values

Health care
professionals
or large
sample of
combined
patients/
societal
values





Health care
professionals
or large
sample of
combined
patients/
societal
values
Depends on
context




Patients or
combined
patients/
societal
values

Patients or
combined
patients/
societal
values
Time
preference
(discount rate)


Individual
(rates vary
from -x% to
plus 100%)
Market (1-
10%)



Market/so ciet
al(1-10%)




Societal (1-
5%)











None (0%)







Societal (1-
5% or
different for
longterm)


Societal (1-
5%)




Societal (1-
5%)



Preferred
elicitation
method


TTO,
transformed
VAS,
decomposed
TTO
CV revealed
preferences,
attribute-
based stated
choice
PTO





PTO












PTO







Depends on
context




PTO, CV,
revealed
preferences,
attribute-
based stated
choice
CV, revealed
preferences,
attribute-
based stated
choice
Level of
measure-
ment


Handicap



Combined
disability/
handicap


Combined
disability/
handicap



Disability












Disability







Combined
disability/
handicap



Combined
disability/
handicap



Combined
disability/
handicap


Preferred
metrics



Non-
monetary


HALYS+ or
WTP



HALYs+





HALYS+












HALYs+







HALYs+
plusCOl,
WTP plus
collectively
borne
costs
HALYs+
plusCOl,
WTP plus
collectively
borne
costs
WTP plus
collectively
borne
costs

Remarks




Marginal
analysis


Marginal
analysis



Distributional
aspects
important,
mostly
marginal
analysis
Average
analysis for
monitoring,
distributional
aspects and
marginal
analysis
important for
resource
allocation



Marginal
analysis






Distributional
aspects
important,
marginal
analysis

Distributional
aspects
important



Distributional
aspects
important,
marginal
analysis
46

-------
6. Discussion and Conclusions

This report's attempt to transfer insights from medical decision making and health economics into
environmental decision support tools has proven  to be fruitful. The summary and review  of the
respective literature  made clear that not only the choice of the metric is  important (whether time-
proportionality  is assumed (HALYs) or not (HYE, WTP) and whether the units  are  monetary in
nature or not) but that it is particularly  important which empirical choices (e.g., life table, time
discounting, elicitation method, elicitation question and elicited group) are finally  made within the
metric. A summary of strengths and weaknesses of three of the most often applied metrics is given in
Table VII.
Tab. VII: Major strengths and weaknesses of three often applied human health metrics
Selected health
metrics
Strengths
Weaknesses
DALYs
consistent sets of disability weights for
environmental endpoints readily available
age-weighting
societal perspective of disability weights and
major assumptions
metric unit is framed as health loss
assumption of time proportionality and risk-
neutrality
lacking consideration of COI and collectively
borne intangible costs
implementation and shape of age-weighting
QALYs
methods and limited set of quality weights
available
individual perspective of disability weights and
major assumptions
data for co-morbidity-adjustments at higher
age available
assumption of time proportionality and risk-
neutrality
lacking consideration of COI and collectively
borne intangible costs
no age-weighting
WTP
metric is easier comparable to other attributes
relevant in a decision process
time-proportionality and risk-neutrality is not
implied
individually borne COI and intangible costs are
considered
the methods and the set of values applicable
for environmental endpoints need further
research
collectively borne COI and intangible costs
are not included
dollar/health-risk tradeoffs provoke protest
bids and refusal (possible sign for non-
compensatory nature of goods) and are
more demanding than health/health
tradeoffs
A case study that applied three different health metrics (DALYs, QALYs and WTP) to the example
of environmental health impacts in The Netherlands revealed the empirical relevance of the choice
of monetary versus non-monetary methods and the sensitivity of the results to mild distortions that
affect large shares of the population (e.g., noise impacts, allergies, effects of endocrine disruption).
Further, it has been noticed that the availability of databases with consistent preference values for
health  outcomes  differs for the  three  metrics  where  the DALYs  offers  presently  the  most
comprehensible publicly available database.


The characterization of both medical and environmental decision support systems showed that their
attributes vary largely  within and between  these groups. This  may explain the large number of
47

-------
suggested health metrics and the different versions of the same metrics. Since the characteristics and
assumptions of an application or tool should be congruent with the characteristics and assumptions
of the health metrics, we indicate ranges of features of metrics that are compatible with the different
applications.  These recommendations  (Table  VI) remain  preliminary, as  the  science  is  in
development and the chosen categorization of applications probably too rough.

For the application in the environmental field we learn that the present state-of-the art in WTP leads
in our example to a pure mortality assessment that may be  an artifact due to  the lack of reliable
values for age-dependent  statistical values of life and  due to insufficient studies that assess WTP
values for morbidity outcomes. Further, HALYs are heavily sensitive to the preference weights for
mild health outcomes. Since many elicitation methods are unable to deal adequately with mild health
outcomes, this needs special attention in any analysis. Since the valuation of premature mortality has
been shown (empirically and theoretically) to be age dependent but not proportional to the years of
life lost,  age weighting may  be a relevant characteristic to be considered. Their application in the
environmental arena makes this point even more important since  many environmental risk factors
affect old people only while some affect children only or the average population.

A further implication of our  analysis is that indeed - as criticized by  many authors - most health
metrics follow the philosophy of utility  maximizing. Since decision makers may want to base their
decisions not only on a utility metric but also on insights how different  ethical and distributional
modifications would affect the outcome, we suggest a semi-quantitative discussion that evaluates the
influence of the  following  aspects: Who  are  the  worst-off, which group  could profit most
(realization-potential); what is the age-distribution; who are the innocent; what changes  if patients
below the age of 70 or  75  are saved first  (fair innings);  does  the  income  matter;  are already
disadvantaged subgroups concerned  (double jeopardy); and have  case-specific  valuation attributes
been overlooked by the generic health metrics?  (See Section 3 for such a discussion based on our
case study). These considerations are usually already made in today's  decision making. Therefore,
this semi-quantitative analysis will provide the hard data to support these considerations and does
not replace the purpose of deriving a utility measurement.

Our analysis of the application of human health metrics to environmental decision support tools was
limited in detail and scope that leave open a number of potentially important questions:

•   For morbidity outcomes,  we have not  studied the empirical relevance of the fact that the time-
    proportionality assumption made in HALYs does  practically  not hold. Potentially useful data
    collected within  WTP  studies  is difficult to use because morbidity outcomes  do not (in our
    example) matter in WTP  estimates and secondly, the effects of scope insensitivities do interfere
    with time-non-proportionality and are difficult to separate.

•   We have  not investigated the  practical  relevance  in  differences  in quality  weights for
    environmental  applications.  These differences would  be due  to different elicitation methods,
    different question framings, or different  groups of respondents. To  investigate these possible

48

-------
   differences, an  empirical  study would be  needed that derives values for these different
   combinations on a reasonable number of environmentally relevant human health endpoints.

•  Since environmental decision support systems sometimes capture effects that are predicted to
   occur in the distant future one would need to develop life tables, trend estimates for population
   development, quality weights in a future world with new medical treatment possibilities, future
   increases in the value of HALYs and  statistical life, and probably  most importantly, a time
   discounting framework that would reflect intergenerational preferences held by concerned
   stakeholders.

•  Only a  relatively small number of environmental decision support tools have been considered.
   However, the chosen applications are probably those that attempt to estimate health impacts on a
   disease  or disorder level including duration and number of affected individuals.

•  The availability of information on  disease type and disorder, age  of onset  and duration of
   disease, and number of affected individuals has been assumed. However, we did not discuss how
   and when this information can be derived nor did we show how some of this data could be
   estimated.

•  We also did not include all types of environmentally caused human health impacts that may
   become important in single case studies. Especially, we left out issues  like developmental and
   fertility  effects  due to endocrine  disrupters, hereditary effects due to ionizing radiation or
   development effects in fetus due to environmental causes.

•  Further, we did  not  address  the question whether  simple  exchange  rates  or transformation
   functions between different metrics exist. In the medical applications a rule of thumb says that a
   treatment or new drug should not cost more than 50,000 to 100,000 US$  per QALY (Hammitt
   2000b). Such rules of thumb suggest that such transferability does exist. However, the case  study
   in Section 3 and the different assumptions on time-proportionality and age-weighting make clear
   that such a straight forward exchange rate does not exist.

Next to the analysis and research into the mentioned limitations of this article we suggest to work on
the following research questions  due to their  demonstrated relevance for  environmental decision
support systems:

1. Age-dependent statistical value of life or utility-adjusted years of life lost have been shown to
   reflect best both public values and outcomes of theoretical  life-cycle models.  The application of
   these insights was used, e.g., in Seethaler et al. (1999) to estimate age-dependent VSL, and  some
   applications based on Murray et al. (1996a) take age weighting for DALYs into account as well.
   However, in both  cases the underlying evidence for the shape of the age-adjustments are weak,
   their slopes contradict each other,  and they require, due to their  practical relevance,  more
   investigations.  Since  these  age-adjustment  functions  may look very   different  for single

49

-------
   individuals, studies must either include large samples or define subgroups or contexts that allow
   more homogenous answers.

2.  Quality and disability weights for distortions or mild illnesses that are caused by environmental
   risk factors need to be assessed with a special emphasis on the potential biases introduced by the
   commonly used VAS, TTO, SG and PTO elicitation methods.

We hope that this article will contribute to better understanding of the differences between available
health  metrics and a more informed choice of metric by practitioners. In addition, we hope it will
stimulate additional research to help resolve some of the remaining conceptual and practical issues in
measuring health for use in environmental decision support tools.
50

-------
References
AbouZahr C & Vaughan JP. (2000). Assessing the burden of sexual and reproductive ill-health: questions regarding the
      use of disability-adjusted life years. Bulletin of the World Health Organization, 78 (5), 655-664.
Adamowicz W, Louviere J & Swait J. (1998). Introduction to Attribute-based Stated Choice Methods. Prepared for the
      National Oceanic and Atmospheric Administration, NOAA Purchase order 43AANC601388 by ADVANIS,
      Edmonton (Can).
Alberini A, Cropper M, Fu T-T, Krupnick A, Liu J-T, Shaw D & Harrington W. (1997). Valuing health effects of air
      pollution in developing countries: the case of Taiwan. Journal of Environmental Economics and Management, 34,
      107-126.
Alderson M. (1988). Mortality, Morbidity and Health Statistics. New York, NY: Stockton Press.
Anand S & Hanson K. (1997). Disability-adjusted life years: a critical review. Journal of Health Economics, 16, 685-
      702.
Andersson F & Lyttkens CH. (1999). Preferences for equity in health behind a veil of ignorance. Health Economics, 8,
      369-378.
Anonymous. (1994). Asthma TyPE, Health Outcome Institute.
Anonymous. (1999a). Victorian Burden of Disease Study: Morbidity. Melbourne, Victoria: Public Health Division.
      http://www.dhs.vic.gov.au/phd/9909065/index.htm (11/09/00).
Anonymous. (1999b). Victorian Burden of Disease Study: Mortality. Melbourne, Victoria: Public Health Division.
      http://www.dhs.vic.gov.au/phd/9903009/index.htm (11/09/00).
Arnesen T & Nord E. (1999). The value of DALYs life: problems with ethics and validity of disability adjusted life
      years. BMJ, 319, 1423-5.
Arrow K, Solow R, Portney PR, Learner EE, Radner R & H Schuman (1993). Report of the NOAA Panel on Contingent
      valuation. Federal Register 58, 4601-4614.
Azimi NA & Welch HB. (1998). The effectiveness of cost-effectiveness analysis in containing costs. J Gen Intern Med,
      13, 664-669.
Bala MV & Zarkin GA. (2000). Are QALYs and appropriate measure for valuing morbidity in acute diseases? Health
      Economics, 9, 177-180.
Bala MV, Wood LL, Zarkin GA, Norton EC, Gafni A & O'Brien B. (1996). Testing constant proportional trade-off
      assumption using standard gamble.  Clinical Therapeutics, 18, 31.
Bala MV, Wood LL, Zarkin GA, Norton EC, Gafni A & O'Brien B. (1998). Valuing outcomes in health care: A
      comparison of willingness to pay and quality-adjusted life-years. J Clin Epidemiol, 57(8), 667-676.
Banfi S, Doll C, Maibach M, Rothengatter W, Schenkel P, Sieber N & Zuber J. (2000).  External Costs of Transport:
      Accident, Environmental and Congestion Costs of Transport in Western Europe.  INFRAS Zurich/I WW
      Karlsruhe.
Barendregt JJ, Bonneux L & van der Maas PJ. (1996). DALYs: the age-weights on balance. Bulletin of the World Health
      Organization, 74 (94), 439-443.
Baron J. (1996). Why expected utility theory is normative but not prescriptive? MedDecis Making, 16, 7-9.
Baron J. (1997). Biases in the Quantitative Measurement of Values for Public Decisions. Psychological Bulletin, 122 (1),
      72-88.
Beattie J, Covey J, Donan P, Hopkins L, Jones-Lee M, Loonies G, Pidgeon N, Robonson A & Spencer A. (1998). On the
      contingent valuation of safety and the safety of contingent valuation: part 1- caveat investigator. Journal of Risk
      and Uncertainty, 17, 5-25.
Bell CM, Chapman RH, Stone PW, Sanberg EA & Neumann PJ. (1999). An Off-the-Shelf-Help List: A comprehensive
      catalogue of preference weights from published cost-utility analyses (CUAs), submitted December 9,  1999.
Bennett R & Tranter R. (1998). The dilemma concerning choice of contingent valuation willingness-to-pay elicitation
      format. Journal of Environmental Planning and Management, 41 (2), 253-7.
Bergstrom TC. (1982). When is a man's life worth more than his human capital? in Jones-Lee MW (ed), The Value of
      Life and Safety: Proceedings of a Conference held by the Geneva Association, 3-26, Amsterdam: The
      Netherlands.
Bleichrodt H, Pinto JL & Wakker PP. (2000). Making descriptive use of prospect theory to improve prescriptive
      applications of expected utility. Working Paper, Erasmus University Rotterdam, ML, March 24, 2000.

51

-------
Blumenschein K & Johannesson M. (1999). Use of contingent valuation to place monetary Value on pharmacy services:
      An overview and review of the literature. Clinical Therapeutics, 27(8), 1402-1417.
Brazier J & Deverill M. (1999). A checklist for judging preference-based measures of health related quality of life:
      learning from psychometrics. Health Economics, 8, 41-51.
BrickmanP, Coates D & Janoff-Bulman R. (1978). Lottery winners and accident victims: is happiness relative? Journal
      of Personality and Social Psychology, 36, 917-927.
Brock DW. (1998). Ethical issues in the development of summary measures of population health status, in Field MJ and
      Gold MR (eds.). Summarizing population health: Directions for the development and application of population
      metrics. Washington, D.C.: National Academy Press (73-85).
Carrothers TJ, Evans JS. & Graham JD. (2000). The lifesaving benefits of improved air quality: An uncertainty analysis.
      Submitted to Risk Analysis, May 2000.
Carson RT, Hanemann WM, Kopp RJ et al. (1997). Temporal reliability of estimates from contingent valuation, Land
      Economics, 73(2), 151-63.
Carson RT. (2000). Contingent Valuation:  A users guide. Environ.Sci.Technol, 34, 1413-1418.
Carthy T, Chilton S, Covey J. Hopkins L, Jones-Lee M, Loonies G, PidgeonN & Spencer A.  (1999). On the contingent
      valuation of safety and the  safety of contingent valuation: part 2-The CV/SG "Chained" Approach. Journal of
      Risk and Uncertainty, 17, 187-213.
Chisholm D, Healey A & Knapp M. (1997). QALYs and mental health care. Soc Psychiatry Psychiatr Epidemiol, 32,
      68-75.
Cohen BJ. (1996a). Is expected utility theory normative for medical decision making? MedDecis Making, 16, 1-6.
Cohen BJ. (1996b). Reply: Utilitarianism, risk aversion, and expected utility theory. Med Decis Making, 16, 14.
Cohen J. (1996). Preferences, needs and QALYs. Journal of Medical Ethics, 22, 267-272.
Cookson R. (2000). Incorporating psycho-social considerations into health valuation: an experimental  study. Journal of
      Health Economics, 19(3), 369-401.
Cropper ML &  Sussman FG. (1990). Valuing future risks to life, Journal of Environmental, Economics and
      Management, 19, 160-174.
Cutler D & Richardson E. (1998). The value of health: 1970-1990. American Economic Review, 88, 97-100.
de Hollander AEM, Melse JM, Lebrte E & Kramers PGN. (1999). An aggregate public health indicator to represent the
      impact of multiple environmental exposures. Epidemiology, 10,  606-617.
de Rosa CT, Stara JF & Durkin PR. (1985). Ranking chemicals based on chronic toxicity data. Toxicology and Industrial
      Hygiene, 1 (4), 177-191
de Wit GA, Bussbach JJW & De Charro F. (2000). Sensitivity and perspective in the valuation of health status: whose
      values count? Health Economics, 9(2), 109-126.
De Wit GA, Busschbach JJV & de Charro FTH. (2000). Sensitivity and perspective in the valuation of health status:
      Whose values count? Health Economics, 9,  109-126.
Dickie M & Gerking S. (1996). Formation of Risk Beliefs, Joint Production and Willingness to Pay to Avoid Skin
      Cancer. Review of Economics & Statistics, 78 (3), 451-63.
Diener A., O'Brien B & Gafni A.  (1998). Health care contingent valuation studies: a review and classification of the
      literature. Health Economics, 7, 313-326.
Dolan P, Gudex C, Kind P & Williams A. (1996). Valuing health states: A comparison of methods. Journal of Health
      Economics 15, 209-231.
Dolan P. (1996). Modelling valuations for health states: the effects of duration. Health Policy, 38, 189-203.
Dolan P. (1997). The nature of individual preferences: A prologue to Johannesson, Jonsson and Karlsson. Health
      Economics, 6, 91-93.
Dolan P. (1999). Whose preferences count? Me d Decis Making, 19, 482-486.
Donaldson C. (1999). Valuing the benefits of publicly-provided health care: does 'ability to pay' preclude the use of
      'willingness to pay'? Social Science & Medicine, 49, 551-563.
Douard J. (1996). Is Risk neutrality neutral? Me d Decis Making, 16, 10-11.
Dougherty CJ. (1994). Quality-adjusted life years and the ethical values of health care. Am.J.Phys.Med.Rehabil, 73,61-
      65.
Eeckhoudt L. (1996). Expected utility theory - is it normative or simply "practical"? Med Decis Making, 16, 12-13.
52

-------
Elbasha EH. (2000). Discrete time representation of the formula for calculating DALYs. Health Economics, 9, 353-365.
Elixhauser A & CA Steiner. (1999). Most Common Diagnoses and Procedures in U.S. Community Hospitals, 1996.
      Healthcare Cost and Utilization Project (HCUP) Research Note: Prepared by the Agency for Health Care Policy
      and Research (AHCPR), Rockville, MD. Publication No. 99-0046. http://www.ahcpr.gov/data/hcup/commdx.
      (11/13/00).
ESEERCO (Empire State Electric Energy Research Corporation). (1995). New York State Environmental Externalities
      Cost Study,  Volume 1: Introduction and Methods. New York, NY: Oceana Publications Inc..
Essink-Bot ML, Stouthard ME & GJ Bonsel. (1993). Generalizability of valuation on health states collected with the
      EuroQol questionnaire. Health Econ, 2, 237-46.
ExternE. (1995). Externalities of Energy. European Commission EUR 16520 EN, Volume 1-6, Luxembourg: Office for
      Official Publications of the European Communities.
ExternE. (1999). Externalities of Energy. European Commission EUR 19083 EN, Volume 7-10, Luxembourg: Office for
      Official Publications of the European Communities.
Field MJ & Gold MR (eds.).  (1998). Summarizing population health: Directions for the development and application of
      population metrics. Washington, D.C.: National Academy Press.
Fischer GW. (1979). Utility models for multiple objective decisions: do they accurately represent human preferences?
      DecisSci. 70,451-479.
Fischhoff B, Slovic P, Lichtenstein S, Read S & Combs B. (1978). How Safe is Safe Enough? A Psychometric Study of
      Attitudes towards Technological Risks and Benefits, Policy Sciences, 9, 127-52.
Fox JA, Shogran JF, Hayes DJ & Kliebenstein JB. (1998). CVM-X: Calibrating Contingent Values with experimental
      auction markets. Amer.J.Agr.Econ, 80, 455-465.
Frey RL, Gysin CH, Leu RE & Schmassmann N. (1985). Energie, Umweltschaden und Umweltschutz in der Schweiz,
      Easier Sozialokonomische Studien Band 27, Griisch, CH: Verlag Rtiegger.
Frischknecht R., Braunschweig A, Hofstetter P & Suter P. (2000). Human Health Damages due to Ionising Radiation in
      Life Cycle Impact Assessment, Environmental Impact Assessment Review, 20,  159-189.
Froberg DG & Kane RL. (1989a). Methodology for measuring health-state preferences -1: Measurement strategies. J
      Clin Epidemiol 42, 345-354.
Froberg DG & Kane RL. (1989b). Methodology for measuring health-state preferences - II: Scaling methods. J Clin
      Epidemiol 42, 459-471.
Froberg DG & Kane RL. (1989c). Methodology for measuring health-state preferences - III: Population and context
      effects. J Clin Epidemiol, 42, 585-592.
Fryback DG, Dasbach EJ, Klein R, Klein BEK, Dorn N., Peterson K & Martin PA.  (1993). The Beaver Dam health
      outcomes study: Initial catalog of health state quality factors. Med Decis Making, 13, 89-102.
Fryback DG, Lawrence WF,  Martin PA, Klein R & Klein BEK. (1997). Predicting quality of well-being scores from SF-
      36: results from the Beaver Dam Health Outcomes Study. Medical Decis Making, 17 (I), 1-9.
Fryback DG. (1998). Methodological issues in measuring health status and health-related quality of life for population
      health measures: a brief overview of the "HALY" family of measures in Field MJ and Gold MR (eds.).
      Summarizing population health: Directions for the development and application of population metrics.
      Washington, D.C.: National Academy Press, 39-57.
Gabriel SE, Kneeland TS, Melton LJ, Moncur MM, Ettinger B & Tosteson ANA. (1999). Health-related quality of life in
      economic evaluations  for Osteoporosis: Whose values should we use? Med Decis Making, 19, 141-148.
Gafni A & Birch S. (1993). Searching for a common currency: Critical appraisal of the scientific basis underlying
      European harmonization of the measurement of Health Related Quality of Life (EuroQol®). Health Policy, 23,
      219-228.
Gafni A. (1997). Alternatives to the QALY measure for economic evaluations. Support Care Cancer, 5, 105-111.
George JF, Duffy K & Ahuja M. (2000).  Countering the anchoring and adjustment bias with decision support systems.
      Decision Support Systems, 29,  195-206.
Goedkoop M & Spriensma R. (1999). The Eco-indicator '99: A damage oriented method for Life Cycle Impact
      Assessment. Nr. 1999/36A/B. Zoetermeer, NL: VROM, http://www.pre.nl/eco-indicator99/ei99-reports.htm
      (10/19/00)
Gold MR, Siegel JE, Russell LB & Weinstein MC (eds). (1996). Cost Effectiveness in Health Medicine. New York, NY:
      Oxford University Press.

53

-------
Groot W. Adaptation and scale of reference bias in self-assessment of quality of life. (2000). Journal of Health
      Economics, 19 (3), 403-420.
Guinee J & Heijungs R. (1993). A proposal for the Classification of Toxic Substances within the Framework of Life
      Cycle Assessment of Products, Chemosphere 26 (10): 1925-44.
Hammitt JK & Graham JD. (1999b). Willingness to pay for health protection: Inadequate sensitivity to probability?
      Journal of Risk and Uncertainty, 8, 33-62.
Hammitt JK, Belsky ES, Levy JI & Graham JD. (1999a). Residential building codes, affordability, and health protection:
      A risk-tradeoff approach. Risk Analysis, 19(6), 1037-1058
Hammitt JK, Liu J-T & Liu J-L. (2000). Survival is a Luxury Good: The Increasing Value of a Statistical Life.
      Manuscript at the Harvard School of Public Health, Boston, MA. August 2000.
Hammitt JK. (1993). Editorial: discounting health increments. Journal of Health Economics, 12, 117-120.
Hammitt JK. (2000a). Evaluating Contingent Valuation of Environmental Health Risks: The proportionality test.
      Association of Environmental and Resource Economists Newsletter, 20 (1),  14-19.
Hammitt JK. (2000b). Valuing mortality risk: Theory and Practice. Environ. Sci. & Techno, 34 (8),  1396-1400
Hammitt JK., (2002). QALY versus WTP. to be published in Risk Analysis
Hanson K. (1999). Measuring up: Gender,  Burden of Disease, and priority setting techniques in the health sector.
      August 1999. Harvard Center for Population and Development Studies.
      www.hsph.harvard.edu/organi.. .lthnet/Hupapers/gender/hanson.html (4/10/00).
Harada T, Fujii Y, Nagata K, Inaba A & Mettier T. (2000). Panel test for Japanese LCA experts aiming to weight
      safeguard subjects. Proceedings of The Fourth International Conference  on EcoBalance, 10/31-11/2/2000,
      Tsukaba, Japan, 201-204.
Havelaar AH, de Hollander AEM, Teunis PFM, Evers EG, Van Kranen HJ, Versteegh FM, van Koten JEM & Slob W.
      (2000). Balancing the risks and benefits of drinking water disinfection: disability adjusted life-years on the scale.
      Environmental Health Perspectives,  108, (4): 315 -3 21.
Hertwich EG. (1999). Toxic Equivalency: Accounting for Human Health in Life-Cycle Impact Assessment. Berkeley,
      CA: University of California.
Hoffman C; Rice D & Sung H-Y (1996). Persons with Chronic Conditions: their Prevalence and Costs, Jour. American
      Medical Association 276, (18), 1473-1479.
Hofstetter P. (1998). Perspectives in Life Cycle Impact Assessment; A structured approach to combine models of the
      technosphere, ecosphere,  and valuesphere, Boston: Kluwer Academic Publishers.
Hofstetter P., Bare JC, Hammitt JK, Murphy PA, Rice GE. (2002). Tools for comparative analysis of alternatives:
      Competing or complementary perspectives? accepted for publication in Risk Analysis
Holmes AM. (1995). A quality-based societal health statistic for Canada, 1985. Soc SciMed, 41 (10), 1417-1427.
Holmes AM. (1997). A method to elicit utilities for interpersonal comparisons.  MedDecis Making,  17, 10-20.
Hoogenveen RT, Gijsen R, Genugten MLL van, Kommer GJ, Schouten JSAG & Hollander AEM de. (2000). Dutch
      DisMod. Constructing a set of consistent data for chronic disease modeling. RIVM report 260751001, Bilthoven,
      NL: RIVM.
Huber J, Wittink DR, Fiedler JA & Miller R.  (1993). The effectiveness of alternative preference elicitation procedures in
      predicting choices. J Market Res, 30, 105-114.
Huijbregts MAJ.  (1999).  Priority Assessment of Toxic Substances in the Frame of LCA: Development and Application of
      the Multi-Media Fate, Exposure and Effect model USES-LCA, University of Amsterdam,
      http://www.leidenuniv.m/interfac/cml/lca2/index.html (11/10/00).
ILSI. (1996). Human Health Impact Assessment in Life Cycle Assessment: Analysis by an expert panel. Washington, DC:
      International Life Sciences Institute.
Inhaber H. (1982). Energy Risk Assessment. New York, NY: Gordon and Breach Science Publishers.
International Life Sciences  Institute (ILSI). (1996). Human health impact assessment in Life Cycle Assessment: Analysis
      by an expert panel. Washington DC: International Life Science Institute
ISO. (1997). Environmental Management - Life Cycle Assessment - Principles and Guidelines, EN ISO 14040, Brussels,
      Belgium.
Jansen SJT,  Stiggelbout AM, Wakker P, Nooij MA, Nordijk EM & Kievit J. (2000). Unstable preferences: a shift in
      valuation or an effect of the elicitation procedure? Med Decis Making, 20, 62-71.

54

-------
Johannesson M & Johansson P-O. (1997a). Quality of life and the WTP for an increased life expectancy at an advanced
      age. Journal oj'Public Economies, 65, 219-228.
Johannesson M, Johansson P-O & Lofgren K-G. (1997c). On the value of changes in life expectancy: Blips versus
      parametric changes. Journal of Risk and Uncertainty, 15, 221-239.
Johannesson M, Meltzer D  & O'Conor RM. (1997b). Incorporating future costs in medical cost-effectiveness analysis:
      Implications for the cost-effectiveness of the treatment of hypertension. Med Decis Making, 17, 382-389.
Johannesson M, Pliskin J & Weinstein MC. (1994). A note on QALYs, time tradeoff and discounting. Med Decis
      Making, 14, 188-193.
Johnson FR, Banzhaf MR & Desvousges WH. (2000a). Willingness to pay for improved respiratory and cardiovascular
      health: A multiple-format, stated-preference approach. Health Economics, 9, 295-317.
Johnson FR, Desvousages WH, Ruby MC, Stieb D & De Civita P. (1998). Eliciting stated health preferences: an
      application to willingness to pay for longevity. Med Decis Making, 18, 51-61.
Jones-Lee M. & Loonies G. (1997). Valuing Health and Safety: some Economic and Psychological Issues, in Nau R. et
      al. (eds.), Economic  and Environmental Risk and Uncertainty. Dordrecht, NL: Kluwer Academic Publishers, 3-
      32.
Jones-Lee MW, Loonies G & Philips PR. (1995). Valuing the prevention of non-fatal road injuries: Contingent valuation
      vs. standard gambles. Oxford Economic Papers, 47, 676-695.
Jones-Lee MW. (1992). Paternalistic altruism and the value of statistical life, Economic Journal, 102,  80-90.
Kaplan RM & Anderson JP. (1988). A general health policy model: Update and applications. Health Serv Res,  23, 203-
      35.
KeelerEB. & Cretin S. (1983). Discounting of life-saving and other nonmonetary effects, Management Science 29, 300-
      306.
Keeney RI and Raiffa H. (1976). Decision with multiple objectives: preferences and value tradeoff. New York, NY:
      John Wiley & Sons.
Kenkel D. (1997). On valuing morbidity, cost-effectiveness analysis, and being rude. Journal of Health Economics, 16,
      749-757.
KochT. (2000). Life quality vs the 'quality of life': assumptions underlying prospective quality of life instruments  in
      healthcare  planning. Social Science & Medicine, 51, 419-427.
KochT. (2000). The illusion of paradox. Social Science & Medicine, 50, 757-759.
Krabbe PFM & Bonsel GJ.  (1998). Sequence effects, health profiles, and the QALY model: In search of realistic
      modeling. Med Decis Making, 18, 178-186.
Krabbe, PFM.; Essink-Bot M-L & Bonsel GJ. (1997). The Comparability and Reliability of Five Health -State  Valuation
      Methods, Social Science and Medicine, 45 (11), 1641-1652.
Kristiansen CM. (1985). Value correlates of preventive health behavior. Journal of Personality and Social Psychology,
      49(3), 748-758.
Kuppermann M, Shiboski S, Feeny D, Elkin EP & Washington AE. (1997). Can preference scores for discrete  states be
      used to derive preference scores for an entire path of events? An application to prenatal diagnosis. Med Decis
      Making, 17,42-55.
Leigh JP1; SB. Markowitz;  M. Fahs; C. Shin; & P. J. Landrigan, (1997). Occupational Injury and Illness in the  United
      States: Estimates of Costs, Morbidity, and Mortality, Archives of Internal Medicine, 157 (14), 1557-1568.
Leonard HB & Zeckhauser PJ. (1986). Cost-Benefit Analysis applied to risks: Its philosophy and legitimacy. In
      MacLean D. (Ed). Values at Risk. Totowa, NJ: Rowman & Allanheld Publishers, 31-48.
Levin R. (1997). Lead in drinking water, in Morgenstern R. (Ed). Economic analysis at EPA: Assessing Regulatory
      Impact. Washington, DC: Resources for the Future, 205-232.
Lippmann M (Ed). (2000). Environmental Toxicants: Human Exposures and their Health Effects. Second Edition. New
      York, NY: Wiley Interscience.
Loonies G & McKenzie L.  (1989). The use of QALYs in health care decision making. Soc Sci Med, 28, 299-308.
MacKeigan LD, O'Brien BJ and Oh PI. (1999). Holistic versus composite preferences for lifetime treatment sequences
      for type 2 diabetes. Med Decis Making, 19, 113-121.
Magat WA, WK Viscusi & Huber J. (1996). A reference lottery metric for valuing health, Management Science, 42,
      1118-1130.

55

-------
MansourianBG. (1996). ACHRNews. Bulletin of the World Health Organization, 74 (3), 333-337.
Mara DD & Feachem RGA. (1999). Water- and excreta-related diseases: unitary environmental classification, Journal of
      Environmental Engineering, 125 (4), 334-9.
McNeil BJ, Weischselbaum R & Pauker SG. (1981). Speech and survival tradeoffs between quality and quantity of life
      inlaryngeal cancer. NEnglJMed, 305, 982-987.
Mehrez A & Gafni A. (1989). Quality adjusted life years, utility theory, and health years equivalents. MedDecis
      Making, 9, 142-149.
Miller GA.  (1956). The magical number seven, plus or minus two: some limits on our capacity for processing
      information. The Psychological Review, 93,  (2), 81-97.
Miyamoto JM & Eraker SA. (1985). Parametric estimates for a QALY utility model. MedDecisMaking, 5, 191-213.
Moore MJ & Viscusi WK. (1988). The quantity-adjusted value of life. Econ Inquiry, 26 (3), 369-388
Morgenstern R. (Ed). (1997). Economic analysis at EPA: Assessing Regulatory Impact. Washington, DC: Resources for
      the Future.
Mtiller-Wenk R. (1999). Life-Cycle impact Assessment of Road Transport Noise,  IWO-Diskussionsbeitrag Nr.77, St.
      Gallen, CH: University of St. Gall, http://www.iwoe.unisg.ch/service/index-e.html (10/19/00).
Murray CJL & Lopez AD. (1996b). The incremental effect of age-weighting on YLLs, YLDs, and DALYs: a response.
      Bulletin of the World Health Organization, 74 (4), 445-446.
Murray CJL & Lopez AD. (1997). The utility of DALYs for public health policy  and research: a reply. Bulletin of the
      World Health Organization, 75 (4), 377-381.
Murray CJL & Lopez AD. (2000). Progress and directions in refining the Global Burden of Disease approach;  A
      response to Williams. Health Economics, 9,  69-82.
Murray CJL & Lopez AD. (Eds.). (1996a). The Global Burden of Disease, Volume I of Global Burden of Disease and
      Injury Series,  WHO/ Harvard School of Public Health/ World Bank. Boston, MA: Harvard University Press.
Ng Y.-K. (1992). The older the more valuable: Divergence between utility and dollar values of life as one ages. Journal
      of Economics, 55 (1), 1-16.
Nord E. (1992a). Methods for quality adjustment of life years. Social Science and Medicine, 34, 559-569.
Nord E. (1992b). An alternative to QALYs: the saved young life equivalent (SAVE). BMJ, 305, 875-7.
NordE. (1995). The person trade-off approach to valuing health care programs. Me d Decis Making, 15, 201-208.
Nord E. (1999). Cost-Value Analysis in Health Care: Making Sense out of QALYs. Cambridge, U.K.: Cambridge
      University Press.
O'Brien B &Viramontes JL. (1994) Willingness to Pay: A valid and reliable measure of health state preference?. Medical
      Decision Making, 15, 132-137.
Olsen JA. (2000). A note on eliciting distributive preferences for health. Journal of Health Economics, 19, 541-550.
Patrick DL  & EricksonP. (1993). Health status and health policy; Quality of life  in health care evaluation and resource
      allocation. New York, NY: Oxford University Press.
Patrick DL, Bush JW & Chen MM. (1973). Methods for measuring levels of well-being for a health status index. Health
      ServRes, 8, 228-45.
Pigou AC. (1932). The Economics of Welfare, 4th edition, London, U.K.: MacMillian.
Pinto Prades J-L. (1997). Is the person trade-off a valid method for allocating health care resources? Health Economics,
      (5,71-81.
Pliskin JS, Shepard DS  & Weinstein MC. (1980). Utility functions for life years and health status. Operational Research,
      28 (1), 206-224.
Ponce RA, Bartell SM, Wong EY, LaFlamme D, Carrington C, Lee RC, Patrick DL, Faustman EM & Bolger M. (2000).
      Use of Quality-Adjusted Life Year weights with dose-response models for public health decisions: A case study
      of the risks and benefits offish consumption. Risk Analysis, 20 (4), 529-542.
Poulos C & Whittington D. (2000). Time preferences for life-saving programs: evidence from six less developed
      countries. Environ. Sci. Technol, 34, 1445-1455.
Pratt JW &  Zeckhauser RJ. (1996). Willingness to pay and the Distribution of Risk and Wealth. Journal of Political
      Economy, 104 (4), 747-763.
RaiffaH. (1961). Risk, uncertainty and the Savage axioms: comment. Quarterly Journal of Economics, 75, 690-694.

56

-------
RaiffaH. (1970). Decision Analysis: Introductory Lectures on Choices under Uncertainty. Reading, MA: Addison-
      Wesley
Ramsberg J. (1999). Listening to the vocal citizens: how do politically active individuals choose between lifesaving
      programs? Journal of Risk Research, 2 (4), 355-367.
Ratcliffe J. (2000). Public preferences for the allocation of donor liver grafts for transplantation. Health Economic, 9,
      137-148.
Rawls J. (1971). Theory of Justice. Cambridge, MA: Harvard University Press.
Redelmeier DA, Heller DN & Weinstein MC. (1994). Time preference in medical economics: science or religion? Med
      DecisMaking 14, 301-303.
Reiling SD, Boyle KJ, Philips ML & Anderson MW. (1990). Temporal reliability of contingent valuation. Land
      Economics, 66(2), 128-134.
Richardson J & Nord E. (1997). The importance of perspective in the measurement of quality-adjusted life years. Med
      DecisMaking, 17, 33-41.
Richardson J, Hall J & Salkfeld G. (1996). The measurement of utility in multiphase health states. Int J Technol Assess
      HealthCare,  12, 151-162.
Richardson J. (1994). Cost utility analysis: What should be measured? Soc Sci Med, 39 (1), 7-21.
RokeachM. (1973). The nature of human values. New York, NY: The Free Press.
RosserRM & Watts VC. (1972). The measurement of hospital output. Int J Epidemiol, 1, 361-68.
Rusthoven JJ. (1997). Are quality of life, patient preferences, and costs realistic outcomes for clinical trials? Support
      Care Cancer, 5, 112-117.
Ryan M & Hughes J. (1997). Using conjoint analysis to assess women's preferences for miscarriage management.
      Health Economics 6, 261-273.
Ryan M. (1999). Using conjoint analysis to take account of patient preferences and go beyond health outcomes: an
      application to in vitro fertilisation. Social Science & Medicine, 48, 535-546.
Saaty TL.  (1980). The Analytical Hierarchy Process, Planning, Priority Setting, Resource Allocation, New York, NY:
      Me Graw-Hill
Sackett DL & Torrance GW. (1978).  The utility of different health states as perceived by the general public. J Chron.
      Dis, 31, 697-704.
SagoffM. (1998). Aggregation and deliberation in valuing environmental public goods: A look beyond contingent
      pricing. Ecological Economics, 24, 213-230.
Sayers B & Fliedner TM. (1997). The critique of DALYs: a counter-reply. Bulletin of the World Health Organization,  75
      (4), 383-384.
SeethalerR. (1999). Health costs due to road traffic-related air pollution; an impact assessment project of Austria,
      France and Switzerland. Synthesis Report. Prepared for the WHO Ministerial Conference on Environment and
      Health, London. GVF-Bericht 1/99, Bern, Switzerland: Federal Department of Environment, Transport, Energy
      and Communications.
Sen AK. (1979). Personal utilities and public judgements or what's wrong with welfare economics? Economic Journal,
      89,  537-558.
Shepard DS & Zeckhauser RJ. (1984). Survival versus consumption. Management Science, 30(4), 423-439.
Shiell A, Sezmour J, Hawe P & Cameron S. (2000). Are preferences over health states complete? Health Economics, 9,
      47-55.
Singer P, McKie J, Kuhse H & Richardson J. (1995). Double jeopardy and the use of QALYs in health care allocation.
      Journal of Medical Ethics, 22, 144-50.
Sintonen H. (1981).  An approach to measuring and valuing health states. Soc Sci Med, 15C, 55-65.
SpilkerB. etal. (1996)., Quality of Life, Bibliography and Indexes, Medical Care, 28 (Suppl 12), D51-77.
Stalmeier PFM & Bezembinder TGG. (1999). The discrepancy between risky and riskless utilities: a matter of framing?
      MedDecisMaking, 79,435-447.
Stouthard MEA, Essink-Bot M-L & Bonsel GJ, on behalf of the Dutch Disability Weights Group. (2000). Disability
      weights for diseases: a modified protocol and results for a Western European region. Eur J Public Health, 10, 24-
      30.

57

-------
Stouthard MEA, Essink-Bot M-L, Bonsel GJ, Barendregt JJ, Kramers PGN, van de Water HP A, Gunning-Schepers LJ &
      van der Maas PJ. (1997). Disability weights for diseases in The Netherlands. Rotterdam, NL: Department of
      Public Health/Erasmus University Rotterdam.
Suterland HU, Llewellyn-Thomas H, Boyd NF & Till JE. (1982). Attitudes toward quality of survival. The concept of
      "maximum endurable time". Med DecisMaking, 2, 299-309.
Tengs TO, Adams ME, Pliskin JS, Safran DG,  Siegel JE, Weinstein MC & Graham JD. (1995). Five-hundred life-saving
      interventions and their cost-effectiveness, Risk Analysis, 15 (3), 369-89.
Tolley G, Kenkel D & Fabian R (eds.). (1994). State-of-the-art health values. In Tolley G, Kenkel D & Fabian R.(eds.).
      Valuing Health for Policy; An Economic Approach. Chicago, IL: The University of Chicago Press, 323-344.
Tolley G, Kenkel D & Fabian R. (1994). Valuing Health for Policy; An economic approach. Chicago, IL: The
      University of Chicago Press.
Torrance GW, Feeny DH, Furlong WJ, Barr RD, Zhang Y & Wang Q. (1996). Multiattribute utility function for a
      comprehensive health status classification system; Health Utility Index Mark 2. Medical Care, 34 (7), 702-722.
Torrance GW, Thomas WH & Sackett DL. (1972). A utility maximization model for evaluation of health care
      programmes. Health ServRes, 7, 118-33.
Torrance GW. (1986). Measurement health state utilities for economic appraisal; A review. Journal of Health
      Economics, 5, 1-30.
Toxicology Excellence for Risk Assessment (TERA). (1999). Comparative Dietary Risks: Balancing the Risks and
      Benefits of Fish Consumption, http://www.tera.org/news/ (11/13/00)
Treadwell JR. (1998). Tests of preferential independence in the QALY model. Med Decis Making, 18, 418-428.
Tversky A & Kahneman D. (1992). Advances in prospect theory: cumulative representation of uncertainty. Journal of
      Risk and Uncertainty, 5, 297-323.
Ubel PA, Richardson J & Menzel P. (2000). Societal value, the person trade-off and the dilemma of whose values to
      measure for cost-effectiveness analysis, Health Economics,  9(2), 127-136.
Udo de Haes HA, Jolliet O, Finnveden G, Hauschild M, Krewitt W, Mtiller-Wenk R. (eds.) (1999). Best available
      practice regarding impact categories and category indicators in Life Cycle Impact Assessment, Background
      document for the second working group on Life Cycle Impact Assessment of SETAC-Europe (WIA-2),
      Int.J.LCA, 4, 66-74, 167-174.
USDL/BLS. (1999). Lost-Worktime Injuries and Illnesses: Characteristics and Resulting Time Away from Work.
      Publication No. 99-102, Washington, DC: USDL/BLS. http://stats.bls.gov/oshhome.htm. (4/22/1999).
USEPA.  (1998a). Comparative Risk Framework; Methodology and Case Study. National Center for  Environmental
      Assessment (NCEA-C-0135), SAB External Review Draft http://www.epa.gov/ncea/frame.htm (11/9/00).
USEPA.  (1998b). Cost of Illness Handbook. Draft prepared by Abt Associates, Cambridge, MA for the Office of
      Pollution Prevention and Toxics, Washington, D.C. Electronic copy provided to MSB by Dr. Nicolaas Bouwes,
      EPA Project Manager on January 6, 2000.
USEPA.  (1999a). Valuation of human health and welfare effects of criteria pollutants. Appendix H in USEPA. The
      benefits and costs of the Clean Air Act 1990-2010, EPAReportto Congress, EPA-410-R-99-001
      http://www.epa.gov/oar/oacLcaa.html (11/13/00)
USEPA.  (1999b). The benefits and costs of the Clean Air Act 1990 to  2010, EPA Report to Congress, EPA-410-R-99-
      001 http://www.epa.gov/oar/oacLcaa.html (11/13/00)
Ustiin TB, Rehm J, Chatterji S, Saxena S, Trotter R, Room R., Bickenbach J & WHO/Nffl Joint Project CAR Study
      Group. (1999).  Multiple-informant ranking of the disabling effects of different health conditions in 14 countries.
      The Lancet, 354, 111-15.
Van der Pol MM & Cairns JA. (2000). Negative and zero time preference for health. Health Economics, 9, 171-175.
Van Hout BA. (1998). Discounting costs and effects: A reconsideration. Health Economics, 7, 581-594.
Viscusi WK. (1983). Risk by Choice, Regulating Health and Safety in the Workplace. Cambridge, MA: Harvard
      University Press. 93-113.
Viscusi WK. (1993). The value of risks to life and health. Journal of Economic Literature, XXXI, 1912-1946.
Viscusi WK. (1998). Rational Risk Policy. Oxford, U.K.: Clarendon Press, 45-68.
Viscusi, WK, WA Magat & Huber J. (1987). An Investigation of the Rationality of Consumer Valuations of Multiple
      Health Risks, TheRand Journal of Economics, 18, 465-479.


58

-------
Viscusi, WK. WA Magat & Huber J. (1991). Pricing Environmental Health Risks: Survey Assessment of Risk-Risk and
      Risk-Dollar Trade-Offs for Chronic Bronchitis, J. Environmental Economics and Management,  21, 32-51
Von Neumann J & Morgenstern O. (1943). Theory of games and economic behavior. Princeton: Princeton University
      Press, (3rd edition and second printing by Science Editions, John Wiley & Sons, New York, 1967).
von Winterfeldt D & Edwards W. (1986). Decision Analysis and Behavioral Research. Cambridge, U.K: Cambridge
      University Press.
Wakker P & Deneffe D. (1996). Eliciting von Neumann-Morgenstern utilities when probabilities are distorted or
      unknown. Management Science, 42 (8), 1131-1150.
Wathieu L. (1997). Habits and the anomalies in intertemporal choice. Management Science, 43 (11), 1552-1563.
Weinstein MC & Stason WB. (1977). Foundation of cost-effectiveness analysis for health and medical practices. New
      England Journal of Medicine, 296(31), 716-721.
Weinstein MC, Siegel JE, Garber AM, Lipscomb J, Luce BR, Manning WG & Torrance GW. (1997). Productivity costs,
      time costs and health-related quality of life: A response to the Erasmus group. Health Economics, 6, 505-510.
Weitzmann ML. (1998). Why the far-distant future should be discounted at its lowest possible rate. Journal of
      Environmental Economics and Management, 36, 201-208.
Wenstep F, Carlsen AJ, Bergland O & Magnus P. (1997). Valuation of Environmental Goods with Expert Panels, in
      Climaco J. (Ed), Multicriteria Analysis, Berlin, D: Springer.
WHO. (1947). The Constitution of the World Health Organization, WHO Chronical, 1, 29.
WHO. (1993). International Statistical Classification of Diseases and Related Health Problems - ICD-10, Tenth
      Revision, Geneva, CH: WHO.
Williams A. (1996). QALYs and ethics: a health economist's perspective. Soc Sci Med, 43 (12), 1795-1804.
Williams A. (1999). Calculating the Global Burden of Disease: Time for a strategic reappaisal? Health Economics, 8, 1-
      8.
Williams A. (2000). Comments on the response by Murray and Lopez. Health Economics, 9, 83-86.
Willis KG & Powe NA. (1998). Contingent valuation and real economic commitments: A private good experiment.
      Journal of Environmental Planning and Management, 41(5), 611-619.
Wu G. (1996). The strengths and limitations of expected utility theory. Med Decis Making, 16, 9-10.
59

-------