Evaluation And Effective Risk Communications Workshop Proceedings


 Interagency Task Fore* on
 Environmental Cancer and
 Heart and Lung Disease
EPA/600/9-90/054
January 1991
Evaluation and Effective
 Risk Communications
Workshop Proceedings
         Edited by

         Ann Fisher
         Maria Pavlova
         Vincent Covello

-------
                                                         EPA/600/9-90/054
                                                         January 1991
  EVALUATION AND EFFECTIVE RISK COMMUNICATION
                   WORKSHOP PROCEEDINGS
              Editors:
                      Ann Fisher
                      Maria Pavlova
                      Vincent Covello
 Interagency Task Force on Environmental Cancer and Heart and Lung
      Disease Committee on Public Education and Communication
              Environmental Protection Agency • National Cancer Institute
National Heart, Lung, and Blood Institute • National Institute for Occupational Safety and Health
   National Institute of Environmental Health Services • National Center for Health Studies
              Centers for Disease Control • Food and Drug Administration
             Department of Energy • Consumer Product Safety Commission
        Occupational Safety and Health Administration • Department of Agriculture
               Department of Defense • Department of Vetems Affairs
      Agency for Toxic Substances and Disease Registry • National Library of Medicine
                                                     ^y Printed on Recycled Paper
       U S  Environmental Protection Agency
       ftegion 5, Library (PL-12J)
       77 West Jackson Boulevard, 12tn
       Chicago, IL  60604-3590

-------
                                     NOTICE

The information in this document has been funded wholly or in part by the Federal Task
Force on Environmental Cancer and Heart and Lung Disease. It does not necessarily reflect
the views of the Task Force or its individual member agencies and no official endorsement
should be inferred.  Mention of trade names or commercial products does not constitute
endorsement or recommendation for use.

-------
Contents                                                          m

                       TABLE OF CONTENTS


FOREWORD	„	vii

EVALUATION AND EFFECTIVE RISK COMMUNICATION:
       INTRODUCTION	xi
       Vincent Covello, Ann Fisher, Elaine Bratic Arkin
COMMISSIONED PAPERS
       RISK COMMUNICATION: ON THE ROAD TO MATURITY	3
              Milton Russell
       EVALUATION FOR RISK COMMUNICATORS	11
              Elaine Bratic Arkin
       THE TWELVE LAWS OF EVALUATION RESEARCH	25
       Highlights from A Guide to Evaluation Research
       Theory and Practice
              Peter H. Rossi
PRESENTATIONS
       INTEGRATING EVALUATION INTO THE DEVELOPMENT AND
       DESIGN OF RISK COMMUNICATION PROGRAMS	33
              June A.Flora
       MARKETING RESEARCH AND RISK COMMUNICATION
       Corporate and Public Sector Roles	41
              William D. Novelli
       EVALUATING RISK COMMUNICATION PROGRAMS
       A Catalogue of "Quick and Easy" Feedback Methods	45
              Mark Kline, Car on Chess, Peter M. Sandman
COMMENTARIES ON EVALUATION ISSUES
       DEVELOPING THE MESSAGE

              Selecting Appropriate Strategies	65
                    Mildred Zeldes Solomon
              Tailoring The Message to the Audience	73
                    James W. Swine hart
              Focusing on the Audience	83
                    Marilyn Rice

-------
iv                                                             Contents

       TRACKING PROGRESS
              Issues to Consider for Evaluation Design	89
                     Judy Shaw, Jeanne Herb
              Tracking the Health Objectives for the Nation	91
                     James A. Harrell
              The Purpose of Tracking Progress	93
                     James L. Regens
              Benefits to Conducting Midcourse Reviews	97
                     MaxLum

       DECIDING ON THE EXTENT OF EVALUATION	99
              Elaine Bratic Arkin

       MATCHING YOUR NEEDS WITH AN EVALUATOR'S
       CAPABILITIES	103
              James W.  Swinehart, Shelagh Smith,
              Vicki S. Freimuth, Charles Darby

       MEASURING ACCOMPLISHMENTS
              Considerations for Planning Risk Communication	Ill
                     Robert W. Denniston
              Four Factors in Designing Evaluation Strategies	115
                     David McCallum
              Integrating Evaluation: A Seven-Step Process	119
                     William H. Desvousges

       UNDERSTANDING OMB  PROCEDURES
              OMB Survey Clearance Procedures	127
                     Richard Eisinger
              OMB Regulatory and Approval Requirements	129
                     Susan E. Dudley

USING EVALUATION

       CASE STUDIES
              Introduction	,135
                     Elaine Bratic Ark
              The National Cancer Institute	137
                     Shelagh Smith
              New Jersey Department of Environmental Protection	141
                     Jeanne Herb, Judy Shaw, Henry L. Garie
              CIBA-GEIGY Corporation, Toms River (NJ) Plant	147
                     Thomas A.  Chizmadia
              National Heart, Lung, and Blood Institute	151
                     John C. McGrath

-------
Contents
              New York City Health Department	159
                     Robert W. Denniston
              Environmental Protection Agency	163
                     Ann Fisher
              Maryland Department of the Environment	165
                     Nancy Zahedi, Carol Deck
              U.S. Council for Energy Awareness	169
                     Ann S. Bisconti
              Food and Drug Administration	171
                     Louis A. Morris
              Cancer Prevention Awareness Program	173
                     Shelagh Smith
              EPA Office of Toxic Substances	175
                     Maria Pavlova
              National Cholesterol Education Program	177
                     John C. McGrath
              EPA Superfund Program	181
                     Maria Pavlova
              Cancer Information Service	183
                     Roswell Park Memorial Institute
              National High Blood Pressure Education Program	187
                     John C. McGrath
CONCLUSION
       DOES RISK COMMUNICATION MAKE A DIFFERENCE?	191
              John F. Ahearne
       WHAT ELSE DO YOU NEED TO KNOW ABOUT
       EVALUATION?	195
              Roger E. Kasperson
APPENDIX
       A GUIDE TO EVALUATION RESEARCH
       THEORY AND PRACTICE	201
              Peter H. Rossi and Richard A. Berk
PARTICIPANTS	257

INDEX	273

-------
                                FOREWORD


     Many agencies and other organizations communicate with the public about risk.
How can these agencies and organizations learn whether they are communicating effec-
tively?  Are their messages appropriate and clear to the intended audience?  Are their
messages reaching that audience? Is the audience understanding and internalizing the
message?
     To explore questions like these, the Workshop on Evaluation and Effective Risk
Communication brought together experts from academia, government agencies, and the
private sector under the auspices of the federal Task Force on Environmental Cancer and
Heart and Lung Disease and its subcommittee, the Interagency Group on Public Education
and Communication. These proceedings of the Workshop provide an overview of the
principles and methods of evaluation and of their application  to risk communication
programs.

The Task Force on Environmental Cancer and Heart and Lung Disease
     The Task Force on Environmental Cancer and Heart and Lung Disease was
established because Congress believed that federal environmental and health agencies, i.e.,
the Environmental Protection Agency (EPA) and the Public Health Service agencies,
should be cooperating on a regular, formal basis. Thus, the impetus for the Task Force from
the beginning has been communication: communication among federal agencies about
what is being done to elucidate the relationships between environmental factors and human
disease and to prevent or reduce the incidence of environmental  disease.
     The Task Force has sponsored various activities to examine specific issues related
to environmental disease:

            The effects of exposure to toxic substances, especially how they are metabo-
                lized and their mechanisms of toxicity
            Exposure assessment
            Non-oncogenic lung disease related to the environment
            Women's occupational health

                                                                          vii

-------
viii                                                                  Foreword

            Air pollutants and respiratory cancer
            Environmental toxicity and the aging process
            Health professionals' awareness of environmental diseases

     The Task Force' s current activities focus on how pesticides affect human health and
on environmental and occupational asthma.
     Task Force reports on these activities have been widely disseminated among the
scientific community, within Task Force member agencies, and to Congress. Task Force
recommendations  are used by federal agencies in planning research and establishing
regulatory priorities and by Congress in drafting legislation.
     But, while the Task Force has promoted communication between federal agencies
and with the outside scientific community, it has not had great success in communicating
with the public. In fact, government in general, although it has initiated research and
formulated policies that respond to public concern about the environment, is often deficient
when it comes to communicating with the public.
Interagency Group on Public Education and Communication
     Recognizing this deficiency, the Task Force sponsored a Workshop on the Role of
Government in Risk Communication and Public Education in January 1987.  This
Workshop recommended the establishment of an Interagency Group on Public Education
and Communication, under the auspices of the Task Force, to enhance collaboration on
public education and risk communication efforts.
     The Interagency Group includes representatives from all fifteen Task Force agencies
as well as from other government agencies and from private organizations. Some of the
members are scientists, some educators, some policymakers, but all have an interest in
communicating with the public.  The missions of Task Force member agencies, either
formally prescribed in legislation or implicitly derived from evolving programs, require
that the public receive information and be able to participate in decisions that affect overall
health and welfare, as well as to make personal decisions concerning risk.
     The Interagency Group first concentrated on identifying federal risk communication
programs that already exist and defining the role that public and private groups could play
in this area. It found that many federal agencies do have risk communication and public
education programs of one kind or another.  These exist for a variety of reasons: the
requirements of new legislation, such as that requiring public disclosure of information on
release  of toxic substances; the increasing interest in disease prevention; and public
demand for information on health and environmental risks.
     Having  identified these programs, the Interagency Group realized  that many
agencies knew little about the actual effectiveness of their risk communication efforts. The
Interagency Group also saw the need to share information about their risk communication
activities and  to move increasingly  toward collaborative efforts.  The Workshop on
Evaluation and Effective Risk Communication was one  step in that direction. It was
designed to share information about what had worked—and what had not worked—when
communicating about risk to allow  agencies to  avoid pitfalls that already have been

-------
Foreword ix

encountered by others, and to avoid costly re-invention of approaches already proven to
be effective.
This record of the Workshop includes papers given in presentations, panels, and
individual working sessions. Read the Introduction to gain an overview of the meeting's
issues and conclusions and the commissioned papers for more detailed presentations of
current knowledge in the field of risk communication and its evaluation. Subsequent
sections address specific aspects of evaluation and provide case studies. In cases where
sessions addressed similar topics, individual author's papers are grouped together under
one title. The appendix includes a background paper developed for the Workshop, A Guide
to Evaluation research Theory and Practice, by Peter Rossi and Richard Berk. A list of
Workshop participants is also appended.
Throughout this volume, readers will encounter the term "research" applied to
evaluation. In this context, the term refers to investigation in some systematic way; it is
not meant to imply that evaluation need be expensive, time consuming, or sophisticated.
In reality, the importance of each evaluation determines the level of resources and
sophistication needed.
The Workshop required many months of planning and hard work on the part of a few
dedicated individuals. The Task Force extends its sincere gratitude to Co-Chairpersons
Vincent Covello, Ann Fisher, and Rose Mary Romano and to Frederick Allen, Elaine
Bratic Arkin, Jean French, and David McCallum, who were also instrumental in planning
the Workshop. In addition, the Task Force appreciates the special contributions of the
Environmental Protection Agency, the Agency for Toxic Substances and Disease Regis-
try, the Food and Drug Administration, and the National Cancer Institute.
We hope that the insights and ideas that were shared in these two days will be of use
to communicators in many agencies as they plan, and plan to evaluate, health and
environmental communication programs.
Maria Pavlova, M.D., Ph.D.
Chairperson
Interagency Group on
Public Education and
Communication

-------
        EVALUATION AND EFFECTIVE RISK COMMUNICATION:

                             INTRODUCTION

              Vincent Covello, Ann Fisher, and Elaine Bratic Arkin


     The papers in this volume review and summarize much of what is known about
evaluating risk communication activities. The papers were presented at the Workshop on
Evaluation and Effective Risk Communication held in Washington, D.C., in June 1988.
The purpose of the Workshop was to bring together experts from academia, government
agencies, and the private sector to review the current state of knowledge in evaluation
research and the ways this knowledge can be applied to risk communication. Specific
objectives were to:


            Improve understanding of evaluation problems and tasks;
            Survey principles and methods of evaluation relevant to risk communication;
            Illustrate the practice of evaluation through examples;
            Provide guidance for organizations engaged in planning and coordinating
               the evaluation of risk communication;
            Derive recommendations for improving risk communication; and
            Identify future needs.

Definitions
     For purposes of the Workshop, risk communication was defined as any purposeful
exchange of information about health or environmental risks between interested parties.
More specifically, risk communication was defined as the act of exchanging information
about levels of health or environmental risks; about the significance or meaning of health
or environmental risks; about the data and methods used in deriving estimates of risk; or
about decisions, actions, or policies aimed at managing or controlling health or environ-
mental risks.
                                                                         XI

-------
xii                                                                 Introduction

     Evaluation, in the context of risk communication, was defined as any purposeful
effort to determine the effectiveness of risk communication programs.  Evaluation,
according to this definition, encompasses a wide range of activities, from diagnosing risk
communication problems to measuring and analyzing program effects and outcomes.
Why Evaluate?
        One fundamental question dominated initial workshop discussions:  Why is it
important to evaluate risk communication programs?  In response to this question,
participants agreed that evaluation is critical to  effective risk communication; without
evaluation, there is no way  to determine whether risk communication activities are
achieving (or have achieved)  their objectives.
     Evaluation should be an integral part of the risk communication process.  When
carried out at each stage of program development, evaluation provides information that is
critical to program effectiveness. For example, it provides essential planning information,
it provides program direction, and it can help demonstrate program accomplishments.
Most fundamentally, evaluation can signal the need for timely modifications.
     When viewed in this way, evaluation has much to offer organizations that have risk
communication responsibilities. During the planning and pre-production phase, evalua-
tion can provide data critical to effective program design, including information about
health, environment, and lifestyle needs and concerns, information about risk management
needs and concerns, and information about how to  meet those  needs and concerns.
Through surveys, questionnaires, focus groups, and other research tools, evaluation can be
used to identify stakeholders and other relevant audiences, to assess audience opinion or
reaction, to find out what people see as important problems, to find out what issues and
events people  are aware of,  and to find  out how people react to different sources of
information. Pretesting and pilot testing can  be used to forecast the effectiveness and
feasibility of alternative risk communication activities, to determine the kinds of informa-
tion needed by target audiences to understand risk communication material, to examine
how people process and interpretrisk communication information, and to obtain feedback
on draft materials.  Estimates of the effectiveness of alternative risk communication
activities can  be combined  with  information about their costs  to  show which risk
communication strategy will be most cost-effective.
     Once the risk communication program  is  operational, evaluation can be used to
address questions of accountability and performance. For example, evaluation studies can
determine whether the risk communication program is reaching the intended audience,
provide feedback on the performance of risk communicators, identify program strengths,
suggest ways these strengths can be used to communicate  more effectively, and determine
whether the program is being implemented appropriately (for example,  what material was
produced, how much was produced, how long it took, what it cost, and  what audiences
received the material).
     Once the risk communication program has been implemented,  evaluation can
provide information on program  impact and outcome.  For example, evaluation can
determine what members of the audience actually received, what they learned, and whether

-------
Introduction xiii

change occurred in the way they feel, think, or behave. The results can be used to answer
the most important question: Did the program achieve its goals?
One major reason for evaluating risk communication activities is the general lack of
resources for development of comprehensive risk communication strategies and programs.
Few organizations have the resources needed to launch state-of-the-art risk communica-
tion programs that address multiple audiences through multiple channels. As a result,
managers need to be able to choose messages and channels that use their limited resources
most effectively.

Problems and Difficulties
These advantages raise a second question: If evaluation is so valuable, why are so
few risk communication activities formally evaluated? The answer to this question appears
to lie in a variety of problems and difficulties that affect the conduct of evaluation. These
include problems and difficulties stemming from conflicts and disagreements about
values, goals, resources, and usefulness. Each is briefly discussed below.
Values. Many difficulties in evaluation arise from its nature as a normative, value-
laden undertaking that carries important policy, ethical, and practical implications.
The value-laden nature of evaluation derives in part from the many stakeholders
interested in the conduct and effectiveness of any given risk communication activity or
program. These include government agencies, corporations and industry groups, unions,
the media, scientists, professional organizations, public interest groups, and individual
citizens. Each of these groups has varying and often conflicting needs, interests, and
perspectives. Evaluators are often asked to respond to the needs and concerns of each of
these constituencies. However, different audiences have different goals; different audi-
ences need different types of information; and different risk communication activities
require different types of evaluation studies. As a result, an initial difficulty in any
evaluation study is determining the perspective from which the evaluation will be
conducted. Having chosen a perspective, several reporting implications follow, including
the evaluator's responsibility to be explicit about the chosen perspective and to acknowl-
edge the existence of other perspectives. Several practical implications also follow,
including limits on the relevance and role that evaluation can play in affecting risk
communication programs, and an increased likelihood that evaluation results will be
criticized, even by the sponsors of the evaluation.
Goals. A second problem affecting evaluation is the difficulty in identifying goals
for risk communication. What goals are appropriate? For example, should the primary
goal of risk communication be to help people become aware of an issue, make more
informed decisions, take action, seek information, seek help, protect themselves, change
their behavior, or participate more effectively in the decisionmaking process? For some,
the goal of risk communication is narrowly defined as personal or organizational survival
and damage control; for others, it is to overcome opposition to decisions; for still others,
it is to achieve informed consent, enhanced public participation, constructive dialogue, and
citizen empowerment.

-------
xiv Introduction

Meaningful evaluation is possible only when the program's goals, intended audi-
ence, and expected effects can be specified clearly. However, for many risk communica-
tion programs, such specification is extremely difficult and sometimes impossible. In
many cases, evaluators and those who commission the evaluation are not able to agree on
what the goals of the risk communication program should be, let alone which goals should
be assessed or what kinds of success measured (e.g., through measures of knowledge,
attitudes, and perceptions; measures of message awareness, comprehension, and accep-
tance; measures of information demand; or measures of behavioral intentions or actual
behavior).
One practical requirement for evaluation is thinking through communication goals
at the beginning. Program and evaluation activities should be based on a set of clear risk
communication goals. Even the most basic risk communication activity, such as respond-
ing to a telephone inquiry from a concerned citizen, should have a specific goal. Without
clear communication goals—be they informational, organizational, legally mandated, or
process goals—it is impossible to know if the interaction and exchange has been
successful.
Once risk communication goals have been determined, they should occupy a key role
in the planning and implementation process. At each stage of the program, activities should
be evaluated in light of these goals. If warranted, program goals should be reviewed and
changed as the program develops.
Resources. Effective risk communication requires a determined effort to ascertain
whether the program is working as intended. Ideally, this should be done while there is still
time to change direction. Feedback is essential to ensure that the communication effort is
achieving its goals; if done early enough, it can save time by identifying places where mid-
course corrections may be effective.
In practice, however, evaluation is often neglected in favor of more urgent tasks—
especially if evaluation has not been planned and budgeted in advance. In most cases, the
amount of money spent on evaluation represents an extremely small percentage of the total
amount spent on the risk communication effort.
There are several reasons for the reluctance of managers to evaluate. One reason is
that many program managers believe that evaluation is prohibitively expensive and that
only a few organizations have the resources and skills to carry out evaluation. Another
reason is the tendency for program managers to exhaust all available resources producing
and distributing more risk communication materials (in the hope of increasing effective-
ness by reaching more people), rather than to conduct evaluation studies that ask whether
the message has reached the target audience and whether the target audience has received
and internalized the message. There also is an understandable reluctance on the part of
many program managers to support research that has the potential for showing that the
time, resources, and effort they have invested in a risk communication activity or program
have not produced the desired results. Program managers may not want to be told that their
programs have shortcomings, because this may have implications for career advancement,
for intra-organizational decisions about the allocation of resources, and for program
survival. Whenever an evaluation is conducted, there is a chance that it will reveal (serious)
shortcomings. Thus, not evaluating avoids the potential for evidence of failure. On the
other hand, if a program manager is convinced that evaluation can demonstrate success,

-------
Introduction xv

according to what he judges to be appropriate measures, then evaluation may be viewed
very differently; it becomes a tool to justify promotions, bonuses, or increases in financial
resources and staff.
Another factor that may affect the decision to evaluate is the limited success of
previous risk communication programs aimed at changing risk-related attitudes and
behaviors. These planned risk communication activities make up only a small share of the
many factors that impinge on people's perceptions and behavior. Most evaluation studies
conducted to date suggest that even when the message is clearly communicated and
appears to be in the audience's best interest, the goals and expectations for such programs
should be realistic. For example, a successful risk communication program might change
the behavior of only a small percentage of the population. Agencies that have a public
health mandate may view a small percentage change as insignificant even if the number
of individuals affected is large. However, from the perspective of competing for attention
and recognizing the complexities of behavioral change, risk communication endeavors
should be compared with marketing efforts. For example, a marketing effort that produced
an increase of a few percentage points in market share would be judged a big success.
Beyond this lack of understanding of what level of impact should be considered a success,
program managers may prefer formative and process evaluation over outcome and impact
evaluation because the former affords opportunities to make changes in response to
findings.
All of these factors suggest that increased attention needs to be given to understanding
organizational and other barriers to evaluating risk communication activities. Equally
important is the need to develop strategies to overcome these barriers. First among these
strategies is planning risk communication efforts early in the program planning stage so
that evaluation activities can be integrated into the effort from the beginning. Evaluation
is less likely to be resisted when evaluation is built into each stage of the risk communi-
cation process, when adequate resources are available for evaluation, and when changes
implied by evaluative data can be made. Evaluation also is less likely to be resisted when
funds for evaluation have been set aside and built into the risk communication budget in
the beginning.
Second, greater attention needs to be given to the use of informal, quick, and simple
evaluation methods, many of which can produce extremely valuable planning and program
information. When more rigorous, systematic evaluations are required, these ideally
should be carried out by parties other than those who control and conduct the risk
communication activity or program.
Third, greater attention needs to be given to developing incentives for program
managers to fund evaluations for the purpose of better understanding which risk commu-
nication activities are most effective, not solely for justifying what has been done.
Fourth, program managers should be encouraged to develop well articulated
evaluation plans with clear goals and clear explanations of what the evaluation is designed
to achieve.
Finally, program managers should be encouraged to document and share risk
communication successes, including cases in which community feedback was solicited
and used to enhance the risk communication activity or program.

-------
xvi                                                                Introduction

     Usefulness. A common criticism of many evaluations is that the results are seldom
used. Implicit in this criticism is the notion that use means direct and immediate changes
in risk communication policies or programs. However, there are several different types of
use, not all of them immediately apparent. For example, results may be used to confirm
that changes in the risk communication program are not needed. In some cases, evaluation
may indicate directions for risk communication that are inappropriate or not feasible. Even
when there is no immediate discernible use of the information derived from an evaluation,
results may accumulate over time and be absorbed slowly, eventually leading to changes
in risk communication concepts, perspectives, and programs.
     In assessing the usefulness of evaluation research, an important consideration is that
the forces and events impinging on risk communication programs are often more powerful
than the results derived from evaluation studies. The environment in which risk commu-
nication programs are developed seldom permits swift and  unilateral changes; new
information may actually slow down the change process, because it may make decisions
more complicated.
Summary Recommendations
     Several recommendations can be derived from these observations and from those
found in the papers in this volume. The recommendations are divided into those for the
short-term and long-term.
     Consistent with the goals of the Workshop, most of these recommendations are
oriented toward policymakers in public sector agencies that have risk communication
responsibilities. However, the recommendations apply equally well to risk communication
efforts in private sector organizations,  such as public interest groups  and industrial
corporations.

     Short-term Recommendations

      1. Agencies and organizations should be encouraged to use evaluation methods that
        are appropriate to the scale and importance of the risk communication effort.
        Small-scale efforts may require only quick and easy evaluation methods.  In
        contrast, more resource-intensive, statistically reliable methods may be appro-
        priate for large-scale efforts.
     2. Agencies and organizations should be encouraged to integrate evaluation strat-
        egies and results into program planning and decisionmaking: evaluation should
        become a routine part of risk communication practice.
     3. Mechanisms are needed to permit agencies to share evaluation methods and the
        results of evaluation research.
     4. Agencies and organizations should develop guidelines to help managers choose
        the most suitable evaluation methods. Workshops or other training mechanisms
        are needed  to build the skills  required to  design and implement evaluation
        strategies.

-------
Introduction                                                                xvii

      5. Agencies and organizations should be encouraged to evaluate risk communica-
        tion programs so that mid-course corrections can be made and program impact
        can be assessed.

      Long-term Recommendations

      1. Agencies and organizations should support research aimed at measuring the
        effectiveness of risk communication activities as well as the cost-effectiveness of
        alternative approaches. Examples of research questions that need to be answered
        are:
            How can we evaluate the role of risk communication in changing behavior?
            Are risk management decisions better made as a result of more effective risk
            communication?
            Is it more cost-effective to extend the time period for existing risk com-
            munication activities, to intensify their use in the originally scheduled time
            period, or to combine multiple risk communication activities?
      2. Agencies and organizations should sponsor forums for public and expert debate
        on issues related to the appropriateness of using different kinds of motivational
        and persuasive messages within risk communication programs.  For example,
        what guidelines are needed on ethical issues related to using different types of
        motivational and persuasive messages to help foster a more informed public?
      3. Agencies and organizations should support development of guidebooks and
        manuals for practitioners on how to apply  evaluation techniques. Guidebooks
        and manuals should include information on how to tailor an evaluation program
        to the scope and importance of a risk communication activity, as well as how to
        recognize the limitations of alternative methods. Guidebooks and manuals should
        also include case studies demonstrating the value and importance of evaluation
        research in risk communication.

-------
COMMISSIONED PAPERS

-------
Risk Communicaton:  On the Road To Maturity
Milton Russell

     The focus on evaluation of risk communication efforts is striking evidence of the
growing maturity and self-awareness of this field. The sponsors and practitioners of risk
communication are beginning to take evaluation seriously. It is an extraordinary thing for
practitioners to ask themselves and others such questions as:


                What is success, and have we measured up?
                What can we do better?
            •    Is there any evidence that some of our efforts are failures?

        The following issues are important in evaluating risk communication programs:


                Opportunities for improving public health lay in changing lifestyles, and
                risk communication is necessary to achieve those improvements.

            •    A prerequisite for the evaluation process is to be clear about what risk
                communication goal is appropriate under what circumstances; i.e., is the
                goal to inform or to change behavior?

            •    Risk communication that is designed to change individual behavior
                imposes some serious value conflicts, specifically between the duties of
                the state and the rights of the individual; however, there are principled
                bases for resolving these conflicts.

            •    Risk communicators must develop the professional skills to perform as
                well as the confidence to insist on effective evaluation in order to make
                risk communication a useful tool for improving public health.

-------
4                                   Risk Communication: On the Road to Maturity

     Both risk communication and the protection of public health have come a long way
in this country. Risk communication was born when professional risk managers, whether
they were environmental protectors, public health officials, physicians, or safety trainers,
realized that the post- Vietnam generation would not respond automatically when told to
jump. This generation wanted to know why and on whose authority. Risk communication
began to grow up when there were requirements for public hearings and when aggressive
special interest groups and an activist press started demanding that decisions be explained,
so that those who made them could be judged. Risk communication made a leap to its
present state of near-maturity when the "decide and announce" model, which had alienated
a generation whose credo had become participatory democracy, started to be replaced with
interactive consultation with the people whose interests were affected. On the threshold
of adulthood, risk communication is now attempting to empower ordinary citizens to take
control of the risk situations encountered in everyday life.
     Risk communicators are still far from successful in  informing people about the
consequences of alternatives in collective decisions. Further, there are many examples of
mismatches between what people believe and what experience has shown and science
confidently asserts about risks. Finally, some of those in positions of authority remain
stuck in an earlier time warp where "doing good on the behalf of others" was sufficient
     Risk communication will not receive the respect or support that it deserves until its
practitioners subject themselves to a rigorous standard of performance. This performance
must be demonstrated in terms of the usefulness of the messages actually received by the
intended audience. Therefore, developing and using evaluation techniques  is of preemi-
nent importance at this stage in risk communication.

Three Historical Stages in Public Health Protection
        In terms of protecting public health, simplistic categorization suggests that public
health protection has experienced two stages and is embarking on the third. The first stage
was the province of the engineer, and victories were gained against infectious diseases and
premature death by cleaning up the drinking water, disposing safely of garbage and
sewage, and ridding the country of insect- carried scourges such as yellow fever. The
reduction of conventional air pollutants that caused respiratory distress was accomplished
as well. One also could place enforced inoculation programs such as those for smallpox
and polio in the same category. These actions were carried out by government acting as
an agent for its citizens.
     The second stage came when physicians developed both the skills and the tools to
intervene in the course of disease, rather than simply diagnose it.  Insulin was one such
breakthrough, and developments leading to safer childbirth  were another. The advent of
antibiotics also made it possible for physicians to do  more for patients than offer them
comfort and prognosis. In addition, psychoactive drugs, modern surgical techniques, and
crisis intervention tools lengthen and improve the quality of life of many individuals.  In
this stage, physicians act as agents for their patients.
     These two stages are now in a maintenance and marginal improvement phase in
terms of public health gains. For example, the purity of drinking water cannot be allowed
to deteriorate, but little improvement to overall health is likely to occur if it becomes

-------
Risk Communication: On the Road to Maturity 5

cleaner. Improvements in the environment can continue as government acts for its citizens,
but the large opportunities for improving public health now appear to be elsewhere. In this
third stage of risk communication, major improvements will come not from government
in collective actions, nor from physicians treating individual patients, but from ordinary
citizens acting on their own behalf as they carry on their daily lives.
The skills of the physician are no match for lung cancer and heart disease caused by
smoking. There is little that government regulation of industry can do to improve indoor
air quality when radon seeps from the basement and chemical fumes leak from under the
kitchen sink. Moreover, a lifetime of poor diet and inadequate exercise cannot be reversed
by a pill; there is no way to reverse the course of AIDS once the virus strikes; and only so
much can be done at the water plant to prevent lead from leaching into the drinking water
from household pipes. Finally, hospitals have limits on what they can do for alcohol or drug
damaged infants or for patients with cirrhosis or pancreatic cancer.

Ethical Issues
However, all of these health risks can be prevented by individual behavior. All
that is required is having the appropriate information and choosing to act on it. This is
where risk communication can play a central role. It is a means by which society fulfills
its obligation to protect public health by empowering individuals to make informed
decisions about the hazards within their individual control.
This philosophy implies that success is measured by whether individual decisions
are informed, not by what the decisions are or by how much risk they may impose. Yet to
move away from the informed decision standard can conflict with values regarding the
rights of individuals and the limits of the state.
One source of difficulty with the informed decision standard is that the harm rarely
is limited to the one making the decision. Self-inflicted illness is an economic burden to
all of us, since we share in its cost in medical insurance and taxes as well as other losses
that society bears when its members are not as healthy and productive as they might be. The
risks are often imposed on others, such as innocent persons hurt by drunk drivers and
children afflicted by the insults their development suffered as fetuses. In addition, lung
cancer not only shortens the smoker's life, but also lessens the quality of life for anguished
family members and friends when the victim suffers and dies. These external effects of
individual behavior mean that others inevitably have a stake in whether the behavior is
changed.
External effects also reach across time. What responsibility does the teenage boy
who starts on a course of addictive smoking have to the middle-aged man he will become?
That man may find himself not only with a wife and children but also with heart disease.
Or what responsibility does the teenage girl with a fast-food diet deficient in calcium have
to the osteoporosis-ridden grandmother she will become forty years later? These "others"
also have a stake in whether harmful behavior is changed.
Beyond external effects, another difficulty is that we think we know what is best for
someone else. This argument is simple: science has shown that smoking shortens life
spans. Longer, healthier lives are better than shorter, sicker ones. Therefore, people should
not smoke, and if they won't quit, even when fully informed, we should make them.

-------
6 Risk Communication: On the Road to Maturity

However, dictating individual decisions rather than assuring that they are informed
has two problems: the practical and the philosophical. One practical consideration is that
the ability to control individual behavior is limited. Would one monitor diet and exercise
and enforce healthy habits? What about private, consensual sexual practices? The strict
enforcement of laws may have reduced drunk driving but has failed so far to halt the spread
of drug use.
Beyond these practical matters are philosophical questions. Where should the line
be drawn between the power of the state and the rights of the individual? Each of us would
probably draw it in a different place, but there are few who would draw no line at all. This
matter has embroiled political philosophers and ethicists for millennia, and it is not likely
to be resolved soon. Yet it is a central issue in modifying individual behavior, and its
implications need to be clarified for an evolving set of risk communication ethics and
guidelines.
Those with access to information about changes in individual behavior that may
improve health have an implicit social duty to make it available to those who can use it.
Government agencies and researchers supported by public funds bear the burden of making
an effort to disseminate the results of research as widely as possible so that it can be used
most effectively. Communicating about risk reduction opportunities in ways that will
inform effectively is therefore part of the social contract.
However, effective communication about individual risks absorbs resources that
could be used elsewhere. No researcher ever has enough money or time, and diverting
resources from research is asking some individuals to go against both incentives and
personal proclivities. One answer may be to build a communication requirement into the
support for the research, so that this element of the social contract is clear and enforceable.
In addition, researchers are not usually trained or equipped to inform those who could use
the information. Risk communication specialists could take on this task, which is integral
to the research, not ancillary to it.
Another important fact is that little effort is placed on evaluating the effectiveness
of messages about individual risk reduction. With few exceptions, public health profes-
sionals have failed to get messages about individual risk reduction behavior across to the
public. Communication about major issues may be adequate for the reasonably intelligent,
educated, medically alert portion of the public that reads newspapers and magazines,
watches the news on television, has regular medical and dental check-ups, and attends PT A
meetings. However, this is only part of the public. When numerous surveys indicate that
many teenagers are uninformed about sex, what is the likelihood that they know enough
to make informed decisions about the health risks of smoking, the effects of alcohol or
drugs during pregnancy, or the long-term dangers of obesity or inadequate diet? It is a
tremendous challenge to reach those persons who fall outside the usual information
network, and it will not happen by putting more effort into the same techniques. Good
evaluation procedures are likely to demonstrate that new communication strategies are
needed.

-------
Risk Communication: On the Road to Maturity                                   7

Risk Communication Guidelines
     The first set of risk communication guidelines should address the responsibilities of
scientists and health professionals to inform the public, evaluate whether the messages are
being received, and develop alternate tools  when  necessary.  However, sometimes
information alone is not enough to change behavior, and there may be good reasons to go
further.  While different communication techniques  actually  form  a continuum, rough
distinctions are possible. Closest to informing is the use of persuasion. Persuasion goes
beyond supplying the facts to conveying the information in ways designed to encourage
the individual, through reason, to make the behavior change desired.
     In contrast, manipulation bypasses reason to work on the emotions. By presenting
material in forms that tap unrelated emotions, behavior that would resist appeals to reason
can be changed.  Dr. Koop with all his charts and  medical authority is a marvelous
spokesman against smoking, but who among us would choose his recent press conference
over a manipulative Michael Jackson video for changing teenage behavior?
     At the other end of the continuum is deception—lying, presenting partial truths, or
omitting clearly relevant facts. Deception is the antithesis of communication because it
rejects the values of the recipients and seeks to change actions by coercive means. A free
society depends on  trust, especially between government officials and the public they
serve. Deception, even with the best of motives, erodes trust at its core. Finally, deception
rarely works, and when the deception comes to light, lost credibility is difficult to regain.
     Further guidelines for the use of risk communication to protect public health are as
follows. First, deception cannot be tolerated.  Second, efforts to inform and even persuade
those who are reachable by the usual form of messages represent a powerful tool to reduce
public health risks and offer no cause for objection. Third, there is a difference between
using manipulative  devices to get a message across  and  using manipulation to change
behavior. If today's youth tune out Dr. Koop and tune in Michael Jackson, having the latter
try to persuade teenagers to protect themselves also seems unobjectionable.
     The difficult choices start at the next level of communication with the use of
manipulation to change behavior.  When external effects are sufficiently large and where
direct intervention is practical, such as with drunk driving, society does not hesitate to
employ coercive sanctions.  It may be acceptable to attempt to change risky behavior,
which society would otherwise constrain, by manipulation as long  as appropriate safe-
guards are in place.  However, establishing those safeguards is not easy, nor is deciding
where to draw the line with regard to the degree of external effect.
     Unlike cases in which coercive actions are taken and due process is clear, guidelines
for acceptable manipulative behavior are difficult to define and enforce.  By definition,
manipulative risk communication is subtle. Moreover, the "watchers" in government are
often those doing the manipulating in the first place, and they often believe that they have
a high moral purpose. In these circumstances, the appeal of elitism and the belief that those
in positions of authority know best, even about choices that informed individuals clearly
are competent to make, is strong.
     The major safeguard against the erosion of individual choice is to demand greater
degrees of political legitimacy as government moves beyond simply informing its citizens
about risk. This legitimacy can range from the responsibility implied by the existence of

-------
8 Risk Communication: On the Road to Maturity

an agency, to clear statements of executive intent that Congress does not see fit to reject,
to executive orders, to explicit legislation relating to members of the executive branch who
attempt to bring about certain behavior. Thus, both expressed and implied legitimacy may
be the best vehicle for justifying actions that are designed to change behavior in ways not
expressly commanded by law.
However, this conclusion leaves an opening for those in government to manipulate
and control those whom they are supposed to serve. When such power is wrongly used,
itviolatesthebasicpremiseoftheconsentof the governed. Therefore,thetestof legitimacy
should be a stringent one that is supported by the professionalism of those involved in the
process as well as by the vigilance of those subjected to it.
More professionalism among risk communication practitioners offers one avenue
for developing guidelines and codes of ethics. Greater attention by the political system to
the means as well as to the goals of public policy may offer another restraint. Also, the
pluralism of this society should not be underestimated. Those who value individual rights
and restraints on the power of the state have ready access to publicity, political power, and
the courts to hold egregious violations at bay.
As the behavior in question moves toward more truly individual impacts, the
justification for manipulative intervention shrinks and then disappears. Other values take
precedence in our society, and fully informed adults must have the right to make their own
decisions. In the abstract, most individuals would agree that this value is an important
building block for a free society, although they might abhor its practical implications in
particular situations.

Summary
Risk communication is not merely a means of supporting the protection of public
health. It has an important role of its own. When it is individual behavior that causes risks,
which is evermore the case, then it must be individual changes in behavior that reduce them.
Risk communication, aggressively and effectively pursued, can raise the quality of public
health at this stage of our history in the same way that clean drinking water did at the turn
of the century or antibiotics did forty years ago. However, it is critical that other values
are not sacrificed in the process; therefore, much attention needs to be paid to how and for
what purpose communication techniques are used.
Consider for a moment how drugs are tested for safety and efficacy. Cautious steps
are taken leading from testing in animals, to rigorously monitored human trials, to larger
double blind tests on a few individuals, to broader trials, and only then to general
availability. Even then, there are carefully articulated contra-indications and injunctions
about side effects. At each step, careful protocols are followed so that the processes can
be replicated and the results judged by peers. Also, consider how scientists and physicians
are trained: their performance and judgment are monitored by experienced professionals
who can intervene if necessary.
Then consider the usual risk communication effort by a government agency:
haphazard would often be an apt description of the quality of the process. This is true, even
when the message involves a public health hazard that may affect millions of people or
where appropriate behavior change has the potential for saving thousands of lives.

-------
Risk Communication: On the Road to Maturity                                   9

      In comparison with the money and effort spent on research, risk communication is
frequently treated as an afterthought or as a side line or diversion. As noted earlier, this
is because the process is considered by many scientists to be beneath their attention or
somehow suspect. Informing the public is not considered a professional scientific activity;
also it is hard work and absorbs time and resources. Decisionmakers tend to lose interest
in issues that have been resolved and turn to the next item on their agenda. Alternately,
decisionmakers may consider communicating with the public about scientific issues a
simple process that their political experience equips them to do with no special help. There
are many exceptions, but those who have the formal responsibility for dealing with the
public on risk matters are often recruited from other fields, are unfamiliar with the science
they are communicating, or have little expertise, training, or incentive for doing this part
of their job well.
      In short, this society has all sorts of controls and safeguards, tests for safety and
efficacy, and professional standards and codes of ethics regarding who can do what to an
individual's brain, but it pays scant attention to what goes into the collective mind. At a
time when known changes in individual behavior could bring about  the first significant
improvements in the quality and length of life since antibiotics, the tools to communicate
this information are rudimentary, the research is poorly supported, and many of the front
line troops lack training and the support of those who send them into the field.
      There is little reason to believe that this situation will change as matters now stand.
If public officials are judged on the decisions they make rather than on  the effectiveness
of their messages, why should they devote a great deal of effort to informing the public?
If applause comes simply because Dr. Koop appears on television, where is the incentive
to develop community-based peer groups to persuade teenagers not to start smoking? If
scientists working on public health issues are judged exclusively  by the papers they
publish, where is the incentive for them to transform  that research into information on
which people can act? There will be no reason to take the risk communication process
seriously until evaluations are made about whether people are truly empowered to make
important choices about the way they lead their lives or about the collective decisions that
others are making for them.
      In summary, it is the risk communication professionals who have the largest stake
in both facilitating and demanding evaluation of their efforts. They have both  the
professional responsibility and the personal incentive to determine what has been successful
and what further efforts will be required over time to fulfill the promise of risk commu-
nication as a major element in improving public health.

-------
Evaluation for Risk Communicators
Elaine Bratic Arkin
"In the broadest sense, evaluations are concerned with whether or not a
program or policy is achieving its goals and objectives." (Rossi and Berk,
1988)

While it is true that there are specialists who use sophisticated techniques to perform
program evaluation, it is also true that evaluation is a natural process. We all assess actions
by consciously or unconsciously reviewing the available facts, considering them in the
light of the original intent, and drawing a conclusion. For example, you might find that the
news media rarely report your agency's news as you think they should. A close look at the
situation—the content of your news releases, how and when they are released, #nd the
reactions of the reporters receiving them—might identify and help solve the problem. The
purpose of any evaluation is to learn from actions so that improvements can be made.
Everyone reaches conclusions about the relative success or failure of programs and
activities. Formal evaluation helps assure that those conclusions are based on objective
data. Formal evaluation takes the natural process and makes it a conscious, orderly effort
by using objective techniques for gathering and analyzing data and reaching conclusions.
The purposes of evaluation are to improve current and future efforts, certify the degree of
change that has occurred, and identify programs, or elements of programs, that are not
working.
Evaluation is one of many tools available to help risk communication professionals
and other decisionmakers do their jobs well. However, it is important to recognize that
there are many kinds of evaluation, from the very informal and simple to the very formal
and complex.
Evaluation should not be tacked onto the end of a program. Assessment and careful
planning are interdependent, integral functions of program development and implemen-
tation. Just as each step of a program contributes to its effect, each step can be subjected
to evaluation. Even before program development begins, evaluative discipline demands
that the desired program outcome be described as specifically as possible. Once set, these
goals and objectives direct how each aspect of the program will be developed.
11

-------
12 Evaluation for Risk Communicators

It is important to note that evaluation is not a substitute for sound judgment,
creativity, or decisionmaking. Once evaluation results are available, they must be
interpreted and a determination made about how and to what extent they will be used.

Types of Evaluation
This chapter describes four basic types of evaluation. Some of these concepts and
definitions conform to standard textbook terminology, others do not. These types of
evaluation are designed to predict results of a program, measure results of a program, or
help determine why certain results occur. Examining why specific results occur helps
determine which strategies or tasks work well and provides direction for improving a
program's functioning.
Although there are many barriers to undertaking formal evaluation projects, it is
important to consider using evaluation tools to assess work performed. The four types of
evaluation discussed are: formative, process, outcome, and impact.
Formative Evaluation. Formative evaluation consists )f determining the strengths
and weaknesses of messages, materials, or program strategies before full production,
distribution, or implementation. It permits revisions before the full effort goes forward and
before the communications strategy is fully developed. Its basic purpose is to maximize
a program's chance for success. Formative evaluation does not guarantee that a program
will have a certain effect. However, it does minimize the possibility that a program will
fail due to developmental flaws, such as a confusing message, inappropriate strategy, or
ineffective educational materials.
Examples of evaluation strategies that are used during the planning and developmen-
tal stages of a program include needs assessments, pretesting, and field testing.
A needs assessment may be undertaken to reveal the habits, needs, resources, and
interests of the target audience, the community, or both. This kind of study takes the
problem to be addressed and relates it to the existing situation, providing a basis for
designing risk communication and other strategies that will positively affect the problem.
Pretesting ideas (concepts) helps ensure that messages or draft materials will have
the intended effect and answers questions about whether they are understandable, relevant,
attractive, attention-getting, credible, and acceptable to the target audience. These factors
can determine whether messages and materials work with a particular (or target) audience.
Pretesting should not be used to determine whether the message is accurate and complete—
this requires expertise and professional judgment. Instead, pretesting assures that the target
audience will interpret and accept the information as it was intended.
Most pretesting involves a few persons chosen as representatives of the target
audience, and they do not constitute a statistically valid sample in number or selection
method. Pretesting is generally considered qualitative research—research that can be
interpreted somewhat loosely to provide clues about audience reactions, acceptance, and
direction regarding materials production and use. This kind of informal evaluation is fast
and affordable; therefore, it is easier to fit into risk communication program budgets and
schedules.
There is no prescribed methodology for pretesting. Rather, a technique is chosen to
fit each pretesting requirement according to the objectives of and available resources for

-------
Evaluation for Risk Communicators
13
each project. The most frequently used methods include self-administered questionnaires,
central location intercept interviews, focus group interviews, theater testing, and readabil-
ity testing.
      Table 1  indicates which of these techniques is best suited to pretest specific risk
communication products.
                    Table 1. Applicability of Pretesting Methods
Concept Development
Poster
Pamphlet
Booklet
Notification Letter
Storyboard
Radio PSA
Television PSA
Videotape
Concept Development
Poster
Pamphlet
Booklet
Notification Letter
Storyboard
Radio PSA
Television PSA
Videotape
Nonparticipatory Qualitative
Readability
Tests

X
X
X



Focus Self
Groups Tests
X
X
X X
X
X
X
X








X
Qualitative or Quantitative
Individual
Interviews
X








Intercept Mail
Interviews Questionaires
X
X
X X
X
X X
X
X
X

Theater
Tests






X
X
X

-------
14 Evaluation for Risk Communicators

Field testing (pilot testing) focuses on the strategies for communicating risks, rather
than the messages. For programs that will be implemented on a large scale or over a long
period of time, or have a potentially vital impact, field testing can help assure that the
message dissemination and other program activities will work by testing them on a smaller
scale (e.g., within a limited geographic area) before full program implementation resources
are committed. Also, a field test can offer a smaller, more controllable setting for
conducting outcome evaluation.
Examples of information that might come from a formative evaluation include:
comprehension and understanding of the message by the target audience; appeal or
relevance of materials to a particular audience; and feasibility of a mode of distribution for
reaching the target audience.
Process Evaluation. Process evaluation examines the procedures used to implement
an activity. This type of evaluation monitors the administrative and organizational aspects
of a program in progress, providing information about whether activities are on track;
which strategies are most successful; which aspects of the program need more attention,
alteration, or elimination; whether time schedules are being met; and whether resource
expenditures are acceptable.
Tracking the number of materials distributed, meetings attended, articles printed, or
inquiries received will determine how the program is operating and whether the target
audience is responding. These measures explain how a program works, but not whether it
is having the intended effect. Although the effect or outcome is the reason for a program's
existence, it is also important to document what is happening, which elements are working,
and what needs to be changed or improved during the implementation period of a program
to maximize its chances of success.
Data from routine record keeping and other tracking measures should be reviewed
on a regular basis, so that program tasks and schedules can be modified as necessary to
improve progress. Information from a process evaluation might include: number of
educational materials distributed and to whom; number of events and how many attended;
print media coverage and estimated readership; number of inquiries; number of organiza-
tions, businesses, and media outlets participating; effectiveness of the working relation-
ships among key personnel; and degree of adherence to budget and deadlines as well as
reasons for deviations.
Outcome Evaluation. Outcome evaluation is used to obtain descriptive data about
the results of a project and to document short-term results. Sometimes, these measures may
appear to overlap process measures, but they should provide more information about the
value than about the quantity of the activity. Project-focused results describe the output
of the activity (e.g., the number of organizations, businesses, or media outlets participating
and what and how much they are doing). Short-term results describe the immediate effects
of the project on the target audience (e.g., the percent of the target audience showing
increased awareness of the subject or taking a simple action).
An example of an outcome evaluation methodology is a comparison between the
target audience's awareness, attitude, and behavior before and after the program. Unlike
the qualitative methods used for pretesting, outcome evaluation generally calls for
quantitative measures that are necessary to draw conclusions about the program effect.
These measures may be self-reported (e.g., interviews with a statistically valid sample of

-------
Evaluation for Risk Communicators 15

the target audience) or observational (e.g., a study of changes in public inquiries or town
meeting attendance). Comparisons between a control group that did not receive the
program and the target audience receiving the program are desirable. It is also useful to
accumulate data relevant to the desired outcome from the target audience prior to the
intervention (baseline data) and again following the intervention to study changes.
However, one problem that must be addressed in comparing pre- and post- intervention
data is the role that factors other than the intervention being evaluated (e.g., extensive
media attention) may have played. The existence of a control group can lessen this
problem. Information that can result from an outcome evaluation includes knowledge and
attitude changes; expressed intentions or simple actions taken by the target audience; and
policies initiated or other institutional changes made.
Impact Evaluation. Impact evaluation is the most comprehensive and difficult to
obtain of the four evaluation types. It is desirable for some long-term programs because
it focuses on the long- range results of the program, such as changes or improvements in
health status. It may also be problem focused, that is, the results of the evaluation relate
directly to the problem being addressed. For example, a program designed to make local
residents more aware of the risk of toxic chemicals, increase participation in local
decisionmaking processes, and ultimately strengthen local governance of toxic waste
could be evaluated in terms of changes in awareness of residents' own risk, changes in
participation in town meetings (outcome evaluation), and changes in local governance
(impact evaluation).
Impact evaluations are rarely possible because they are often costly, involve an
extended commitment, and the results are difficult to attribute to the effects of a single
activity or program when compared with other influences on the target audience over
extended periods of time. This is especially true for risk communication programs,
because there may be more compelling influences on an individual's behavior. For this
reason, impact studies are rarely initiated as part of a communication activity, except when
communication is one aspect of a larger intervention. Information obtained from an impact
study may include changes in morbidity and mortality; changes in absenteeism from work;
long-term maintenance of desired behavior; and rates of recidivism. Exhibits 1 and 2 give
further information on designing and evaluating risk communication programs.

-------
16 Evaluation for Risk Communicators
Exhibit 1. Elements of an Evaluation Design
Every formal evaluation design, whether formative, process, outcome, impact, or
a combination of elements, must contain certain basic elements. These are briefly
described below.

1. A Statement of Risk Communication Objectives—Unless there is an adequate
definition of desired achievements, evaluation cannot measure them. Evaluators
need clear and definite objectives in order to measure program effects.
2. Definition of Data to be Collected—The determination of what is to be measured
in relation to the objectives.
3. Methodology—A study design is formulated to permit measurement in a valid
and reliable manner.
4. Instrument—Data collection instruments are designed and pretested. These
instruments range from simple tally sheets for counting public inquiries to
complex survey and interview forms.
5. Data Collection—The actual process of gathering data.
6. Data Processing—Putting the data into usable form for analysis.
7. Data Analysis—The application of statistical techniques to discover significant
relationships.
8. Reporting—Compiling and recording evaluation results. These results rarely
categorize a program as a complete success or failure. To some extent all
programs have good and bad elements. It is important to realize that lessons can
be learned from both, if the results are properly analyzed. These lessons can be
applied to either altering an existing program or as a guide to planning new efforts.

-------
Evaluation for Risk Communicators                                           17
              Exhibit 2. Risk Communication Assessment Questions

    How many people were reached? (process evaluation)
     •  Amount of time on radio and television and estimated audience at those times
     •  Print coverage and estimated readership
     •  Number of educational materials distributed
     •  Number of speeches/presentations and size of audience
     •  Number of other organizational and personal contacts

     Did they respond? (process evaluation)
     •  Number of in-person, telephone, and mail inquiries (location of inquirers, where
        they heard of the program, and what they asked)
     •  Number of new organizations, businesses, and media outlets participating in the
        program
     •  Response (e.g., filled-out evaluation forms) from presentations.

     Who responded?  (outcome evaluation)
     •  Demographics of responders (e.g., gender, education, and income)
     •  Geographic residence of responders.

     Was there change?  (outcome evaluation)
     •  Changes in knowledge and/or attitudes
     •  Changes in intentions (e.g., intentions to modify diet)
     •  Actions taken (e.g., increased enrollment in smoking cessation clinics)
     •  Policies initiated or other institutional changes made.

-------
18                                            Evaluation for Risk Communicators
  Constraints to Risk Communication Evaluation
     Every program manager faces constraints to undertaking optimal evaluation tasks,
just as there are constraints to designing other aspects of a risk communication program.
These constraints may include:
     •  Limited funds
     •  Limited staff time and capabilities
     •  Length of time allotted to the program
     •  Limited access to computer facilities
     •  Agency restrictions to hiring consultants or contractors
     •  Policies limiting the ability to gather information from the public
     •  Management perceptions regarding the value of evaluation
     •  Levels of management support for well designed evaluation activities
     •  Difficulties in defining (or establishing agency consensus) regarding the objec-
        tives of the program
     •  Difficulties  in designing appropriate measures for risk communications pro-
        grams, and
     •  Difficulties in separating the effects of program influences from other influences
        on the target audience in "real-world" situations.

     These constraints make it necessary to accommodate existing limitations as well as
the requirements of a specific program. However, it is not always true that "something is
better than nothing."  If an evaluation  design, data  collection, or analysis must be
compromised to fit limitations, the program must decide whether:
     •  The required compromises will make the evaluation results invalid
     •  An evaluation strategy is essential for the particular situation, compared with
        other compelling uses for existing resources
     Some questions for program managers to consider when deciding whether to
evaluate a risk communication program include:
      • Is the program entirely new, or does it incorporate messages and methods that
have been previously tested?
     •  Have program strategies already been formally evaluated or well documented
        and accepted?
     •  How long will the program last? Will the implementation phase be long enough
        to permit significant adjustment?
     •  Will the program be repeated?
     •  Are the objectives measurable in the foreseeable future?
     •  Which program components are most critical?
     •  What aspects of the program fit best with the agency's mission or goals?
     •  Is there management support or public demand for program accountability?
     •  Will an evaluation component help risk communication efforts to compete with
        other agency priorities for future funding?
     The answers to these questions should help identify what kind of evaluation should
be included in the program.

-------
Evaluation for Risk Communicators 19

Determining the Type of Evaluation
Listed below are examples of how evaluation can fit into the seven consecutive
stages of program development and implementation.

1. State Problem or Need—A well conceived statement of the problem or the need
for risk communication is essential, regardless of the size or extent of the program
to be developed. Conducting a formal or informal needs assessment at this point
can provide objective verification of the need and added understanding of, or new
dimensions to, the need to be addressed.

2. Formulate Goals and Objectives—All risk communication programs should be
founded on carefully composed goals and objectives. The goals describe the
overall change, such as a specific improvement in the health of the specified
population; in most cases, activities beyond risk communication will be neces-
sary to reach stated goals. The objectives describe the intermediate steps that
must be reached to accomplish the broader goal. These objectives should be as
specific as possible, obtainable through risk communication activities, and
measurable. Once the goals and objectives are written and approved, these
statements serve as a kind of agreement or contract regarding the program's
purpose; all aspects of the risk communication program should relate specifically
to them. Without clear, measurable goals and objectives, there is no clear
direction for program development and no basis for evaluation. Plans for
outcome or impact evaluation should be developed at this point also to permit the
collection of baseline data, if possible, prior to program intervention.

3. Develop Risk Communication Strategies and Message Concepts—The com-
munication strategy statement outlines the benefits and information to be com-
municated to the target population; the message concepts are how the information
will be communicated (e.g., the information, the appeal, and the spokesperson)
as opposed to the fully composed messages. Concept testing at this stage will
confirm whether the proposed benefits, appeals, spokesperson, and information
are considered clear, understandable, culturally acceptable, and relevant by the
target audience.

4. Draft Risk Communication Materials—Pretesting prior to the expenditure of
production funds can help diagnose any problems and indicate whether the
materials are likely to be effective.

5. Develop Distribution and Implementation Plans—These plans will indicate
through what mechanisms, in what quantities, and when messages and materials
will be directed to the target audience. For broad-scale or long-term programs,
a field test (pilot test) of these mechanisms for a short period with a smaller
audience segment (or smaller geographic area) can help identify any potential

-------
20 Evaluation for Risk Communicators

problems before full implementation begins. A field test may be designed to test
several options for risk communication delivery and assess their relative effec-
tiveness to permit full- scale implementation of the most successful methods.

6. Implement Programs—Process evaluation measures used during program
implementation can provide feedback in time for modifications to be made, if
necessary. This accountability evaluation is designed to identify and correct
problems, not necessarily to determine the extent to which the problems exist.
These measures can also provide indications of intermediate progress and justify
program expansion.

7. Assess Program Effects—Outcome evaluation, based on measurable goals and
objectives, is designed to document what changes occurred. Often it is difficult
or impossible to credit the risk communication activities as the direct cause of the
effects. However, outcome evaluation can help program managers determine
whether the specific program or some of its methods should be continued or
replicated. An analysis of outcome measures combined with resource costs can
yield some measure of the efficiency of the process (cost effectiveness) and the
importance of the change relative to its cost (cost benefit).

Rarely does anyone have access to adequate resources for an ideal risk commu-
nication program, much less an ideal evaluation component. Nevertheless, there are
practical benefits to including an evaluation component, such as to determine whether the
program is on track and how well or why it worked. With a little creative thinking, some
form of evaluation can be included in almost any budget. However, resources other than
program funds, such as professional staff time and skills, computer time, and evaluation
consultants, also should be considered when determining evaluation strategies.
Table 2 includes examples of different evaluation tasks for programs with minimal,
modest, or substantial resources. The matrix is additive from left to right. That is, each
ascending program level could be expected to include the evaluation techniques described
at lower levels in addition to those described at the higher level.

-------
Evaluation for Risk Communicators

           Table 2. Evaluation Options Based on Available Resources
                                                        21
 Type of
 Evaluation

 Formative
 Process
 Outcome
 Impact
Minimal
Resource

Readability test
Recordkeeping
(e.g., monitoring
activity timetables)
Activity
assessments (e.g.,
numbers of health
screenings and
outcomes or
program attendance
and audience
response)

Print media (e.g.,
monitoring of
content of articles
appearing in the
media)
Modest
Resources

Central location
intercept views
Program checklist
(e.g., review of
adherence to
program plans)

Progress in
attaining objectives
(e.g., periodic
calculation of
percentage of target
audience aware,
referred or
participating)

Public surveys
(e.g., telephone
surveys of self-
reported knowledge
or behavior)
Substantial
Resources

Focus groups,
individual in-depth
interviews

Management audit
(e.g., external
management review
of activities

Assessment of
target audience for
knowledge (pretest
and posttest to
measure change in
audience knowledge)
Studies of public
behavior/health
change (e.g., data
on physician visits
or changes in
public's health
status)

-------
22                                          Evaluation for Risk Communicators

SUGGESTED READINGS

         Environmental Protection Agency.  1987. Evaluating and Improving EPA's
         Risk Advisory Programs. Washington, DC: The Agency, Program Evaluation
         Division, Office of Policy, Planning and Evaluation, May.

         Fink, A., and J. Kosecoff. 1987. An Evaluation Primer and Workbook: Prac-
         tical Exercises for Health Professionals. Beverly Hills, CA: Sage Publications.

         Fitz-Gibbon,C.T.andL.L.Mirris. 1978. How to Design aProgram Evaluation.
         Beverly Hills, CA: Sage Publications.

         French, J.F., C.C. Fisher, S.J. Costa Jr. Working with Evaluators. A Guide for
         Drug Abuse Prevention Program Managers. U.S. Department of Health and
         Human Services.  Rockville, MD: Alcohol, Drug Abuse and Mental Health
         Administration, Publication No. (ADM) 83- 1233.

         Green, L.W., and F.M. Lewis. 1986.  Measurement and Evaluation in Health
         Education and Health Promotion. Palo Alto, CA: Mayfield Publishing Co.

         Morris, L.L., and C.T. Fitz-Gibbon. 1978. How  to Measure Program Imple-
         mentation. Beverly Hills, CA: Sage Publications.

         National Cancer Institute. 1978. Making Health Communications Programs
         Work: A Planner's Guide. Bethesda, MD: The Institute, NIH Publication No.
         89-1493.

         National Heart, Lung, and Blood Institute. 1986.  Measuring Progress in High
         Blood Pressure Control: An Evaluation Handbook. NIH Publication No. 86-
         2647. April.

         Rossi, P.H., and H.E. Freeman.  1985. Evaluation. Beverly Hills, CA: Sage
         Publications.

-------
Evaluation for Risk Communicators 23

GLOSSARY

Baseline study—collection and analysis of data regarding a target audience or
situation prior to intervention.

Control group—a group randomly selected and matched to the target popula-
tion according to characteristics identified in the study to permit a comparison
of changes between those who receive the intervention and those who do not.

Formative evaluation—evaluative research conducted during program devel-
opment (e.g., state- of-the-art reviews, pretesting messages and materials, and
pilot testing a program on a small scale before full implementation).

Goal—the overall improvement the program will strive to make; this usually
requires efforts beyond risk communication.

Impact evaluation—research designed to identify whether and to what extent
a program contributed to accomplishing its stated goals (more global than
outcome evaluation).

Objective—a quantifiable statement of a desired program achievement necessary
to reach a program goal; for risk communication programs, specific objectives
can relate directly to desired outcomes of communication activities.

Outcome evaluation—research designed to obtain data about the results of a
program (short term or intermediate changes).

Pretesting—a type of formative research that involves systematically gathering
target audience reactions to messages and materials before they are produced
in final form.

Qualitative research—research that is subjective in that it involves obtaining
information about feelings and impressions from small numbers of respondents.
The information gathered usually should not be described in numerical terms,
and generalizations about the target population should not be made.

Quantitative research—research designed to gather objective information from
representative, random samples of respondents. Results are expressed in
numerical terms and are used to draw conclusions about the target audience.

-------
The Twelve Laws of Evaluation Research

Peter H. Rossi

This paper highlights some of the major ideas in A Guide to Evaluation Research
Theory and Practice (Rossi and Berk, 1988). The full text of this larger paper, reprinted
in the Appendix, introduces the reader to the central substantive and technical issues in
evaluation research and to the important literature in this field. Presented here are some
principles derived from that paper.

Major Evaluation Modes
At first, evaluation research assessed whether or not programs were succeeding in
reaching their stated goals, but it soon became clear that there was a strong need to use
social research in designing the programs as well. As a result, there are now two major
evaluation modes:
• Formative evaluations, consisting of research to improve programs at the pro-
gram design stage
• Summative evaluations, consisting of efforts to assess the success of existing
programs.

Although formative and summative evaluations resemble each other in some ways,
there are important differences. Operating agencies with the responsibility for implement-
ing programs are usually more interested in formative than in summative research.
Policymakers and oversight groups, such as the Congress and the executive branch
agencies, are usually more interested in summative research.
Most of the evaluation laws discussed below are applicable primarily to one mode
or the other. A few are general laws that apply both to formative and summative
evaluations. Some of these laws appear to embody simple common sense and therefore
may seem hardly worth stating. Yet, a large proportion of failed programs and inconclusive
findings are the result of not following these common-sense laws.
25

-------
26 The Twelve Laws for Evaluation Research

Three General Laws of Evaluation Practice
LAW GI: There is no such thing as a free evaluation. This law states that there are
costs to every evaluative effort. It implies that there is a rough proportionality between
quality and price.
LAW Gil: Evaluations should not cost more than the program being evaluated. This
law emphasizes that evaluation is not an end in itself but is necessarily subservient to the
programs to which it is applied. An implication of this law is that trivial programs do not
merit elaborate evaluations and that important programs ought to be evaluated more
elaborately.
LAW Gill: Evaluation starts at the very beginning of a program. Ex-post-facto
evaluations can never attain the same degree of validity as evaluations that are planned at
the outset of a program and conscientiously pursued throughout the planning, design, and
implementation stages.

The Laws of Formative Evaluation
LAW FI: Proper design requires prior knowledge. This law states that a program
cannot be designed properly without having some valid knowledge about the nature,
extent, and location of the problem in question. It means that one of the first steps in the
design of programs is to learn about the nature of the problem to which the program is
addressed.
Of course, the information needed is not simply the opinions and guesses that one
can find in the op-ed pages of the national media or depicted in television documentaries.
What is needed is valid data, firmly based on rigorous social research.
LAWFII: Proper evaluation design requires specific program goals. Or, if you don't
know where you are going, you can't figure out how to get there. Stated this way it sounds
obvious, but it is one of the most frequently ignored rules of program design. There are
all too many examples of legislation that simply provides funds for programs without
specifying what the programs are to accomplish, a sure invitation to the design of frivolous
programs.
LAW Fill: Response to dosage is usually curvilinear. Another way of stating this
law is that a reduction in the amount of a treatment of some sort does not usually produce
a proportional reduction in response. For example, if an eight-page educational pamphlet
produces a certain amount of knowledge change, a four-page version does not necessarily
produce half that amount (but usually considerably less).
LAW FIV: Pilot programs usually work better than production programs. This law
means that it is easy for program designers to produce and run a program that is effective
when they run it but not so easy to fashion a program so that YOAA—"Your Ordinary
American Agency"—can carry it out. A critical design issue is the need to create a program
that, when turned over to an operating agency, will perform as well as when under the
control of designers.

The Laws of Summative Evaluation
The main purpose of a summative evaluation is to estimate aprogram's impact; that
is, those effects that are over and above what would have occurred naturally, or net effects.
A homely illustration: an effective remedy for the common cold should produce recovery

-------
The Twelve Laws for Evaluation Research 27

in patients in time periods appreciably shorter on the average than the typical two weeks
that it takes for untreated patients. Summative evaluations, which are quite tricky and
difficult to carry out well, require high levels of technical skill.
A summative evaluation is usually commissioned by an agency with oversight
responsibility; in the case of the federal government, this is Congress, an agency in the
executive branch, or the central policymaking unit of an agency. Summative evaluators
often find themselves regarded as antagonists by program managers. In contrast, formative
evaluators typically work closely with program designers and managers.
LAW SI: Impact assessments are not substitutes for the political process. The first
law of impact assessment states that policymaking is a political not a technical function.
The fact that a program has been found effective or ineffective usually does not dominate
decisionmaking about that program, nor should it. There are many reasons for establishing
and continuing a program, among which effectiveness may be only one of the major
criteria. Correspondingly, there are many examples of programs that have been shown to
be ineffective or weakly effective that are nevertheless continued; prime examples are job
training programs.
LAW SII: The impact of a program can be assessed only comparatively. This law
states that in order to estimate the impact of a program it must be compared to the absence
of that program. This is the law that mandates the use of comparison groups. Italso implies
that randomized controlled experiments are the preferred means to make such compari-
sons, although they are frequently impractical.
Most of the art of impact assessment lies in defining and using the best possible and
most practicable comparison groups or situations. The full paper provides a charted
inventory of the nine most commonly used approaches to the construction and utilization
of comparison groups, ranked in rough order of credibility of the resulting impact
assessments. This chart is the most important item of information in the long section on
impact assessment and should be given serious consideration.
LAW SIII: Programs that do not have clear and consistent goals cannot be evaluated.
This third law of impact assessment is a restatement of the second law of formative
evaluations. In other words, if you don't know where you are going, not only can you not
figure out how to get there, but if you do get there, you don't know where you are.
Designing an impact assessment requires specifying in advance what are to be
indicators of success, a process that involves translating program goals into concrete
measures of success. A program, such as a community block grant, that has only the vague
goal of improving the quality of living in urban areas, simply cannot be assessed.
Evaluators can avoid a lot of aggravation by simply refusing to undertake evaluations of
such programs.
LAW SIV: The expected outcome of an impact assessment is an estimate of zero
impact. This fourth law of impact evaluation often is called the "iron law of evaluation"
and is misunderstood as an argument against having any programs. The law is based on
the fact that most evaluations find programs to be, at best, only marginally effective in
reaching their goals.
In part, these findings reflect the fact that designing effective programs is not an easy
task. Ours is a society that has moved a long way toward improving the level of living of
most members; making additional improvements is usually increasingly more difficult.

-------
28 The Twelve Laws for Evaluation Research

For example, it is relatively easy to move a society rapidly from 10 percent literacy to 60
percent literacy, and indeed, there are many examples of nations that have accomplished
such changes in the space of one or two generations. In contrast, it is much more difficult
to move from 80 percent literacy to 90 percent. To be illiterate in a society in which most
people are literate is quite a different matter from being in that condition when a majority
is illiterate. Similarly, most of the decline in mortality experienced in our society was
accomplished relatively easily and inexpensively by such public health measures as
sanitary sewers and supplying reasonably good drinking water. Today's problems in
further reducing mortality require more resources at a higher level of effort with poorer
prospects of success.
The somewhat discouraging message of the fourth law reflects the fact that we
usually evaluate only those programs whose success is problematic. There are no
evaluations of our social security old age pension system or of our public schools, because
there are very few doubts that mass education is effective compared to no education at all
or that retired persons are better off under the social security benefit system compared to
no benefit system at all.
In other words, we set about to estimate the effects of those programs whose
effectiveness we believe is problematic. It is no surprise that when we do so, our findings
are that they are indeed problematic.
LAW S V: There are three main reasons for the failure of programs.
• The problem was not correctly understood and that misunderstanding was built
into the structure of the program.

• The program was improperly designed.

• YOAA could not deliver the program with sufficient fidelity and at the correct
dosage level.

This law states that the interpretation of an impact assessment is a complicated
matter. In the first place, it has to take into account the fit between a program and the
existing valid knowledge concerning the problem in question. Clearly, if there is no
reasonable correspondence, that is reason enough for the program' s failure. The example
that comes most easily to mind is the assumption in the design of the housing voucher
experiment that families that lived in substandard housing, as defined in the experiment,
would agree that they lived in such housing and would welcome change to better housing.
In fact, many of the standards used by the program were irrelevant to participants.
The second main reason for ineffective programs are design defects in the programs
themselves. For example, a famous California program providing group therapy for
prisoners was designed to utilize prison guards as therapists, a design feature that insured
that an atmosphere of trust between therapy group members and therapists would not be
attained. This example can be quite misleading, however; most program design defects are
not as obvious.
The third main reason for program failure lies in program implementation. It is all
too often the case that an agency is given a mission for which it is unsuited. Police
departments have been given the mission of counseling in domestic disputes when called

-------
The Twelve Laws for Evaluation Research                                     29

to quell a family quarrel. Schools in inner cities have been given the task of providing for
recreation for drop-outs. The military has been asked to release unused facilities to house
the homeless. The U.S. Department of Agriculture's Extension Service was asked to set
up urban extension programs to teach proper nutrition practices to inner city mothers.
     An additional failure in implementation can occur because the agency assigned to
implement a program is not given enough resources to accomplish that end. The result is
a fatally weakened version of the program.

Conclusion
     These are only a few central rules for the proper design and conduct of evaluation
research. Additional laws might be formulated and could, indeed, make up a fat compen-
dium.  What makes these twelve laws important is that they link the technical skills of
evaluation research with substantive knowledge about problems and programs. Evalua-
tions, whether formative or summative, are not just technical exercises; they need to be
informed by substantive knowledge.
REFERENCE

Rossi, P.H., and R.A. Berk. 1988. A Guide to Evaluation Research Theory and Practice.
Paper prepared for the Workshop.

-------
PRESENTATIONS

-------
Integrating Evaluation into the Development and
Design of Risk Communicaton Programs

June A. Flora

Risk communication researchers and professionals have long acknowledged the
importance of evaluation in message development, intervention implementation, and
program dissemination. Interventions can suffer from a lack of planned, systematic, and
comprehensive evaluation that incorporates preproduction research, intervention and
dissemination monitoring, and measuring program effectiveness (Flay, 1987). In addition,
evaluation results often are not incorporated into message development, intervention
implementation, and program revision. This lack of integration of results into program
plans can be due to lack of time to fully incorporate the resulting feedback. In other cases
evaluations are initiated late in the intervention planning process (e.g., intervention
monitoring or program outcome evaluation). Other problems include evaluations that are
limited to superficial objectives (e.g., liking, reading, listening) and exclude more in-depth
evaluation (e.g., audience segmentation, needs analysis) and measurement of objectives
closer to the desired outcomes (e.g., behavior change).
This paper describes the components of a framework for comprehensive risk
communication evaluation and provides suggestions for integrating evaluation results into
the intervention planning process. The framework includes evaluation during the design
and development phases of an intervention. This preproduction phase includes planning
research, concept testing, pretesting of messages, and pilot studies. We call this first phase
formative evaluation. The second phase, which roughly corresponds with the second phase
of program development, is called process evaluation. Process evaluation includes
monitoring message dissemination, implementation quality and participant utilization and
satisfaction. The final, most often discussed phase of evaluation, outcome evaluation, will
be reviewed only briefly. The emphasis here will be on the underutilized and often
neglected area of research in preproduction and dissemination phases of a risk communi-
cation program. The first two phases of evaluation research will be illustrated with
examples from the Stanford Five City Project (FCP). Finally, a set of principles for
increasing the utility of results will be presented.
33

-------
34 Integrating Evaluation Into Risk Communication Programs
Framework for Comprehensive Evaluation
Table 1 presents each of the three phases of a comprehensive evaluation plan,
which correspond roughly to the states of intervention development. The first phase of
intervention development is called preproduction planning (identification of target audi-
ences, concept development, audience analysis, specification of intervention outcomes)
and production (message design, refinement, and final production). The second stage of
program development encompasses implementation and dissemination (intervention
delivery). The final stage is intervention refinement and revision (understanding what
worked and what failed). The three corresponding stages of evaluation are discussed in
more detail in the following sections.
Evaluation
Intervention
Sequence
Sequence
Table 1. Phases of a Comprehensive Evaluation

Phase of

Evaluation
1. Audience Segmentation
2. Asset & needs analysis
3. Concept testing Formative
4. Message pretesting
5. Pilot studies (optional)

6. Dissemination
7. Utilization Process
8. Implementation
effectiveness

9. Intervention Summative
effectiveness
1. Identify Audiences
2. Specify objectives
3. Develop concepts
4. Construct messages
5. Refine messages

6. Implement programs
7. Disseminate
products/programs
8. Follow-up programs
Formative Evaluation
Formative evaluation is defined as the sum of evaluation activities that occur prior
to the final production of a risk communication intervention. Formative evaluation
encompasses activities that serve three functions relevant to intervention design; planning
research, concept testing, and message pretesting respectively.
Planning research is one of the most important evaluation activities in this sequence.
Planning research activities set the stage for intervention conceptualization. Prior to these
planning evaluation activities, risk communication planners have determined their theo-
retical orientation, broadly identified their target group (e.g., smokers, sexually active

-------
Integrating Evaluation Into Risk Communication Programs 35
adults, or overweight men), and specified target outcomes (e.g., changes in morbidity,
mortality, behavior). These planning requirements set the stage for planning research.
Planning research can be categorized onto three separate sets of activities; however in
reality all may be carried out by one survey or other research activity.
The first planning research activity is audience segmentation. The goal of this step
is to identify target audiences that differ by variables that are relevant for intervention
design. These relevant variables include demographic factors that may be indicative of
differences in message format (e.g., easy to read), access to information (e.g., cost of
programs, membership, mobility), or message appeal (e.g., messages embedded in cultural
context, gender differences, experience with outcome behavior). Other relevant segmen-
tation variable are differences in extent of involvement in the outcome behavior (e.g., the
sedentary, low level exercisers, and vigorous exercisers), lifestyle factors (e.g., cognitive,
social, and behavioral factors), and information processing (e.g., high information seek-
ers). Whatever the segmentation variables, subgroups must differ by factors important to
intervention designers, i.e., channel of communication and message.
The second planning research activity is audience needs analysis. Once audience
segments are identified, their needs and assets must be determined. This step is
traditionally labeled "needs analysis," but this unfortunate title masks the fact that needs
as well as resources (e.g., skills, networks, motivation, regulations) must be identified.
Needs analysis combined with the third and final planning activity, channel analysis, set
the stage for creative and effective message/intervention design.
Risk communication interventions often require a variety of channels through which
messages, products and services can be delivered to target groups. These channels range
from the mass media (e.g., television, radio, and newspapers) to more narrowcast media
(e.g., mail, newsletters, small audience radio and newspapers, specialized magazines) to
interpersonal communication (e.g., community leaders, social opinion leaders.profession-
als) (Lefebvre and Flora, 1988). Audiences differ greatly in their use of channels, the
communication functions of channels (e.g., information, entertainment, socialization, and
surveillance), and the extent to which they select within channels (e.g., reading only news
or sports in the newspaper). This audience communication pattern data and practical
information on feasibility of channel use, cost, time, and access are necessary for the risk
communication intervention designer (Flora, Maibach, and Slater, 1989).
Concept assessment and testing is often a more qualitative followup of audience
segmentation and needs analysis. Researchers may observe members of the target
audience in natural settings (e.g., ethnographic research), conduct intensive interviews
with audience members about issues germane to the targeted outcome, or determine
intervention preferences by examining behavior patterns in related areas (e.g., self change,
health, social skills). Concepts also can be assessed quantitively. In the Stanford Five
Community Program (FCP), we assessed the appeal of a range of intervention possibilities
such as media programs, self-help print kits, correspondence courses, groups and classes.
This information was used to set product development priorities for the next year of
intervention.
Once target groups are identified and concepts refined, further research is necessary
to ensure that the final products will achieve interventionists' objectives. This next, more
detailed step requires that samples of the target audience be exposed to rough forms of the

-------
36 Integrating Evaluation Into Risk Communication Programs
final product. Message pretesting incorporates all evaluation activities that assess the
extent to which risk communication products meet their informational objectives. Mes-
sage pretesting also can determine the likelihood of success of products through evaluation
of acceptability, comprehension, familiarity, memorability, and credibility.
Two methods of data collection are commonly used for message pretesting research:
audience response analysis and focus group discussions. It is always useful when these two
methods are conducted with the same samples of audience members. Audience response
techniques, such as those used by the Health Message Testing Service (Office of Cancer
Communications, 1984) at the National Cancer Institute and the National Heart Lung and
Blood Institute, utilized techniques of message presentation, audience response, and
analysis. Focus group discussions can vary in their purpose and process (Basch, 1988) but
typically incorporate an unstructured discussion of the draft health product. More exten-
sive discussion of formative evaluations are available in publications by Palmer (1980),
LaRose (1980), and Atkin and Freimuth (1989).
A final practical issue created by experience is the relationship of the degree of
finality of the product to the relevance of the message testing results. For example, a
television PSA is first a script; then a script with description of visual and auditory
elements; a story board (cartoon like pictorial sequences that show the message from
beginning to end); a draft production (a roughly edited version of message, often using
different actors, convenient locations, and music); a rough cut (a draft form of scenes from
the final product, perhaps without final music, transitions, and edits); and finally, a finished
product.
Message testing conducted early in this message development process is important
for determining if the message concept is promising. However, information gathered early
in the development will be less accurate in regard to production characteristics that have
not yet been finalized (e.g., actors, music, editing, setting). Yet, changes later in the
sequence usually cost more. Interventionists and evaluators must often make tradeoffs
between quality of evaluation results and the expense (dollars and time) of incorporating
changes. A similar analogy is possible >vith print media, although desk top publishing has
reduced considerably the cost of near-final printed products.
Audience segmentation: An example from theFCP. Smokers identified in the FCP
baseline survey were segmented into the three motivational groups (highly committed to
quit, moderately committed, not at all committed). Comparisons of the three groups
showed that those not committed to quit: a) were more likely to be male and less likely to
be high school graduates; b) more often had a heavier smoking history—they smoked more
cigarettes a day, quit fewer times, and had increased the number of cigarettes smoked over
the past two years; c) had poorer health habits and were less interested in changing habits
to avoid coronary heart disease (CHD), d) used smoking to cope with life stresses and held
attitudes that reflected a poor sense of control over their smoking; and e) perceived fewer
pressures to quit.
In general, less committed smokers were early in the change process (e.g., less aware,
informed, motivated, and skilled). Increasing knowledge about the effects of smoking, and
increasing perceptions of the benefits of change and improving self-efficacy about quitting
are important requisites to quitting. More highly motivated quitters fell later in the change
process, needing cessation skills rather than motivation. Further, constructing a social

-------
Integrating Evaluation Into Risk Communication Programs 37
environment supportive of staying quit (e.g., having nonsmoking friends, family support
for quitting, smoke-free environment) is crucial to the continued success of motivated
quitters.
FCP campaign designers initially focused on the more motivated smokers reasoning
that they could be helpful to those who were less motivated. Television programs
supplemented with classes and self-help quit kits constituted some of the first efforts in
smoking cessation (Sallis, Flora, Fortmann, et al., 1985). Later in the campaign, quit
smoking contests that offered incentives (a trip to Hawaii) for quitting and staying quit
were implemented on an annual basis (King, Flora, Fortmann, et al., 1987).
Efforts to reach poorly motivated smokers included PSAs utilizing fear appeals to
persuade smokers about the need to quit, combined with the promotion of telephone
numbers to call for more intensive instruction. A program teaching physicians to prescribe
Nicorette gum (to cope with the addiction to nicotine) and counsel patients about staying
quit was also an important component of a comprehensive smoking effort.
That segmentation analysis combined with needs analysis of the identified segments
shaped the sequence and construction of programs for the length of the FCP. Other
formative research (i.e., message testing) supplemented the segmentation analysis and
further shaped and refined individual program products.

Process Evaluation
Process evaluation is distinguished from formative evaluation by its concern with
the processes of dissemination and implementation of the intervention. These processes
of implementation can be described by three broad concepts; (1) identification and
definition of the "do's" of intervention, (2) determination of the integrity of intervention
delivery, and (3) detection and description of direct and indirect (intended and unintended)
program effects.
The first concept concerns evaluation from a message sender or program delivery
perspective. The goal of this first aspect of process evaluation is to identify program
components, their intended outcomes, their intensity, repetition, and potency. The second
aspect of process evaluation, integrity of intervention, also is concerned with the extent to
which the actual implementation of the intervention meets the expectations of program
designers. The third aspect of process evaluation is concerned with the participants' (and
non-participants') responses to the program.
The objectives of this level of evaluation are to: (1) investigate the qualitative aspects
of the program, (2) determine the amount of intervention, (3) provide explanatory links in
cause and effect relationships, (4) determine any unintended or indirect effects of the
intervention, and (5) provide supplemental data that may augment the interpretation of
outcome evaluations.
In addition, program monitoring data are useful for effective administration and
management of programs through increased information for intervention goal setting,
intervention development, establishment of priorities, and refinement of programs. This
is accomplished through provision of feedback on intervention to program staff. Both
intervention integrity and program development are enhanced by process evaluation.
Finally, intervention monitoring can yield archives of program efforts that become the
database for cost and cost-effectiveness analysis. Thus, program management, planning,

-------
38 Integrating Evaluation Into Risk Communication Programs
program description, and cost analyses are important additional outcomes of intervention
monitoring.
Process evaluations most often require the capability to: (1) collect similar types of
information over time, (2) collect data about participants as well as non-participants, (13)
allow for collecting information so that the amount of intervention exposure per recipient
can be calculated, and (4) monitor progress of interventions. The Stanford FCP developed
an education monitoring system that accounts for the number, type (e.g., print, television,
face-to-face), and objective (awareness, information, skills) of messages sent to a target
audience. This time-based system provided feedback to program planners on progress
towards implementation goals, amount of effort devoted to promotion and behavior
change, number and type of channels of communication used, and estimates of individuals
reached by programs (Flora, Goode, and Farquhar, 1985). Supplemented with auxiliary
studies of audience response and implementation integrity, FCP staff were able to detect
problems in implementation and to make revisions both during the ongoing campaign and
in future campaigns.

Summative Evaluation
Summative, or outcome, evaluation describes the impacts (direct, indirect,
intended, unintended) of programs. Its main objectives are to determine whether the
program goals are achieved and whether there are alternative explanations for results.
There are several excellent discussions of Summative evaluations of communication
campaigns (Cook and Flay, 1981; Cook and Flay; 1989). These references review
summative evaluations that simply monitor programs along with more costly causal
designs.

Using Evaluation Results
Literature on the history of the use of evaluations reveals a generally pessimistic
picture (Windsor, Baranowski, Clark, and Cutter, 1984). Evaluation results are often not
incorporated into program planning and implementation. Four general kinds of reasons are
given for the lack of utilization of evaluation results (Wholey et. al., 1970; Windsor et. al.;
1984):
(1) Organizational inertia. Organizations are typically slow to incorporate recom-
mendations for change. Organizations are much better at maintaining the status
quo than changing. Thus, evaluations suggesting changes in planning and
program development are likely to be given less attention that those reinforcing
current efforts.
(2) Methodological weakness. Poorly conducted studies are subject to internal as
well as external criticism. This criticism weakens the credibility of results and
interpretations. Thus, decisionmakers are likely to use their own judgments
under conditions of poor research.
(3) Design irrelevance. Often evaluations are conducted without the input of
program planners and decision makers. This lack of input from those who are
the consumers of the research does not facilitate the active use of results.
(4) Lack of active dissemination. Lack of dissemination includes two issues. First,
evaluation results are at times simply not disseminated to a wide range of

-------
Integrating Evaluation Into Risk Communication Programs 39
individuals within an organization. Perhaps a more common dissemination
problem with the types of research discussed here is the lack of tailoring of
research reports to the skills and needs of program staff.

These four concerns are most commonly offered in the context of summative
evaluation. However, the last three reasons—methodological weakness, design irrel-
evance, and lack of dissemination—are relevant for formative and process research.
Evaluation can be made more useful if a few fairly simple suggestions are followed:
(1) Organizational prioritization. Decisionmakers within organizations planning
risk communication program must be committed to evaluation research
throughout the lifespan of a program, including conceptualization, implemen-
tation, and refinement. This commitment has to be accompanied by allocation
of time to conduct research, resources to carry out evaluation plans, funds for
staff, the necessary equipment and other logistical needs, as well as evaluation
expertise to guide planning and implementation of the evaluation.
(2) Evaluation planning. This second suggestion is a logical consequence of
organizational prioritization. Yet, it is so important that it deserves special
attention. Planning for formative, process, and summative evaluation has
many components. These planning components include input from staff
concerned with program development; input based on review of relevant
research literature; input derived from the theories that guide program devel-
opment; and input based on a consideration of evaluation objectives in each
phase (e.g., formative evaluation should include considerations of target
outcomes to be assessed in summative evaluation).
(3) Staff training and education. In order to facilitate input in evaluation planning
from a range of program staff (e.g., evaluation and risk communication
program planners, media professionals, content experts and evaluators), all
staff members should be fluent with the basic tenets of evaluation. Training
should be supplemented by staff discussions about planning, data analysis, and
interpretation. This group discussion process is invaluable in facilitating the
integration of research and program design and implementation.
(4) Reporting of evaluation research. Program planners often require simple,
direct answers to complex questions. Put differently, they need usable data
from evaluations. Thus, while the evaluation design, methods, and analysis
may be complex, interpretation and presentation of the data generally need to
be straightforward to be useful in the intervention planning context.

Summary
A comprehensive risk communication evaluation framework is one that includes:
1. Formative evaluation for the design and production of risk communication
messages/products/programs;
2. Process evaluation to monitor implementation and dissemination of interven-
tions; and
3. Summative evaluations, preferably those that are able to determine causality.

-------
40                       Integrating Evaluation Into Risk Communication Programs
     An often neglected area of concern in evaluation is linking evaluation results to
program  development, implementation, and revision.  Organizational prioritization,
adequate evaluation planning, enhanced communication between program and evaluation
staff, and evaluation reports tailored to intervention staff needs will increase the likelihood
of building risk communication campaigns based on evaluation research data.

REFERENCES

Cook, T.D. and Reichardt, C.S. (1992). Qualitative and Quantitative Methods in
Evaluation Research, Beverly Hills, CA: Sage.

Flay, B.R. (1987).  Evaluation of the development, dissemination, and effectiveness of
mass media health programming. Health Education Research. 123-129.

Flay, B. and Cook T.D. (198)1. Evaluation of mass media prevention campaigns. In Rice
R.E., Paisley W. (Eds.), Public Communication Campaigns. Beverly Hills: Sage.

Flora, J.A., Maccoby, N.M, Farquhar, J.W. (1985). A Prototype Education Monitoring
System.  A paper presented at the American Public Health Association meeting, Wash-
ington, D.C.

Flora,J.A.,Maccoby,N.,Farquhar,J.W. (1989). Cardiovascular disease prevention: The
Stanford Studies. In Rice, R. and Atkin, C., Public Communication Campaigns. Beverly
Hills, CA: Sage.

King, A.C., Flora, J.A., Fortmann, S.P. and Taylor,  C.B. (1987). Smokers challenge:
Immediate and long term findings of a community smoking cessation contest. American
Journal of Public Health. 77. 1340-1341.

La Rose, R. (1980). Formative evaluation of children's television as mass communication
research. In Dervin, B. and Voigt, T. (Eds.), Progress in Communication Science. Vol. II,
275-297.

Palmer, E. (1981). Shaping persuasive messages with formative research. In R.E. Rice and
W.J. Paisley (Eds.), Public Communication Campaigns. Beverly Hills, CA: Sage.

Reicharts, C.S. and Cook  T.D. (1979).  Beyond  qualitative vs quantative methods. In
Qualitative and Quantitative Methods in Evaluation Research. Beverly Hill.  CA: Sage.

Wholey, J. ScanlonJ., Duffy, H.,Kukumoto,J., and Vogt,L. (1970). Federal Evaluation
Policy: Analyzing the Effects of Public Programs. Washington, D.C.: Urban Institute.

Windsor, R.A. Baronowski, T.,  Clark, N., Cutter, G. (1984).  Evaluation of Health Pro-
motion and Education Programs. Palo Alto, CA: Mayfield.

-------
Marketing Research and Risk Communication
Corporate and Public Sector Roles

William D. Novell!

Elements of marketing research, for purposes of planning, tracking and evaluation,
have found their way into many of today's health communication programs. In one sense,
this is to be expected, since marketing research and evaluation research stem from common
social science antecedents. In addition, some of the research methods in health commu-
nication, such as use of focus groups for qualitative research, are borrowed directly from
marketing.
The evaluation research described by Rossi and Berk (1988), however, is more
precise, more comprehensive, and usually more expensive than most marketing studies.
Nonetheless, their concept of "the best possible strategy" is certainly common to market-
ing, since questions of cost, timeliness, political feasibility, and other pragmatic consider-
ations are essential to decisionmaking.
Marketing research, at least in the commercial world, is supposed to have an impact
on the bottom line. There was a time when corporations could rely almost solely on
technical expertise in research and development or production to carve out successful
businesses. Most of the great corporations became great because they excelled in some
technology: Dupont in polymer chemistry, PPG in glass, General Electric in electrical
products, and IBM in computers.
But today, superior technology is virtually ubiquitous. It is no longer so much a
competitive edge as it is the price of being able to compete at all. If one company moves
out ahead in a technological area, its technologically adept competitors are apt to copy it
quickly.
The competitive edge, therefore, frequently lies in marketing. The key to success in
business today is often the ability to direct technical efforts into producing and delivering
products and services for which certain targeted markets will pay and with which they will
be satisfied. That, of course, is what marketing is all about. And that is why marketing
research is so essential to playing the game as well as to keeping score.
While technology is changing, so is marketing research. For example, split cable
technology now enables marketers to compare the perceptions of households receiving one

-------
42 Marketing Research and Risk Evaluation
set of television commercials with next door neighbors who are viewing a different set of
TV messages. This split cable research technology is being combined with uniform
product code scanners, which read and register items and their prices as they are checked
out of supermarkets. Using them together, marketers can measure the purchase patterns
of household members who are viewing the TV commercials being tested. Also, pricing
strategies, coupon use, in-store promotions, and other techniques can be assessed, all
combined with demographic, purchase, and media patterns.
What does all this have to do with risk communication? Marketing research is
relevant to the evaluation of risk communication because their essential purposes are the
same. Both aim to plan well integrated programs, to assess progress in interventions, to
measure intended and unintended effects, to track and engage in surveillance for making
program adjustments, and to respond to market place change. Also, both marketing
managers and health communicators must develop long-term and annual plans and must
measure performance regularly to assess effectiveness in reaching objectives and effi-
ciency regarding budget expenditures.
The necessary ingredient for all this is relevant, quality information, regularly
available in a form that is useful for management decisionmaking. Not all data gathering
is expensive, although some information can be quite costly to collect. Robert Waterman,
a co-author of In Search of Excellence, was asked how the companies he studies anticipate
marketplace change. His answer was deceptively simple: They have close ties with their
customers. Waterman explained that successful companies have a wealth of "listening
devices" to keep tabs on consumers and suppliers, as well as on the competition.
This concept of keeping tabs is directly applicable to health communications
research. Numerous listening devices can be put in place to track at-risk consumers and
other targets. Some of these devices can be quite inexpensive, and no one or two would
be sufficient; but collectively they can be an affordable, effective means for tracking
change. Years ago, in the Stanford Program's original community studies, investigators
employed what they called informal "snoops" to assess what was going on in their test
communities.

Research on Social Issues
While marketing and its research techniques have much to offer health commu-
nicators, corporate America appears to have a particular blind spot. This blind spot
presents a need and an opportunity for public health professionals to lead the way. As
expert as they may be in studying the marketplace, corporate marketers seem to be content
to do little or no research on social issues as long as their current marketing strategies appear
to be working well. As a result, the corporate marketing research system all too often is
insensitive to unfolding social needs. Two New York University marketing professors,
Larry Rosenberg and Robert Shoemaker, expressed some thoughtful views on this
problem some years ago in the Sloan Management Review (Rosenberg and Shoemaker,
1980).
There appear to be a variety of reasons for myopia when it comes to emerging social
issues. One is the widespread corporate practice, especially at the product manager level,
to rotate managers every two to three years. Under these conditions, managers feel
compelled to fund research that will be completed quickly, at the lowest possible cost, with

-------
Marketing Research and Risk Evaluation 43
direct impact on short-term sales. Their brief tenure reduces any incentive to be innovative
in conducting marketing research.
A second reason is that, in most cases, managers are judged by annual sales, profit,
and return on investment. Actually, in today's marketing world, "annual" means "long-
term," and quarterly assessments are typical. Senior level managers tend not to reward
social issue concerns, since they may not be related to immediate profit and loss.
Third, higher-level executives also may be loath to study social issues if the findings
may adversely affect their companies. For instance, Rosenberg and Shoemaker cite the
case of senior managers of a major cosmetics manufacturer who did not research consumer
attitudes on ingredient labeling because they anticipated some level of demand for
improved labeling.
Fourth, large companies usually market a broad product line. Problems involving
potentially deceptive advertising, harmful ingredients, or inadequate safety warnings may
involve many products, in several divisions. The average manager may ignore the issues,
seeing them as larger, company- wide problems beyond his or her control.
Fifth, the cost of research is an important consideration in corporate marketing, and
this, too, contributes to a lack of surveillance of social needs. For example, convenience
samples may be used, which cut costs and speed results, but which may not uncover social
discontent.
Finally, high- and low-socioeconomic segments often are inadequately represented
in corporate research. This means that the early warnings of a social issue may be
underestimated or missed completely. On one hand, studies show that consumers who
complain about company practices and products tend to be atypical and often in higher
socioeconomic strata. These types are often underrepresented in marketing research
samples. On the other hand, disadvantaged, inner city people also are usually under-
represented in marketing studies. Much marketing research is done in suburban locations,
often by interviewing people in malls or through mail surveys with sizeable non-response
levels. Such methods can limit detection of social issues.
For all these reasons, studies of social issues are given low priority and often are the
first to be dropped when research funds are tight. Thus, an awareness of health and safety
issues usually comes to corporate managers from sources other than their own—consumer
advocates, the media, social researchers, and government officials. Yet social issues, such
as health and safety risks, may be of importance to the company. Awareness of such risks
among corporate executives is a necessary first step in their understanding of what they
must do to be good corporate citizens as well as to protect their business from social and
political pressures.
To summarize, marketing research has a great deal to offer public health practitio-
ners in planning, development, implementation, and assessment of programs. In turn,
public health professionals can use similar research techniques to sensitize, inform, and
educate American industry. This can help begin and accelerate the process of corporate
change in areas related to health and safety, before problems reach high-risk proportions.

-------
44                                     Marketing Research and Risk Evaluation
 REFERENCES

 Rosenberg, L.J., and R.W. Shoemaker.  1980. SMR Forum: Is Marketing Research
 Sensitive to Social Issues? Sloan Management Review. Winter.

 Rossi, P.H., and R. A. Berk. 1988. A Guide to Evaluation Research Theory and Practice.
 Paper prepared for the Workshop.

-------
Evaluating Risk Communication Programs1
A Catalogue of "Quick and Easy" Feedback Methods
Mark Kline, Caron Chess, and Peter M. Sandman
Agencies that deal with environmental health issues are paying greater attention to
how they can communicate with the public more effectively. There is also an increasing
body of literature directed to agency practitioners, suggesting how risk communication
principles might be translated meaningfully into reality.
As these principles are integrated into practice, agencies should also be evaluating
their efforts. Communication efforts, like technical ones, can improve with feedback. The
lack of such feedback may lead the agency to repeat the same communication mistakes and
fail to duplicate successes.
Unfortunately, it may be difficult for agencies to identify evaluation strategies that
are practical, useful, and affordable. The term "evaluation" has multiple meanings,
including making critical judgments about the worth of a program. Therefore, evaluation
activities may seem threatening to agencies already immersed in "crisis" communication
efforts, usually with limited resources. In addition, some forms of evaluation may seem
too elaborate and difficult to implement in this context.
The goal of this catalogue, which was funded by a contract from the Division of
Science and Research of the New Jersey Department of Environmental Protection, is to
identify and recommend specific evaluation methodologies with the greatest potential for
agency use in small-scale communication efforts where a full-scale evaluation may not be
feasible. These tools are also likely to have application in risk communication efforts by
industry and advocacy groups.
'Submitted to the Division of Science and Research, New Jersey Departmant of Environ-
mental Protection, September 22,1989, by the Environmental Communication Research
Program, New Jersey Agricultural Experimental Station, Cook College, Rutgers Univer-
sity, 122RydersLane,New Brunswick, New Jersey 08903; this paper summarizes the full
report.
45

-------
46 Evaluating Risk Communication Programs

Strengths and Limitations of Quick and Easy Evaluation
In its most general sense, the term "evaluation" refers to a process of interpreting and
judging events, aprocess that human beings engage in much of the time. Evaluation ranges
along a continuum, from informal, subjective impressionsat one end, to formal, scientifically
conducted and controlled evaluation research at the other (Rossi and Berk, 1988). In the
middle of this continuum are assessment and feedback methods that are more structured
and systematic than subjective impressions, but less rigorous than evaluation research.
Because these intermediate methods require much less time, resources, and expertise than
evaluation research, we call them "quick and easy" methods. In our view when most people
think of evaluation they tend to think of approaches that give an overall assessment of a
program's worth. Such approaches, including "summative evaluation" (Rossi and Berk,
1988) and "impact evaluation", lie at one end of the previously mentioned continuum.
Many programs go without any evaluation whatsoever because impact evaluation is
seen as the only form of evaluation and these efforts are beyond agency capabilities and
resources. Practitioners may be left with only their own impressions of how they fared in
a communication effort, with no basis beyond intuition and guesswork for correcting
communication errors and repeating communication successes.
Evaluation experts have generally accepted this state of affairs because of their
conviction that data from poorly designed evaluation research studies can be misleading.
Rossi (1988) has noted that a bad evaluation can be worse than not doing one at all.
Proponents of rigor have seen less rigorous research badly abused, leading them to
conclude that agencies are better off knowing nothing than obtaining questionable
feedback.
We believe that partial feedback can be better than none at all if the strengths and
limitations of this feedback are fully understood. Agencies should not, for example, rely
on feedback from "quick and easy" approaches for impact evaluation. Drawing reliable
causal inferences about the effects of a communication effort requires scientific evaluation
research.
This catalogue focuses on approaches that we feel are useful when practitioners face
limitations on time, expertise, and other resources. These approaches can be practical for
less resource- intensive communication efforts, where impact evaluation is not appropriate
or possible.
In lieu of formal impact evaluation, agencies can rely on feedback from quick and
easy approaches to guide the development of their risk communication programs. This is
called "process evaluation," and it examines the ongoing processes and procedures of a risk
communication effort. "Formative evaluation" techniques, which assess the strengths and
weaknesses of materials before full implementation of a program, can also be adapted to
suit less resource-intensive communication efforts. Some techniques used in "outcome
evaluation," which explores the reactions of audiences after a phase of a communication
effort, can also be adapted for quick and easy use. Since the use of "quick and easy"
methods generates feedback which is more systematic and disciplined than that found in
typical practice, the use of these methods creates programs that may be ultimately more
amenable to rigorous impact evaluation, should resources become available.

-------
Evaluating Risk Communication Programs 47
If "quick and easy" approaches are viewed as a means of obtaining a snapshot—
rather than a full picture—they can provide useful input to agency risk communication
efforts. Practitioners can use quick and easy strategies to gather some information that will
inform their practice in the absence of a full study. In particular, quick and easy strategies
can yield information that can lead to mid-course corrections and bring new ideas into the
process. This feedback can be even more critical to agency efforts than retrospective
analyses. (It may be ultimately more useful for practitioners to know they are about to light
communications fires than to evaluate their firefighting efforts.) Information gathering of
this type is common in the public relations field, where it is viewed as "developmental"
input for generating hypotheses rather than as conclusive data that are reliable and
generalizable.
Feedback can be viewed as an opportunity to turn bad news into good. Agencies can
use feedback suggesting that a program is off-course to put the program back on track.
Even scathingly negative remarks can be fodder for making a program more effective.
When viewing feedback as information to succeed rather than as justification, superficial
praise about a meeting or brochure may be less useful than critical remarks that include
suggestions for change. The latter provide the agency an opportunity for improving its
materials and the added benefit of being responsive to the public.
Agencies should not abandon rigor entirely when gathering information. Quick and
easy methods can be more valuable if agencies attempt to be as rigorous as possible within
the constraints of their resources. For example, keep in mind basic principles of objective
data gathering, carefully defining target groups, choosing representatives typical of the
target groups, and asking questions in a consistent and unbiased manner. More rigorous
methods increase the strength of conclusions that can be drawn from feedback. Awareness
of the need for rigor can also allow agencies to refrain from drawing sweeping and
misleading conclusions from developmental feedback.

Barriers to the Use of Quick and Easy Evaluation
We believe these strategies can help communicators develop and maintain an open
channel to those outside the agency. However, even the best feedback is of little value if
it is not heeded. Audiences may already be skeptical about whether agencies will use their
input and respond to their needs. If practitioners gather evaluative feedback, they must be
open to using it. Furthermore, they should be prepared to assess how the feedback was
used—whatrole it played in the decision that was ultimately made—and also to demonstrate
any positive effects to the public. Agencies, in short, should be accountable not only for
getting input from the public, but also for using it and showing that they used it. If audiences
sense that their time and effort have gone to waste, they may be even more disenchanted
with agencies than they would have been if no feedback had been solicited.
Agencies that operate as closed systems may have little organizational investment
in this kind of feedback. In such an agency, decisions are made on the basis of an internal
process. Staff are accountable to their supervisors who are in turn accountable to higher-
ups. Communication efforts may be designed to take into account this internal input and
keep things running smoothly. Staff who attempt to bring in new ideas based on public
input may not be supported. Agencies of this kind may attempt to lend an occasional ear,
pass out an occasional survey, and make an occasional telephone call in an effort to solicit

-------
48                                      Evaluating Risk Communication Programs
public input, but the system's incentives make it unlikely that such input will be used
constructively.
      Even the best evaluation tool can be subverted by this sort of agency process. For
quick and easy tools to function well in maintaining an open channel, they must be
supported by agency management and policy. Without this support, front-line practitio-
ners may gather information only to have it ultimately ignored, leaving them with an even
more irritated pubb'c than in the first place.
      Part of quick and easy evaluation involves agency management encouraging staff to
be creative in opening the channel with the public—even when what emerges from the
channel is critical of the agency staff members conducting the communication program.
      Agencies, therefore, must be prepared to turn bad news into good. Critical feedback
provides an opportunity to improve a communication effort and a chance to be responsive.
Agencies that are not willing to make mid-course corrections in response to feedback from
the public will have little use for these tools. Agencies may be tempted to use quick and
easy strategies to justify what they did rather than to find out what they can do differently.
Aside from being a tedious  exercise, using  these tools in this way defeats  their  very
purpose—to introduce new ideas and feedback through an open channel.
      Risk communication and quick and easy evaluation are both value-laden processes.
The values and climate of an agency can have great impact on whether these tools help open
the door to the public or help keep it shut.  We have attempted to identify tools that support
commonly accepted risk communication principles, hopeful that agencies will use them in
the spirit of an open, ongoing dialogue with the public.

Development of This Catalogue
      This  investigation took the  form  of a scavenger hunt. Through telephone and
personal interviews, literature reviews, networking, and a computer database literature
search, we attempted to identify feedback approaches that we could recommend for agency
practice. We looked for techniques that:

      •  Are easy to use
      •  Can be implemented inexpensively
      •  Yield results quickly
      •  Are relatively non-threatening to both the audience and the agency
      •  Give feedback which translates to behavioral change
      •  Reinforce commonly accepted risk communication principles


        Our search was intensive but by no means exhaustive. We talked to a large group
of people, including risk communication  practitioners, those with evaluation experience,
consultants, public relations specialists, industry practitioners, and academics. We looked
into their suggestions and reviewed literature they recommended in addition to literature
we were unearthing. From this rich mix of sources, we identified the evaluation methods
and instruments reviewed in this catalogue.
      We recognize that we may have missed some instruments, though our networking
efforts did yield confirmation of many of the tools we describe from a variety of different

-------
Evaluating Risk Communication Programs 49
sources. This catalogue is not intended to be the final word on quick and easy evaluation
strategies. We encourage agencies to continue to look for and develop tools for this kind
of feedback.

How to Use This Catalogue
Our review of quick and easy evaluation methods is not in the form of a quick and
easy evaluation manual. After agencies have some experience with the instruments we
recommend, development of a step-by-step guide may well be appropriate. We assume
this catalogue will be of most interest to those who have a fair amount of commitment to
and expertise in risk communication. We hope they will use the catalogue as a resource for
assisting policy-makers and technical staff with evaluation. Nonetheless, we recognize
that most agency staff may not have the time to read a full review of each tool before
deciding which one will be useful to their risk communication efforts. The following
summaries of twenty-two tools give a brief overview of each. Readers can use these
summaries to decide which tools might prove useful to their communication effort.
However, readers will want to review the detailed reports about instruments that interest
them in order to get more in-depth information. (See the full report, as listed on page 45.)
These reports include a) detailed descriptions, including examples of how the instruments
have been used; b) discussion of strengths and limitations; and c) how to order the
instruments.

-------
50 Evaluating Risk Communication Programs

OVERVIEW OF EVALUATION METHODS
I. Planning
The key to effective risk communication is effective planning. Just as scientific
research without planning can slow down an assessment due to the need to rethink and
resample, it is ultimately more wasteful and time consuming to develop a brochure or
presentation without planning.
It is quite difficult, if not impossible, to evaluate a risk communication effort unless
you have planned a program so that you know what you want to achieve and how you are
going to achieve it. Because planning is so critical we have developed a separate document
on planning entitled, "Improving Dialogue with Communities: A Risk Communication
Workbook" (Hance et al., 1988). This workbook, available in 1989 from NJDEP's
Division of Science and Research or the Rutgers Environmental Communication Research
Program, includes checklists and worksheets to help those with little communication
background to identify communication goals, audiences, audience concerns, methods of
reaching people, key content points, and other components of successful planning.
Our research for this evaluation catalogue did locate some comprehensive planning
systems (Green, 1980; National Cancer Institute, 1989) that could have application in risk
communication efforts, but they are not "quick and easy" tools appropriate for this
catalogue. Other planning tools we located needed significant modification to be useful
in agency settings.

2. Audience Analysis
One of the keys to successful communication is understanding your audiences in
advance. Agencies need to identify the audiences involved in their communication efforts
and get a sense of what groups already know, what they need and want to know, and what
they expect from the agency. Audience analysis tools provide a means for practitioners to
clarify their perceptions of audiences in organized ways or to solicit feedback from key
audiences before, during and after a communication program. Such feedback can help
practitioners maintain an open channel between the audience and the agency throughout
the communication effort. These strategies are common in public relations and advertising
practice, where ongoing feedback from an audience is important to respond to changes
rapidly.

2A. Conceptual/Organizing Techniques
These techniques do not involve any data collection from audiences. Rather they are
frameworks to help communicators systematically organize and analyze their impressions
about different types of audiences.

2A-1. Policy Profiling Questionnaire
Purpose: To identify stakeholders in an issue and organize
agency perceptions of them.
Lead Time: Low
Staff Time: Brief—might include a meeting of involved staff.

-------
Evaluating Risk Communication Programs 51
Budget: Low
This tool helps agencies assess their perception of the potential impact that important
actors can have on a decision or course of action. Agency staff identify stakeholders and
numerically rate each of them in three categories: issue position, power, and salience.
These ratings allow a calculation to determine whether the stakeholder might oppose,
support, or be neutral toward a decision. This tool guides the agency's internal assessment
of relevant stakeholders and involves no formal data collection. It is a means for organizing
and comparing perceptions of stakeholders to anticipate reactions to a decision or issue.
However, the ratings are based solely on the perceptions of agency staff and are only as
valuable as those perceptions.

2A-2. Audience Analysis Matrices
Purpose: To identify relevant audiences and organize agency
perceptions of their reactions, involvement, or posi-
tion in a communication effort.
Lead Time: Low
Staff Time: Brief
Budget: Low

Matrices are developed which identify relevant audiences and cross-reference the
audience with another important variable— such as issue position, anticipated reactions,
or issue importance. These matrices allow a graphic representation of groups in a
communication effort while also encouraging greater awareness of the specific audiences
and their qualities. These matrices are based only on the perceptions of agency staff—they
involve no data collection. The instrument may be limited by the degree of knowledge,
intuition, and sensitivity present within the agency.

2B. Preliminary Audience Feedback
These techniques involve collecting information about an audience in advance
of communicating to help anticipate the audiences's needs and interests.

2B-1. Audience Information Needs Assessment
Purpose: To gather questions from relevant audiences in ad-
vance of public meetings so a response can be orga-
nized and presented.
Lead Time: Moderate to high—requires a number of weeks to
mail out inquiry, receive responses, and organize the
information. Lead time may be decreased if telephone
contacts are used instead of mailed inquiry.
Staff Time: Moderate
Budget: Low to moderate

Questions from an audience are gathered in advance of a public meeting so agency
staff can develop a meaningful response. The agency response may involve both written

-------
52 Evaluating Risk Communication Programs
and verbal answers to the questions. This approach, which helps agencies meet community
needs, establishes a precedent of listening to the audience and responding to its concerns.
However, it may require too much lead time for a crisis situation, and the answers generated
in advance may still meet with disagreement and dissatisfaction from the audience.

2B-2. Analysis of News Clippings
Purpose: To identify audiences and their concerns. To develop
some historical knowledge of a community to help in
planning future phases of a communication effort.
Lead Time: Variable, depending on how far back in time the
analysis goes.
Staff Time: Variable, depending on the extensiveness of the re-
view.
Budget: Low

Background information about on-going issues is obtained by locating appropriate
newspapers and clipping articles relevant to the issue in question. The clippings can be
analyzed for a variety of factors, including perceptions of prior agency behavior, public
concerns, principal actors, key events, and community mood. While a useful source of
input and background information, news clippings may reflect media biases, journalistic
sensationalizing, and the inaccuracies of the rush of daily reporting.

2B-3. Public Opinion Polling
Purpose: To assess audience opinion or reaction; to find out
what people see as important problems, what issues
and events they are aware of, and how they evaluate
social and political institutions.
Lead Time: Moderate.dependingonhowformalapollisrequired.
Staff Time: Moderate
Budget: Moderate to high—may involve contracting with a
polling firm to obtain useful results. A low estimate
for a very brief formal poll with a relatively small
sample is about $2000. Informal telephone surveys
may require fewer resources.

Polling can give agencies a sense of public attitudes and perceptions so the agency
can better target its communications. Carefully constructed polls can help prevent
surprises and provide a baseline for the later evaluation of the communication effort.
Agencies may hire firms to design and conduct polls on specific issues. These polls benefit
from careful development of the polling questionnaire and random sampling to increase
the reliability of the data. They may also be quite expensive. Informal telephone surveys
involve briefer questionnaires and smaller samples. Informal surveys may be more
practical and less expensive, but also less reliable. Polls and surveys tend to consist of

-------
Evaluating Risk Communication Programs 53
closed-ended questions that limit the richness of the data and can fail to convey the
complexity of public perception.

2B-4. Public Opinion Polling/Pollstart
Purpose: To organize and analyze polling data on personal
computers available within agencies.
Lead Time: Moderate to high, depending on extensiveness of the
poll, expertise in polling design available, and
knowledge of personal computers.
Staff Time: Moderate—depends on previous expertise and skills.
Budget: Moderate. Pollstart software costs $98.00; Public
OpinionPolling.abook that guides useof the software,
costs $19.95.

Pollstart is apiece of computer software which allows agency staff to tabulate and
analyze polling data on a typical office personal computer. The manual for Pollstart
provides step-by-step guidance on how to encode the data within computer files and how
to generate "frequency reports" and "cross-tabulations." Public Opinion Polling provides
useful background on polling and a useful outline of the steps in planning and developing
a poll. The book was written as a companion volume for the software. While this system
provides an excellent review of polling issues, it does not make the reader a survey design
expert, and less experienced readers may still have difficulty designing appropriate
surveys. The software is also not capable of doing more complex data analysis.

25-5. Qualitative Questionnaires
Purpose: To collect information from people whom agencies
have involved in a communication effort.
Lead Time: Low to high, depending on the complexity of the
questionnaire and the time needed to develop it. May
also require at least two weeks to receive responses to
mailed questionnaires.
Staff Time: Low to moderate—depends complexity of feedback
to be tallied.
Budget: Low to moderate

Questionnaires are developed, usually in-house, to assess audience positions on
issues or responses to agency process. Because they may involve a small sample, the
feedback may not be statistically accurate or generalizable. These questionnaires can still
provide early input about specific directions an agency might take, or reasonably rapid
assessment of audience reactions. Questionnaire development, distribution, and tallying
can take considerable effort.

3. Message Pretesting
Agencies can obtain useful feedback on written materials by having them
reviewed (pretested) in advance of production and distribution. This input can significantly

-------
54 Evaluating Risk Communication Programs
improve materials so they are more easily understood and communicate the intended
message more effectively. Message pretesting may involve surveys and questionnaires,
discussion groups, and/or reviews of the language used in a document. Agencies can assess
whether the document is too complicated for the intended audience, the amount of jargon,
and other aspects of the writing style. We found the work of the National Cancer Institute
(1984,1989) to be of great value in exploring and assessing these techniques.

3 A. Brief Approaches
These techniques give feedback in a short amount of time.

3A-L Rightwriter
Purpose: To review documents written on computer word-
processing programs for errors in grammar, style,
usage, and punctuation.
Lead Time: Low
Staff Time: Low
Budget: Rightwriter software currently costs $95.00.

Rightwriter reviews documents on computer and creates a "mark-up" copy, includ-
ing feedback on grammar, style, usage, and punctuation in the text, as well as a summary
of the analysis. This summary includes a readability quotient, a strength index, a
descriptive index, a jargon index, and a sentence structure analysis. The summary also
includes a list of words which readers might find difficult to understand. The program is
easy to use and quite rapid. While it can provide a useful feedback mechanism for written
materials, Rightwriter does not "understand" the content of the text and can give no
feedback about tone or appropriateness. In addition, some Rightwriter feedback may be
confusing, difficult to understand, or irrelevant.

3A-2. SMOG Readability Grading Formula
Purpose: To evaluate the level of reading comprehension a
person must have to be able to understand a piece of
written material.
Lead Time: Low
Staff Time: Low
Budget: Low

This approach involves reviewing a sample of text from a written piece and
performing some simple mathematical calculations to obtain a SMOG grade, which
represents the reading grade level a person must have reached in order to understand the
text The higher the grade level, the more sophistication is necessary to understand the
material. Assessment of readability, along with a knowledge of the target audience's level
of sophistication, can allow agency staff to produce materials that will be more accessible
to their audiences. Readability quotients are useful as a "first cut" in reviewing drafts of
materials for the public, but they give no feedback on style, format, tone, or content. In

-------
Evaluating Risk Communication Programs 55
addition, frequent use of long terms that may be necessary in scientific reports may inflate
the SMOG grade.

3A-3. Signaled Stopping Technique
Purpose: To examine how readers process information as they
read written materials and through this procedure to
get feedback on those materials.
Lead Time: Low
Staff Time: Low
Budget: Low

In this approach, respondents read through a document and put slash marks where
they stop. They are then provided with a coding scheme to notate why they stopped at each
slash. These reasons for stopping provide feedback to the writer. Respondents may stop
due to being confused, needing to re-read, having a question, wanting to think about the
idea, or agreeing or disagreeing with the writer. This technique can help writers recognize
confusing or controversial statements within a piece of text and consider revisions, but its
value may be diminished if the reader is unmotivated or uninterested.

3B. More Extensive Feedback Methods
These methods give richer feedback but also take more time to administer.

3B-1. Self-administered Pretest Questionnaires
Purpose: To get feedback on pretest materials.
Lead Time: Moderate—allow at least two weeks if questionnaire
is mailed.
Staff Time: Moderate
Budget: Low to moderate

Questionnaires about written material are developed to elicit both quantitative
and qualitative feedback from readers representative of the intended audience. The
questionnaire may include questions about format, comprehetision, reaction, interest in the
materials, and any other relevant opinions. Questionnaires may include open-ended or
closed-ended questions, depending on the items being pretested and type of feedback
desired. The approach may be limited by low response rates to mailed questionnaires and
the amount of follow-up time needed to insure a meaningful response.

3B-2. Central Location Intercept Interviews
Purpose: To get feedback on pretest materials or to examine an
audience's attitudes and opinions.
Lead Time: Moderate
Staff Time: Moderate to high
Budget: Low to moderate

-------
56 Evaluating Risk Communication Programs
Interviewers are stationed at a place frequented by a target audience. They recruit
participants who review materials and then respond to a series of multiple-choice or closed-
ended questions. The structured interviews provide feedback that can be summarized
quantitatively. Careful planning when using this approach can increase the reliability and
generalizability of the data, but central location interviews typically reflect a non- random
sample weighted in favor of those who are able to get to the particular site. In addition, the
necessity of using closed- ended questions may deprive the agency of richer feedback from
a more extended discussion.

3B-3. Theater Testing
Purpose: To get feedback on visually presented pretest mate-
rials.
Lead Time: Moderate
Staff Time: Moderate
Budget: Moderate to high

Films, public-service announcements, slide shows, or other audio-visual mate-
rials are observed by a group of respondents in a theater or auditorium. After watching the
film, participants fill out a pretest questionnaire to provide the agency with feedback.
While very useful to improve visually presented messages, this approach may require a
great deal of time and logistical arrangements, in addition to design of the message itself
and the questionnaire.

3B-4. Focus Groups
Purpose: To get feedback on and generate ideas about pretest
items. To get a "feel" for the attitudes and beliefs of
a target audience.
Lead Time: Moderate to high
Staff Time: Moderate
Budget: Moderate to high

A focus group is a discussion session run by a trained moderator. It may include
six to twelve participants, who discuss pretest materials or issues of importance to a
communication effort. Areas covered in a focus group discussion are outlined in the
moderator's guide, which is developed before the session. Focus group discussions
generally yield qualitative feedback as summarized in a report by the moderator. These
reports can give an in-depth sense of participants' language, their reactions to the materials,
and suggestions for improvement. Formal focus groups require careful planning and
moderation and may therefore be too resource-intensive for the average agency. 'Target
audience meetings," involving brief informal discussions with a neutral moderator, a group
typical of the target audience, an agenda planned in advance, and some procedure for note-
taking, can be useful and less expensive.

-------
Evaluating Risk Communication Programs 57
4. Assessment of Communicator Style
Although agency staff may traditionally focus on "facts" as opposed to relation-
ships, conflict in styles can lead to tremendous frustration as well as impasses in a given
communication. Armed with the facts alone, practitioners may be doomed to skirmish with
audiences whose very style of perceiving the world and communicating about it differs
from theirs. Tools in this category can help communicators examine what they bring to the
communication process. Mostof these tools are self-assessment surveys that are completed
and then scored, providing a profile of the respondent's style, type, and/or motivational
pattern. This profile provides a model for understanding communication situations, which
in turn can help practitioners gain flexibility within their own style, recognize their
strengths and limitations, identify the communication styles of people in their audiences,
and recognize and deal with communication impasses resulting from a clash in styles.

4-1. Myers-Briggs Type Indicator
Purpose: To provide feedback on the communication styles of
agency staff.
Lead Time: Moderate to lengthy, due to time needed to secure
services of consultant.
Staff Time: Low
Budget: Moderate

The Myers-Briggs Type Indicator (MBTI) is a self-report inventory consisting of
126 questions. It provides feedback on respondents' communication styles in terms of four
scales: Extraversion-Introversion, Sensing-Intuition, Thinking-Feeling, and Judging-
Perceiving. The profiles generated in terms of these four scales include feedback about
communication strengths and weaknesses. Communicators can become aware of their
own strengths and weaknesses while learning to recognize differing communication styles
in their audiences. The MBTI model has been used in consultation with risk communi-
cators and has helped foster flexibility in communication style. However, the psychological
theory of type underlying the tool may not fully capture the diversity of personality styles,
and the feedback from this tool is of limited value without a consultation to set it in context.

4-2. Strength Deployment Inventory
Purpose: To identify the strengths of agency staff and suggest
ways these strengths can be used to communicate
more productively with others.
Lead Time: Moderate to lengthy, due to time needed to secure
services of contractor.
Staff Time: Low
Budget: Moderate. Each Inventory form costs $3.45; con-
sultation is additional.

The Strength Deployment Inventory (SDI) consists of twenty questions, some of
which refer to situations where things are going well, and some of which refer to situations
where things are going wrong. The SDI is self-scoring, and respondents identify whether

-------
58 Evaluating Risk Communication Programs
they are characterized by any of seven style patterns, each of which implies different
strengths, weaknesses, and motivations which may be reflected in interpersonal com-
munication. The inventory is easy to complete and provides quick feedback about an
individual's style. The SDI model is one way of understanding differences in personal
styles and their impact on communication. A consultation should accompany the tool for
maximum benefit.

4-3. Conflict Management Survey
Purpose: To provide feedback about a respondent's approach
to conflict.
Lead Time: Moderate to lengthy, due to time needed to secure
services of consultant.
Staff Time: Low
Budget: Moderate. Each survey form costs $5.60 and con-
sultation is additional.

The Conflict Management Survey presents scenarios in each of the following
areas: personal views of conflict, interpersonal conflicts, the handling of conflict in task
groups, and conflict in relationships among groups. Respondents note how they would
respond to each conflict scenario, and after a self-scoring exercise, a style preference is
determined, which represents the respondent's preferred mode of managing conflict.
Through consultation, respondents become able to understand the implications of their
style preference and develop the flexibility to use other styles if situations dictate this.
Feedback from this tool may seem threatening if not accompanied by a good consultation.

4-4. Communication Style Survey
Purpose: To provide feedback on the respondent's style of
interpersonal communication.
Lead Time: Moderate to lengthy—surveys need to be mailed to
Chicago for scoring, and a consultation should be
arranged.
Staff Time: Low
Budget: Moderate—standard fee of $ 140 per person which is
negotiable

The Communication Style Survey consists of a self-assessment form and "other-
assessment" forms to be filled out by people who know the respondent well. The survey
involves choosing among a set of words the term that most aptly describes the respondent.
The data are processed to yield an assessment of communication style as some combination
of Analyzing, Facilitating, Advocating, and Controlling. This Style Profile is accompa-
nied by feedback on the respondent's oral communication competency and adaptability.
Consultation is needed to help respondents understand the strengths and weaknesses of
each communication style and develop flexibility.

-------
Evaluating Risk Communication Programs                                     59
5. Outcome Assessment
     Agencies typically view evaluation as a means of finding out whether what they did
worked or not. As suggested earlier, carefully designed scientific evaluation research is
required to draw these kinds of conclusions.  When agencies have little time and few
resources, however, they may still need to find out how audiences have reacted to phases
of the communication effort and to the effort as a whole. The outcome tools we recommend
provide strategies for getting feedback on audiencereaction and communicatorperformance.

5A.  Audience Reaction
        Audiences are asked what their reaction is to a presentation.
5A-1. Meeting Reaction Form
      Purpose:                  To get feedback about participants' reactions to a
                                public meeting.
      Lead Time:               Low to moderate, depending on whether the form
                                developed by the Environmental Communication
                                Research Program  needs modification for specific
                                agency use.
      Staff Time:               Moderate—includes preparation of form, distribution,
                                 and data analysis.
      Budget:                  Low

      The Environmental Communication Research Program has developed a form for
distribution at public meetings which examines whether information was understood,
whether presenters were perceived as honest, whether people felt their concerns and issues
were understood, whether people felt their input would be used in decision-making, etc.
Other relevant issues can also be addressed. The particular form described in this catalogue
was designed to get feedback from various constituencies involved in a public participation
program run by the B ureau of Water Quality S tandards and Analysis (B WQS A) of the New
Jersey Department of Environmental Protection. While  it provides a quick, easy, and
inexpensive way to get feedback about a public meeting, the  form is not standardized or
scientifically validated and some feedback could be difficult  to interpret.

5A-2.  Verbal Meeting Feedback
       Purpose:                  To get direct feedback from participants at a meeting.
       Lead Time:                Low
       Staff Time:                Low
       Budget:                   Low

        Time for a structured feedback discussion is planned in a meeting agenda. The
meeting chairperson actively solicits and may even record this  feedback on a chart for
everyone to see. Participants should feel free to comment on any aspect of the meeting, and
conflicting statements are allowed. The goal is to generate as many idea as possible rather
than going into detail on any one idea. This approach is highly dependent on the skill of
the chairperson in creating a comfortable environment for feedback and inviting partici-

-------
60 Evaluating Risk Communication Programs
pation. Less verbal members may not be heard, and it is difficult to know whether this kind
of feedback is in any way representative of the views of the group as a whole.

SB. Performance of Presentation
These techniques provide feedback more specific to how the communicator per-
forms than how the audience reacts.

5B-1. Speech Evaluation Checklist
Purpose: To get feedback on how a speech or presentation
went.
Lead Time: Low to moderate—depending on design of form.
Staff Time: Low
Budget: Low

The Speech Evaluation Checklist is a simple form to get feedback on a speech or
presentation. It may include statements about the physical setting of the speech, the
speaker's appearance, rapport, comprehensibility, and other important areas. The forms
can be completed by one or a number of evaluators who observe the speech. Alternatively,
a speech can be audio- or video-taped for use for scoring by the presenter. The form is not
intended as a "report card," but as a chance to get some input on a speech that will improve
future presentations. This approach can provide immediate, relevant written feedback, but
the perceptions of other agency staff may differ markedly from the perceptions of the
audience.

5B-2. Observation and Debriefing
Purpose: To get feedback on speeches and presentations.
Lead Time: Low to moderate—time needed to develop an ob-
server checklist.
Staff Time: Low
Budget: Low

One or a number of observers attend a presentation and take organized notes, using
their perceptions of the event and some kind of observer checklist based on the goals of the
presentation. An informal verbal debriefing session may be held after the presentation to
review important strengths and weaknesses with regard to both the speaker's performance
and the audience's reactions. The presenter can also use an audiotaped or videotaped
version for self-assessment. While this is a quick and easy way to provide feedback on a
speech, it should not substitute for finding out the audience's actual reactions, and it can
be uncomfortable for the observers or the presenter depending on their roles within the
agency.

-------
Evaluating Risk Communication Programs                                     61
  REFERENCES
  Briggs, K.C. and Myers, I.E. 1976. Mvers-Briggs Type Indicator. Form G. Palo Alto,
  CA: Consulting Psychologists Press, Inc.

  Green, L.W., Kreuter, M.W., Deeds, S.G., and Partridge, K.B. 1980. Health Education
  Planning: A Diagnostic Approach. Palo Alto, CA: Mayfield Publishing Company.

  Hance, B.J., Chess, C., and Sandman, P.M.  1988. Improving Dialogue with Commu-
  nities:   A Risk Communication Manual for Government. Trenton, NJ:  New Jersey
  Department of Environmental Protection, Division of Science and Research.

  National Cancer Institute.  1984.   Pretesting in Health Communications: Methods.
  Examples, and Resources for Improving Health Messages and Materials.  Washington,
  DC: National Institutes of Health, NIH Publication #84-1493.

  National Cancer Institute. 1989. Making Health Communication Programs Work: A
  Planning Guide. Bethesda, MD:National Institutes of Health, NIH Publication #89-
  1493.

  Rossi, P.H. and Berk, R.A. 1988. A Guide to Evaluation Research Theory and Practice.
  Discussion Draft prepared for the Workshop.

-------
COMMENTARIES ON
EVALUATION ISSUES

 Developing the Message

-------
Selecting Appropriate Strategies

Mildred Zeldes Solomon

     This paper describes some guiding principles for the development of risk reduction
messages, and, like most recommendations on the design of effective messages, these are
based on research in the health promotion field. Professionals in both public health and the
environment face risk communication challenges that are similar in some ways and
different in others. It is hoped that environmental professionals will be stimulated to test
the usefulness of the following suggestions for their own work in environmental risk
reduction.
     Recentreviews of the effectiveness of several different kinds of risk reduction efforts
in the public health field (Wallack and Corbett, 1987; Robertson, 1983), make it clear that
environmental and legislative changes are powerful forces for change and that their effects
often have been greater than those of education directed to individuals in isolation.
Wallack and Corbett (1987), for example, point to the effective role that legislative changes
have had on cigarette consumption: bans on cigarette advertising and increases in excise
taxes have reduced demand. Similarly, Grossman, Coate, and Arluck  (1984) found that
both raising the drinking age and increasing alcohol prices lowered alcohol consumption
among youth. In the area of injury control, state mandates for the use  of infant restraint
seats were much more successful than education programs, even ones that included the
provision of free car seats (Robertson, 1983).
     Examples of effective environmental changes in public areas include such diverse
measures as highway redesign to cut down on automobile crashes; handgun control; point-
of-purchase access to condoms for young people who might be too embarrassed to ask the
clerk for condoms stored out of sight; and safety caps for electrical outlets that pose a threat
to infants and toddlers. Researchers in the United Kingdom even found that they were able
to influence the suicide rate dramatically in that country by redesigning gas stoves.
     Clearly, there are at least three leverage  points:  education aimed at persuading
individuals to change their behaviors; environmental redesign (sometimes called "passive
measures" because they do not require that individuals take action); and legislation. Public
health experience suggests that, whenever possible, program planners  should eschew

                                                                           65

-------
66 Developing the Message
methods that require people to change their behaviors in favor of passive measures. To
promote water conservation, for example, it would be wiser to install self-shutting faucets
in public facilities than to post signs exhorting people to use less water. However, passive
measures are not always an option. We will probably always need to communicate
information about new hazards and to encourage the adoption of new recommendations.

Preliminary Research
What, then, does research say about designing effective educational or persuasional
programs? The most salient feature of effective programs is that they are informed by
rigorous preliminary research that is used actively to design the program. Figure 1 presents
the most critical questions that this research should seek to answer. They are questions that
help clarify whom we should be targeting, with what messages, in what medium, at what
time, and in what ways. They help us come to a better understanding of the target
audience's beliefs, values, and current behaviors. They provide information about the
social and physical context in which our audience lives, and they help us predict how best
to incorporate and deliver our messages in those settings.
The point that preliminary research is vital to the design and implementation of risk
reduction programs is so simple and straightforward that it may appear facile. But the
literature is full of examples of programs that failed precisely because appropriate
preliminary research was not conducted. Early efforts to encourage the use of oral birth
control pills among women in one developing country, for example, resulted in women
inserting the pills vaginally! During the second World War, the U.S. Armed Forces
mounted expensive and essentially ineffective campaigns against venereal disease that
relied on excessive appeals to fear without providing the men with clear messages about
what they could do to protect themselves. And until very recently, many health promotion
advocates thought it would be sufficient simply to give adolescents information about the
harmful effects of drugs, without taking into account the psychosocial dimensions of drug
experimentation by youth.
As painful as these failures have been, they also have been useful. We now know,
for example, that fear arousal is effective only when coupled with concrete recommen-
dations for what the individual can do to reduce the risk. People must feel that they have
it within their power to eliminate or modify the potential source of harm. Without such
assurances and the knowledge and skill to implement the changes, arousing fear is likely
to engender defensive reactions that lead to denial (Berkanovice, 1976; Leventhal, 1970;
Mewborn and Rogers, 1979). Furthermore, many current health and environmental
problems are better served by reducing fear than by elevating it. For example, without more
concerted efforts to reduce community fears about the siting of waste disposal facilities,
the dreadful situation of unlimited and unregulated midnight dumping will persist.

-------
Selecting Appropriate Strategies                                              67
                                    Figure 1
                     Questions to Consider in Message Design
      1.   Have you determined the leverage points most likely to yield the best results,
          given your goals and resources: education aimed at behavior change; environ-
          mental redesign; legislation?
      2.   Who exactly is your audience?  How can your audience be most  usefully
          segmented?
      3.   What is it your audience needs to know?
      4.   What do you want them to do?
      5.   How do you want to make them feel?
      6.   What relevant existing values and beliefs  do they have? Which values and
          beliefs are ones onto which you might "piggyback" your message?
      7.   Which beliefs are likely to run counter to your message?  While providing
          counterclaims, have you acknowledged and respected the group's current
          beliefs?
      8.   How can you establish that the desired behavior is normative or valued by a
          respected elite?
      9.   What role can significant others play in promoting the behavior?
      10. What obstacles exist to the adoption of the target behavior?
      11. How can you acknowledge the obstacles without overstating their importance?
      12. What support/incentives (both social and material) can you offer to overcome
          obstacles?
      13. In what ways can existing social networks be used to convey the message?
      14. How can existing social networks be enhanced to support your goals?
      15. Have you focused on underlying attitudes, behavior change, and skill develop-
          ment rather than disease etiology or other facts for their own sake?
      16. Is the message simple?
      17. Have you found multiple ways to deliver the message and to repeat it over time?
      18. What strategies have you employed to enhance the group's perception or its
          susceptibility to the risk and its ability to do something about it?
      19. Are you satisfied that you have found the right use of fear, combining fear
          arousal with concrete recommendations people can carry out to eliminate or
          modify the risk?
      20. How will you measure success? Will you be able to make use of evaluation
          information to revise your message and/or your implementation strategies?

-------
68                                                       Developing the Message
Strategies
      And there are other lessons. We know, for example, that people are more likely to
adopt a recommended change if it is a simple one-time act, such as installing a water-saving
shower head, than if it is a complex behavior or one that must be repeated and maintained
over time (Robertson, 1983).
      If the target group is misinformed or holds strong misperceptions about the risk, it
is important to acknowledge and respect its beliefs even while countering them (McGuire,
1968). If the new information is not presented in relation to existing beliefs, people are
more likely to rely on their current understanding, dismissing the new information as no
more authoritative than their own conceptions.
      Creating what social psychologists call "cognitive dissonance" is another important
strategy for accomplishing shifts in perception. Cognitive dissonance refers to the sense
of imbalance people  experience when they perceive contradictions in a set of beliefs.
Program planners can create dissonance by introducing new information that clashes with
current beliefs. If that information is introduced in a way  that allows it to be heard—that
is, if existing beliefs  are not ridiculed,  if popular myths are acknowledged, if the target
audience feels the program planners understand its perspective- -the resulting dissonance
creates a demand for re-thinking that can lead to changes not only in perception but in
behavior as well.
      Audience segmentation, a concept derived from social marketing (Kotler, 1971), is
an important characteristic of  well designed educational and persuasional programs.
Audience segmentation refers to the process of dividing the target audience into subgroups
on the basis of their beliefs, needs, and other salient features such as age, occupation, or
ethnic identity. The anti-hypertension campaign, sponsored by the National Heart Lung
and Blood Institute (NHLBI) of the National Institutes of Health over the last decade, is
an excellent example of successful market segmentation. At first blush, one might assume
the audience for a hypertension control program to be essentially homogenous, butNHLBI
developed separate messages and separate delivery mechanisms for a number of different
groups. For example, they distinguished between those who had risk factors for hyperten-
sion but did not know it, those who knew they had hypertension but were not in therapy,
and those who were in therapy but were not complying with  medical recommendations.
      Another way to segment one's audience is to think not only about the ultimate target
group but about appropriate "intermediaries" who can provide access to the audience, lend
credibility to the campaign's messages, and help create the  sense that the targeted behavior
is accepted by one's social group (Solomon and Dejong, 1986). Indeed, successful health
promotion programs recognize  that the unit of focus is not the individual but the group
(Berkanovice, 1976).  If we want to effect and sustain behavior change, we have to mobilize
and influence groups  of peers and significant others. Who should be mobilized and how
they should be involved will depend upon the target group:  Only preliminary research and/
or experience with the group will reveal appropriate answers.  In an Hispanic community,
for example, it is likely that respected authority figures will have to agree with, endorse,
and/or convey messages. A campaign with similar goals  but targeted to say, adolescent
runaways, might focus less on authority figures and more on the peer group.
      Successful campaigns also  recognize obstacles that impede the adoption of the
targeted behavior, and they attempt to build in incentives for those who make the

-------
Selecting Appropriate Strategies 69
recommended change (Green et al., 1980). For example, one obstacle confronting
homeowners who would like to dispose of their household hazardous wastes safely is the
lack of convenient alternatives to simply putting them out with the trash. Some communities
acknowledge this obstacle and help overcome it by designating a "Hazardous Disposal
Waste Week" when special home pick-ups are made. A creative approach might include
an incentive for participating in the program, such as a rebate on town dump fees.

The Health Belief Model
A discussion of the principles of good message design would not be complete
without reference to the Health Belief Model (Janz and Becker, 1984; Strecheretal., 1986),
which attempts to predict health behavior change on the basis of five key variables:

• Perceived susceptibility to the risk
• Perceived severity of the harm associated with the risk
• Effectiveness (the perception that something can be done about the risk)
• Self-efficacy (the perception that it is within the person's power to do something
about the risk)
• Perceived benefits of making the change outweighing the burdens of the status
quo.

Let's consider this model and some of the recommendations made above in relation
to a current environmental risk: household radon levels. Let's imagine that our goal is to
encourage homeowners to install radon detection devices and, if elevated levels of radon
are discovered, to make the necessary corrections. According to the Health Belief Model,
before people would do either of these things they would have to believe that a) they (or
their house) were susceptible—they would have to feel that it was likely that radon were
present, b) the effect of indoor radon pollution was serious enough to warrant their
attention, c) something could be done about it, if it were discovered, d) it was within their
power to do something about, and e) the inconvenience, psychological disturbance, cost,
and other burdens of detecting and correcting the radon problem were outweighed by the
benefit of eliminating (or modifying) the risk.
Shrewd program planners would want to assess what the homeowners' current
perceptions of these issues are. Do homeowners at risk realize that they are? Are they
aware of radon? Do they think it likely that radon would be discovered where they live?
Do they consider it asienificanthealth risk? Do they believe that anything (short of moving
away) can help?
If the answers to these questions are "no," the risk reduction program ought to begin
by raising homeowners' awareness of their susceptibility and the severity of risk, and by
providing information about the effectiveness of corrective measures. But program
planners should recognize that simple awareness will not lead to the desired behavior
changes. Later iterations of the program must consider the obstacles (in Health Belief
Model terms, the burdens) and potential incentives (in Health Belief Model terms, the
benefits): Where do I get a radon detection device? How much will it cost? How hard will
it be to operate? What's in it for me, if I go to all this trouble?

-------
70 Developing the Message
So far, we have focused only on the personal beliefs of the homeowner, as if she or
he were not part of a larger community. But we also must recognize the increased leverage
we have if we conceptualize our homeowner as a social creature. Instead of relying, say,
on home mailings (which might be a very legitimate component of our campaign), we may
wish to consider also reaching Mr. and Mrs. Homeowner through existing social networks
of importance to them, such as church groups, schools, and civic associations.
In addition, we always must ask whether an education or persuasion program is really
the best choice. Is it the only choice? In the case of radon control, we also ought to ask
if there is any way to redesign the environment to modify the risk. Should we, for example,
be urging the use of different kinds of building materials to provide greater protection? Is
there any other "passive measure" available to us?
What about legislative or regulatory leverage points? For example, early efforts to
promote the use of smoke detectors relied on direct promotion to homeowners. But the
reason that smoke detectors are commonplace today is not because promotional efforts
were effective, but because state and local laws now mandate their installation by landlords
and house sellers. Furthermore, house inspections of smoke detectors are now a routine
part of fire departments' responsibilities. Landlords, house sellers, and firemen may be
inappropriate leverage points for encouraging radon control, but is it too farfetched to
consider the builder? Some day will a satisfactory radon reading be as obligatory for
contractors as a satisfactory water percolation test?
These are examples of the types of interventions available to risk reduction program
planners. Too often, we stop short of such brainstorming, assuming that only one kind of
intervention is available to us. Instead we need to ask in as open-ended a way as possible
and from the very start of our work: What kinds of changes in personal behavior,
environmental redesign, and in law or regulation are most likely to accomplish our goals?
Careful program planning will consider all the options and proceed with the best strategy
or combination of strategies. When education or persuasion is to be part of the mix, we now
have the benefit of considerable experience in health promotion to help design, based on
preliminary audience research, messages that work.

REFERENCES

Berkanovice.E. 1976. Behavioral Science and Prevention. Preventive Medicine5:92-105.

Green, L.W., et al. 1980. Health Education Planning: A Diagnostic Approach. Palo Alto,
CA: Mayfield Press.

Grossman, M., D. Coate, and G.M. Arluck. 1984. Price Sensitivity of Alcoholic Beverages
in the United States. Paper presented at Control Issues in Alcohol Abuse Prevention II:
Impacting Communities Conference, 7-10 October, Charleston, South Carolina.

Janz, N., and M. Becker. 1984. The Health Belief Model: A Decade Later. Health Edu-
cation Quarterly 11:403-418.

-------
Selecting Appropriate Strategies                                              71
Kotler, P., and G. Zaltman. 1971. Social Marketing: An Approach to Planned Social
Change. Journal of Marketing 35:3-12.

Leventhal, H.  1970. Findings and Theory in the Study of Fear Communications in
Advances in Experimental Social Psychology: ed. L. Berowitz, Vol. 5. [Location?]
Academic Press.

McGuire, W.J., 1968. The Nature of Attitude and Attitude Change. In Handbook of Social
Psychology, ed. G. Lindzey and E. Aronson, Vol. 3. Reading, MA: Addison Wesley.

Mewborn,C.R.,andR.W. Rogers. 1979.EffectsofThreateningandReassuringComponents
of Fear Appeals on Physiological and Verbal Measures of Emotion and Attitudes. Journal
of Experimental Social Psychology 15:242-253.

Robertson,  L.S. 1983. Control Strategies: Educating and Persuading Individuals. In
Injuries: Causes. Control Strategies, and Public Policy. Lexington, MA:Lexington Books.

Solomon, M.Z., and W. Belong. 1986. Recent Sexually Transmitted Disease Prevention
Efforts and their Implications for AIDS Health Education. Health Education Quarterly.
13(4):301-316.

Strecher, V.J., et al. Spring 1986. The Role of Self-efficacy in Achieving Health Behavior
Change. Health Education Quarterly 13(1):73-91.

Wallack, L., and K. Corbett. Summer 1987. Alcohol, Tobacco, and Marijuana Use Among
Youth: A Overview of Epidemiological, Program, and Policy Trends. Health Education
Quarterly 14(2):223-49.

-------
Tailoring The Message to the Audience

James W. Swinehart

     When planning a risk communication campaign, it is useful to bear in mind that
public information is only one part of comprehensive efforts to improve health and safety.
Information provided through the mass media and other means can improve people's
knowledge, attitudes, or skills, and thus their risk-related behavior, which also is influ-
enced by laws, regulatory actions, and technology. None of these should be expected to
do the job alone, but appropriate combinations should produce lower risks of various kinds
and lead to reductions in morbidity and mortality.
     A campaign plan should be based on answers to several questions:

     •  What audiences are we trying to reach?

     •  How large is each of these target audiences?

     •  How many people in each audience are already taking the actions we recom-
        mend?

     •  What barriers (e.g., ignorance, fear, cost) are keeping other people from taking
        these actions?

     •  What do we want to communicate?

     •  How will it be said?

     •  Who will say it?

     •  What combination of media available will reach people most efficiently  and
        effectively?

     •  How will the results be  measured?

                                                                         73

-------
74 Developing the Message

Setting Objectives
Assuming that a campaign is not seeking only to inform people or to remind them
of something, but also to persuade them to take a particular action, the intended result
should be stated in behavioral terms. Stating the desired outcome helps to sharpen the
message and increases the likelihood that any evaluation will be appropriate. Of course,
some outcomes are harder to produce than others; for example, starting or stopping an
activity is usually harder than changing it, and taking an action repeatedly is harder than
doing it only once. Knowing the level of difficulty in advance makes it possible to set more
realistic expectations.
Table 1 provides some examples of audience segments and of objectives, but these
are necessarily somewhat abstract. The objectives for an actual campaign should be
practical as well as appropriate. If a message recommends an action that costs money, can
people afford it? If the action requires access to facilities, are the facilities available? Do
people believe it will do any good? Would doing it conflict with their personal values or
self-image? Would their friends oppose their doing it? Is the action painful, boring, or
inconvenient? Any such constraints should be known at an early stage of planning.

Table 1

Some Categories of Health Related Behavioral Messages

1. Start doing X (a particular action with health consequences)
2. Don't start doing X (or continue not doing it)
3. Continue doing X
4. Stop doing X
5. Do more of X
6. Do less of X
7. Do X differently (in such a way as to reduce risk)
8. Do X once
9. Find out about X
10. Get someone else to (start, stop, continue, etc.)

Examples of Topics and Target Audiences bv Messages

TOPIC TARGET AUDIENCES MESSAGES

Smoking nonsmokers 2 10
former smokers 2 10
current smokers 4679

Breast self- those who have not done it 1
examination those who have done it 3 9 10
those who do it incorrectly 7 9

-------
Tailoring the Message to the Audience
Safety belt          nonusers
use                 current occasional users
                    current consistent users
                    families or friends of nonusers
                                                                             75
Nutrition
Alcohol/drugs



Immunizations

Hypertension
people with excessive intake of
sugar, sodium, saturated fat,
etc.
people with insufficient
vitamin A

non-abusers
current abusers
former abusers

parents of preschool children

people with undetected HBP
people with detected HBP
families of people with detected
HBP
                                    1   9
                                    5   9
                                    3   10
                                    10
                                    4679

                                    1   5   9   10

                                    2   10
                                    469
                                    2   10

                                    8   9   10

                                    1   8   9
                                    1   3

                                    10
Prenatal care
Exercise
pregnant women (regarding
nutrition, substance abuse,
physical exams, etc.)

people who already do it
people who don't do it
people who overdo it
                                                         1246
                                                         9
                                     3   5
                                     1   9
                                     6
10
Radon, lead          people who have not checked their
paints, etc.           homes for possible contamination
                                     8    10
 Designating Target Audiences
         The target audiences listed as examples in Table 1 were chosen because they had
 some obvious connection with the topics, but greater differentiation should be used in
 planning an actual campaign. In the case of smoking, for instance, "current smokers" could
 be divided into several sub-groups depending on their desire to quit, previous efforts to
 quit, knowledge of risks related to smoking, social support for quitting, and other factors.
 Regardless of the topic, answers should be sought to the following kinds of questions:
         What specific population groups are most affected by the problem?

-------
76                                                      Developing the Message
     •  Are they also the ones in the best position to do something about it, or should the
        campaign be addressed mostly (or at least in part) to others?
     •  How accessible are they, and how susceptible to influence?
     •  What proportion of the people affected have tried previously, and unsuccessfully,
        to do something about the problem?
     •  How much do they know about it?
     •  How many people hold  incorrect beliefs about its seriousness, its causes, or
        intervention methods?
     •  How many are afraid of it, or apathetic about it, or merely resigned?

     •  How much public  interest is there in the problem, and what is its  perceived
        importance in relation to other problems?
     •  How many people feel that the importance of the problem, coupled with the
        prospects for successful intervention, can justify individual or collective actions
        as control measures?
     •  How many people, in what population groups, will be receptive to information
        about the problem? Will they have the opportunity and the ability to influence
        others?

     For each target audience identified, a summary should be prepared which gives the
following information:

     •  Description of audience
     •  Objectives
     •  Barriers to recommended action(s)
     •  Communications strategies/themes/appeals
     •  Spokespersons
     •  Media/channels/vehicles
     •  Methods of measuring results

     Some of the information for this summary can be derived from three worksheets,
each with a matrix showing all of the target audiences and various choices made regarding
them.  One worksheet should indicate the media/channels/vehicles through  which the
campaign will reach each audience; another should show the themes or appeals chosen as
likely to influence each group; and the third  should show the kinds of  spokespersons
thought to be effective with  each group.  Preparing these worksheets can be difficult and
time-consuming, since it involves making several hundred decisions. Moreover, many of
these decisions will have to be guesses if the needed background information (e.g., on
audience beliefs, media usage) is not available from such sources as the Roper Center for
Public Opinion Research. The worksheet preparation is worth the effort, however,because
it imposes focus and some degree of rationality on the process of campaign planning.

Designing Messages
     The fifteen recommendations  that follow,  concerning the  content and style of
messages, are necessarily somewhat general, because they are intended to apply to a wide

-------
Tailoring the Message to the Audience 77
variety of topics—health risks related to personal habits, environmental hazards, occu-
pational safety conditions, and so on. The suggestions given should be used or adapted as
desired to suit particular circumstances.
Be careful when using fear as an appeal. Communications aboutrisks typically arouse
some amount of fear or anxiety, and this emotion may lead people to avoid or distort a
message. Reactions will depend upon the situation, the audience's initial level of concern
about the topic, the number and seriousness of threats posed, the perceived effectiveness
of actions that can be taken, and several other factors. It is agreed generally that some
amount of fear arousal makes people more likely to act, but specifying (and inducing) the
optimum amount is very difficult; in some instances, it is better to offer reassurance than
to emphasize danger, to allay fear rather than arouse it. Strong fear appeals seem to work
best when they pose a threat to the audience's loved ones (rather than a direct personal
threat), come from a highly credible source, deal with a topic that the audience knows little
about, and are directed to people with relatively low income and education, high self-
esteem, and low perceived vulnerability to danger. When it is necessary to emphasize risks,
the audience should be in a position to act at once on the recommendations and should be
given specific advice to help them do so.
There are no firm rules about the choice of information to convey in campaign
messages, and it is often hard to decide which points of information are most likely to lead
people to take the recommended action. In some cases there is a risk that by emphasizing
a point regarded as important, a message may actually result in a decrease in the number
of people taking a recommended action. For example, by mentioning that a problem has
its greatest impact on certain population groups, a message may lead people in other groups
to feel that it does not concern them. Care should be taken to minimize the chance that anv
information points or appeals will produce a negative reaction in some people while pro-
ducing a positive reaction in others.
Rather than trying to use a single message for everyone, it is better to use a series of
specialized messages for different audiences. Ideally, each person in your intended
audience should feel that the message applies to him or her personally. This is especially
true for people in high-risk categories, who may tend to deny the personal relevance of the
message.
Emphasize the usefulness of the information to the person receiving it. Make the
recommended actions as specific as possible and explain why or show how the action can
help a member of the target audience.
Be sure the information is current and technically accurate. When feasible, have it
checked independently.
As appropriate, try to identify a particular problem and offer a specific wav to handle
ifc but don't try to convince people that this is the best or only solution. Rather seek to
convey an understanding of the problem and the reasons for taking the kind of action
suggested.
When people are initially hostile to a position, or are likely to hear conflicting views.
presentboth sides of theissue rather than only one. Doing this has two benefits: it increases
credibility, and it prepares the audience to resist arguments that may be presented later by
the other side.

-------
78 Developing the Message
Avoid exaggeration and moralizing, because either can make people reject the
message. In general, the same applies to exhortation, although this is often a function of
how the message is given. Most people are willing to accept advice but resent being told
what to do.
Distinguish between established facts and guesses or assumptions. People may react
negatively to an entire message if they recognize an assumption stated as a fact.
Make the language and style appropriate to the intended audience. Whenever pos-
sible, avoid using technical terms. When such terms have to be used, explain them clearly
and briefly. (This is one of the areas in which showing an item to a few lay people and
asking for their reaction—the simplest kind of pretest—can tell communicators whether
their message is getting across.)
In general, make the tone of messages serious rather than flip or frivolous. A hu-
morous approach can attract attention and be entertaining, but special care should be taken
to ensure that the tone is consistent with the topic, the information presented, and the public
image of the sponsoring agency.
Use the power of group pressure to reinforce a message. Ascertain the normative
beliefs and actions of people with whom the target audience identifies, and use them as
appropriate. The fact that "everybody's doing it" can prompt certain people to take a
protective action that they would not have undertaken on their own.
If presenting a series of messages, let each one seek to convey very limited
information. A pamphlet or other printed piece that people can read and review at their own
pace can carry several points, but as a rule only one point should be made in a poster, or
public service announcement.
Identify the intended audience in materials whenever it is feasible to do so. This
makes it more likely that the "right" people will pay attention and perceive the message as
personally relevant.
Find ways to elicit the active participation of the audience, such as writing a slogan,
taking notes, role-playing, voting on issues discussed, or taking a self-test. Active
involvement facilitates both learning and recall of message content.

Choosing Media, Channels, and Vehicles
Since no medium will reach everyone in an intended audience, it is important to use
multiple channels or media and tailor the choices to the habits and preferences of the kinds
of people the program aims to reach. Look for answers to these questions about each
medium considered:
1. How do people rate the credibility of this medium versus others?
2. Will getting access to this medium be relatively hard or easy?
3. How much will it cost to produce materials for this medium?
4. How much will it cost to distribute or place materials?
5. How much staff time will be needed for production and placement?

6. Do we have, or can we get, the production capabilities required?
7. How much control will we have over the final product?

-------
Tailoring the Message to the Audience 79
8. How much flexibility will we have about the timing of placements?
9. What tie-in possibilities exist with regard to other media?
10. How effective is this medium versus others in conveying a message that people
will notice, recall, and act upon?
11. How much repetition (frequency of exposure) will our message have?
12. How efficient is this medium in reaching the kind(s) of people we are addressing
in the campaign? (reach + selectivity)
13. In what context will messages appear? What other material will surround or
accompany them?
14. What is the probable "set" or frame of mind of the audience when using this
medium?
15. How does this medium compare with others in its ability to convey complex
information?
General answers to these questions can be found in such sources as The Media Book
and current textbooks on advertising planning. For answers pertaining to a particular
campaign, it may be necessary to consult media planners in an advertising agency or
organizations that specialize in placement of public service materials.
Any media plan should specify the particular vehicles to be used within the general
categories of broadcast, print, and out-of-home kinds of media. Vehicles in each of these
categories are listed below.

TV and Radio Newspapers. Magazines Out-of-Home

PSAs public service ads billboards
paid commercials paid ads transit cards
talk/interview shows feature articles posters
news items interviews
news program inserts editorials
editorials news items
documentaries cartoons
specials letters to editor
station break tags/slides health/advice columns
call-in shows
entertainment programs

It is also important to differentiate among specialized magazines. For example, the
content and style of submissions should differ greatly across these 13 categories of
magazines:

• automotive • news
• business/financial • Sunday

-------
80                                                       Developing the Message
      •  outdoor                 •  shelter
      •  sports                   •  general
      •  men's                   •  women's
      •  national weeklies        •  fashion/beauty
                                •  special appeal (e.g., Esquire. National Geographic.
                                   New Yorker. Psychology Today)

  The same is also true of submissions to radio stations, which normally use one of
  these formats:

      •  album-oriented           •  rock, jazz
      •  agriculture and farm      •  middle of the road,
      •  all news                  •  adult contemporary
      •  soul and  blues,           •  news, weather,
        Afro-American             information
      •  music, instrumental       •  oldies, popular classics, nostalgia
      •  country and western      •  public or community affairs
      •  classical, concert, fine arts •  religious, gospel, inspirational
      •  disco                    •  rock and roll, folk, progressive
      •  educational, cultural      •  discussion, interview, personality
      •  ethnic music and topics   •  hit parade
      •  foreign language         •  variety, diversified

      Important considerations in selecting channels and vehicles are not only how many
members of the target audience will have the opportunity for exposure to the message, but
also how many are actually likely to be exposed, will pay attention, will learn from it, and
so on.  Table 2  indicates  the  factors that help determine channels' and messages'
effectiveness. Measuring these variables is an important part of program evaluation.

Summary
      The goals of any campaign should be explicit and realistic. Detailed data about target
audiences—their beliefs, feelings, actions, habits, perception of risks, use of mass media,
and so on- should be considered when choosing the content and style of messages and the
media through which they  will be distributed.  Use mass media in combination with
interpersonal communications and efforts to obtain organizational support. Finally, use
appropriate research, from pretesting of materials in the developmental stage to evaluating
the results of the campaign as implemented.

-------
                100 -i
ESTIMATED %
OF POPULATION
MEETING EACH
CONDITION FOR
A GIVEN MESSAGE
(hypothetical data)
50 -
                                                                                                  05
                                                                                                  n
                                                                                  a

                                                                                  i
                                                                                  C5
                                                                                  a
                                                                                  m
                                                                                  M
                                                     2  H
                                                     It  W
                                                     C«  ^*
                                                     C/i  E"
                                                     Sj  f»
                       OPPORTUNITY
                       FOR EXPSOURE
                     ACTUAL
                    EXPOSURE
ATTENTION      MOTIVATION   RECALL

        LEARNING
OPPORTUNITY
 FOR ACTION
                                                                                         s.
                                                                                         §.
                                                                                         OQ
                                                                                         CO
                                                                                                          "8
                                                                                                          n
                                                                                                          s
                                                                                                          o
                                                                                                          ct

-------
Focusing on the Audience

Marilyn Rice

Although the topic of this paper is developing the message, 50 percent of what risk
communicators actually develop is a description of the audience. Their role is to stimulate
an interaction between the audience and the message in order to achieve some outcome,
perhaps a behavior change. In this light, the initial task of a risk communicator is not to
identify materials to develop, but rather to ascertain what needs to be achieved.
Four factors motivate people to listen, learn, and take action:

• Perception of need—unless the audience perceives that it needs the program
benefits, motivation will be difficult.
• Foreseeable risk or benefit—there are three levels of motivation—
- Individual: A risk or benefit to the individual from taking or failing to take a
given action.
- Peers and role models: The persons who influence the individual to take an
action.
- Broader social context: The cultural norms and traditions that might influence
an individual's actions.
• Previous experience and habits — Communicators need to be cognizant of
individuals' habits and routines. When some new desired action is introduced,
the motivation to change existing habits must be strong.
• Attitudes and values—What do the people who risk communicators are attempt-
ing to reach hold important? The values of the audience, or program recipients,
may be different from those of the communicator.

Channels
Two major channels of communication are mass media and interpersonal commu-
nication. Mass media (such as television, radio, and newspapers) attempts to reach many
people in a short time. The benefits of this approach must be balanced against the cost.
83

-------
84 Developing the Message

Another consideration is whether the message can be tailored effectively to different
audiences on such a large scale.
Conversely, interpersonal channels (counseling, group sessions, question and an-
swer sessions, and so on) are valuable in clarifying and reinforcing information, but limited
in their ability to reach large numbers of people.

Designing Messages
Key points to consider in designing messages are these:

• Who is the audience, and what are their differentiating characteristics (e.g.,
educational level, cultural background, age, sex, and socioeconomic background)?

• What is the purpose of the message? Is it expected to stand alone or to serve as
part of a broader program?
• What options does the message present? It is important to give the target
population options to avoid the perception that they are being controlled. Specify
the consequences of these options, which need to be simple and accessible. For
example, responding to a survey or going to a meeting are relatively simple and
straightforward options. Risk communicators also need to spell out the incentives
for each option.

Developing Materials
There are two types of evaluation of educational materials: 1) evaluation of materials
currently being developed, and 2) evaluation of materials previously developed. Five
principles guide the development of materials:

• Develop educational materials from the community perspective. This relates to
development of materials for a given audience. It may be beneficial to sample the
community to ascertain its perspective.
• Ensure that materials arean integral part of ahealth education program. Note that
the materials should be part of a program, not the entire program. Materials by
themselves are not a program. How will the materials reinforce each other and
contribute to the objectives of the program? Conflicting information should
never be disseminated; the audience's receptivity to further materials will be
damaged.
• Relate materials to health service delivery. If the risk communication materials
are informing of the availability of a service, ensure that it is in fact available when
and how it is advertised, or credibility will be damaged. Be aware of the potential
response, and plan to have adequate supplies of whatever services are being
advertised.
• Pretest all materials. A formal pretest is important to ensure the usefulness of the
materials. This entails exposing a sample of the actual audience for the materials
and incorporating any revisions suggested by the feedback. The pretest should
yield feedback on the following:

-------
Focusing on the Audience                                                     85


        -  Attractiveness:  do the materials gain and keep interest?
        -  Comprehension: is the message understandable?
        -  Acceptability: are materials in concurrence with the beliefs and norms of the
           audience?
        -  Ownership:  does the audience identify with the message?
        -  Persuasiveness: does the message convince the audience to make an attitudi-
           nal and behavioral change?

      Include instructions for use when distributing materials.  This point seems simple
and straightforward, yet instructions are often understated or not included. Much time and
effort will be wasted if materials are not used properly or not used at all.
      Exhibit 1  includes some questions to help evaluate printed materials.

-------
86                                                    Developing the Message

                                  Exhibit 1

              CRITERIA TO EVALUATE PRINTED MATERIAL

On a scale of 1 to 5, indicate the extent the criteria are met, with 5 being totally and 1
not met at all.


    SPECIFIC CRITERIA                  12345
    1.  Does it fully present one specific
       theme?

    2.  Is the content or message easily
       understood?

    3.  Do the illustrations clarify or
       complement the written parts?

    4.  Is the size of the letters easy
       to read?

    5.  Does it provide a synopsis of the
       message or content?

    6.  Does it have aspects that emphasize
       important ideas, such as type size,
       style or color of certain parts?

    7.  Are the writing style, grammar, and
       punctuation appropriate for the
       audience?

    8.  Does it avoid information overload
       or too much writing in one place?

    9.  Does it use language easily
       understood by the target audience?

-------
TRACKING PROGRESS

-------
Issues to Consider for Evaluation  Design

Judy Shaw and Jeanne Herb

     Tracking progress in risk communication means tracking changes in public involve-
ment and control: Has the public become part of the decision process? An example of
change in public control can be observed in the doctor-patient relationship. Today, patients
ask the doctor questions pertaining to an upcoming operation, go to another doctor for a
second opinion, or even refuse the operation.
     For the public to become involved in the decisionmaking process, a dialogue is
needed. Dialogue entails:
     •  Education about risk
     •  Institutional mechanisms
     •  An understanding of what citizens think

     With these three components in a dialogue, evaluation can occur throughout the
process of decisionmaking. Unfortunately, many organizations are afraid of evaluation
because it may reveal flaws in policies and planning.

     Evaluation provides the following benefits:
     •  Awareness of the public's response (e.g., are they understanding the message?)
     •  Awareness of the  behavioral change in the public and what caused the change
     •  The option of improving a communication strategy (and sometimes policies
        within the organization).

     When a communication fails to change public behavior, it may be for a number of
reasons:
     •  The message was not conveyed appropriately.
     •  The public does not trust the person/organization delivering the message.
     •  The public feels the organization is dealing with a small problem (such as oven
        gas wastes) instead of tackling what it perceives as the major problem (such as the
        incinerator about to be constructed down the block).
                                                                         89

-------
90 Tracking Progress

However, the specific reasons for success or failure cannot be determined without
evaluation.
The evaluation design determines the type of information that results from the
evaluation tasks. Some kinds of evaluation will result in information about changes or
outcomes. Or evaluation can be designed to look at the process, i.e., the factors that
produced the results. In either case, the analysis needs to consider both desired outcomes
and unexpected outcomes in determining whether the program was "successful."
Knowing an organization's goals is the first step towards message design and
delivery. Indeed, a careful review of an organization's goals may reveal that a risk
communication program is not timely or appropriate. If a risk communication program is
desirable in the context of the goals, then both its evaluation and expected outcomes should
relate back to the goal. The goal should remain constant, but objectives or intermediate
steps to reach the goal may change as risk communication effects change. A tracking
system to monitor any such changes is essential. When evaluating a risk communication,
one should know whether goals are changing (and, perhaps, incorporate new goals
accordingly); the audience is changing; and the message is being received as intended.
Pretesting (e.g., asking people if they think an important question or issue has been missed
or left unanswered) ensures that an important aspect of the project is not overlooked.
If an evaluated risk communication does not have measured success, it does not mean
the communication effort was not successful. It could mean that there were other
overriding effects that ran counter to the objectives of the effort. Under other circum-
stances, the same communications effort may have worked. In some cases the public will
not listen to any communications, spoken or written. One such example was a lake infested
with arsenic; some people did not care what was in the lake and wanted to use it regardless
of the arsenic. Furthermore, communications vary with each risk; what works in one
instance may not work in another.

-------
Tracking the Health Objectives for the Nation
James A. Harrell
In general, objectives are used for evaluation, planning, or management purposes,
and they help focus, structure, and mobilize a program or activity. Objectives are a
valuable planning tool because they are both measurable and specific. Objectives should
translate abstract ideas into something concrete; specifically, they are used to:
• Establish priorities (e.g., reach a consensus on which issues to address)
• Manage programs by answering questions such as
—When? (By a certain date or year)
—How much? (What percentage)
—Who? (Target audience)
—What? (Topic)
• Identify concrete signs of progress
• Indicate challenges (strengths and weaknesses of a program initiative)

Management by objectives (MBO) is a decisionmaking process that assists in the
planning, implementing, and evaluating of a program or activity. The MBO process has
five classes of objectives: outcome, strategy, productivity, marketing, and innovation.
Disease prevention and health promotion activities use outcome (e.g., morbidity and
mortality reduction) and strategy (e.g., controllable risk factors) objectives.
The 1990 Health Objectives for the Nation: A Midcourse Review, coordinated by
the U.S. Department of Health and Human Services, Office of Disease Prevention and
Health Promotion, examined the status of 226 health objectives. These objectives were
issued in 1980 asaresultof the 1979 publication. Healthy People: The Surgeon General's
Report on Health Promotion and Disease Prevention. The 226 objectives were the result
of a consensus by a multitude of groups. These groups chose objectives that followed
trends but also posed challenges. In addition, the objectives were required to address a
problem that was preventable or controllable. These objectives addressed problems at the
national, state, and community levels and were used to build disease prevention and health
promotion programs by many agencies.

-------
92 Tracking Progress

Healthy People: The Surgeon General's Report on Health Promotion and Disease
Prevention announced five national health goals for enhancing the health of the U.S.
population among five major age groups: infants, children, adolescents/young adults,
adults, and older adults. The 1990 Health Objectives for the Nation: A Midcourse Review
assessed the status of 226 health objectives developed as a result of Healthy People, and
found the following:
• 34.5 percent of the objectives were on track.
• 26.5 percent of the objectives were unlikely to be achieved.
• 13 percent of the objectives had been achieved.
• 26 percent of the objectives could not be assessed because of lack of data.
The negative findings are important for judging progress and identifying remaining
needs. We learn most from those objectives for which we are not on track because the
assessment indicates what still needs to be done as well as what has not worked.
Besides setting specific and measurable goals, the national health objectives have
provided a blueprint or frame of reference for state and local disease prevention and health
promotion activities. Their use at state and local levels has been idiosyncratic; in some
places it is very structured, and in others a more grassroots approach is used. What is
important is that the national objectives have been used at all levels.
The national health objectives are now being revised to serve as a guide for designing
intervention and evaluation strategies between now and the year 2000. The twenty priority
areas for preventive interventions for the year 2000 objectives are:
1. Reduce tobacco use
2. Reduce alcohol and other drug abuse
3. Improve nutrition
4. Increase physical activity
5. Improve mental health and prevent mental illness
6. Reduce environmental health hazards
7. Improve occupational safety and health
8. Pievent unintentional injuries
9. Rduce abusive and violent behavior
10. Improve oral health
11. Improve maternal and child health
12. Immunize against and prevent infectious diseases
13. Prevent and control HIV infection and AIDS
14. Prevent and control sexually transmitted diseases
15. Reduce teenage pregnancy and improve reproductive health
16. Prevent, detect, and control high blood pressure and high blood cholesterol
17. Prevent, detect, and control cancer
18. Prevent, detect, and control other chronic diseases

-------
The Purpose of Tracking Progress

James L Regens

Increasingly, public officials are using program evaluation techniques in an effort
to monitor the effectiveness of risk communication strategies. Because program evalu-
ation involves systematic attempts to measure consequences (i.e., outcome or impact
evaluation) or operations (i.e., process evaluation), it offers an attractive option for
producing information about how decisionmakers can produce deliberative changes in risk
factors as part of an overall plan for managing environmental hazards.
For example, process evaluation encompasses a variety of considerations with
respect to program operations. Were the pamphlets distributed? What problems were
discussed? How many people attended the meeting? Answers to such questions are helpful
in assessing the needs of the target population and ascertaining the most effective means
of distributing materials to that audience. Tracking progress during the implementation
phase of risk communication activities draws attention to significant structural elements—
program components, outputs, objectives, and effects—so that the program can be
modified as needed.
Other informational objectives of tracking exercises include answering outcome
or impact evaluation questions: Can the program be repeated and/or did the program make
a difference? A serious examination of the actual content of the risk communication
program being evaluated can prevent or reduce the occurrence of some of the obvious but
often repeated failures of prior programs.
There are a number of reasons for tracking progress. First, evaluation helps explain
choices. The products or endpoint of evaluation can be used to clarify responses to risk. In
addition, a well designed monitoring system gives better ongoing evidence about program
accomplishments. For example, if information about the program components—such as
the message being communicated, mechanisms for evaluating target audiences, and
impact/outcome—is incorporated into a tracking program, evaluators can obtain informa-
tion about how program accomplishments are achieved and about the program's impact for
purposes of modification. That is, both program intention and content can be evaluated.
93

-------
94 Tracking Progress

Second, tracking can demonstrate the kinds of problems that may arise as a risk
communication program is implemented. Tracking allows practitioners to maintain
relevance throughout the life of the program. Moreover, tracking directs attention to data
needs. Finally, progress or lack of it can be tracked and necessary changes or adjustments
made.
Information about progress can be used to obtain several kinds of information
about a program:
• Better evidence about the program's usefulness and its context
• Better information about the kinds of problems that arise during program
implementation
• Information on the nature of outcomes
• Ideas for alternative strategies for dealing with the situation
In designing a plan to evaluate risk communication efforts, the following series of
key questions can help focus attempts to track progress:
• Why conduct a risk communication program?
• What am I trying to obtain from my risk communication program? What kinds
of information do I need? What do I want to know? What kinds of questions will
lead to the information needed?
• Is the information to be used to help inform the decisionmaking process and
clarify goals or objectives? For whom and through what mechanism?
• Is the design appropriate for the study?

Careful attention to framing responses to each question before initiating risk
communication activities can increase the likelihood of program success. Moreover,
ambiguous results and lack of understanding between the messenger and audience can
occur unless there is clarity of presentation and timeliness. This underscores the need for
evaluators to recognize that effective risk communication is a continuum, not a dichotomy.
Clearly, evaluation can be an important part of a management information system.
There are several objectives in tracking that illustrate this point. First, a well designed
tracking system makes it possible to detect important events and interactions among
events. Second, it can generate information during the course of the program in order to
identify the nature and significance of such event. Third, tracking systems provide
continuous awareness and evaluation of trends to guide choices of action. Moreover, as
part of a comprehensive management information system, the risk communication
program's tracking activities direct attention to data needs which might otherwise be
overlooked. Finally, the insights obtained from such ongoing, systematic appraisal can
inform decisionmakers of the need for anticipatory action and stimulate proactive instead
of reactive management. In summary, tracking progress:
• Helps decisionmakers identify what they are giving up for the sake of accommo-
dating organizational and political pressures
• Maintains study relevance
• Provides early warning about things that are going wrong
• Aids in making mid-course corrections

-------
Tracking the Health Objectives for the Nation                                   95

Planning on Evaluation
      For tracking, the kinds of resources potentially available to  agencies include:
internal staff specialists, the general pool of employees in the organization, an internal ad
hoc team, internal management personnel, or outside contractors. For example, obvious
sources for obtaining resource materials include experts in the Environmental Protection
Agency or the General Accounting Office. Such resources can provide insights into the
following considerations for planning and evaluating a risk communication program:
      •  What criteria are to be used to judge a program? What can be proved or disputed?
      •  What kinds of outcomes are likely to emerge?
      •  What may be alternative strategies in dealing with a situation?
      •  What is most easily evaluated?
      •  In monitoring a program for ecological effects, what kind of data are needed, how
         can the data be manipulated, and what do the data tell?
      •  To inform decisions, what minimum kinds of data are needed?
      •  Is there internal validity of program design?  (This can help sidestep a major
         problem: uncertainty about whether or not the perceived change is a result of the
         program.)
      •  Will results be quantifiable?

      If evaluators are to keep their work relevant to tracking progress, they need ongoing
information  about a variety of program elements. For instance, it is important to monitor
what is happening in day-to-day activities. Continuous monitoring helps to identify new
questions that the evaluation will be expected to answer. It also can aid in pinpointing
changing conditions, which can  create a situation in which the program should try to
achieve different goals from those originally set. Finally, continuous monitoring makes it
somewhat easier to detect unexpected developments and changes that are significant from
a scientific standpoint.
      Making the  best, direct use of evaluation results requires the following:
      •  Clear decision points illuminated by specific questions
      •  An evaluation design appropriate to the purpose and a completed study supplying
         evidence on questions identified for study
      •  Unambiguous results
      •  Clarity and timeliness of presentation to appropriate audience
      •  Congruence of values
      • Relevance of results to contemporary situation
      • Lack of external pressures that constrain choices made by decisionmakers
      •  Sufficient resources to apply findings in the context of the risk communication
        program
      • Authority to change or to modify the program as indicated.

-------
96                                                           Tracking Progress

     In summary, tracking makes sense if only to monitor a program and even if it consists
only of process evaluation.  It is a good management tool, which tells the practitioner
whether a program is working as planned. In addition, tracking is helpful for informed error
correction.   The monitoring system should be crafted to match the communication
program. Tracking can provide the information needed to make decisions about whether
or not  continued allocation  of resources is justified, by answering these fundamental
questions: Is the program working? Is it something we should  continue to do?

-------
 Benefits to Conducting Midcourse Reviews
 Max Lum
      An obvious but often overlooked point with regard to evaluation is that a program
must be implemented before it can be judged. The realities of implementation are:
      •  The program  may not actually exist in the community where it has been
        "implemented" in the form originally intended.
      •  Final acceptance by  the community is never certain, even if program imple-
        mentation has occurred.
      •  Implementation always contains unknowns that may change the original objectives
        and the intent of the evaluation design.

      Studies of federal programs have concluded that implementation problems are the
reasons  that programs are most often unsuccessful.  The main problem in program
implementation is that managers often make the assumption that the process is rational and
quantitative. However, in reality, implementation usually does not fit the clear research
and development model; it lacks specificity and there may be little active "user"—or
public—involvement in the model.
      The types of information one can obtain about and during implementation are routine
management information (e.g., costs, numbers of people involved); process information
(e.g., what happens during implementation); and treatment information (e.g., how it is
being implemented, what treatments are being used, and what effects have occurred).
      Barriers to effective risk communication program implementation include:
      •  Personnel turnover; understaffing
      •  People refusing to give up their own ideas and conform with the planned program
        strategies
      •  Emotional outbursts; conflicts between staff and the public
      •  Muddled communication about what implementation entails
      •  Lack of anticipation of problems and plans for handling them
      •  Poorly composed objectives
                                                                         97

-------
98                                                            Tracking Progress

      •  Undue haste in implementing program without sufficient planning, training, or
        agenda setting
      •  Compulsion to spend money before fiscal year ends without attention to planning
        or realistic expectations for implementation
      •  Management conflicts, differing points of view and goals
      •  Insufficient or unskilled planning

      Although midcourse review cannot, of course, prevent these problems from occur-
ring, such a review can identify problems, both anticipated or unanticipated, that can
prevent a program from reaching its objectives. A major benefit of midcourse review is
identifying problems at a time when corrections can be made, to try to assure that program
objectives can be met. Whether the problems stem from incorrect execution of plans, faulty
judgment in planning, or unexpected circumstances, the purpose of this kind of evaluation
is problem identification.  The  design of a  midcourse evaluation must consider the
transition of the program from the design stage, which may  have reflected a logical, or
"ideal," situation into the "real" world, where influences within and outside of the program
manager's control will affect the program  outcomes.  Therefore, in designing and
implementing a midcourse review, program managers should consider factors such as
these:
      •  Is the process of implementation formal or informal?
      •  Is control centralized or decentralized?
      •  Is management authoritarian or participatory?
      •  Is the program structure hierarchical or egalitarian?
      •  Is the community divisive or cohesive?
      •  Is the program isolated or community oriented?
      •  Are the methods of communication  standardized or individualized?
      •  Is response and interaction controlled or expressive?
      •  Are strategies partitioned or integrated?

      Finally, it must be recognized that midcourse review is but one of many useful
evaluation strategies. If the program is a one-time effort with sufficient depth, length, and
resource to make corrections, this might be the most important strategy choice to assure
that program objectives are met.

-------
Deciding on the Extent of Evaluation
Elaine Bratic Arkin
There is no one answer to what kind of—or how much—evaluation should be
included in a risk communication program. A number of factors contribute to the decision
about what evaluation tasks to undertake. It is essential that evaluation considerations be
included in the planning phase of a program to assure that adequate time and resources are
allocated and that any preintervention tasks, such as collecting baseline data, can be
accommodated.

Why Evaluate?
Evaluation offers a number of benefits to risk communication managers. Formative
evaluation (such as pretesting program message strategies or draft materials) promotes
effectiveness, indicates potential problems, and permits revisions prior to expending final
production budgets or moving a program into the field. Formative research or evaluation
also can provide a more complete understanding of the problem and the population
affected, building a stronger rationale for the interventions that will follow.
Process evaluation, such as tracking the effects of a program underway, can alert the
manager to the strongest and weakest program components, allowing mid-course ad-
justments. Therefore, both formative and process evaluation help managers determine
whether an activity or program can be improved. These kinds of measures can provide
some predictors of program effects as well.
Adding outcome evaluation tactics to a risk communication program can provide
evidence of whether the program or activity works. Outcome measures also can uncover
other program consequences (unexpected effects); provide the basis for deciding whether
additional interventions— and what kind—are needed to reach program goals; justify
expenditures for similar activities; and generate ideas for new interventions and programs.
Outcome and impact evaluation data can provide a strong response to the need for
institutional, public, or political accountability, and can help the program manager, agency
policymakers, or others determine the cost versus benefit of the program.
99

-------
100 Deciding on the Extent of Evaluation

Sometimes, evaluation components are included in a program because of agency
requirements, public demand, or political or interest group pressure. No matter what the
incentive or requirements are, there can be strong benefits to evaluation. Risk program
managers should be aware of these benefits, carefully assess which kinds of evaluation will
be of greatest value, and incorporate the most appropriate evaluation tasks into their risk
communication efforts.

How Can Barriers to Evaluation Be Overcome?
A number of barriers to conducting appropriate evaluation exist. Some can be
overcome; some cannot. Managers can develop strategies to avoid or deal with such
barriers, but also must be prepared to assess whether an obstacle or problem will so
undermine the integrity of a planned evaluation that the evaluation should be reconsidered,
restructured, or abandoned.
Some obstacles may be integral to the risk communication problem to be addressed
or the activities planned. For example, the need for an emergency response will probably
prevent conducting formative evaluation, or any evaluation requiring the collection of
baseline data. Emergencies also may tax an agency's ability to respond, and time and
resources may not be allocated to evaluated tasks. Nevertheless, an analysis of evaluative
data after the fact (such as reviewing the effect of agency actions, the quality and extent
of media coverage, or public response) can help the agency determine how well it
responded to the emergency and help plan for similar situations in the future. Although
many emergencies appear to be unique, there are few instances where lessons cannot be
learned about staff capabilities to respond, intra-agency coordination, and logistical
procedures that work or need to be rethought.
Although not deemed emergencies, some other situations may prevent or handicap
evaluation. Sometimes a program manager faces a short deadline, and optimal evaluation
must be sacrificed in lieu .of shortchanging implementation tasks. In this case, a sound
communication program plan with evaluation tasks intertwined may provide justification
of the need for more time.
Similarly, a lack of trained staff or sufficient resources can hamper evaluation
attempts. Staff knowledge sometimes can be supplemented with advice from other agency
offices, other agencies, institutions, or universities. Even if an evaluation position cannot
be supported on staff, staff can be trained to conduct some evaluation tasks, perhaps with
the guidance of evaluation experts. Program managers must be able to judge whether
simple evaluation steps can be undertaken by their staff, when outside help is needed to
plan or carry out an evaluation, and when a lack of expertise or resources should lead to
a decision not to evaluate. A poorly designed, administered, or analyzed evaluation can
result in a waste of resources, faulty conclusions, and an agency bias against evaluation.
Staff or agency interest, political needs, or investment in an intervention also may make
selfevaluation unwise, even if the staff has the necessary skills. In this case, the program
manager may need to decide whether funds can be secured to underwrite evaluation by a
more neutral party.
The risk communication program design itself may preclude certain types of
evaluation. For example, the intervention may be too shallow or the time frame too short
to expect a measurable impact. On the other hand, the intervention may be based on well

-------
Deciding on the Extent of Evaluation 101

established, previously evaluated strategies; in this case, resources may be allocated more
usefully to other aspects of the program. Program schedules or resources may not allow
collection of baseline data, eliminating the possibility of some kinds of evaluative
measures. And even the most careful evaluation plans can be laid waste by unexpected
events affecting the intervention or the sponsoring agency. However, process or formative
measures still may be valuable.

Institutional resistance is one of the most frequently cited reasons for not evaluating.
Agency decisionmakers may not appreciate the value of evaluation, may have internal
policies that make conducting an evaluation difficult, may have other spending priorities,
may disagree about a program's objectives, or may not want to find out the effects of an
intervention (and be held accountable). These barriers usually can be overcome, although
perhaps not in a short time. Presenting sound, clear, and understandable justification,
including program accountability, for evaluation tasks, and showing examples of how
other evaluation findings have been useful and applied can overcome agency resistance.

What Kind of Evaluation?
Decisions regarding the kinds of evaluation to conduct are based on agency
support, understanding, and needs; resources; the program design; desired outcome; and
related future agency activities addressing the same or similar problems. An ideal risk
communication program would be designed to accommodate a balance of formative,
process, and outcome evaluation measures, because each kind of measure contributes
differently to the quality of the program and an understanding of its effects. However, few
programs are structured to accommodate all of these kinds of evaluation, and few agencies
enjoy the resources and commitment to support elegant schemes. Each type of evaluation
serves a different purpose.
Formative evaluation helps identify potential problems and refine program elements
before full-scale implementation. Formative evaluation techniques may be more useful
than other kinds of evaluation when there is sufficient time allocated for program pretesting
and revision, and when a program is a one-time effort, with no opportunity for refinement
in the field.
Process evaluation strategies provide some indications of program effects and are
particularly useful for program management. Process evaluation can identify logistical
and other program problems in time for correction while a program is underway. Such
indicators also supply some evidence of success and failure when other kinds of evaluation
are not feasible or affordable.
Outcome evaluation assesses the effect of a program or strategy after it has been
implemented. Outcome measures are important because they go beyond how a program
worked to address what changes occurred in the target population as a result. Such
measures help a program manager decide whether a particular program or strategy was
sufficient to resolve the problem, to decide whether and what kind of additional efforts will
be needed, and to justify support for using such strategies with similar situations in the
future.

-------
102                                          Deciding on the Extent of Evaluation

How Much?
     Deciding how much effort and resources to devote to evaluation is frequently the
most difficult question to answer. Sometimes a manager must forfeit desired evaluation
tasks to the urgency of an intervention or lack of resources to support both implementation
and evaluation (although with a little creative thinking, some kind of evaluation is
affordable for almost every budget). At the other extreme, a risk communication program
may be designed to test a specific strategy, or series of intervention strategies, with the
expectation that a successful model could be widely used. In this case, it is not unusual for
the evaluation costs to exceed the  communication development and intervention costs,
with these expenditures regarded as an investment in future risk communication program
efforts. Thus one major determinant of how much evaluation to include in a program
design revolves around the  program purpose.  If the program elements are on trial for
replication and application  to similar situations, a more elaborate evaluation may be
justifiable as a wise investment.

Summary
     Deciding upon the type and extent of program evaluation should be an integral part
of risk communication  program planning, when there is an opportunity to consider
resource allocation for evaluation.  Some kinds of evaluation can serve as a powerful
management tool for a program manager; frequently, these evaluative tasks are conducted
prior to  or  during program implementation and  cannot be relegated to last minute
decisionmaking. Other evaluation tasks are designed to measure the extent of program
success and failure, and why efforts did, or did not, work. Evidence of success is  important
to justify current and future risk communication efforts.  Perhaps even more valuable is the
identification of failures, so that strategies can be altered, mistakes corrected, and future
failure avoided.
     Considering these questions can help the risk communication program manager
make the difficult and important evaluation choices:
      •  How urgently must the risk communication problem be addressed?
      •  What would be the consequences of failure?
      •  Is there management support or public demand for program accountability?
      •  How long will the program be and how much will the total effort cost?
      •  Are the program objectives measurable in the foreseeable future?
      •  How else  might the problem be addressed in the future?  Will an analysis of
         program effects be used for planning additional efforts?
      •  What aspects of the program best fit with agency priorities?
      •  Will an evaluation report help communication efforts compete with other agency
         priorities for future funding?

-------
Matching Your Needs with an Evaluator's
Capabilities

James W. Swinehart, Shelagh Smith, Vicki S. Freimuth,
Charles Darby

Several types of services are available to the evaluator from academia as well as
small and large consulting firms. It is valuable to know how to choose the best services
for optimum effectiveness.

Academic Services
The partnership between the risk communication practitioner and the academic can
work to the advantage of both, but certain constraints, such as time and cost, do apply. The
advantages to using the services of academics include assistance in overcoming these
constraints and access to expertise not always present in the practitioner's own agency.
Academics are conversant with the latest literature on a subject and can help translate
theory into action. Furthermore, students can be a source of low-cost or free help for labor
intensive projects.
For the academic, these evaluation projects can be a source of valuable data and are
useful instructional tools. When students are employed on projects outside the university,
they gain realistic experience and begin to forge their own networks. An example of an
effective evaluation service provided by an academic is a recent analysis of the Cancer
Information System database. This proved to be a good match between the practitioner's
needs and the evaluator's special expertise. The evaluation's objectives should always be
included in the planning process for a risk communication program. The evaluation
strategies and the type of help needed from academia should be based on those objectives.

Consultant Services
Evaluation contractors can provide assistance with the technical development of an
evaluation plan as well as the implementation of the plan. Many companies have the staff
and flexibility to assist with data collection as well as design and analysis for qualitative
and quantitative evaluation methods. Design and analysis services can include setting

103

-------
104                        Matching Your Needs with an Evaluator's Capabilities

objectives, sampling, developing the instrument, choosing methodology, securing Office
of Management and Budget clearance, statistical and qualitative analysis, and interpreta-
tion and reporting.
     Qualitative data collection is used in concept or materials testing.  The most
commonly employed techniques are focus groups, central location intercepts, small-scale
executive interviewing, gatekeeper reviews, and needs assessments.  Quantitative data
collection involves analysis of existing data or large-scale surveys conducted in person, by
telephone, or by mail.
     Some suggestions for making the relationship between practitioner and contractor
work include:
     • Recognizing the advantage of a contract
     • Setting objectives jointly and with clarity
     • Making the contractor a technical partner
     • Tracking costs and progress and assuring accountability
     • Allowing the contractor the latitude necessary to do the job
Selecting Assistance
     The choice of evaluation assistance can be made from among academia, full-service
providers, or firms that specialize in certain facets of evaluation, e.g., focus groups, central
location intercepts,  market research, analysis of program issues, special populations, and
executive interviewing. The appropriate selection will depend, to some extent, on the risk
communicator's determination of the type and extent of evaluation services needed. To
make such a determination, the risk communicator first poses the question(s) to be
addressed through evaluation and then considers the options for answering the question
(and their respective costs). Here are some examples:

1.  QUESTION:    WERE THE CAMPAIGN MATERIALS REGARDED AS  AP-
                   PEALING AND UNDERSTANDABLE BY THE TARGET AU-
                   DIENCES?
      METHOD:    Testing with persons representative of designated audiences
         COST:    Estimated range of $ 10,000 to $40,000, depending upon number of
                   items and number of audiences

2.  QUESTION:    WERE THE CAMPAIGN MATERIALS REGARDED AS  AP-
                   PEALING AND APPROPRIATE BY MEDIA GATEKEEPERS?
      METHOD:    Interviews and/or questionnaires with appropriate persons at TV
                   networks and stations, cable systems, radio networks and stations;
                   magazine and newspaper editors; others as appropriate
         COST:    Estimated range of $5,000 to $20,000, depending upon number and
                   location of persons interviewed

3.  QUESTION:    DID THE CAMPAIGN MATERIALS ACHIEVE A LEVEL OF
                   DISTRIBUTION AND PLACEMENT THAT GAVE THE TAR-
                   GET AUDIENCES ADEQUATE OPPORTUNITIES FOR EX-
                   POSURE TO THEM?

-------
Matching Your Needs with an Evaluator's Capabilities
                                                 105
     METHOD:
         COST:
4.  QUESTION:
     METHOD:


         COST:

5.  QUESTION:
     METHOD:



         COST:

6.   QUESTION:
TV spots: reports from persons in agencies distributing these items,
plus monitoring of on-air placements by tracking firm such as
Broadcast Advertisers Reports

Radio spots: interviews/reports from persons in agencies distribut-
ing spots, and radio station personnel (reports on placement to
include, where feasible, printouts from stations regarding dates and
times of airings)

Print ads and articles: interviews/reports from persons in agencies
distributing ads and articles (reports to include copies of ads or tear
sheets of articles as published)

Other materials: interviews/reports from persons in agencies dis-
tributing these items, and, as appropriate, from other intermediaries
Estimated range of $25,000 to $75,000, depending upon number/
type/exclusivity of monitoring reports purchased, use of clipping
services, number and location of persons interviewed as distribu-
tors or users of materials

HOW SATISFACTORY WERE THE ARRANGEMENTS AND
PROCEDURES THAT WERE USED IN MAKING CAMPAIGN
DECISIONS (SETTING OBJECTIVES, CHOOSING TARGET
AUDIENCES AND COMMUNICATION STRATEGIES, DE-
VELOPING MATERIALS, ETC.)? WHAT CHANGES, IF ANY,
SHOULD BE MADE IN PLANNING FUTURE CAMPAIGNS?
Using a set of specific questions, interviews and invited commen-
tary from participants in the process (with assurances of anonym-
ity)
An estimated $5,000 to $10,000

HOW SATISFACTORY WERE THE PROCEDURES USED TO
OBTAIN COOPERATION FROM OTHER ORGANIZATIONS
(IF ANY WERE USED), AND TO COORDINATE ACTIVITIES
WITH THEM?  WHAT CHANGES, IF ANY, SHOULD BE
MADEINTHESEPROCEDURESFORFUTURECAMPAIGNS?
Using a set of specific questions, interviews and invited commen-
tary from staff, coordinators for the campaign, and from represen-
tatives of organizations involved in this campaign or parallel ones
(with assurances of anonymity)
An estimated $5,000 to $10,000

TO WHAT EXTENT DID THE TARGET AUDIENCES NOTICE
THE CAMPAIGN MATERIALS  AND PAY ATTENTION TO
THEM?

-------
106
        Matching Your Needs with an Evaluator's Capabilities
    METHOD 1:
         COST:
    METHOD 2:



         COST:

    METHOD 3:
         COST:
7.   QUESTION:
     METHOD:
         COST:
8.   QUESTION:
    METHOD 1:
Three surveys conducted at appropriate intervals among samples
representative of the designated target audiences, assessing both
unaided and aided recall of campaign materials
Estimated range of $ 15,000 to $50,000, depending upon the samples
used, data collection procedures, and the possible opportunity to
share costs with others (for example, through participating in multi-
sponsor surveys)

Analysis of sources of inquiries prompted by the campaign (made
possible by use of keyed box numbers or other identification on
campaign materials — e.g., Box A on radio spots, Box B on TV
spots, etc.)
Estimated range of $6,000 to $15,000

A continuing survey of persons submitting inquiries or requests
prompted by the campaign (data to be obtained by enclosing a reply
card or brief questionnaire when fulfilling information requests—
perhaps in a tenth or fewer of the packages sent out, depending on
the total number of requests received)
An estimated $8,000 to cover reply postage and some analysis time
on the part of staff

TO WHAT EXTENT DID THE CAMPAIGN ALTER THE BE-
LIEFS, ATTITUDES, AND BEHAVIORAL INTENTIONS OF
THE TARGET AUDIENCES IN THE DIRECTION INTENDED?
DID ANY CHANGES OCCUR THAT WERE OPPOSITE TO
THE ONES INTENDED?
A series of three surveys (conducted one month before the cam-
paign begins, at its midpoint, and one month after it ends) with
panels representative of the designated target audiences. Each of
the panels should  include in the second  and third surveys two
identifiable subsets of people: (1) those who are able to recall seeing
or hearing campaign materials pertaining to them, and (2) those
who have not seen any campaign materials pertaining to them and/
or are unable to recall these materials even with prompting.
An estimated range from $15,000 to $50,000, depending upon the
samples used, data collection procedures, and the possible oppor-
tunity to share costs with others

TO WHAT EXTENT DID THE CAMPAIGN ALTER THE AC-
TIONS OF THE TARGET AUDIENCES IN THE DIRECTION
INTENDED?  DID ANY CHANGES OCCUR THAT WERE
OPPOSITE TO THE ONES INTENDED?
A series of surveys with samples representative of the designated
target audiences, scheduled appropriately  for each one (e.g., one

-------
Matching Your Needs with an Evaluator's Capabilities
                                                      107
          COST:
    METHOD 2:
          COST:
9.   QUESTION:
      METHOD:
          COST:
month before the campaign, at its mid point, one month after it ends,
and one year later) with exposure to campaign either controlled or
assessed
Local or regional surveys may run from $5,000 to $25,000; national
surveys $50,000 to $150,000 (Factors affecting cost  size and
nature of samples, data collection methods, number of open-end
questions, etc.)

Final analysis and summary of volunteered comments from persons
noticing or using campaign materials (Note: Such anecdotal evidence
of campaign impact is weak in comparison with the kinds of data
produced by the surveys indicated above, but it can be extremely
useful in illustrating the points made in more representative stud-
ies.)
An estimated $3,000 to $5,000, depending  upon the number of
items reviewed

TO WHAT EXTENT DID THE CAMPAIGN STIMULATE IN-
QUIRIES FOR INFORMATION ABOUT THE  TOPIC(S)
COVERED?
A month-by-month tally of inquiries, classified by topic, source,
and other relevant characteristics; such tracking to begin at least a
month prior to the start of the campaign and continuing for a year
after it ends
No additional cost, assuming that tasks of this kind are already
being performed (and will continue to be performed) by staff on a
regular basis
      Once the evaluation needs have been outlined, criteria for selecting contractor
 support should be determined. The criteria should be based on the kinds of experience,
 skills required for the methods chosen, cost, and other considerations.  These include:
      • Academic or other training
       • Experience with chosen methodology
       • Age, sex, ethnicity, other characteristics if important for working with a selected
        target audience
       • Samples of reports and/or references from previous clients
       • Population characteristics and recruiting procedures
       • Possible client conflicts
       • Contractual matters: schedule, confidentiality of findings (and topic, if desired),
        fees, review and possible revision of report, etc.

-------
108                          Matching Your Needs with an Evaluator's Capabilities

Resources
        The following guides and directories are valuable to anyone involved in risk
communication evaluation.

The Green Book—director of market research firms and services; new edition available in
April each year; cost: $50.00

Available from:                  American Marketing Association
                                310 Madison Avenue
                                New York, NY 10017

The Blue Book—directory of some agencies and organizations represented in membership
of American Association for Public Opinion Research (AAPOR); new edition published
each year; free

Available from:                  AAPOR
                                P.O. Box 17
                                Princeton, NJ 08542

Directory of Focus Group Facilities and Moderators: new edition published each year;
cost: $25.00

Available from:                  National Focus Group Network
                                33 Junction Road
                                Brookfield Center, CT 06805

What is a Survey?—provides flow chart of process and checklist of items to budget; free

Available from:                  American Statistical Association
                                806 15th Street, NW, Suite
                                Washington, DC 20005

Newsroom Guide to Polls and Surveys: cost $10.00

Available from:                  American Newspaper Publishers Association
                                11600 Sunrise Valley Drive
                                Reston, VA 22091

-------
MEASURING ACCOMPLISHMENTS

-------
Considerations for Planning Risk Communication

Robert W. Denniston

     Measuring accomplishments begins at the beginning: with a thorough understanding
of the risk communication program objectives and a program foundation based on realistic
expectations for risk communication messages within a broader risk abatement program.
Without adequate consideration in the program planning stage, appropriate evaluation
cannot be designed, and measurement may later reveal basic, avoidable faults in program
design.  Planning risk communication messages necessitates a review of the problem,
public knowledge, attitudes and behavior related to the problem, and an understanding of
how the public views risk messages.

Public Perceptions of Risk Messages
      The following are some of the general obstacles to public understanding of risk
messages.
     •  Risk is an intangible concept, and  requires effort on the part of the public to
        understand.
     •  The public does not understand relative risk and may underestimate or overesti-
        mate their personal vulnerability.
     •  People seek absolute answers, and  risk messages address intangible, invisible
        hazards without concrete outcomes.
     •  The public reacts unfavorably to fear, and fearful messages may result in uncalled
        for outrage or, conversely, denial of an important risk.
     •  The public has a strong tendency to underestimate personal susceptibility.
     •  Individuals have contradictory beliefs that interfere with their understanding of
        risk messages; they can, at the same time, believe that "it can't happen to me" and
        "everything causes cancer."
     •  Most people lack a future orientation, and threats that may materialize far in the
        future are easy to put aside.
     •  The public does not understand science.  Technical data, risk models and the
        variables involved in calculating risk, and the fact that scientific knowledge is not
                                                                        111

-------
112                                                 Measuring Accomplishments

        static but evolving over time, all add obstacles to public understanding of risk
        messages.

     In addition, environmental risk messages are more difficult for the public to accept
than personal health messages.  For example:
     •  People grasp easy solutions, and easy solutions to environmental risk problems
        may conceal many complexities and obstacles.
     •  Individuals desire personal control over their well-being, and most environmental
        risks are more amenable to governmental or institutional control than individual
        control.
     •  Individuals seek guidance adapted to the personal level, and most environmental
        risk messages look at the problem on a community or societal level.
     •  Most people have more pressing priorities, including more immediate threats to
        their health.

Assessment of Available Data
     Risk communication planning begins with a review of available data about the risk
problem. The decision to develop messages for the public should be based on the answers
to questions like these:
     •  Is sufficient information available to explain the  problem to the public?
     •  Will more data be forthcoming within a reasonable amount of time?
     •  Are there compelling reasons (e.g., issues of right-to-know, public or political
        pressure, need for public action) for informing the public even if insufficient
        information is available?
     •  What is the purpose of developing messages for  the public?
      • Are these expectations realistic?
      • Is the government or other responsible body prepared to handle public reaction
        or response?

        A  realistic review of the answers to  these  questions will help  shape risk
        communication messages.

What Risk Communication Can—and Cannot—Do
     Risk communication by itself is not the answer but one component of the response
to environmental and other risk situations. Well designed communications can increase
awareness of a problem as well as the options for resolving it; can empower citizens to make
changes in  their own behavior, or work towards community change; and can provide
support for effecting policy or institutional changes to resolve a risk situation. For risk
situations in which there is no opportunity for personal action,  particular care in message
design is needed to  reduce frustration and  fatalism.   In the long term, using risk
communication to inform and motivate the affected public is crucial to ameliorating risky
situations and assuring the public's health.
     Too often, risk messages are judgmental, rather than neutral or supportive. The
public may hear contradictory messages from different sources about a situation. Although
there may be different points of view about a risk, and different audiences need different

-------
Considerations for Planning Risk Communication                               113

types of information, clear, direct messages can reduce unnecessary conflict, misunder-
standing, and mistrust in all cases.
     The design of risk messages must include consideration of what the public knows
and understands about the issue, and what interests and concerns it has. The complexity
of mostrisk communication messages, juxtaposed with the public's desire for answers and
solutions, poses a challenge for designing risk messages that will further the program's
purpose and not merely enrage the recipients. Formative evaluation, including message
pretesting, is essential to assuring that messages are appropriate and effective prior to their
dissemination.

Conclusions
     Risk communication should be an integral part of larger risk programs, but will not
produce a positive effect without careful planning and development. Careful planning
includes a thorough understanding of the intended audience, including its knowledge,
attitudes and interest in the  subject; a clear statement of the intent of the communication
(e.g., to inform, persuade, and/or influence attitudes or behavior); and formative evaluation
prior to public dissemination to reduce the chances of unintended effects.

-------
Four Factors in Designing Evaluation Strategies
David McCallum
In planning environmental risk communication programs, it is often difficult to
define intended program outcomes, a difficulty that becomes evident in the developmental
stages of these programs. Outcomes are more definable in some other kinds of health
programs, albeit hard to measure (e.g., increasing the percentage of the population that has
controlled blood pressure or eats certain foods). The difficulty lies in determining what
kinds of responses and attitudes risk communicators want people to adopt. Frequently
agencies set goals, such as "the public should make more informed decisions," but
translating goals into specific, directive messages—as well as measurable objectives—is
difficult if not impossible.
This paper presents four planning considerations that can help overcome these
difficulties.

Defining the Problem
First, communicators should begin with a definition of the problem and a
statement of desired outcomes. It is helpful to differentiate between goals and objectives.
Overall goals can then remain constantregardless of vested interests of program objectives.
For example, with Superfund, the goal is to clean up waste sites and reduce exposure; the
objectives will consist of intermediate steps toward this goal. Some of these steps may
include risk communication strategies to produce specific outcomes, such as public
awareness and participation. Most infectious diseases cause adverse effects sooner than
do exposures to chemicals; because of the long latency period after exposure to toxic
chemicals before the manifestation of adverse effects, articulating goals can be very
difficult Risk communication program goals might include behavioral changes that
people should make with objectives addressing the intermediate steps toward behavior
change (e.g., awareness and attitude change). If people decide not to follow certain risk
reducing recommendations, is it due to a change or difference in their values or a lack of
access to information and other supporting services?
115

-------
116 Measuring Accomplishments

Frequently the public is characterized as irrational; that usually means that the public
does not agree with the risk manager's explanation. Program planners need to be cognizant
of and track attitudes, because these precede behavior change. Furthermore, they should
determine the definition of the risk communication program success (e.g., an informed
public, informed decisionmaking, public support, or changed behavior).
Initially, do not focus too heavily on deriving measurable objectives. Although these
are important and should be included where possible, the emphasis in program design
should be on what information is needed to understand whether and how well the program
is working. Program designers should understand the kinds of questions other people are
likely to ask. For example, the framers of the Superf und Amendments and Reauthorization
Act (SARA) Title III intended the overall goal to be a reduction in toxic emissions; but the
communicators' objectives are to provide incentives for people to work together.
Then, specify a numerical or percentage change in a measure as a result of program
intervention, and be realistic. A 3- to 4-percent increase in the number of homes that are
owner- tested for radon may sound low; restating this as a "statistically significant change"
(which it is) may improve the perception of the value of the program. The 3- to 4-percent
increase can represent, in some cases, a large absolute number. The risk communicator
must provide a link between the communication objectives and the program goals. This
link, the program rationale, answers the questions of policymakers and others about the
purpose of the risk communication program.
Risk communication is of ten undertaken after a population has been exposed to a risk
factor, such as after exposure of workers to a toxic chemical. In these cases, initial
intervention is not possible. In the short term, there may be an increased reporting of
adverse effects from the exposure. Therefore, criteria for measuring objectives must be
established in light of the natural history of the population so that, if there is an increase
in adverse effects in the short term, the risk communication program is not criticized
undeservedly.
In program design, it helps to preserve the greatest number of options when
measuring outcomes. Design "triggers" to tell whether unforeseen outcomes or results are
taking place. This is important because in setting up objectives, an unforeseen outcome
that is just as important as the stated objectives may be missed. For example, a program
might be designed to increase people's knowledge about a health issue. While this
objective is very hard to quantify and observe, another unforeseen program outcome could
take place: People exposed to the risk communication program might visit a health clinic
as a result of increased awareness. This is, in fact, a more desired outcome than increased
knowledge alone. Yet, it could be missed as an outcome of the evaluation design did not
include ways to track this type of behavior change.
Program planners should identify early in the program design stage what the
confounding variables could be. That is, there may not be the ability to control some factors
that might influence the outcome. Program administrators need to know not only whether
the objectives were achieved, but why (or why not). "What happened" should be identified.
regardless of whether everything that happened was intended or not. If what went right—
or wrong—cannot be identified, then the public next asks, "who went wrong." Evaluation
is a useful tool to identify program dysfunctions.

-------
Four Factors in Designing Evaluation Strategies 117

Planning How to Use Results
Program administrators shouldknow how they will use either a positive or a negative
result. In government programs, the value of a "success" may not be weighted as highly
(in terms of desirability) as problems that could arise if the risk communication were
unsuccessful. Therefore, situations that could be created as a result of the communication
(e.g., public outrage, new legislation, a change in resource commitments) should be
considered for inclusion in the evaluation design.
It is also important to understand what program administrators should do with the
results of the measurement of outcome or accomplishment. The results can be used to
answer questions like these:
• Should the strategy of the program be changed?
• Should the program be marketed to increase the use of its effective components?
• Can positive evaluation data be used to leverage additional resources from the
community or other program supporters?

Establishing Timeframes
A third factor to consider in designing programs so that accomplishments can be.
measured is timeframe. It is important to know the decision timeframe of, for example, a
remedial investigation, so that evaluation of the outcomes of a risk communication
program can be built into the plan. The timeframe for political, technical, and social
processes may not coincide with the evaluation timeframe. For example, legislation often
imposes timetables on a program; the effect is that program goals must be established to
meet the specified timetable, rather than addressing a desired outcome of the program or
being consistent with the natural timing of communication programs. Interim objectives
that can be accomplished within the legislative timeframe are essential in this case. When
designing the evaluation, it is very important to identify the timeframes of these processes.
Failure to do so can lead to a situation in which, for example, the required time to achieve
a change is longer than the timeframe for the evaluation, so results cannot be shown.
Because of the inflexible nature of some of the political, technical, or social timeframes,
it may be necessary to establish interim measures that can take place within the established
timeframe.

Using Resources
A fourth factor to consider in the design of risk communication programs is the
effective use of evaluation resources. What level of validity can be achieved with the
resources available? This needs to be made explicit during the planning program. The
greater the level of social controversy surrounding an environmental risk, the more
resources will need to be allocated for evaluation. In such cases, data produced must be
able to withstand intense scrutiny.
"Cost saving" is a particularly tricky evaluative measure. For example, the cost-
effectiveness of disease prevention programs became an issue with the emphasis on health
care cost containment in the late 1970s and early 1980s. Results of studies showed that live
people cost more than dead people; unless the "social good" of people staying alive was
considered, it appeared to be more costly to prevent deaths. The programs, however, were
cost-effective using broader measures.

-------
118                                                 Measuring Accomplishments

     In summary, a number of factors must be considered in planning program evaluation
and incorporated into the design of the program to assure that the needed measurements
will be possible. These factors include: 1) a careful definition of the problem so that an
appropriate intervention and outcome can be planned; 2) a clear purpose for the evaluation
results; 3) a consideration of the varying timeframes involved; and 4) the best use of
available evaluation resources.

-------
Integrating Evaluation: A Seven-Step Process
William H. Desvousges
An Environmental Protection Agency (EPA) study has shown that the Agency has
given too little attention to evaluating the effectiveness of its risk communication activities
(EPA, 1987). Recently, its Risk Communication Program has taken some strides to address
the lack of evaluations, but clearly, more needs to be done.
Evaluating the effectiveness of a risk communication effort involves subtle consid-
erations. For example, former EPA Deputy Administrator Milton Russell persuasively
argues that the main challenge facing risk communicators is empowering individuals to
make informed choices about the hazards that are under their control (Russell 1988). Yet,
he acknowledges that there are legal, institutional, philosophical, and even cognitive limits
to influencing how individuals make decisions involving risks, especially environmental
risks.
Recent studies for EPA have found that two factors in evaluating risk communica-
tion effectiveness are often overlooked. First, the benchmarks used to measure effective-
ness can have an important effect on the final assessment of a risk communication activity.
For example, Johnson et al. (1988) have shown that an "informed consent" or "informed
choice" criterion—one in which individuals have access to the best information but make
their own choices—yields a different measure of effectiveness than an assessment based
on individuals' following an agency's recommendations.
Second, perceptual, cognitive, and behavioral measures of effectiveness are more
reliable than simply asking people for then- evaluations of risk communication materials.
In one study, for example, more than 85 percent of homeowners gave a fact sheet high
ratings, while the learning and risk perception measures showed much lower levels of
effectiveness (Smith et al. 1987).
Even so, evaluations of risk communication programs must be practical. Common
sense suggests that little can be gained from spending more on the evaluation than on the
entire risk communication effort. This paper presents a comprehensive framework and a
seven-step approach for evaluating risk communication. It argues that integrating
119

-------
120 Measuring Accomplishments

evaluation with risk communication will increase substantially the overall effectiveness of
risk communication activities.

Risk Communication Framework
The most important step in developing a framework for evaluating risk communi-
cation is to develop a clear definition of risk communication itself. Several experts
expressed the need for such a definition at the first major conference on risk communica-
tion sponsored by the Conservation Foundation (Davies et al., 1987). However, presen-
tations at the 1988 Society for Risk Analysis meetings, which devoted several sessions to
risk communication topics, showed that many participants used strikingly different
definitions.
A comprehensive definition for risk communication would include three important
aspects of the communication process:
• Perceptions: How do people perceive environmental risks?
• Practices: What messages about risk are developed and how are they communi-
cated?
• Process: To what extent are various groups involved in the communications
process?

The communications process may affect how risks are perceived as well. Creighton
(1988) suggests that involving key groups in the process early and providing several means
of resolving conflicts can improve the chances for successful communications.
Success with communications can be measured in several ways and from several
perspectives. The following criteria commonly are used in evaluations:
• Information delivery: Did the target audience receive the message?
• Information processing: Did the targetaudience process the message "correctly"?
• Information impacts: Did the target audience take the recommended actions?
And, did they make informed choices?

While these criteria are not exhaustive, they illustrate the subtle considerations that
arise in evaluating risk communications. Information delivery simply asks whether people
were exposed to the message. Under this criterion, the larger'the exposure, the more
successful the program. The second criterion adds an additional consideration to the
evaluation. Is "correctly" mainly a cognitive criterion? That is, does it require only that
the target audience understand the risk communication message? Or, does it also imply
that the audience should interpret the message in the way that the communicator considered
correct?
The same distinctions arise in the assessment of information impacts, except that
behavioral changes are the main focus. The "informed consent" evaluation criterion may
be the most appropriate (Johnson et al., 1988). Under this criterion, people are assumed
to make their decisions on the basis of the best information available. A radon risk
communication program may be judged successful even when homeowners, using sound
information, choose not to test their homes for radon. Nevertheless, implementing such a
criterion can be complex (Desvousges et al., 1988).

-------
Integrating Evaluation: A Seven-Step Process 121

Despite the subtleties involved in developing a risk communication framework, the
need for such a framework is critical. It is necessary to provide a clear definition of risk
communication and a sound basis for evaluating its effectiveness.

Seven Steps of Evaluation
Two separate campaigns concerning radon used seven steps to evaluate risk
communication effectiveness. One study, which took place in New York State, aimed to
communicate with 2,300 homeowners throughout the state who had already tested for
radon (Johnson et al., 1988). The other study was carried out in Maryland, and its objective
was to inform homeowners in two communities about radon tests (Desvousges et al.,
1988). EPA cooperated with the states in both studies.
The seven steps used to evaluate risk communication effectiveness are as follows:
1. Define risk communication objectives
2. Design communication program
3. Determine measures of program effectiveness
4. Design effectiveness evaluation
5. Develop implementation plan
6. Evaluate
7. Determine communication effectiveness

To define the risk communication objectives (Step 1), the purpose, significance, and
constituency must first be defined. For example, what is the risk communication expected
to accomplish? Why is the risk communication important, and how can importance be
shown? Who will benefit from the risk communication, and why is it important to reach
those people? Once these questions have been answered, clearly defined objectives can
be stated.
Designing the risk communication program (Step 2) demands attention to integra-
tion and workability. Features include such activities as mailing informational brochures,
setting up toll-free numbers for questions, or providing diagnostic assistance. Integration
involves defining a message (e.g., radon is a serious health risk; you may be at risk),
publicizing that message through the media (print, radio, or television), and targeting a
specific audience for the message. Workability involves pretesting to see whether the
program works. This can be done through focus groups, expert review, or one-on-one
evaluations. Step 2 can be envisioned as a funnel. Many ideas and alternatives are
narrowed down, and the best ones are chosen for use in the risk communication program.
To determine how to measure program effectiveness (Step 3), participant evaluation
or perceptual/behavior indicators are useful. However, participant evaluation is often
misleading because, although participants might respond that a program was effective, the
actual measured perceptual and behavioral changes might be small. Participant evaluation
provides good qualitative but not good quantitative information.
Step 4, designing effectiveness evaluation, involves defining the target population(s),
identifying the experimental design, and setting controls or limits.
Developing the implementation plan (Step 5) involves choosing a plan such as a
sampling plan (e.g., stratified random sampling or random digit dialing) or a survey plan
(e.g., baseline and follow-up telephone survey; mail follow-up survey).

-------
122                                                 Measuring Accomplishments

      To do the evaluation correctly (Step 6), the program activities and the evaluation
must be integrated; evaluating only after completing the activity does not allow valuable
feedback to be incorporated into the activity to improve its results. Evaluation activities
can include developing and using a questionnaire, training interviewers, and establishing
quality assurances.
      Analyzing  the data and summarizing the findings are necessary  to determine
communication effectiveness (Step 7). Simple or complex measures can be used to analyze
the data. The most informative measures involve giving a pre- and post-survey question-
naire and then estimating changes in behaviors and intentions. Simple measures usually
involve changes in means or proportions.  For example, in the  Maryland study, EPA
evaluated changes in the proportions of people aware of radon, changes in the mean number
of correct answers on a radon quiz, and changes in the proportions of people testing their
homes for radon.  More  complex measures involved developing models to describe
changes in knowledge, attitudes, and behavior (Desvousges et al.,  1988). The same basic
data were used for both types of analysis, indicating the importance of carefully planning
the overall evaluation process.

Implications for Risk Communication Evaluation
     The radon experiences suggest four benefits that an agency gains from evaluating its
risk communication activities:

     •  Determining what works and what does not
     •  Providing ideas for program changes
     •  Establishing credibility
     •  Enhancing program effectiveness

     The following recommendations are based on experiences in several evaluations of
radon risk communication:
     •  Make objectives  explicit
     •  Use attitudinal, perceptual, and behavioral indicators
     •  Establish experimental controls
     •  Pretest program materials and evaluation materials
     •  Integrate evaluation design and analysis

     This paper  has drawn primarily from experiences gained with evaluating risk
communication for radon. Whether its conclusions apply to other risk communication
experiences is an important issue that needs to be addressed in future studies.  Another
important need is an evaluation guidebook based on a comprehensive evaluation frame-
work.

-------
Integrating Evaluation: A Seven-Step Process                                 123

REFERENCES

Creighton,J.L., 1988. A Comparison of Successful and Unsuccessful Public Involvement:
A Practitioner's Viewpoint. Paper presented at the Annual Convention of the Society for
Risk Analysis, October 31-November 2, Washington, D.C.

Davies.J.C..V.T.Covello.andF.W. Allen. 1987.RiskCommunication.Washington.D.C.:
The Conservation Foundation.

Desvousges, W.H., V.K. Smith, and H.H. Rink. 1988. Communicating Radon Risk Ef-
fectively: Radon Testing in Maryland. Overview and Summary of Survey Results. Final
report prepared for Office of Policy Planning and Evaluation, U.S. Environmental
Protection Agency. Washington, D.C., Research Triangle Park, North Carolina: Research
Triangle Institute, October.

Johnson, F.R., et al. 1988. Informed choice or regulated risk? Lessons from a social
experiment in risk communication. Environment. 30:12-15,30-35.

Russell, M. 1988. Risk Communication: On the Road to Maturity. Paper presented at the
Workshop.

Smith, V.K., et al. 1987. Communicating Radon Risk Effectively: A Mid-Course Evalu-
ation. Prepared for the Office of Policy Analysis, U.S. Environmental Protection Agency,
under Cooperative Agreement No. CR-811075, by Vanderbilt University, Nashville,
Tennessee, and Research Triangle Institute, Research Triangle Park, North Carolina.

United States Environmental Protection Agency, EPA. 1987. Unfinished Business: A
Comparative Assessment of Environmental Problems. Washington, DC: The Agency,
February.

-------
UNDERSTANDING OMB PROCEDURES

-------
OMB Survey Clearance Procedures

Richard Eisinger

The federal government's Office of Management and Budget (OMB) grants clearancesfor
surveys, including those performed as part of a program evaluation. Any federally
sponsored survey often or more individuals must go through the OMB clearance process.

The Office of Information and Regulatory Affairs (OIRA) in the Office of Manage-
ment and Budget (OMB) is responsible for granting survey clearances. The OMB is given
the authority for survey clearance under three separate legal authorities, one of which is the
Federal Reduction of Paperwork Act of 1980. The purposes of this legislation are to reduce
the burdens placed on individuals and to coordinate the government's surveying activities.
The primary authority for survey clearance comes from the President's Executive Order
on Regulations, requiring OMB to approve all federal rules, many of which have reporting
requirements. The OIRA is legally required to follow the President's orders. However,
most OMB decisions are affected more by questions of survey design than by political
policy.
The OIRA staff are mostly economists, lawyers, and public policy experts, with a
few social scientists. Therefore, there is a small staff with expertise in surveys and data
collection. The facts that OMB staff consider for survey clearance are:
• Duplication: Are there other sources of the same data?
• Burden on the public: Will the survey require an unfair or unnecessary effort on
the part of the public?
• Cost. Can the survey be done for less money?
• Practical utility. How will the results be used? This can be called the "So What?"
factor. The OMB must be convinced that the results of an evaluation will be used
to change something. This is a primary criterion for risk communication
evaluations.

Clearly, these are important questions to be answered in designing any survey,
whether or not OMB clearance is required.

127

-------
128                                              Understanding OMB Procedures

     Another important consideration is cost versus potential benefit. This was impor-
tant, for example, in an evaluation done by the Food and Drug Administration (FDA) on
Patient Package Inserts  (PPIs), information sheets  for consumers on the effects  and
possible adverse reactions of specific drugs, which the FDA Commissioner wanted all
pharmacists to distribute at the point of prescription drug sale. The national cost would
have been $20 to$100 million. The Rand Corporation studied the effects of PPIs and found
there was an increase in knowledge but no change in behavior. Therefore, other less costly
alternatives had to be considered as a result of the  study. In this case, investment in
evaluation prevented a requirement that would have cost society millions of dollars with
unproven benefits to the  consumer.

-------
OMB Regulatory and Approval Requirements
Susan E.  Dudley
     The objective of OMB's Office of Information and Regulatory Affairs (OIRA) is to
ensure that government activities  do more good than harm.  Two of the primary
governmental activities that affect the public are 1) regulations, and 2) paperwork and
reporting requirements.  The OIRA  is  responsible for weighing the effects of these
activities against their intended results to ensure net benefits to society.

Regulations
     With respect to regulations, the OMB operates according to procedures outlined in
the following two Executive Orders:  Executive Order 12291—Federal Regulation
(Federal Register 2/17/81); and Executive Order 12498—Regulatory Planning Process
(Federal Register 1/4/85) The "general requirements" of Executive Order 12291 are
itemized in Section 2, which stresses that net benefits should be maximized whenever any
agency promulgates new regulations, reviews existing regulations, or develops legislative
proposals concerning regulations. This Executive Order requires agencies to prepare
Regulatory Impact Analyses (RIA) for all major rules.
     Executive Order 12498 builds upon the previous Executive Order by requiring each
agency, subject to the provisions of 12291  (which includes EPA), to submit to the OMB
an annual statement of its "regulatory policies, goals, and objectives for the... year and
information concerning all significant regulatory actions underway or planned" (Section
1).
     When a regulation is proposed by EPA, the agency is required to send four copies
of the proposed regulation to OMB. Within OMB, these copies are distributed to:  1) the
desk officer; 2) the budget examiner; 3) the regulatory analyst; and 4) the public file. The
copy in the public file is not actually accessible to the public until the rule is published in
the Federal Register. The OIRA's review is based solely on the regulatory agency' s record
and comments from other agencies in the Executive  Branch. The OIRA staff do not
communicate with anyone outside the government regarding regulations and thus can
focus on consumer welfare without being influenced by special interests. When the rule is
approved, OMB notifies the Agency.
                                                                         129

-------
130 Understanding OMB Procedures

The Executive Order anticipates the following review time requirements within
OMB:
• For proposed major rules, 60 days
• For final major rules, 30 days
• For nonmajor rules, 10 days

However, the review period may be extended. The secret to ensuring a smooth
review process is to demonstrate that the draft rule would make society better off; that
demonstration includes an examination of alternative approaches to addressing the
problem and a good RIA. Risk communication programs are not covered by the Executive
Orders per se. However, because risk communication can be a good substitute for less
efficient command-and-control approaches to regulation, an RIA may determine that such
regulation is unnecessary. For example, with radon, perhaps individual homeowners can
help ameliorate the problem, minimizing the need for federal regulation.
The requirements of Executive Order 12291 (and 12498) are not affected by the
recent court decision on vinyl chloride, according to which the Agency's determination of
what is "safe" cannot consider cost, because the requirements of the Executive Orders do
not override statutory requirements. Nevertheless, under the Executive Order, agencies
are required to choose the least costly approach to meeting a statutory goal. Therefore, once
a decision to regulate is made, then cost is considered. This is consistent with the court
decision.

Paperwork and Reporting Requirements
Three federal documents are especially pertinent to paperwork and reporting
requirements: the Paperwork Reduction Act of 1980; the Code of Federal Regulations
(Volume 5, Part 1320): Control of Paperwork Burdens on the Public; Regulatory Changes
Reflecting Amendments to the Paperwork Reduction Act; Final Rule (5/10/88); and
"Information Collection Requests (ICRs)" (Fact sheet published by the EPA Office of
Policy, Planning and Evaluation, dated 11/87). The ICR fact sheet summarizes the
Paperwork Reduction Act of 1980 and the rule development process for EPA.
The topic of paperwork and reporting requirements is more applicable than regulatory
requirements to evaluation and risk communication. Paperwork and reporting require-
ments pertain to notifications, surveys, questionnaires, and other types of information
collection. The Paperwork Reduction Act directs the OMB to review and approve all
collections of information from the public based on the following criteria:
• The collection has practical utility.
• It is not duplicative.
• It is the least burdensome method to the government and the public of obtaining
the information.

The requirements for OMB approval pertain to any survey or questionnaire requesting
information from ten or more persons. This includes mandatory and voluntary requests for
information (see 5 CFR 1320, Subpart 1320.7: "Definitions," Paragraphs C(l), C(2), and

-------
 OMB Regulatory and Approval Requirements                                  131

      Agencies are required to list the estimate of burden hours in the ICR. If the actual
 respondent burden of the approved survey is excessive, respondents may comment to the
 agency or to OMB.  This mitigates somewhat against an agency underestimating the
 burden hours. One method of more accurately predicting burden hours is to pretest the
 survey instrument and observe how long it takes respondents to complete the survey.
 Pretest instruments need to be submitted for approval only if they will be used to collect
 data from ten or more respondents. The pretest instrument should be submitted as part of
 the ICR, which contains the draft instrument to be fielded. The OMB desk officer may
 request revisions in the pretest instrument even though it is going to fewer than ten
 respondents and does not require formal OMB clearance.
      When the ICR is sent from EPA to OMB, a copy goes to the desk officer, who may
 request review by a budget examiner. Normally, the OMB is required to respond within
 60 days of its receipt of the agency ICR, but this can be extended to 90 days. If OMB takes
 no action within 60 days, and there has been no extension, EPA can request an OMB control
 number, which OMB is obligated to provide.
      At the time of submission of the ICR, the EPA also places a notification in the Federal
 Register, informing interested persons to contact the OMB desk officer or the Agency for
 further information or comments.  This is unlike the regulatory review process,  where
 public comments are not solicited for pending rules. Usually OMB does not receive any
 public comments on ICRs.
        As a general rule, the OMB does not complete its review of ICRs for 45 days to
 allow time for the interested public to comment.  However, there are expedited review
 procedures (fully explained under 5 CFR, Part 1320, Subpart 1320.18: "Emergency and
 Expedited Processing").  The procedures for approval are somewhat flexible.  For
 example, in one recent case, OMB approved one segment of a survey, a technical portion,
 while disapproving another segment that required further refinement.  This enabled the
 regulation development process, which was dependent on the results of the first segment,
 to commence sooner than would otherwise have been the case.
     The ICRs in EPA are processed through the Information Policy Branch of the Office
 of Policy, Planning and Evaluation (OPPE) prior to their transmittal to OMB. In addition
 to the three criteria noted earlier, those cited in "Evaluation for Risk Communication"
 (Arkin, 1988) are good guidelines to help ensure prompt review and approval.
     The EPA does have the expertise to provide assistance to agencies in designing
 surveys to meet the required OMB criteria. The Statistical Policy Branch of OSR in EPA
 serves as an information resource for the Agency in the design of instruments, and EPA's
 OPPE also has statistical expertise.
     There is an apparent tension between the regulatory review and the information
collection review functions of OMB; the goals appear to conflicting.  That is, with
regulatory review, choosing the best alternative for society requires an agency to collect
information to analyze the situation, while the information collection review focuses on
minimizing the burden of information collection on the public. However, providing good
information to the public is important and, if a survey is well designed, it can meet the
overarching requirement of the Executive Order of doing more good than harm. As a result,
OMB is supportive of risk communication programs and evaluations of their success.

-------
132                                             Understanding OMB Procedures

     The OMB publishes compendia of the regulatory programs and agendas of federal
agencies. The Regulatory Agenda is a compilation of all regulations that agencies plan to
promulgate within six months; the Regulatory Program is an annual compilation  of
forthcoming significant regulatory actions.


REFERENCE

Arkin, Elaine Bratic. 1988. Evaluation for Risk Communicators. Paper prepared for the
Workshop

-------
USING EVALUATION
    CASE STUDIES

-------
Introduction
Elaine Bratic Arkin
Whether the issue is smoking or a Superfund site or pesticides in the drinking
water, one challenge for risk communicators is that these risk messages, as warnings, come
laden with extra burdens. What we tell people may enlighten them, but it may have an equal
chance of confusing them; it may be frightening or reassuring; it could cause denial or
alarm or anger. Messages about risk may motivate people to action or to frustration.
At a previous conference, one panelist from a government agency said, "Why
complicate the issue? Why talk about public frustration and anger and all of those things?"
He said, "My job at the federal level is just to get the word out. Why is this such a big deal?"
Although some people may think of the communication challenge as "just getting the word
out," in reality we know that there is always a purpose for the communication. The purpose
may be to encourage someone to seek information or help or protection, to change
behavior, or to participate in policymaking or in the enforcement of existing laws and
policies.
So we need to examine the reasons for communicating about risks prior to designing
risk messages, and then to look at the results. We need answers to questions such as: Did
anyone hear what was said? Who listened? Did they understand? Did they agree? What
happened as a result?
Beyond getting the risk information out, there is a compelling need to answer
questions of evaluation to justify risk communication efforts to taxpayers or to the
sponsoring agency or company. Risk communication frequently has been considered an
auxiliary activity to risk assessment and risk abatement. Answering questions about the
results and value of risk communication is necessary to prove that this is a separate
discipline, and that it requires professional knowledge and skills.
Evaluation efforts are necessary to decide whether risk communication works as
intended, to make sure that it does no harm, to know how it works, with whom, and how
it should be altered in the future. In addressing evaluative questions, we must consider not
only how to find the answers, but the obstacles and barriers to evaluation and how to
overcome them. Such obstacles include agency restrictions, resource limitations, and the
fact that the discipline of risk communication is still under development.

135

-------
136                                                                  Case Studies

     This group of papers explores experiences with risk communication evaluation.
Some are descriptions of entire programs, while others discuss experiences with particular
methods or aspects of evaluation. Together they provide an illustrative range of perspectives
from federal, state, and municipal agencies and from the private sector.

-------
The National Cancer Institute

Shelagh Smith

Frequently, a risk communicator also must wear the hat of an evaluator. That can
be both an advantage and a disadvantage. Many people think that evaluation is difficult,
or a burden, and they panic. At the National Cancer Institute (NCI), there is a staff person
in charge of evaluation, and this can lead to one of the primary problems in evaluation:
Often, program staff are asked, at the end of a program, to determine whether it was
successful. A prime task of the evaluator is to educate program managers about what kind
of evaluation is feasible—and what is not—and about the need to build evaluation into a
program from the beginning. Evaluation cannot be tacked onto a program once it is
completed.
In the federal government, there is a certain amount of commitment, funding, and
support for evaluation, and for that reason, some evaluations conducted at the National
Cancer Institute (NCI) may not be feasible under other circumstances. NCI staff have the
expertise to plan and to conduct surveys and to obtain clearance from the Office of
Management and Budget (OMB). These tasks may be obstacles for risk communication
staff at other agencies.
On the other hand, the disadvantage of having greater evaluation resources is that
there is a temptation to design evaluation strategies that are more elaborate than necessary.
It is helpful to remember that evaluation exists only to support a program. Without the
program, evaluation is not necessary. Therefore, the evaluation design should fit the
context and scale of the program it supports.
Also, not all risk communication results are measurable, and not everything can or
should be evaluated. For example, there are some programs for which pretesting
(formative evaluation) is more applicable than evaluation of the program results. In order
to make the best decisions about what and how to evaluate, it is important to review the
reasons for evaluation.
137

-------
138 Case Studies

Why Evaluate?
First, evaluation tasks can provide information for future planning. Evaluation
provides program direction. Evaluation can demonstrate accomplishments and help to
answer the questions of program managers, policymakers, and others. Evaluation can
answer questions at the pre-production, production, and results stages. In pre-production,
questions might include: Who is the audience? What are their needs (needs assessment)?
What revisions are needed in draft messages (pretesting)?
In the production and program implementation stages, questions that process
evaluation can help answer include: What was produced? How many were produced and
distributed? How long did it take? What did it cost? Who was the audience? Were they
exposed to the message? At the final stage, questions to be answered include: Did members
of the audience receive the message? Did they learn? Did they change the way they think
or behave (outcome evaluation)?
Evaluation can be used to apply successful methods to new programs, to revise
current programs, or to plan. Evaluation and planning go together. Measurable objectives
and defined goals are essential if one is to be realistic and objective about what kind of
program—and evaluation—is feasible.

Formative and Process Evaluation
The first evaluative step in the program planning process is needs assessment. For
example, at NCI, we conducted a survey of needs for educational materials among 100
hospital-based patient educators.
Another evaluative activity is testing message concepts to identify the best way to
communicate about a risk. For example, NCI conducted six focus groups to classify
profiles of people to help shape "Eat for Health," a joint NCI/Giant Food consumer
nutrition education program designed to change people's behaviors in buying, consuming,
and preparing high-fiber, lower-fat food. The focus groups helped formulate ideas to kick
off and shape this program.
A third kind of activity is pretesting. One pretest at NCI showed that we were using
illustrations and a title for discouraging use of chewing tobacco that was not appealing to
adolescents. As a result, the booklet, "Chew or Snuff is Real Bad Stuff," was revised to be
more appropriate for the intended audience.
Another important activity is process evaluation. At NCI, we collect data routinely,
but it was not being organized regularly into summary reports for staff to use in assessing
questions such as "Where are we? How many phone calls are we receiving? How many
brochures are being requested? Are we reaching our target audience?" This process data
is useful to make sure that a program is on course and to permit any adjustments necessary
while the program is still underway. NCI recently has prepared a plan for analyzing this
process data.
These are the types of evaluation most traditionally used by programs. They are
inexpensive and use existing, accessible data and resources. They can be undertaken on
a small scale, without large population based surveys, which are very expensive and
require considerable expertise. Sometimes a combination of methods will provide the best
evaluative picture. For example, pretesting draft materials using focus groups and

-------
National Cancer Institute 139

intercept interviews can provide two kinds of data to analyze for a more complete
assessment.
A new undertaking at NCI is an audience segmentation survey. Nationally represen-
tative, it is called a "psychographic survey." The NCI will send seventy-five value
statements on separate cards to approximately 2,000 randomly chosen people. Respon-
dents will be asked to sort the cards in order of priority, i.e., by how much a respondent
agrees with the value statements. There will be value statements on issues related to cancer
and health, but also unrelated subjects such as "I like to watch television on Sunday nights."
NCI hopes to use the results to segment the national population into subgroups using
factors other than simple demographics (e.g., age, sex, education and income). The intent
is to target different types of people based on their different values. Although commercial
marketers and political pollsters target according to psychographic characteristics, this is
innovative for health communications.

Outcome Evaluation
The next type of evaluation—outcome evaluation—is more difficult. It takes more
time, and is more expensive. It may be something imposed upon a program by a
policymaker or through public demand. At NCI, a true experimental design is being used
to evaluate one communication program. The difference between the true experimental
design and the quasi-experimental design is randomization. In the true experimental
design there are specified criteria for subjects who may volunteer. These subjects are then
assigned systematically to an intervention or control group.
Quasi-experimental design is used when randomization is not possible. For
example, NCI' s "Eat for Health" program with Giant Foods is being tested in Washington,
D.C., using Baltimore as the control site. Obviously, NCI could not randomize people
going to supermarkets in one city, so a different city was chosen as the control site.
Although not randomized, this test is a quasi- experimental design because there is the
comparison group in Baltimore.
A more feasible variation of outcome evaluation is field testing, or pilot testing. A
few years ago NCI conducted a breast cancer education pilot test on a small scale with
AT&T in New Jersey. The results showed that the program resulted in changes in both
knowledge and practice of breast self- examination 5 months after intervention. As a result,
the program was implemented on a larger scale.

Obstacles to Evaluation
There are a number of obstacles to evaluating risk communication programs. For
federal agencies, one is OMB clearance. Also, evaluation can be expensive, although
qualitative methods are often affordable. In addition, evaluation is time consuming, and
policies governing evaluation may be predetermined by the agency involved. Further, not
all risk communicators have sufficient skills to design and conduct evaluations. If this is
the case, agencies can contract for assistance or consultation or tap into university based
talent.
There are evaluation methods that are not as difficult or complicated, including those
used forpretesting and process evaluation. However, it is necessary for risk communication

-------
140                                                                Case Studies

program managers to become familiar with the options that are available. One source of
help is NCI's new publication Making Health Communications Work.
     What is realistic? An impact evaluation may not be. It may not be possible to say
that a program was successful and resulted in changes in behavior.  But some type of
evaluation usually is  feasible.  Many people have unrealistic expectations of what
evaluation can do, especially if they are not well versed in risk communication or if they
are not familiar with evaluation.  And, of course, evaluation is not something performed
as a program's last step. So, one obstacle may be the difficulty of deciding how to measure
a program's effects if evaluation was not anticipated and  planned early in the program.
     In conclusion, some recommendations include: Keep it simple.  Concentrate on
qualitative methods, if quantitative methods are not practical. Use secondary data from
existing sources if possible to simplify the work. And finally, consider the many technical
resources and sources  of data that exist to help make some type of evaluation possible.

-------
New Jersey Department of Environmental Protection
J. Herb, J.A. Shaw, H.L  Garie
      Government environmental regulators make management decisions using a variety
of tools and mechanisms.  Within the past ten years, the scientific community has
recognized the value that risk assessment can contribute when used as a tool in environ-
mental decisionmaking.  Applied appropriately and with protective assumptions, risk
assessment provides a logical, scientific  basis for protecting public health through
environmental management.
      However, while the utilitarian and scientific aspects of risk assessment are clear,
uncertainties and assumptions inherent in the process often cause skepticism in the affected
community. This makes communicating environmental decisions based on risk assess-
ment particularly difficult to explain to the general public.  These uncertainties prompt
many citizens to consider risk assessment with suspicion and disbelief and, as a result, they
may not accept or understand regulators' decisions.
      The case study outlined in this paper specifically concerns communicating with the
public about an environmental health risk which was determined through the use of risk
assessment. All too often, risk communication is tagged on as a final piece to an overall
risk management strategy and,  when put into practice, the risk communication effort is
more  a risk "telling" strategy, that is, the agency informs the public of the decision
regarding an environmental health risk rather than actually "communicating" with the
public about the situation. It is the belief of the regulators in the case described below that
risk assessment is best communicated to the public when a proactive communication effort
is integrated early in the risk management process. In addition, the communication effort
should be designed to allow two-way communication with the public throughout the
process.

Background
      Union Lake in Millville, Cumberland County, is the second largest fresh water lake
in New Jersey. With a statewide reputation as a sailing and fishing lake, it is a popular
recreation area. The lake is south of the Vineland Chemical Company (Vi-Chem), now a

                                                                         141

-------
 142                                                              Case Studies

 Superfund site, located along the Blackwater Branch of the Maurice River in the city of
 Vineland.
      The water and lakebed sediments are known to be contaminated with arsenic,
 believed by the New Jersey Department of Environmental Protection (NJDEP) to have
 originated at the Vi-Chem site. Up until the spring of 1987, analyses of the lake indicated
 that recreational exposure to the water itself did not pose a health threat. During the spring
 of 1987, the state began to undertake plans to reconstruct the lake's 119 year-old dam which
 posed imminent hazards to life and property.  At this time, NJDEP learned that it was
 necessary to lower the lake level considerably in order to perform the reconstruction work.
 It was the pending lowering of the lake that prompted the NJDEP Division of Science and
 Research (DSR), in conjunction with the New Jersey Department of Health (NJDOH), to
 conduct a health risk assessment of potential health risks posed from recreational use of
 Union Lake. Lowering the lake waters would increase the amounts of arsenic contaminated
 sediments exposed. The health risk assessment concluded that the arsenic contamination
 in exposed lake bottom sediments would result in an unacceptable level of risk. As a result
 of the risk assessment, NJDEP, as well as the Cumberland County Health Department,
 decided to close Union Lake for all uses.
      The branch of DSR that undertook the risk assessment and risk communication
 program at Union Lake is known as the Office of Environmental Health Assessment. This
 Office is the central component of an overall environmental assessment program initiated
 by New Jersey Governor Kean in spring 1986. The overriding philosophy of the office is
 that integration of risk assessment, risk communication, and risk reduction allows the
 public to be included in the decisionmaking process, which significantly increases an
 agency's ability to protect public health.

 Recognizing the Need for Communication
      As a result of the preliminary data from the Union Lake risk assessment, it was clear
 to the regulators that the contaminated sediments posed a potential health risk.  They
 immediately consulted with the Risk Communication Unit (RCU) staff and integrated
 them into the NJDEP-NJDOH team that was assessing the situation. From that point, the
 role of communications was an integral part of the overall strategy at Union Lake. Matters
 of what to communicate to the public and how to communicate it were weighed equally
 with technical aspects of the risk  assessment and long-term policy issues.
      This strategy was manifested in several ways: RCU staff were directly involved in
 the NJDEP- NJDOH planning meetings; technical staff were directed to devote as much
 time and energy as  necessary to respond to requests for assistance from RCU staff;
resources needed to  implement communication strategies were made readily available;
 and RCU concerns were addressed in the development of policy.
      It is particularly important to note that it is often the communications staff that the
public confronts with policy questions regarding coordination within the agency, long-
 term plans for the site in question, and the process under which environmental health issues
are addressed. For example, in the case of Union Lake, it was difficult for staff to explain
the health risk assessment to the public without addressing concerns about why the dam
reconstruction could not be accomplished without lowering the lake. This demonstrates
the necessity of having the communications staff identify technical and policy issues that

-------
New Jersey Department of Environmental Protection                           143

concern the public and bring these to the attention of the agency. In other words, the
communications staff can serve as the voice of the public within the agency. In the Union
Lake case, the  communications staff were given the freedom and authority to raise
concerns about policy issues, such  as dam reconstruction, enforcement, and future
research. However, this can work only if the communications staff are considered equal
and important contributors to the overall agency effort. It also should be noted that a
deliberate effort was made to communicate with the public about the risk assessment
process as well as the outcome of the Union Lake risk assessment.

Strategy
        The following considerations guided RCU's planning.
     •  What is the purpose, or goal, for communicating an issue to the general public?
        In the Union Lake case, the purpose for communicating with the general public
        was to generate public support and adherence with the ban on use of the lake.
     •  Who are the audiences affected and who are the audiences that we want to reach?
        In order to assess the Union Lake audience, RCU staff immediately  moved to
        establish contacts in the community. First, the staff met with the Director of
        Health and Human Services for Cumberland County who, in addition to being a
        local government leader, is an active and well known leader in the community.
        Second, RCU staff visited  the site and surrounding neighborhoods. Civic,
        recreational, educational, and religious organizations and leaders were subse-
        quently identified and contacted. Third, RCU staff reviewed past documents and
        newspaper articles regarding the site to further identify persons or groups that
        might be affected.

     In addition to identifying affected or interested audiences, RCU targeted specific
audiences that needed to be informed. These audiences included local fishing and boating
enthusiasts and the community schools. Two primary audiences were identified: a private
sailing club located on the lake and a lakeside housing development that provided home
owners with direct access to the lake shores.
     •  What is the message that must be conveyed to each audience in order to reach it?
        In some communications efforts, there is a need to convey different messages to
        different audiences. However, the message in the Union Lake case was single and
        clear: contact with the sediments may pose a significant health risk. The RCU
        decided that this  same message should be conveyed to all audiences,  including
        people boating, swimming, and fishing, schoolchildren, and curiosity-seekers.

      • What strategies should be used to get the message across? The RCU determined
        from the outset that the most effective way to reach the public and generate trust
        in the decision to ban use of the lake was to use the county health department as
        the contact agency for local citizens. This allowed the public to acknowledge that
        the local government supported the actions of the state and increased public
        acceptance of the ban. The RCU also strived to make interpersonal communi-
        cations the preferred form of interaction between the agency and the public.

-------
144                                                               Case Studies

      The following table summarizes the specific strategies used to communicate the
potential risks posed by use of Union Lake to the general public.  Several strategies are
particularly important to note. First, the RCU staff planned and implemented a proactive
two-way communications strategy with the community and also acted to raise the
community's concerns within the agency. Second, through discussions with local leaders,
the RCU recognized the importance of local newspapers in conveying information to the
general public and developed background papers for the media on technical topics.
                   Steps Used to Deliver Union Lake Message

        Established contact with local officials
        Prepared factual briefing materials for press, officials, and others
        Arranged a press conference to announce ban on lake use
        Assisted regulatory personnel in developing language for signs posted around
        Union Lake
        Coordinated distribution of written materials to local audiences
        Arranged follow-up meetings with key local interest groups
        Contacted local schools, churches, and hospitals to offer educational assistance
        about the lake ban
Summary: Elements of Success
        The Union Lake case points out several key factors that must be included in efforts
to integrate risk communication and risk assessment into a comprehensive risk manage-
ment approach:
     First, the communications staff were involved as early as  possible and were
encouraged to raise concerns  of the local community with technical staff and policy
makers.
     Second, communication was considered an  integral part of  the overall agency
program pertaining to Union Lake. Communications staff were considered part of a team,
which also included policymakers and technical staff.
     Third, communications  staff were given time and resources  not only to plan a
proactive communications strategy, but  also to respond to public concerns.  The com-
munications staff's persistence in spreading the word and responding to the public in a
timely and responsible manner increased the credibility of the agency and, in turn, led to
increased adherence to the ban.
     Fourth,  communications and technical staff conveyed not only the results of the risk
assessment, but the process of risk assessment as well.

-------
New Jersey Department of Environmental Protection                           145

      Fifth, communications staff did some of the actual communicating but also acted as
a liaison between the community and technical staff and also established mechanisms and
forums for the technical staff to communicate directly with the local community.
      Sixth, the communications staff worked within existing networks in the community
and relied on local groups and government agencies to funnel information to individuals.
This strategy allowed the community to understand that local agencies endorsed the state's
actions and, by bringing the communications strategy down to a local scale, it increased
the opportunities for local citizens to voice concerns, to which either local or state agencies
eventually could respond.
      Seventh, the communications staff were committed to developing a communications
strategy that was interactive or two-way, rather than simply telling the local community of
its findings and decisions.
      The RCU is able to identify specific factors that allowed the communications
strategy to successfully influence the overall risk management program at Union Lake.
However, it recognizes that the Union Lake case is not an ideal model for two-way, up-front
risk communication.
      Specifically, there are two aspects of an ideal communications approach that the
RCU would have liked to integrate into the Union Lake case.  First, although the RCU staff
were brought into the Union Lake case as soon as a risk assessment began to show the
presence of potential public health risks, this timing cannot be considered "up-front" in the
ideal sense. The RCU intends to explore the potential for integrating affected communities
into the process before the risk assessment is conducted so that communities actually are
involved before the problem is identified.
      Second, although the RCU has evaluated the effectiveness of its communications
efforts at Union Lake informally, no formal evaluation mechanisms were built into this
communications strategy. Without formal evaluation, an assessment of the effectiveness
of communications strategies is not fully reliable. The primary responsibility of the RCU
is to conduct research and case studies to identify effective strategies for communicating
environmental health risks to the public and for integrating the public into decisionmaking.
Both communications evaluation and up-front integration of the public are the subjects of
research investigations currently underway by the Risk Communication Unit.

-------
146                                                             Case Studies

READINGS

Faust, S.D., A. Winka, T.J. Belton, and R. Tucker. 1983. Assessment of the chemical and
biological significance of arsenical compounds in a heavily contaminated watershed: Part
II, The distribution of several arsenical species in a watershed. Journal of Environmental
Science and Health A18(31: 389-411.

Faust, S.,A. Winka, and T. Belton.  1987a. An assessment of chemical and biological
significance of arsenical species in the  Maurice River  drainage basin  (N.J.). Part I:
Distribution in water and rivers and lake sediments. Journal of Environmental Science and
Health 22(3) [need page numbers].

Faust, S., A. Winka, and T. Belton. 1987b. An assessment of chemical and biological
significance of arsenical species in Maurice River drainage basin (N.J.). Part II: Partitioning
of arsenic into bottom sediments. Journal of Environmental Science andHealth 22(31 [need
page numbers].

Hazen, R., L. Jowa, and J. Savrin. 1987. Risk Assessment for Recreational Use of Union
Lake. New  Jersey Department of Environmental Protection, Division of Science and
Research.

-------
CIBA-GEIGY Corporation, Toms River (NJ) Plant
Thomas A. Chizmadia

The Toms River Plant began operating in 1952 as the Toms River division of CIB A
States Limited. It was later consolidated with Cincinnati Chemical Works Inc. which was
owned by CIB ALimited (J.R. Geigy, S.A., and Sandoz Limited, all of Basel, Switzerland).
By 1960, after the consolidation with Cincinnati Chemical, the site was known as Toms
River Chemical Corporation. CIBA and Geigy merged in 1970. After Sandoz sold its
remaining interest to CIBA-GEIGY in 1981, the site became the Toms River Plant of
CIBA-GEIGY Corporation.
Over the 35 years of the site's existence, the plant has manufactured a wide variety
of dyes, additives, and adhesives in largely batch processes. Environmental facilities and
procedures have evolved over this time period, as has the entire chemical industry, and
consequently a variety of waste treatment and disposal techniques have been used at the
facility—all of which are no longer considered advisable. Due to the nature of some of
these techniques, contamination of parts of the site and some underlying acquifers has
occurred. CIBA-GEIGY already has undertaken remediation of a portion of the contaminated
area through the use of recovery wells, and full remediation through the Record of Decision
under Superfund is anticipated shortly.
To put the Superfund risk communication plan in its proper perspective, one must
realize that the site was concurrently addressing many other environmental and waste
disposal issues. These issues included use of an ocean discharge pipeline that carried
treated waste water from the plant's waste water treatment plant to the Atlantic Ocean (the
heightened awareness of which was triggered by a leak in April 1984); a planned site
expansion that involved reduction of the current synthesis of dyes and plastics and the
construction of a Pharmaceutical's active ingredient manufacturing facility; and 1985
regulatory and legal issues that involved a$ 1.45 million fine by the New Jersey Department
of Environmental Protection and an indictment issued by the State Attorney General in
October of that year.
Consequently, since the pipeline leak in 1984, the plant has been the subject of
virtually daily coverage in the news media, the focal point of many environmental and
citizens' groups, and the subject of various state legislative initiatives that have attempted

147

-------
148                                                               Case Studies

to end its discharge of treated effluent into the ocean. This high level of public concern and
interest is likely to continue into the foreseeable future.
     In regard to Superfund, the site was placed on the national priorities list in December
1982. The Remedial Investigation began in 1985  and the draft Feasibility Study was
released in June 1988.

Communications Goals
     Because of  the many issues facing the plant, its overall communications plan
addressed both short-term and long-term objectives, as well as segregating the issues and
developing communications plans for each. This paper will describe the plan implemented
for Superfund related activity.
     The short-term goal of the overall communications plan was to improve the public
understanding, image, and credibility of the Toms River Plant and CIBA-GEIGY by better
educating the public about theplant'sproducts, environmental activities, and improvements,
and about its many contributions to the local community/economy. The plan's long-term
goal was to restore the public's confidence in the plant's environmental protection efforts
and technology by communicating that the plant could be a viable production site without
creating or causing any adverse impact on the environment. The two key messages from
the plant to the public at large about Superfund were that there was no health risk as a result
of the groundwater contamination that had been identified both under and off the site, and
that CIBA-GEIGY was committed to cleaning up the contamination without the use of
public funds.
     The first message was critical for a successful risk communications program and was
based upon an independent quantitative risk assessment conducted for CIBA-GEIGY by
ENVIRON Corporation. ENVIRON concluded that no significant risk existed as a result
of the Saperfund site, and recognized that direct exposure to the contaminated groundwater
was very limited since it was not used for domestic purposes.

Activities
        To implement  the plan,  the plant undertook an aggressive informational cam-
paign based on scientific data without ignoring the general human  concern of whether
health was affected.  One of the biggest challenges of implementing the plan was to
distribute technical data effectively in language easily understood to the lay public, and in
a manner that clearly represented CIBA-GEIGY's concern for residents' interests and its
ability to handle such a complex issue.
     Audiences were identified in the risk communication plan; priorities were given to
regulatory and elected officials, the media, employees, retirees, and residents who lived
adjacent to the site. With messages and audiences identified, vehicles  were established for
communicating the messages. These included briefings for public officials (on an almost
weekly basis,  both formal and informal); editorial board meetings; plant briefings for the
media; neighborhood meetings on the plant site; participation on  a citizens advisory
committee established by the county governing body to discuss issues related to the plant
site; and adoor-to-door campaign conducted inAugust 1986, in Oak Ridge, the neighborhood
adjacent to the eastern boundary of the plant. In addition, the plant periodically mailed
information directly  to the residents in Oak Ridge related to the status of the Remedial

-------
CIBA-GEIGY Corporation, Toms River (NJ) Plant 149

Investigation and what CIBA-GEIGY was doing to address the contamination. These
items included the company's own material as well as information prepared by the EPA
and independent consultants.
Door-to-door campai gn. The most effective outreach efforts were those that allowed
face-to- face contact with residents. The two-day, door-to-door campaign in the Oak Ridge
neighborhood represents one of the best examples of a coordinated effort between the
communications staff and other corporate staff departments, and participation by the
neighbors themselves. Specifically, this program set out to communicate with those
neighbors who, through geography alone, were perceived to be at the highest potential risk
related to the CIBA-GEIGY Superfund site. One of the questions that the door-to-door
campaign set out to address was how many of these residents had functioning irrigation
wells and how these wells were being used. (All the residents in the affected portion of this
neighborhood were on municipal water service for domestic use. The wells servicing the
municipal water supply were not affected by the plant.) Prior to this effort, various critics
and opponents of the plant had stated that irrigation wells were in abundance in this
neighborhood and that neighbors were coming in direct contact with the contaminated
groundwater through the use of these wells for filling pools, watering lawns, washing cars,
and other activities.
After identifying the portion of the neighborhood under which the contaminated
plume of groundwater flowed, a four- to five-block buffer zone was added, and this total
area encompassed the contact points for the door-to-door campaign. Approximately 220
homes were in this area. Residents were contacted by two CIBA-GEIGY employees who
explained that they were conducting the campaign to share information about Superfund
and to inquire whether a homeowner had an irrigation well. In addition, print material
related to both Superfund and drinking water standards was distributed. If the residents
indicated they had an irrigation well, they were told that there would be follow-up contact
regarding a well testing and/or closure program in which the company would be involved.
After the two-day period, fourteen homeowners were identified as having irrigation
wells. The subsequent program on testing and closure was managed by the plant
communications staff who maintained direct contact with the residents. Anyone with a
well was offered an opportunity to have his or her well sampled on a periodic basis by an
independent certified laboratory (paid for by CIBA- GEIGY) or to have the well
permanently closed according to a state certified procedure (paid for by CIBA-GEIGY)
with additional compensation for the inconvenience of losing the use of the well. The
objective of the program was to address directly any concerns that neighbors had about the
purity or quality of the water they were using for non-domestic purposes. Approximate half
of the well owners chose to close their well (some of which, through sampling, proved to
have no contaminants present) and half chose to have their wells monitored three times per
year at the company's expense during the period of peak water use. Coordinated by the
communications department, this program not only provided more data related to the
extent of off-site contamination, but also met the objective of addressing concerns the
homeowners had on the relative risk of continuing to use their wells for non- domestic
purposes.
Technical assistance grant. A further effort to address citizen interest in the CIBA-
GEIGY Superfund issue was undertaken by the plant and the Ocean County Citizens for

-------
150 Case Studies

Clean Water (OCCCW), a local environmental organization formed in 1984. The two
organizations set a national precedent when the plant contributed and the OCCCW
accepted a $50,000 grant to retain professional consultants to study the Toms River Plant
Superfund site. CIB A-Geigy independently awarded the grant to the OCCCW to meet the
intent of the Superfund Amendments and Reauthorization Act (SARA) of 1986 which
provides for such grants from the EPA to environmental organizations. The plant offered
to contribute the funds when it became apparent that the regulations allowing the EPA to
provide such grants to citizens' organizations would not be promulgated prior to the
resolution of the plant's Superfund issues. The awarding of the grant was made at a public
ceremony on August 8,1987 at the office of the late Congressman James Howard in Toms
River.
To date, the process has allowed CIB A-GEIGY, the OCCCW, consultants, and the
EPA a forum for continuing an active and productive dialogue on Superfund.

Evaluation
Public opinion polling commissioned by the plant since 1984 has indicated a great
deal of success in the outreach efforts related to Superfund. Of the five surveys conducted
between December 1984 and January 1988, the most significant positive change attributed
to communications efforts by the plant was seen in the Oak Ridge area. According to the
results, residents here not only exhibit more knowledge about the company and more
confidence with the environmental protection efforts of the plant, but they want to receive
much more information.
CIBA-GEIGY is continuing its efforts to meet the neighbors' desire for more
information by continuing its Superfund communications efforts with the Oak Ridge
community as a primary audience. Realistically, the company also understands that the
issues will not be resolved overnight. As more information becomes available through
environmental studies, the final Feasibility Study, and the Record of Decision to be issued
by the EPA, CIBA-GEIGY will continue to recognize the social, political, and technical
aspects of risk communication under Superfund.

-------
National Heart, Lung, and Blood Institute
John C. McGrath
The primary function of the National Heart, Lung, and Blood Institute (NHLBI)
is to support biomedical research in the area of heart, lung, and blood diseases. But in
addition to supporting research, a high priority of this Institute is to transfer and translate
the results of that research for the benefit of the American public. One important means
of transferring research results is through National Risk Factor Education Programs. The
Office of Prevention, Education and Control coordinates three cardiovascular disease risk
factor education programs: The National High Blood Pressure Education Program, The
National Cholesterol Education Program, and the NHLBI Smoking Education Program.
These three risk factor education programs were established because it is known that high
blood pressure, high blood cholesterol, and smoking are the major modifiable risk factors
for cardiovascular disease.
This paper describes how NHLBI uses research data to evaluate the feasibility,
progress, and results of its national education programs. It will pose and then answer four
questions:
1. How do we know that high blood pressure, high cholesterol, and smoking are the
three major modifiable risk factors for cardiovascular disease?
2. How do we know that modifying these risk factors will reduce the risk of
cardiovascular disease?
3. How can we intervene so that those at risk modify their behavior in a way that
reduces their risk?
4. How do we know that the intervention is effective?

Before addressing these questions, it is important to understand the theoretical
framework for risk reduction education programs and some of the main sources of data
used in the development, implementation, and assessment of the Institute's national
education programs.
151

-------
152 Case Studies

Theoretical Framework for Risk Reduction Programs
The science base is the firm foundation upon which risk reduction programs must be
built. Basic research and applied research are conducted to investigate the cause and nature
of disease. Knowledge validation supplies much of the basis for risk factor education
programs, first through clinical investigations and then through clinical trials. When the
results of clinical trials provide firm evidence of the benefits of controlling a risk factor,
the Institute transfers that knowledge, first through demonstration and education research,
and then through national education programs such as the National High Blood Pressure
Education Program and the National Cholesterol Education Program.

Sources of Data
While several sources of data are used in the development, monitoring, and
assessment of risk reduction education programs, the Institute relies most heavily on four
sources.
• TheFraminghamStudy. Thisstudy,beguninl949,gathersdatathatdescribeand
qualify the relative risk associated with specific risk factors. In this longitudinal
study of residents of Framingham, Massachusetts, participants have been followed
for forty years, their risk measured, and cardiovascular disease outcomes moni-
tored.
• Data From Clinical Trials. Clinical trials are an important component of the
biomedical research spectrum because their results can have immediate appli-
cability to medical practice. Many of these large scale clinical experiments are
used to validate the efficiency of risk reduction.
• The National Center For Health Statistics (NCHSX This is the federal agency
responsible for collecting much of the nation's health data. NCHS conducts
several large scale cross-sectional health surveys, including the National Health
and Nutrition Examination Survey (NHANES), which gathers information on the
health and nutritional status of Americans. The NHANES study is unique
because it not only uses an interview in which medical history is obtained, but also
an extensive physical examination including measurements of blood pressure,
blood cholesterol, height, and weight. NCHS also sponsors an annual National
Health Interview Survey to determine self-reported health status and public
knowledge and behaviors related to certain diseases and/or risk factors.
• NHLBI Surveys. The Institute has conducted several large-scale, national
surveys of public and professional knowledge, attitudes, and practices related to
high blood pressure and high blood cholesterol. Surveys of the public and high
blood pressure were conducted in 1973,1979, and 1982. A survey of physicians
and high blood pressure was conducted in 1977, and an update has just been
completed. Surveys of public and physician knowledge, attitudes and practices
related to high blood cholesterol were conducted in 1983 and 1986 and another
will be conducted in 1989. In addition, the Institute soon will conduct a survey
of nurses and dietitians.

-------
National Heart, Lung, and Blood Institute                                      153

Using the Data
      How can we use the above data to answer the four questions posed above?
      First, how do we know that high blood pressure, high blood cholesterol and smoking
are the three maior modifiable risk factors for cardiovascular disease?  Data from the
Framingham S tudy show that persons with high blood pressure have three to four times the
risk of developing coronary heart disease and as much as seven times the risk of developing
a stroke as  do individuals with controlled or normal  blood pressure.  Likewise, the
Framingham Study shows that the risk of developing coronary heart disease increases as
blood cholesterol levels rise. The Framingham Study also shows that the more cigarettes
one smokes, the greater the risk of developing cardiovascular disease, including stroke,
atherogenesis, and vascular disease.
      The Framingham data provide a clear and causal link between three risk factors—
high blood pressure, high blood cholesterol, and smoking—and cardiovascular disease.
However, these data do not demonstrate that lowering these risk factors will reduce the
disease risk.
      This prompts the second question: How do we know that lowering these risk factors
will reduce  the risk of cardiovascular disease? Data from several clinical trials have
demonstrated that reducing high blood pressure, high blood cholesterol, and smoking will
reduce morbidity and mortality caused by these conditions. For instance, the Veterans
Administration Cooperative Study was initiated in 1963 to determine whether treating and
controlling high blood pressure with antihypertensive medication would reduce resulting
morbidity and mortality.  The results were so positive that the trial was halted before its
scheduled conclusion so those in the control group could benefit from the treatment. This
trial was followed by the Hypertension Detection and Follow-up Program which determined
that controlling high blood pressure, particularly mild hypertension, through a vigorous
treatment program would reduce morbidity and mortality.
      Much of the evidence  concerning cholesterol comes from  the  ten-year Lipid
Research Clinics Coronary Prevention Trial. Data from  this Trial indicated that in
individuals at high risk due to high blood cholesterol levels, a one-percent reduction in risk
of coronary heart disease resulted from every two-percent reduction in cholesterol levels.
      The evidence of the benefit of smoking cessation  comes from a variety of studies
over the last twenty years. Many of those that deal with cardiovascular disease are reported
in the 1983 Surgeon General's report on smoking: The Health Consequences of Smoking:
Cardiovascular Disease. These studies show that ex-smokers have a risk of death from
cardiovascular disease substantially less than that of continuing smokers.  In addition, the
risk of developing chronic obstructive pulmonary disease is significantly reduced after
smoking cessation.
      Having established that high blood pressure, high blood cholesterol, and smoking are
significant risk factors for cardiovascular disease, and that reducing the risk factor reduces
the risk, we come to the third question: How can we intervene so that those at risk change
their behavior in a wav that reduces their risk?
      As mentioned earlier, the Institute coordinates  three education  programs: the
National High Blood Pressure Education Program, the National Cholesterol Education
Program, and the NHLBI Smoking Education Program.  These programs comprise  a
network of several federal agencies, more than 150 national organizations, fifty states, and

-------
154 Case Studies

several thousand community programs. At the core of the programs is a leadership entity
called the coordinating committee. It consists of representatives of organizations with a
diversity of interests including professional associates, voluntary organizations, hospitals,
and citizens' groups. These education programs coordinate a wide variety of programs
directed towards public, patient, and professional audiences in an effort to intervene and
assist those at risk to lower their risk behaviors.
Key components of the high blood pressure program and the cholesterol program are
mass media campaigns. In developing the campaigns, the Institute follows a social
marketing process described in detail in Making Health Communications Work, from the
National Cancer Institute. An important first step in these campaigns is a strategy statement
that provides the broad guidelines concerning target audiences and messages.
In developing these strategy statements, data from NCHS, from NHLBI surveys, and
from other sources are used extensively. For instance, data from the National Health and
Nutrition Examination Survey reveal that prevalence rates of high blood pressure increase
with age and are greater among blacks than whites. In addition, compliance with
medication regimes are lower in men than in women and lower in younger than in older
persons. Furthermore, data from the NCHS National Health Interview Survey showed that
knowledge of high blood pressure is extremely high and that 92 percent of those surveyed
had their blood pressure checked within the last 24 months.
Based on these data showing high awareness combined with low compliance,
particularly among men, the Institute identified a communication goal as compliance with
therapy and the target audience as those whose risks are high and whose compliance with
therap> is low, particularly younger men, both black and white.
The strategy statement for the National Cholesterol Education Program identifies a
different target audience and a different message strategy. Data from the 1983 and 1986
consumer awareness surveys sponsored jointly by NHLBI and the Food and Drug
Administration showed that:
• Recognition of high blood cholesterol among adults was high; 81 percent of
respondents had heard of the condition in 1986 compared to 77 percent in 1983.
• More people in 1986 believed that reducing elevated cholesterol levels would
have a large effect on heart disease. The figure was 72 percent in 1986, up from
64 percent in 1983.
• But less than half of adult Americans reported having their blood cholesterol
checked and only about seven percent knew their blood cholesterol level.

Based on these data, as well as data from NCHS indicating that approximately 50
percent of adult Americans have cholesterol levels above the desirable range, the
cholesterol campaign identified the general public as the target audience. The message of
the campaign urged people to get their cholesterol levels checked and to ask their doctor
what the results mean.
This is all part of an extensive process that eventually leads to products: radio and
television public service announcements (PS As), print ads, posters, and collateral material,
all with specific risk reduction messages targeted to specific audiences. But before
developing these messages, the Institute tests concepts in focus groups. Focus group

-------
National Heart, Lung, and Blood Institute 155

participants are selected from among the target audiences (i.e., the general public, aware
hypertensives), and can be selected to reflect the socioeconomic status of that audience.
For example, in developing the cholesterol mass media campaign, NHLBI examined
three different, potentially motivating concepts:
• Curiosity, a time-tested method of motivating people
• Getting on the health bandwagon—everyone else is doing it, so should I
• Taking control of one's health—taking responsibility

The focus groups revealed that curiosity was a much stronger motivator than either
getting on the health bandwagon or taking responsibility for one's health. Participants
were extremely curious about the fact that they might have high blood cholesterol and not
know aboutiL On theotherhand, taking responsibility for one's health was not particularly
motivating because these participants generally thought of themselves as being in control
of their health.
In the next step of this process, the Institute developed a series of scripts based on
the results of the focus groups. Before going into final production however, the scripts were
tested. In this phase of testing, it is common to use an animatic, a detailed story board placed
on videotape with voice- over. Scripts can be tested in two ways: through focus groups and
through central location intercept interviews.
Finally, the Institute circulated the proposed scripts, along with a report documenting
the process, to approximately 150 key constituents for field review. Reviewers include the
liaisons at State departments of health, members of the coordinating committees, members
of the Institute's ad hoc minority committee, as well as other interested individuals. When
a reviewer makes a particularly salient comment, the script is modified accordingly.
The final question is this: How do we know the intervention is effective? One way
is to use national data on the status of awareness, treatment, and control of high blood
pressure and high blood cholesterol. By comparing baseline data to subsequent survey
data, the effectiveness of an intervention can be measured. For instance, adjusted death
rates for coronary heart disease, stroke, and noncardiovascular causes declined beginning
in 1972 when the high blood pressure program was initiated. Other survey data show an
increase in awareness that high blood pressure can cause stroke: from 29 percent in 1973
to 38 percent in 1979 to 66 percent in 1982. While the causal link cannot be proved, most
experts agree that the National High Blood Pressure Education Program had a significant
impact on these developments.
To assess the impact of mass media campaigns, the Institute relies on several
indicators. First it includes a bounce-back card with each radio and television PSA it
distributes. Typically, about ten percent of the cards are returned, and in a sense they are
self selective. The people who like the PSAs the most and the least tend to answer. While
the comments are not generalizable to all stations, they provide an indication of what
gatekeepers, i.e., the public service directors, think of the PSAs.
Another indicator is provided by a monitoring service, Broadcast Advertisers
Reports (BAR). This company monitors the Institute's PSAs (as well as those of several
other Public Health Service agencies) in 75 of the country's largest markets. A monthly
report includes the number of times the PS A was aired in each city, the times, and the value
of that time had it been purchased. It is particularly useful to monitor this information from

-------
156                                                               Case Studies

month to month.  The Institute intervenes in some way if the numbers go down and
congratulates itself if the numbers go up.

Summary
        In summary, the NHLBI uses various sources of research data in the development,
implementation, and assessment of its national education programs. These sources include
data from the Framingham Heart Study, from the National Center for Health Statistics,
from clinical trials, and from NHLBI surveys. It also uses all of these data in developing
and assessing mass media campaigns. In addition, it supplements the existing quantitative
data with qualitative data. The latter, frequently obtained in focus groups, tell how well
specific concepts communicate and how well various messages communicate. Finally, the
Institute assesses the impact of programs through national survey data and commercial
monitoring services. Exhibit A summarizes these steps.

-------
National Heart, Lung, and Blood Institute 157
Exhibit A
Using Evaluation in Developing PSAs

The National Heart, Lung, and Blood Institute follows a twelve-month campaign devel-
opment process consisting of six major steps:

1. Data Review—The Institute staff looks at the data on the prevalence, awareness,
treatment, and control of high blood pressure and high blood cholesterol. Several
sources of data are used including the second National Health and Nutrition
Examination Survey (NHANES II) and the National Health Interview Survey.
The purpose is to identify target audiences, and to identify the most effective
information and education strategies to reach the target audience.

2. Concept Development—The Institute holds a one-day meeting with represen-
tatives from constituent groups such as State health departments, members of its
coordinating committees, public health practitioners, and health professionals
dealing with members of the target audience on a regular basis. The purpose of
the meeting is to develop a series of concepts that can be used to reach the target
audience. These concepts are then tested with members of the target audience.

3. Draft Scripts/Test Messages—Based on the most successful concepts, a series of
scripts are developed and tested with members of the target audience. Testing
procedures include focus groups and central location intercept interviews.

4. Field Review and Clearance—Scripts and story boards for the PSAs are sent to
approximately 300 key constituents along with a description of the message
development and testing process. These people are asked to review and comment
on the scripts. If a pattern of comments emerges on some aspect of the script, the
issue is resolved before production.

5. Production—Production for all of the high blood pressure and high blood
cholesterol television PSAs distributed during the year takes place during a
concentrated 2- to 4- day period.

6. Distribution—The Institute sends its television PSAs to a designated media
coordinator in each state health department who then distributes the PSAs, often
through personal delivery, to television stations throughout the state. Bounce-
back cards and commercial monitoring services help evaluate stations' use of the
PSAs.

-------
New York City Health Department
Robert W. Denniston

Most people are aware that the consumption of alcohol can involve some risk,
whether it is drinking associated with driving, high blood pressure, or alcoholism. For most
healthy adults who drink, these are preventable problems. But for some people, in
particular those in certain high-risk groups, such problems can be especially severe.
The discovery of fetal alcohol syndrome (FAS) and the resulting risk communication
messages is one example of how risk communication can affect the public's health. It was
only about fourteen years ago that scientists established that alcohol consumption during
pregnancy presents a high risk for birth defects. In fact, it is the number one preventable
cause of birth defects; about 2 percent of all births (about 50,000 each year) involve fetal
alcohol syndrome. This syndrome is lifelong, irreversible, and costly to both families and
society.
Following the publication of research identifying FAS, there was scientific consen-
sus resulting in a Surgeon General's Advisory Statement in 1981 that "the safest choice is
not to drink during pregnancy." The Advisory was supported by the American Academy
of Pediatrics, the March of Dimes, and many other health related organizations.
In addition, the 1990 Health Objectives for the Nation identified the need to increase
public awareness about the risks of alcohol consumption during pregnancy. The under-
lying assumption was that increased public awareness is necessary, but probably not
sufficient, to assure behavior change. That is, improved knowledge is a logical antecedent
of behavior change, empowering individuals to make informed decisions about matters
within their individual control.
The first step in designing a risk communication program is to identify the
prevalence of the problem: in this case, a survey of high school seniors nationwide revealed
that about two-thirds identify themselves as "current drinkers." About two-thirds of the
high school senior girls said that they drink five or more drinks on an occasion about every
two weeks. From other data, we know that certain populations, particularly women of
child-bearing age, are generally heavier drinkers than older women.
The next step in developing the risk message about FAS was to examine the
knowledge of the target audience. About two-thirds of women are aware that alcohol

159

-------
160                                                                Case Studies

consumption during pregnancy can cause birth defects. However, upon closer examina-
tion, it appears that  this awareness is  very superficial.  In fact, about one-third of
respondents to a recent survey said that alcohol is a hazard that can lead to fetal alcohol
syndrome, but only at the level of three drinks or more a day. There are other myths and
misconceptions as well. For example, many people believe that distilled spirits are more
hazardous than beer or wine. This belief is especially prevalent among younger audiences.
     Although the public health community's response to the need for risk communica-
tion messages about FAS has been multifaceted, including mass media campaigns and
information provided to health care professionals, one specific case study will be presented
here.
     The City of New York responded with a nontraditional public policy decision—to
require warning posters in establishments that sell or serve alcoholic beverages. New
York's program was somewhat unusual also because it included an outcome evaluation;
polls conducted before and after the campaign allowed planners to gauge the program's
effect.
     The warning poster is at point-of-purchase, within some 8,000 commercial estab-
lishments in New York City, and reads: "Warning: Drinking alcoholic beverages during
pregnancy can cause birth defects." This policy is very controversial; it required 15 months
to be approved by the city council. In fact, it was probably the controversy reported in the
media that raised public awareness of FAS  more than the warning posters.  This is one
positive outcome of good public policy discussion  and open public debate.
     In order to evaluate the effects of this new policy, a Gallup poll was commissioned
before the warning signs were posted; however, the  poll was conducted after the publicity
had begun.  A second poll was conducted after the posting of the  warning.  In the first
survey, 54 percent of respondents spontaneously mentioned alcohol  as a risk factor for
birth defects. This increased to 68 percent in less than a year as a result of the warning poster
policy.
     Even more important was the increase in knowledge among those at risk. Seventy-
six percent of women of childbearing age mentioned alcohol as a risk factor for birth
defects, and 74 percent of those women who said that they had consumed alcohol in the
last 30 days identified it as a risk factor.
     This public policy did result in increased public awareness.  Drinkers became far
more aware of FAS than nondrinkers. Also, there were big gains in refuting some myths
and misconceptions, particularly those related to  differing risks from consumption of
different types of alcoholic beverages. For example, because beer is the beverage of choice
in the United States, it was important to make the public aware through the warning signs,
and through risk messages in other media, that beer is as likely as wine or distilled spirits
to cause FAS. Awareness of this fact increased from 60 percent in  the first year to 71
percent. For wine, awareness increased from 60 to 66 percent, and for hard liquor, from
90 to 92 percent. The number of women who said  that wine is an unlikely cause of birth
defects decreased from 25 to 11 percent. This is important because it shows progress in
breaking the strongly held conviction that wine is less harmful than distilled beverages.
     In summary, the program was effective in the sense that it made good use of the
available research evidence on public opinion, responded to myths and misconceptions,
developed persuasive risk messages, and after a year had a positive effect in increasing

-------
New York City Health Department                                            161

public awareness as the logical basis for behavior change. There were positive results not
only due  to the warning labels, but also to the programming efforts, including the
discussion in the media of the controversy. Of course, all of these positive changes cannot
be ascribed to the campaign, the public discussion, or the warning signs alone, but rather
to the synergistic effects of all components.
      As a result, other cities including Columbus, Ohio, Philadelphia, Pennsylvania, and
Washington, D.C., have adopted this measure. In California, it stimulated Proposition 65
and other new policies. The message has continued to be controversial, causing responses
from the beverage industry in particular.

-------
Environmental Protection Agency
Ann Fisher
This paper describes two risk communication evaluations undertaken by the Envi-
ronmental Protection Agency (EPA). The first illustrates the process of formative
evaluation and the second, the process of summative evaluation.

Formative Evaluation: Lead in Drinking Water
Lead in drinking water became a news item before the EPA had decided what to do
about this issue. With time at a premium, the Agency's Water Office drafted a booklet on
lead in drinking water, assisted by the Office of Public Affairs, and then asked the Risk
Communication Program to help evaluate the booklet.
The Risk Communication Program contacted half a dozen experts, some of whom
knew about lead and some of whom knew about drinking water. All knew something about
risk communication. They were asked to take a quick look at the draft booklet and provide
comments within two weeks.
These experts' comments helped identify some definite problems. For example, it
was not clear what the draft booklet's objectives were. It also was not clear who the target
audience was: operators of water companies or households? This first version of the
booklet also failed to give readers an action to take after they had become aware of their
risk.
The Water Office responded to these points and revised the draft. The resulting
booklet, although it did not go through a formal evaluation, is certainly much closer to what
most risk communication experts would feel makes some sense.

Summative Evaluation: Radon Information
The EPA estimates that radon causes between 5,000 and 20,000 lung cancer deaths
per year in the United States, posing the greatest environmental risk that the Agency now
deals with. A major concern was how to raise awareness of this risk without creating
unnecessary panic. Research, however, indicates that it is hard to scare people about
something that they can't see, smell, or taste and that is in their own homes (Weinstein et

163

-------
164 Case Studies

al, 1986). These findings suggested that the EPA needed to concentrate on raising
awareness. The Agency has some research underway to evaluate alternative ways of doing
this (Desvousges, Smith, and Rink, 1988).
There was another basic question with respect to radon communications: Once the
program had raised awareness, how could it get the right people to take action? Of course,
some judgments have to be made about who the right people are, but after selecting criteria
for determining that, it is possible to set up an appropriate research design. This the Agency
did with the cooperation of New York State. The State, to find out how serious the radon
problem was in New York, had put canisters in about 2,000 homes. It had not given much
thought, however, to how it would tell people what the readings meant.
EPA worked with New York to set up an experimental design that included six
different approaches to providing information. Everyone in the study was assigned to one
of the six approaches. The study initially gathered data at three points: at baseline; after
homeowners got their first readings; and after they received annual readings. These data
were examined to determine which approaches were most effectice with respect to a)
satisfying respondents' need for information, b) increasing their knowledge about radon
and how its risk can be reduced, and c) whether respondents could form risk perceptions
consistent with the radon levels in their homes. Later, a fourth data set was gathered to learn
which approaches were most effective in encouraging mitigation for those with high radon
levels.
In summary, this evaluation compared the impact of a variety of communication
methods on a particular target audience. Although not as comprehensive or expensive as
the ideal might be, it illustrates many of the principles of proper design and conduct of
evaluation research.

REFERENCES

Desvousges, H., V.K. Smith, and H.H. Rink, III. 1988. Communicating Radon Risk Ef-
fectively: Radon Testing in Maryland. October. Final Report to Office of Policy Planning
and Evaluation, U.S. EPA.

Weinstein, Neil D., Sandman, Peter M., Klotz, M.L. 1987. "Public Response to the Risk
From Radon, 1986," Final Report to New Jersey Department of Environmental Protection,
January, 1987.

-------
Maryland Department of the Environment
Nancy Zahedi and Carol Deck


     Federal and state agencies often work with limited resources in their efforts to
communicate to the public about risk, with little extra time or money available to devote
to evaluating the impacts of those  communications.  The Environmental Protection
Agency (EPA), in an effort to fill the gap in evaluation, recently carried out a pilot radon
risk communication effort, and an integral component of the effort was an evaluation of
its effectiveness. This campaign took place from January to March 1988 in conjunction
with the Maryland Department of the Environment.  The goal was to evaluate the
effectiveness of a number of innovative  and cost-effective radon risk communication
methods and materials.
     Based on insights gained through past risk communication efforts and focus group
sessions, the following messages were emphasized:
     • Radon is a serious health risk.
     • You may be at risk, and the only way to find out is to test.
     • Testing is easy and inexpensive.
     • If your home  has a radon problem, it can be fixed.
     • The State of Maryland will provide information and a list of testing companies
       through its toll-free radon hotline.

     These messages  were communicated through a combination of methods in two
different Maryland communities. The communication methods used were:
     • Radio public service announcements
     • Newspaper print advertisements
     • Newspaper articles
     • Utility bill inserts
     • Community radon presentations
     • Other community events publicizing radon
                                                                       165

-------
166 Case Studies

Defining Success
In designing the evaluation of the communications efforts, it was first necessary to
define the purpose of the evaluation. Was the purpose to identify whether testing (the
outcome encouraged by the communications) had increased; to learn where people
obtained their radon information; to determine which methods were most effective; or to
measure the impact of specific messages? Discussions on this subject indicated that none
of these measures would adequately define success for the communications efforts.
Measuring changes in testing alone, for example, might understate the impact of the
communications, since the project had a short time frame, and radon poses long-term rather
than short-term risks, thereby making it less imperative that people test immediately.
However, the communications may have educated people about radon and resulted in their
taking action to test at some later time. Thus, awareness, knowledge, and attitudes about
radon, as well as testing behavior, were considered to be important measures of success.
Where people obtained their information as well as socioeconomic characteristics that
might influence how they responded to risk communications were also considered to be
useful in understanding the impact of the communications.

Evaluation Methodology
Having defined the purpose of the evaluation and how success would be measured,
different options were considered for evaluating the radon communication efforts. Based
on the kinds of information needed to assess the effects of the radon communications, the
main evaluation method selected was the collection of data through a pre-outreach and
post-outreach survey. A survey questionnaire was developed and used to establish first the
existing levels of awareness, knowledge, attitudes, and testing, and then measure changes
that took place following the communication activities.
An additional means of assessing which communications activities and materials
were most successful was also built into the project design. This consisted of keeping a
log of all calls to the State of Maryland's radon hotline during the project's three months
of outreach and recording where callers had heard about radon.

Limitations of the Evaluation Design
The survey questionnaire relied on respondents to recall their sources of radon
information. However, it can be difficult for people to remember where they heard about
a given subject Thus the survey did not accurately reflect where people heard about radon,
relying as it did on imperfect recall. The hotline data, however, were likely to be a
somewhat more accurate measure of where people had heard aboutradon than the surveys,
since those calling the hotline were mod vated by a specific communication or combination
of communications to seek additional information. The disadvantage of the hotline data
was that they included only those who called the hotline number and not those who may
have received the project's communications but did not call.
Another limitation of the evaluation design was that it assumed that the changes
measured would be attributable primarily to the project's communications. However,
during the same time period, a local television station carried out a major radon public
awareness campaign that also reached the communities included in the study. As a result,
while the surveys were useful in identifying changes in the effectiveness measures—

-------
Maryland Department of the Evironment 167

awareness, knowledge, attitudes, and testing—they were less useful in attributing the
causes of the observed changes to a specific information source—in this case, the EPA's
communications versus the local television station's communications. Thus, although
interesting changes were observed between the two surveys, which showed that respon-
dents had increased their awareness, knowledge, and testing for radon, it was difficult to
determine how much of the observed increases was due to which communication activities.
Finally, the value of survey data depends on applying the appropriate statistical tools
in analyzing the data. However, low response rates in both the pre- and post-outreach
surveys made it difficult to use rigorous statistical techniques as a means of generalizing
from the survey population to the population at large. Survey findings could be used only
to describe the respondent population, and not to predict how other individuals would react.

Use of Evaluation Data
Despite the problems encountered in evaluating the EPA/State of Maryland radon
risk communication efforts, this evaluation provided much information of value to risk
communicators. As a result of the evaluation, it was possible to offer recommendations
to EPA regional and state radon offices based on both the process of communicating about
radon and the outcome. This information can be adapted by these radon risk communi-
cators in their communications efforts.
The survey also provided data that can be used to understand how people process risk
information and where they turn for such information. The record of calls to the Maryland
radon hotline helped supplement data from the survey to provide a more accurate picture
of where people heard about radon during the project period. The data further allow for
the quantification of the changes observed during the project period. It is possible not only
to say that changes occurred, but also to indicate the magnitude and direction of changes.
By evaluating risk communications, important factors that might not be otherwise
obvious can be understood and incorporated into future communications. For example, an
interesting finding from this survey was that speaking to someone else about radon (e.g.,
a friend, relative, or co-worker) was as important as exposure to other communication
sources in explaining why some people were more concerned or knowledgeable about
radon than others. Also, while more respondents were aware and knowledgeable about
radon as a result of the communications during this period, many of them did not test
because they were able to avoid personalizing the risk. They acknowledged therisks posed
by radon but did not perceive being atrisk themselves (EPA, 1988). Thus, itis not adequate
just to educate people about radon risks; efforts also must be made to convince people that
they are personally at risk.

Final Thoughts on Evaluation
Evaluating the effectiveness of risk communication efforts is an important aspect of
communications that is often overlooked, but it can provide important lessons for
improving communications. The cost of conducting an evaluation can be a major
drawback, particularly when resources are scarce. However, there is also a cost involved
in producing an ineffective risk communication campaign, in terms of wasted resources
and lives affected. A number of evaluation options are available and can be selected
according to resource availability and evaluation needs. The information gathered from

-------
168                                                           Case Studies

such evaluations allows risk communicators to build on successes and avoid repeating
failures.
REFERENCE

Environmental Protection Agency (EPA). 1988.Region 3/OPPE/State of MarvlandRadon
Risk Communication Project: An Evaluation of Radon Risk Communication Approaches.
Washington, DC: The Agency, November.

-------
U.S. Council for Energy Awareness
Ann S. Bisconti
If communications are aimed at changing behaviors, improving attitudes, or just
informing, program managers may be able to achieve their objectives without continuous
evaluation and change. But the odds are against it. Think of the difficulties in communicating
effectively. Even if the sponsor's name is Coca Cola, the program will not have unlimited
resources. Planners must select the audiences with whom the program's limited resources
will do the most good, and that means knowing about the potential audiences, getting their
attention, and communicating messages that are meaningful and believable, and do not
raise undue concern. Most important, program managers must be alert to the need for
adapting to change, lest the program stagnate or die.
The old dichotomy of dividing evaluation into two stages, formative and summative,
is certainly inadequate for long-term communications projects and probably inadequate
for most short- term projects as well. It is not enough to ask: "What should we do?" and
then "How well did we do?" Conceptually, that sequence of questions treats evaluation as
an add-on to the program, a response to the requirement for accountability. Instead, when
evaluation is integrated into the program, the additional relevant questions are "How well
are we doing and why?" and "How can we do it better?" When summative research and
formative research become part of the program process, the program is more likely to start
strong and then evolve and improve.

A Case Study in Using Evaluation Results
Evaluation research at the U.S. Council for Energy Awareness (USCEA) has helped
its communications evolve. For example, in 1983, USCEA launched a national advertising
campaign as part of a multifaceted communication program on the need for energy security
and the prominent role of electric power from nuclear energy and coal in achieving that
goal. The advertising agency, Ogilvy and Mather, one of the best in the business, began
admirably but without the benefit of solid research. The initial television advertising,
which did derive from research on audience attitudes, was not tested until later. It was
attractive, appealing, and included the memorable song "Tomorrow" from "Annie." But

169

-------
170 Case Studies

early in the program the Council began to test the advertisements and found that this series
of commercials was not getting its intended message across.
The other half of the campaign, two-page magazine advertisements aimed at
information seekers, derived from the body of facts about the electrification of America.
For instance, the advertisements described how the use of electric energy had grown
substantially since the oil embargo in 1973, while nonelectric energy use had declined. The
new electricity, largely from coal and nuclear energy, replaced oil in many uses and helped
reduce U.S. dependency on foreign oil. Because it was more efficient, it also helped reduce
overall energy consumption. This informational series proved to be far more effective than
the television advertising. It was attention-getting, clear, and favorably received.
Based on follow-up research, the Council made significant changes. The television
and magazine advertising were changed to match, with the same visuals, the same basic
message, and approximately the same audience. Subsequent research consistently has
shown the wisdom of these changes, as the television and magazine advertising are
synergistic, i.e., they reinforce each other.
Since those initial program changes, many other research based decisions and
improvements have been made. For instance, each advertisement is tested before it is
placed. Based on this testing, the Council may decide to use the advertisement as is, revise
it, or drop it altogether. USCEA has a continuous, experimental-design study of overall
advertising impact using large national panels. This research has shown that those who see
the advertisements improve their knowledge and attitudes significantly more than those
who do not see the advertisements. The research also helps identify population segments
that the communications should be reaching more effectively, and this information is
considered in both media placement decisions and creative approaches.

The People Part
Good integrated evaluation requires a communications team that appreciates research
and knows how to apply the findings intelligently. Many professional communicators feel
threatened by research until they learn how helpful it can be; therefore, launching an
integrated evaluation program requires strong direction from the top. Once the evaluation
component becomes familiar, it can be seen as an aid in improving the effectiveness of
communications and in demonstrating that decisions were made scientifically and not by
the seat of the pants.

-------
Food and Drug Administration
Louis A. Morris
Reye's syndrome (RS) is a rare but severe disease associated with influenza and
other viral diseases. It affects primarily children under the age of 18 years. Although its
pathogenesis is unknown, the mortality rate is estimated at 20 to 30 percent, and permanent
brain damage may also occur. For the past decade, evidence has been accumulating
supporting the association between RS occurrence and the use of salicylates, such as
aspirin. In the early 1980s, the Surgeon General recommended that doctors advise parents
to "use caution" when administering salicylates to treat children with viral diseases such
as chicken pox and influenza.
In 1985, hearings were held before Congress to support a bill that would require the
makers of aspirin to include a distinctive section on the product's label warning consumers
not to administer aspirin to children with flu or chicken pox. Under mounting pressure,
some manufacturers had voluntarily included the warning information while others had
not. Some aspirin manufacturers feared that the warning would cause a substantial drop
in their sales and increase the sales of competitors who make acetaminophen products.
Surveys of parents undertaken in Houston, Texas, in 1981 and 1983 indicated a
growing trend among parents to avoid administering aspirin to children with flu or chicken
pox. In the 1983 survey, 60percent of the parents surveyed had heard of RSand42 percent
knew of the association between RS and aspirin. Of the 103 children who had the flu, 14
percent received aspirin, 42 percent received acetaminophen, and 20 percent received
both.
The issue addressed in this case was the need for a study to determine if the Food and
Drug Administration (FDA) should require a warning label on aspirin to tell parents not
to administer it to children with flu or chicken pox.
The first challenge is to define the question or objective accurately. In the case of
the warning label, two issues needed to be addressed: the consequence of a communication
(e.g., behavior change) and the communication itself. What would be the public impact
of an aspirin warning label? What would be the best label format and design? Was a label
171

-------
172 Case Studies

the appropriate risk communication response? Are multiple communications—beyond
labels—necessary?
In defining communication objectives, the following questions can be formulated:
• Is a warning label effective at changing behavior?
• What would be an effective communication mechanism (or a combination of
mechanisms)?
• How do people decide to take aspirin? When is the decision made? Will a label
affect that?
• How do parents learn about medicines, and how do people use medicines?

Methods to find answers to these questions include field studies, focus groups, panel
studies, and national surveys. Field studies in most cases are not appropriate because they
require too much time. Focus groups, panel studies, and national surveys each have
strengths and weaknesses. Probably the best way to gather the most information, consid-
ering cost and time constraints, is to use a combination of these methods. As information
becomes available, the communications objectives might need to be redefined.
The FDA frequently chooses national telephone surveys to explore, quickly and
relatively inexpensively, public awareness and response to issues and problems. When
structuring a questionnaire for a national survey, areas to be considered for questioning
include what people know, what people do, and what people intend to do. The question
sequence as well as the question structure should be carefully constructed. For example,
how the questionnaire begins is important because it can influence how a respondent will
answer the rest of the questionnaire. In this case, Reye's Syndrome should not be
mentioned in the beginning of the interview to prevent a bias in the responses.
One obstacle to conducting surveys in the federal government is the requirement for
Office of Management and Budget approval. Time and cost considerations also make
many large scale surveys unrealistic. However, a telephone survey can produce the needed
results in about a month with a fairly good response rate.
In summary, the fears, threats, and obstacles to evaluating effective risk communi-
cation seem to be 1) defining the question or objective accurately, 2) time, and 3) costs.
Effective risk communication is an evolving process as public awareness, opinion, and
action change. The program or activity must develop mechanisms to reassess the situation
readily as it evolves.

-------
Cancer  Prevention Awareness Program
Shelagh Smith
     The National Cancer Institute's (NCI) Cancer Prevention Awareness Program
(CPAP) is a national public information and education program that was launched in 1984.
Its purposes are to increase public awareness of cancer risks and promote changes in
lifestyle to help people reduce their own risks. The program was planned following the
scientific quantification of the potential for cancer prevention. NCI determined that
      • About 80 percent of cancers  are potentially preventable.  The concept of
        prevention flowed naturally from a decade of positive life-style trends in the
        United Sates that support and sustain NCI's prevention messages.
      • Survey research consistently showed that the public was confused and skeptical
        about cancer and cancer prevention.

     NCI selected risk factors for inclusion in the CPAP based on three criteria:
      • The risk factor affects a significant percentage of Americans.
      • It poses a substantial threat by itself or in combination with other factors.
      • Its identification presents an opportunity to reduce or control exposure through
        individual effort.

     NCI conducted the following activities to select the risk factors to be addressed and
establish a sound foundation for program development:
      • Prepared research summaries on each of seven selected cancer risk factors
      • Reviewed the existing literature on public knowledge, attitudes, and practices
        (KAP) related to cancer
      • Reviewed recent federal and state health promotion campaigns  (e.g., Healthy
        Mothers, Healthy Babies; NIA Prevention Campaign; High Blood Pressure
        Education Program; Healthy Older Americans) to identify elements of successful
        programs
      • Analyzed communications, social marketing, and health education models to
        provide a conceptual and practical foundation for the program

                                                                         173

-------
174                                                                 Case Studies

      • Conducted thirteen focus groups to explore consumer perceptions of cancer risk
        and prevention and to test alternative messages and formats
      • Convened sixteen working groups to provide guidance on messages related to
        specific cancer risk factors and on communication strategies for various channels
        and audiences
      • Conducted a national survey of public KAP related to cancer prevention and risk

     Based on this research, the following objectives were set:
      1. Improve public knowledge and attitudes regarding cancer prevention, incidence,
        and treatment.
     2. Increase public awareness and knowledge of cancer risks that can be modified.
     3. Increase public awareness and knowledge of healthful behaviors that afford a
        measure of personal control over cancer risk.
     4. Promote changes in behaviors and practices that will help individuals to reduce
        their cancer risks.

     These objectives were established specifically for the prevention awareness pro-
gram, designed as one contributor to the overall NCI goal for the year 2000—to reduce the
cancer death rate by up to half from the 1980 level. This goal cannot be achieved without
major gains in prevention.
     The following key evaluation questions were developed:
      • Are  the program messages and materials effective  (believable, interesting,
        persuasive, understandable, memorable, and personally relevant)?
      • Are the program networks (media and intermediaries) functioning effectively
        (amount of coverage, level of activity)?
      • Are people's knowledge and attitudes improving or changing with regard to
        cancer and cancer prevention?
      • Are people seeking information about cancer prevention from the appropriate
        sources?
      • Are people changing their life-styles  on the basis of NCI's cancer prevention
        messages?
      • Is progress being made toward NCI's year 2000 goals?

     The following evaluation approaches and activities were designed to address the
evaluation questions:
      • Formative evaluation: concept and message testing; pretesting of products.
      • Process evaluation:  tracking calls to the toll-free telephone service and publica-
        tions distributed; case studies; tracking PSA use; tracking news coverage (news-
        clip content analyses); and tracking of secondary data.
      • Outcome evaluation: national knowledge, attitudes, and practices surveys (1983,
        1985); tracking secondary data (especially other surveys, e.g., NHANES, NHIS);
        and ongoing surveillance of communications literature.

-------
EPA Office of Toxic Substances
Maria Pavlova
     Thousands of facilities in the United States are required to report environmental
releases of over 300 toxic chemicals annually to the U.S. Environmental Protection
Agency (EPA) and to the states as of July 1988.  These new data augment existing
information about the presence and effects of toxic chemicals.  This information is
available to the public and allows for more informed participation by the public on related
issues. However, this information will be helpful only if the public also is provided with
a context for understanding and using these data as a result of education efforts sponsored
by EPA and others.
     To design messages and materials that are responsive to the public's questions and
concerns regarding the presence of toxic substances, EPA commissioned a needs assessment
through a cooperative agreement with the Institute for Health Policy Analysis, Georgetown
University. The needs assessment was designed to:
      • Identify current awareness, knowledge, perceptions, concerns, needs, and wants
        of various publics (e.g., affected citizens, employees, environmentalists, com-
        munity leaders, local government staff, health and media professionals, educa-
        tors, and students) about toxic chemicals
      • Identify credible sources of information and potential delivery channels (e.g.,
        League of Women Voters chapters, homeowners associations) to guide  the
        design of communications activities
      • Identify and evaluate existing educational materials for use in EPA's program to
        prevent duplication of effort and assure optimal use of EPA resources
      • Test messages used to explain the meaning and implications of toxic emissions
        (e.g., public understanding of  terms such as emission,  risk, toxicity, dose,
        exposure, and health effects)

     Beyond this needs assessment, the project was designed to:
      • Produce guidelines for effective educational messages and materials on toxic
        exposure and health effects

                                                                          175

-------
176                                                                Case Studies

      • Develop criteria to evaluate the subsequent risk communication activities

     The needs assessment revealed how specific segments of the public and opinion
leaders think about issues related to environmental risks, specifically about the presence
of toxic substances in their communities, through the following activities:
     Analyzing public perception data available from related projects. The EPA has
sponsored related activities (e.g., the Toms River, New Jersey, Superfimd community
education program, focus groups on radon, and the National Pesticide Survey), which have
produced information about public awareness and perceptions of environmental risks.
     Analyzing national polling data.  An analysis of related questions asked in national
public opinion polls over the previous three years was conducted to provide a quantitative
perspective on public knowledge and attitudes.
     Gathering information about the  perceptions  of environmental professionals.
Telephone and in-person interviews were being conducted with environmental leaders,
EPA headquarters and regional staff, and state and local officials to assess their perspectives
regarding what the public  needs and wants to know, as well as what assistance these
professionals need to communicate effectively with their constituents.
     Identifying and evaluating existing educational materials. Letters to 1,200 envi-
ronmentally related agencies and businesses solicited copies of related public education
materials; existing inventories, libraries, clearinghouses, and databases were also checked.
Potentially relevant, useful materials were reviewed for readability, accuracy, appropri-
ateness, and availability.
     Conducting focus groups in potentially affected communities. Fifteen focus groups
were conducted in communities where the presence of business and industry provides for
the potential release of toxic chemicals. Focus groups were being held with citizens who
live near affected industries, employees of those industries, environmental group mem-
bers, local officials  and potential  intermediaries with public credibility, and business
representatives.
     The results of all components of the needs assessment were analyzed in a report to
EPA to:
      • Recommend priority messages, risk communication strategies, and target audi-
        ences.
      • Identify existing or modifiable educational materials or the criteria for develop-
        ing new educational materials.
      • Provide guidance for  developing assistance and training programs for local
        emergency preparedness committees and for use in the community by interested
        organizations.
      • Recommend how to develop communication networks within a concerned or
        affected community.
      • Suggest criteria and methodologies for providing  feedback to EPA.

-------
National Cholesterol Education Program
John C. McGrath

The National Cholesterol Education Program (NCEP) of the National Heart, Lung,
and B lood Institute (NHLB I) pretested materials prepared for a public education campaign.
Persons with high blood cholesterol were asked to comment on two brochures in two focus
groups in the Washington, D.C., area and two groups in Providence, Rhode Island.
Persons who were aware that they had a high blood cholesterol level were recruited
through the cooperation of medical facilities conducting blood cholesterol screenings.
One group with male respondents and another group with female respondents were
conducted in each of the two locations. All of the respondents had at least a high school
education, worked in either a nonprofessional or professional job, and had never worked
in the health or medical field. In addition, those selected had never had a heart attack or
stroke and did not consider themselves very knowledgeable about cholesterol.
Focus groups are a form of qualitative research, so the findings cannot be projected
statistically to a larger population. However, the groups provide reactions to educational
materials as well as insights regarding potential confusion in meaning, difficulty with the
level of language used, and the need to highlight information to communicate critical
points. Participants' needs for additional information (i.e., the questions left unanswered
after reading the material) can be explored in the focus group session.
To provide consistency across groups, a moderator's topic guide was developed.
This guide was designed to examine participants' reactions toward:
• Learning that they have high blood cholesterol,
• The general content and format of two public education brochures ("So You Have
High Blood Cholesterol" and "How To Lower Your High Blood Cholesterol"),
and
• The specific contents of each brochure

For ease of discussion, the booklet "So You Have High Blood Cholesterol" will be
referred to as "the red booklet" and "How To Lower Your High Blood Cholesterol" will
be called "the green booklet," based on the color of their covers.
177

-------
178 Case Studies

All participants were given the two booklets in advance and asked to read them prior
to attending the focus groups. At the beginning of the discussion, participants were asked
to discuss their reactions to learning that they had elevated blood cholesterol. This
discussion served two purposes:
• To explore target audience feelings, perceptions, knowledge, attitudes, and
misconceptions to help plan appropriate messages
• To help the individual group members become comfortable with the moderator
and the other participants

Following this discussion, the participants were asked to discuss their reactions to
the two booklets. To facilitate recall and discussion, the specific sections of each booklet
being discussed were shown on overheads.
Participants were asked a series of questions to assess their reactions to the amount
of information provided, its level of complexity, attitudes regarding the tone and style of
the publications, and whether this information would make a difference in their subsequent
behaviors. Responses included the following:
• Despite their interest in receiving the information, a number of participants felt
that the level of information made it difficult to understand the brochures. Many
participants described having read and re-read sections in an attempt to comprehend
the information.
• Although participants found portions of the booklets to be too densely written,
they felt that the booklets were appropriate in tone. They described the general
tone of both booklets as "serious," "straightforward," and "generally informative."
• Although the booklets were considered similar in tone, several differences were
apparent. The red booklet was perceived as being "more clinical" in tone, "hard-
hitting," and "concise," while the green booklet was described as "less technical"
and "more hands-on."
• Despite their sense of having received a great deal of information, several
participants raised questions that they felt had not been addressed. For the red
booklet, these questions were more "clinical" in nature, while those directed at the
green booklet were on a more "nitty gritty" level.
• In general, participants felt that the material had given them enough information
to know how to lower blood cholesterol. Several described the influence that this
information had already had on their shopping and eating behaviors. However,
others felt that the material was not as "accessible" as it could be. A few described
feeling "inhibited" by their lack of complete understanding of blood cholesterol
information. When asked if they would be able to describe this information to a
friend, several participants mentioned their own "lack of comfort" with the
material.

Participants also responded to a number of questions developed to assess their
reactions to the placement of information on the page, use of graphics, print size, and
structure of appendices. They were also asked to comment on the way in which they
perceived the two booklets as fitting together. Responses included the following:

-------
National Cholesterol Education Program 179

• Respondents liked the placement of the material, with boldface questions or
labels in the left-hand margin serving to provide a framework for the text. In the
red booklet, especially, participants appreciated being able "to look at the
questions and then move in to find the answers."
• The red booklet was perceived as "too densely formatted." A number of
participants suggested that more graphics be included in the final version of the
brochures. Graphics were described as a means of "catching attention" and
"breaking up the flow." Participants liked the idea of highlighting important
portions of the text. They felt that the use of color would add excitement to the
materials. Several also remarked that a picture can often be an aid to memory, if
it is relevant to the information. Although participants thought that additional
charts, graphs, and pictures would be quite useful, they did not want to include
images that would not "add meaning" to the text.
• The print size was acceptable to all participants.
• As a means of facilitating access to information, respondents favored the idea of
adding a glossary to each booklet.
• Women, more than men, were interested in a format that would allow them to pull
out various sections from the green booklet to carry around as references.

In addition, specific comments about the content, wording, and format of each
booklet were discussed and recorded on a page-by-page basis. As a result of focus group
findings, both booklets were revised prior to publication. The changes included the
following:
• Adding more illustrations and simplifying the format to make the booklets less
dense
• Substituting summary charts for large blocks of text
• Adding a "pull-out" summary suitable for posting on the refrigerator
• Adding a glossary to one booklet

-------
EPA Superfund Program
Maria Pavlova
      In  1985, EPA  Region n  developed a pilot public education program to run
concurrently with the CIBA-GEIGY Superfund Remedial Investigation/Feasibility Study
(RI/FS) in Toms River, New Jersey. The Toms River community was selected because of
public concern over contamination from the waste disposal areas of the CIBA-GEIGY
plant, a designated Superfund site.  The program was designed to assess the levels of
awareness and concern among local citizens and to provide accurate information about
health risks associated with potential exposure to environmental contaminants.
      Following  completion of a community needs assessment for risk information, a
series of fact sheets was developed, pretested, and modified to respond to citizen interests
and concerns. The fact sheets were produced, and a field test was conducted to assess the
best methods for reaching the public and ascertain public response to the fact sheets.
Components of the field test included:
      •  Asking community leaders who were members of a network established for the
        program to distribute a sample fact sheet and assessment questionnaire to their
        constituents
      •  Reviewing requests for the fact sheets received by EPA's toll-free telephone
        number
      •  Reviewing answers to the question "where did you get the fact sheet you just read"
        from readers requesting additional fact sheets via an order form
      •  Analyzing responses to the assessment questionnaire
      Because of the limited number of participants in the informal field test, the results
of the study were not reviewed for the purposes of quantifying demand or projecting
responses of the community as a whole. Rather, the results of the field test were intended
to indicate the most promising routes of distribution and to verify that the fact sheets were
responsive to identified public interests in risk information. Following two months of the
study, the findings regarding the routes of distribution included:
      •  The community leaders' network was very willing to distribute the fact sheets.
      •  Most of the organizations included in the network did not meet during the summer
        months (the time of the field test), so there was no way for community leaders to
                                                                          181

-------
182 Case Studies

distribute the fact sheets to their constituents.
• Although a teacher was willing to help, schools were out of session.
• Community leaders were willing to mail to their constituents, but only if EPA
would pay for postage.
• Nineteen requests for additional fact sheets were received by the EPA toll-free
number (advertised in the media), and eleven requests were received through mail
order forms. These requests indicated that fact sheets had been picked up at:
—a community leaders' network meeting
—the county public information office
—the N. J. Department of Environmental Protection
—a local college
—the county fair
—the county library
• About 1500 sets of ten fact sheets were distributed at the county fair.

Two conclusions were reached about routes of distribution:
• The fact sheets needed to be placed in more popular locations in the community
(e.g., grocery and convenience stores, shopping malls) rather than the library and
county information office.
• The summer months are not the optimal time to release information to the public,
because schools are on recess, families are on vacation, and most community
groups do not hold regular meetings.

Regarding the utility of the sample fact sheet distributed:
• Most of the participants responded very favorably, indicating that some of the
information was new, it was appropriate for the public, and they would recom-
mend it to a friend.
• They said that the question-and-answer format made it easier to read, and it was
very interesting, informative, useful, and clear.
• They said that the information was understandable, although not easy to read, and
somewhat complete.

As a result of the field test, the sample fact sheets were determined to be appropriate
for broad-scale distribution.

-------
Cancer Information Service
Roswell Park Memorial Institute

In May 1985, Dr. Frank Field broadcast a series of reports on the relationship
between dietary practices and cancer risk on four consecutive week-night segments of the
WCBS-TV evening news. During each segment, Dr. Field promoted the National Cancer
Institute's (NCI's) booklet "Diet, Nutrition and Cancer Prevention" and provided the toll-
free Cancer Information Service (CIS) number (1-800-4-CANCER) so that viewers could
order it by telephone. After the second night, he also provided an address at NCI so that
viewers could write for a free copy of the booklet. The WCBS- TV evening news is one
of the most frequently watched news programs in the nation's largest television market,
encompassing New York City, Long Island, Southern New York State, Northern New
Jersey, and Connecticut.
This promotion resulted in the largest response in CIS's ten-year history; a total of
75,000 booklet requests were received. Approximately 15,000 phone requests were
handled by the New York City, New York State, and Connecticut CIS offices, and 60,000
mail requests were handled by NCI. The large response to the booklet promotion provided
a good opportunity to characterize the population who requested the booklet, determine its
usefulness to readers, and assess its impact on dietary behavior and knowledge.
In November 1985, a survey was undertaken of persons who had called the Roswell
Park Memorial Institute CIS (covering New York State callers) to request the diet booklet.
The purpose of the survey was to (1) describe the characteristics and cancer-related
attitudes of those requesting the booklet; (2) assess the callers' perceptions and uses of the
booklet; and (3) determine the impact of the booklet on the callers' dietary practices and
knowledge about diet and cancer risk. Between May 2 and May 31, the Roswell Park CIS
office received 3,725 orders for the diet booklet The name, address, age, sex, educational
status, and race were recorded for each caller. From the 3,725 callers to the Roswell Park
CIS, a random sample of 1,842 callers was selected to participate in the survey. This
sample was demographically representative of callers to the Roswell Park CIS and similar
to callers requesting the diet booklet from the Connecticut CIS office. (New York City data
were not available.) Based on previous experience with CIS surveys, a 70 percent return
rate of questionnaires was anticipated, which would have yielded approximately 1,300
completed questionnaires.

183

-------
184                                                                 Case Studies

     The questionnaires were mailed with a cover letter signed by Dr. Field and a prepaid
return envelope; two weeks after the initial mailing, a reminder postcard was sent to those
who had not returned the questionnaire. One week later, nonrespondents were sent a
second questionnaire and cover letter. Finally, in January 1986, a third mailing consisting
of a questionnaire and cover letter were sent to nonrespondents.  A total of 1,106 usable
questionnaires were returned.
     The four-page questionnaire included questions about the booklet, the respondent's
dietary practices, changes in dietary habits and food preparation methods as a result of
reading the booklet, and beliefs about the relationship between specific dietary practices
and cancer. In addition, respondents were asked to respond to several attitudinal items on
cancer prevention and treatment and to indicate their usual sources of information about
cancer topics. A pretest version of the questionnaire was mailed to fifty individuals; based
on their responses to the pretest questionnaire, a few changes were made and a final version
of the questionnaire was constructed.
     Three  indices for analysis were constructed to:
      •  Provide a measure of the respondent's knowledge about the relationship between
        dietary practices and cancer risk
      •  Provide a measure of positive changes made by respondents in food consumption
        practices since receiving the booklet
      •  Provide a measure of positive changes in food preparation practices

     The findings were as follows:
      •  Sixty-five percent said they read all of the booklet; 32 percent reported reading
        some of it.
      •  More than 90 percent said the booklet motivated them to try to change their diet,
        was easy to understand,  and provided useful diet suggestions.
      •  Seventy-one percent indicated that they made changes in their eating and/or food
        preparation habits after receiving the diet booklet.
      •  Men and women did not differ with regard to reported changes in food consump-
        tion habits except for consumption  of skim  milk and  low-fat dairy products,
        where women were more likely than men to report increased consumption.
      •  In general, more positive changes in  food consumption habits were reported by
        older respondents, whites, and those with more education.
      •  Those who reported reading all or some of the booklet were significandy more
        likely to report positive changes in food consumption habits than those who did
        not read the booklet.
      •  A substantial percentage of respondents incorrectly reported that consumption of
        salt, coffee, eggs, and food additives are associated with cancer risk.
     In summary, the results of the survey suggest that the diet booklet was associated
with positive changes in food consumption and food preparation practices and higher
levels of knowledge about the relationship between diet and cancer risk. However, the lack
of a control  or comparison group makes it difficult to attribute the reported changes in
dietary habits to the booklet.
     The overwhelmingly positive response to  the  television promotion of  the diet
booklet may be tempered somewhat by the fact that the characteristics and health habits

-------
Cancer Information Service                                                  185

of those who received the booklet suggest that they had access to the same information from
other sources, and they were more interested and better informed about the issue than the
general public. While mass media promotions may be useful in triggering a response from
those most ready to make dietary changes, the majority of the population (and probably
those most in need of altering their dietary habits) is not likely to respond to this type of
promotion.  For the majority of the population, efforts to persuade them  about the
importance of diet as a factor in cancer risk are necessary prior to attempts to educate them
about specific dietary changes.

-------
National High Blood Pressure Education Program
John C. McGrath
        The National High Blood Pressure Education Program (NHBPEP) was estab-
lished in 1972. Administered by the National Heart, Lung, and Blood Institute (NHLBI),
the program was conceived as a cooperative effort of various federal agencies and major
national health care organizations.
     The goal of the NHBPEP is to reduce death and disability related to high blood
pressure through professional, patient, and public education. Strategies to achieve this goal
include health promotion activities and the dissemination of information on the latest and
most effective modes of treatment.
     Throughout its history, the NHBPEP has employed a comprehensive strategy of
mobilizing, educating, and coordinating the resources of all interested groups in govern-
ment and the private sector. The NHBPEP has developed a network of approximately
fifteen federal agencies, 150 national organizations, all state health departments, and more
than 2000 community-based programs.  At the core of the program is the NHBPEP
coordinating committee, a body composed of representatives from thirty-two national
organizations.
     The NHBPEP's examination of major blood pressure control issues encompasses:
     •  Appropriate roles of health care professionals
     •  Most effective treatment practices in medical management
     •  High blood pressure control at the worksite
     •  Health needs of rural communities
     •  High blood pressure in the elderly
     •  The relationship between diet and high blood pressure
     •  Special problems in controlling high blood pressure in minority populations

     Several factors have contributed to permitting an evaluation of the impact of the
NHBPEP:
     •  The longevity of the program
     •  The level of resource commitment

                                                                       187

-------
188                                                               Case Studies

      •  The breadth of commitment by many organizations
      •  The range of strategies to address the problem including but extending beyond
        risk communication
      •  The existence of tracking measures

      Three measures  of the NHBPEP's success—hypertensive patients' awareness,
treatment, and control rates—indicate that substantial progress has been made toward its
goal.  By 1980, almost three out of four persons with hypertension were aware of their
condition—a  50  percent increase since 1971- 72.  During this  same period, rates of
hypertensive persons undergoing treatment (on medication) were one and one half times
greater, and rates of hypertension control more than doubled (National Center for Health
Statistics data).
      Although current national estimates will not be available until 1990, preliminary
analyses of the  data collected by  the seven states that participated  in the NHLBI
Demonstration Grant study  suggest that the rates of awareness, treatment, and control
continue to improve among the hypertensive population.
      Because uncontrolled  high blood pressure is the  major risk factor for stroke, the
mortality rate for cerebrovascular disease is another indication of NHBPEP progress. Age-
adjusted death rates for cardiovascular disease in general have been on a downward trend
since the 1950s.  However, mortality rates for coronary heart disease and stroke sharply
declined in the early 1970s. The NHBPEP was instituted in 1972; by 1985, age-adjusted
mortality rates had declined 34.6 percent for coronary heart disease and 50.2 percent for
stroke (National Center for Health Statistics data).
      The  improvements made in public knowledge and patient behavior are highly
encouraging; however, studies of these and other survey results indicate new challenges
for the NHBPEP.  Thus, the program continues to use both  survey data and analysis
methodologies to evaluate its multifaceted strategies and to refine its educational empha-
sis. For example, in light of recent findings demonstrating a higher mortality rate from
stroke in the  southeastern United States, the NHBPEP has  launched  a "Stroke Belt
Initiative," a major effort to target the population of this region.

-------
CONCLUSION

-------
Does Risk Communication Make a Difference?

John F. Ahearne
The purpose of this paper is to examine some mistakes in risk communication,
describe the role of risk communicators, and pose some challenges that are facing risk
communicators.
Examples of bad risk communication exist in both the Love Canal and Three Mile
Island (TMI) incidents. Within the scientific community, Love Canal is considered an
example of government incompetence. After the problem erupted, the Environmental
Protection Agency (EPA) decided to conduct a study of chromosome damage among
residents of the Love Canal area. However, the study did not include a control group,
primarily because it was ordered by the legal office, which did not understand how to do
a valid evaluation. Only thirty-six people were examined, and they were selected to
maximize the likelihood of finding chromosome damage. After the study, five reviews of
the study were conducted to determine whether any valid conclusions could be drawn. The
study only added to the agency's problems.
When the TMI reactor was destroyed, neither the operating company nor the Nuclear
Regulatory Commission (NRC) was prepared for such an emergency. TheNRC had only
a semblance of an emergency procedure, which quickly broke down in the face of a major
emergency. The agency was unable to get accurate information about what was happening,
and had happened, at the reactor. Consequently, even if an effective plan to deal with the
media had existed, the NRC would not have had good material to use. But the agency was
not prepared to deal with the media. It had no knowledgeable official spokesperson, and
it was several days before the director of the NRC's reactor division went to Pennsylvania
and became the federal government spokesperson.
Several post-accident reviews evaluated plant management, the nuclear industry,
and the NRC response. The NRC took many actions as a result, including three actions
directly relating to communications.
First, the NRC made a major revision of its emergency response organization,
establishing who would be in charge; who would handle contacts with the plant, other
governmental agencies, and the press; and how to form evaluation teams. The revised
system is a substantial improvement and has been tested in many drills and in at least two
accidents.
191

-------
192                                Does Risk Communication Make a Difference?

     Second, the NRC passed an emergency planning rule, which requires a coordinated
plan between plant operating staff and local governments.  These plans must include
dissemination of information, installation of warning systems, and drills to check com-
munications links and ensure that participants know their roles during an emergency. The
rule has upgraded emergency planning in many locales, and the plans have been used
successfully to respond to non-nuclear emergencies, such as chemical spills from railroad
accidents.
     Third, the NRC recognized that many people in the media covering  TMI wrote
confusing stories because they were unfamiliar with nuclear plants. Therefore, the NRC
established a program  in which the five NRC regional offices hold day-long sessions
annually for regional science reporters and others in the media located near nuclear power
plants.  At these meetings, speakers review important  information about the  plants,
including how they work, what their hazards are, and how an emergency would be handled.

        Using engineering terms, risk communication can be seen as a smart circuit with
feedback loops. The decisionmakers are on one side, separated from the communications
channel by a barrier, which can be called a buffer. At the other end of the channel is another
barrier, or buffer.  The buffers can be other agency staff, media representatives, repre-
sentatives of public interest groups or industry, congressional staff, and the like.  On the
far side of that buffer are the recipients, who are the media, the public, and Congress. Some
decisionmakers and people in communications see the channel and the buffers as one-way
transmission devices, to transfer information from the decisionmakers to the communi-
cations channel, and from the communications channel to the recipients. However, the
buffers should be seen as two-way, with the smart channel providing information back to
the decisionmakers.   This information feedback can  improve decisions by  letting
decisionmakers know what recipients think about proposed actions, what they are angry
about, what their concerns are, and what information they want
     Risk communication today often must address complex scientific and engineering
issues. Unfortunately,  to be a smart channel the communicator must understand the
technology. If the channel is a dumb channel, a one-way transfer of anything put into it
can create the following problems:
      • A smart buffer—a knowledgeable media person or a skilled public interest
        group—will reject the transmission.
      • The concerned public will attempt to communicate via the channel, but will
        become frustrated because a dumb channel cannot become a two-way smart
        channel.

     The actions at Love Canal demonstrated these problems.
     Of course, this model assumes that the decisionmakers understand the science or
engineering involved, which is not always the case.  However, a knowledgeable com-
municator, a  smart channel, may be able to force a lazy decisionmaker  to work to
understand the issues.
     To understand the proper role of a communicator, the concept of Thomas Jefferson
is helpful: "I know of no safe depository of the ultimate powers of society but the people
themselves; and if we think them not enlightened enough to exercise their control with a

-------
Does Risk Communication Make a Difference? 193

wholesome discretion, the remedy is not to take it from them, but to inform their
discretion." Jefferson did not endorse manipulating or even persuading people. He
endorsed informing them. Using this Jeffersonian concept and the model of a smart
channel, successful risk communication can be defined as raising the level of understand-
ing of relevant issues among participants, including decisionmakers.
Rossi andBerk (1988) explain the importance of evaluation. However, one can read
more into their paper. They stress understanding the problem to be solved, the planned
solution, and the goals of the program, and then objectively assessing progress toward
meeting these goals. By requiring understanding, Rossi and Berk mean to push program
planners to identify clearly the problem, the solution, and the program's goals.
This analysis is necessary for developing sound programs. Therefore, if risk
communicators use the Rossi and Berk approach and have access to the decisionmakers,
the communicators will be checking on whether the programs are sound. This is in line with
the appropriate role of a smart channel. The NRC changed its programs after evaluations
resulting from the TMI accident. The EPA also may have changed as a result of Love
Canal, but the Agency's actions during the event would have flunked the Rossi and Berk
checklist.
Two key concepts that stem from Jefferson' s "informed discretion" are accuracy and
completeness. It should be remembered that risk communication is not a one-time affair,
a brief skim of a pamphlet The rationale for evaluation is that programs continue and
similar situations will arise. Evaluations enable the system to improve because risk
communicators learn from mistakes and successes.
Two groups in particular need to be addressed:
• The technical professionals, who need to disseminate their knowledge and un-
derstanding. They have a responsibility to see that decisionmakers and com-
municators understand the technical issues and that messages are accurate and
complete. A large problem for persons in government is that they lack credibility.
Credibility is necessary for successful communication; it is easily lost and, once
lost, is never restored completely. Maintaining credibility requires continuing
efforts to be accurate and complete, which in turn requires understanding the
technology involved.

• The communicators, who are willing and able to be smart channels, using the
Rossi and Berk approach to improve the entire risk communication process.
Much of the U.S. public mistrusts its government, unfortunately with reason.
This can be seen in the fact that the U.S. public does not participate in elections
at the levels seen in other democracies. It is doubtful that this is because the public
believes everyone in government is doing an excellent job. Rather, the public
believes government is not influenced by single voters, a symptom of a growing
gap between the government and its people. This is not healthy.

Risk communicators can affect this problem for better or worse, because they are
involved in situations where the public comes in contact with government, where there is
high interest, strong emotions, and the potential for strengthening the public's positive or

-------
194                               Does Risk Communication Make a Difference?

negative attitudes about its government. To be more than a dumb channel is a large
responsibility.

REFERENCE

Rossi,P.andR.A.Berk. 1988. A Guide to Evaluation Research Theory and Practice. Paper
prepared for the Workshop.

-------
What Else Do You Need to Know About Evalutation?
Roger E. Kasperson
This book is one outcome of the first specialized conference on risk communication
in which evaluation was the central topic and was treated generically, a sign of the
beginning maturity of discussions in a rapidly growing and changing field. There is a
remarkable array of new initiatives and programs in risk communication being undertaken
by government agencies and other organizations. Since 1987, many guides and manuals
on risk communication have become available. Many of these, however, have preceded
a sound base of research to support prescriptions. Risk communication is still in its first
generation of effort and encompasses diverse subjects and issues. The challenge now is
to move beyond the notion of risk communication as simply the transfer of scientific
information that has been amassed and move toward the creation of genuinely interactive
communication processes and more comprehensive risk education.
Need for Better Theory and Understanding
Milton Russell (1988) has identified a central issue in risk communication: the extent
to which society focuses on trust. People are concerned about the performance of social
institutions and, in many instances, have lost their trust in them. It is important to focus
on how the loss of trust occurred and to understand how it can be regained. The question
of whether trust, once lost, can be regained in the short term remains to be answered. If
it can, we need a better understanding of how to accomplish this. Likewise, it is important
that we understand how to communicate with the public about risk under conditions of high
social distrust.
The proper design of programs requires intelligence about the nature of basic
problems. Requirements include:

• Careful definition of program objectives and goals
• A formal assessment of communications needs
• Baseline studies of public concerns and perspectives

195

-------
196                           What Else Do You Need to Know About Evaluation?

       • Conceptualization of the nature of communications problems

     Contrary to much of the flavor of discussions at these meetings, theory must be
integrated with practice.  The quality of a risk communication program depends on the
quality of the underlying theory.  It is also important to study the reasons for any changes
in public understanding and behavior that occur, because, to the extent that the causes for
change can be identified, risk communication can improve. Conversely, to the extent that
we do not really know whether it was the intervention or confounding forces that produced
the change (whether desired or not), the substance of risk communication as well as its
evaluation will not advance.
     It is often stipulated that evaluation requires definition of clear goals and objectives,
but goals and objectives, in reality, are never as clear as might be desired. This situation
is likely to continue. Planners also must consider whether their program goals and
objectives are appropriate or whether they have been manipulated or revised to achieve the
hidden objectives of institutions. In the latter case, when the program is evaluated solely
according to the stated program objectives, the evaluation will be shallow and insufficient.
     It also is possible to evaluate risk communication programs without goals and
objectives, using normative criteria.  Indeed,  it is necessary to go beyond the stated
objectives because risk communication is fundamentally a value laden activity and a
political act. Risk communication that does not include rigorous evaluation is unethical
and should be avoided.
     A continuing problem is the large gap between what is known and what is practiced.
It is essential to narrow that gap so that risk communication programs embody the best of
current knowledge, theory, and experience. This requires that agencies and institutions
commit themselves not only to risk communication but to practice that is anchored in state-
of-the-art understanding and that strictly observes the limits of what is known.

Evaluation Within a Social Context
        It is important to consider risk communication programs and  their related
evaluation within the realities of public acceptance and understanding.  Some consider-
ations are outlined here.

      • Public judgments about problems.  Statistical risk is an abstraction that has
        little meaning for the public. Members of the public do not think about risk in the
         same way as scientists. The public makes judgments about the more tangible
        technologies, not about intangible risks. If a risk communication program is
        organized around risk, it may be out of tune with the needs and concerns of various
        publics.  A key job for  formative evaluation  is to determine whether the
         communication program is actually adapted to the nature of public concerns and
         the breadth of relevant considerations.

      •  Knowledge versus information. The public seeks and receives risk information
         from many sources and, in turn, sends information to diverse parties. All have an
         impact on how a problem is perceived and the way that the decision process is

-------
What Else Do You Need to Know About Evaluation? 197

structured. These many confounding variables present methodologic problems
in evaluating the effects and success of risk communication. People will have a
better understanding of a risk problem if information is adapted to their mental
models of the risk and what they feel they need to know. Correspondingly, risk
managers will enlarge their understanding if they know the nature of the public
experience of risk.

• The unintended consequences of risk communication. Although communi-
cators tend to focus primarily on the objectives they seek to realize, unintended
consequences of risk communication also occur. These unintended consequences
may be greater in impact than the intended consequences. They may also be
harmful. The control of unintended consequences requires that they be identified
and assessed. Risk communication programs themselves should undergo a risk
assessment to determine potential harm as well as benefits.

• Holism and integration. As the practice of risk communication continues to
develop and to become more elaborate, there is a danger of specialization of
expertise and division of labor. If risk communication becomes isolated from risk
assessment and from other aspects of risk management, it will lose its quality and
integrity. Examples of failures in risk communication due to its abstraction and
isolation from the science of risk assessment already are apparent.

• Risk communication as a humane activity. The conduct of risk communication
is embedded in its mission as a humane enterprise. At base, the goals of risk
communication are those of risk management more generally—to anticipate the
harms that may be inflicted on people, to reduce these harms whenever that can
be done reasonably, and to reduce overall human suffering. All individuals
involved in risk communication need to work to ensure that risk communication
remains a caring and humane activity and forms an integral part of a broader risk
management process.
REFERENCE

Russell, M. 1989. Risk Communication: On the Road to Maturity. Paper prepared for the
Workshop.

-------
APPENDIX

-------
A GUIDE TO EVALUATION RESEARCH THEORY AND PRACTICE
                           Peter H. Rossi
                       Stuart A. Rice Professor
                          of Sociology and
                          Acting Director
               Social and Demographic Research Institute
                University of Massachusetts at Amherst
                          Richard A. Berk

                      Department of Sociology
                University of California at Los Angeles

-------
                           TABLE OF CONTENTS

Introduction                                                            205

Key Concepts in Evaluation Research                                     206
      Policy Space                                                      206
      Effectiveness: Three Meanings                                      207
      Validity                                                          208
      Measurement Error                                                208
      Causality                                                         208
      Generalizability                                                   209
      Chance                                                          210
      The Best Possible Strategy                                          213

Policy Formation and Program Design Issues                              213
      Policy Issues and Evaluation Research                                213
      Fitting Strategy to Problem                                          213
      The Policy Contexts of Evaluation                                   215
      Policy Formation and Design Stage                                   215
      Defining the Problem                                              216
      Needs Assessment Where Is the Problem and How Big Is It?             217
      Estimating Problem Parameters                                      218
      Qualitative Needs Assessment Approaches                            219
      Forecasting Needs                                                 220
      Policy-Oriented Research:                                          221
      Developing Promising Ideas into Workable Programs                   222
      The YOAA Problem                                               224
      Will Some Particular Program Work? The Effectiveness Issue           225
      Practical Developmental Evaluation Approaches                       227

  The Assessment of Ongoing Programs: Accountability Evaluation         227
      Is the Program Reaching the Appropriate Beneficiaries?                 228
      Program Integrity Research: Are Benefits Being Delivered?              229
      Are Funds Being Used Appropriately? Fiscal Accountability.            231

Program Assessment Evaluation                                          231
      Can Effectiveness Be Estimated?  The Evaluability Question.             232
      Did the Program Work? The Effectiveness Question.                   233
      Designs Frequently Used For Estimating Effectiveness                  237
      Was the Program Worth It:  The Economic Efficiency Question.          248

Evaluation in Evolution                                                 249

REFERENCES                                                         249

-------
Introduction
Program evaluation is not something new, having been undertaken since the time
when social policies and programs became recognized as secular matters. Judgments have
always been made about whether prospective or ongoing programs are worth the effort and
resources expended. However, evaluation research—the use of social science research
methods to aid such judgments—is relatively new, becoming common only in the last two
decades. Its use has grown because the assessments of policies and programs have become
more complicated and because social science research methods have matured sufficiently
to handle the technical issues involved. If one of the parents of evaluation research is policy
makers' uncertainty about how best to determine the success of public policy and
programs, then the other parent is the technical development that made evaluation research
credible (at least in principle).
To evaluate something means to make judgments about its value or worth. Evalu-
ation research is research in support of judgments about public programs, usually social
programs. It involves the application of a complex set of research procedures, mainly
based on social science research methods, to the questions generated by the policy
problems arising in the course of program development, implementation, and assessment.
At its best, evaluation research can help policymakers make judgments about the relative
success or failure of programs and policies, whether prospective or in operation. Evalu-
ation research is not, however, a substitute for policymakers' judgments, and responsible
evaluators have no interest in either circumventing the political process or becoming
central players. Put another way, evaluation research is essentially about the provision of
the most accurate information possible in an even-handed manner. Thus, an evaluation
might determine the likely impact of a program providing information about sexually
transmitted diseases to adolescent school children, but leave unaddressed the political
question of whether the schools should make such programs mandatory.1 Likewise, an
evaluation might estimate the degree to which charges imposed for the treatment of waste
water that are proportionate to the degree of pollution would deter manufacturers from
polluting, but be silent on the fairness of such pricing policies. Or an evaluation might
determine that bottle-ban initiatives really reduce litter, but take no position on whether
such bans are an unreasonable interference with a free market.
This paper provides a detailed introduction to the variety of purposes for which
evaluation research may be used and to the range of methods that are currently employed.
Specific examples are given to provide concrete illustrations of both the goals of evaluation
researchers and the methods they use.
Although this paper is intended to be comprehensive in the senseof describing major
uses of evaluation research, it cannot pretend to be encyclopedic. Citations to more
detailed discussions are provided. In addition, there are several general references that
survey the field of evaluation in a more detailed fashion (Suchman, 1967; Weiss, 1972;
Cronbach and Associates, 1980; Rossi and Freeman, 1985; Cronbach, 1982, Cuba and
Lincoln, 1981, Guttentag and Struening, 1975, Cook and Campbell, 1979).
1 For purposes of this paper, the deeper issues surrounding the possibility, or even
desirability, of true objectivity can be sidestepped. Suffice it to say that we do not hold to
the conventional positivism position (see for example, Berk et al., 1985; Berk, 1988).
205

-------
206 A Guide to Evaluation Research Theory and Practice

Key Concepts in Evaluation Research
One needs to know some of the specialized language of evaluation research to
understand this paper. This section introduces the key concepts.
The main intellectual roots of evaluation research are found in the social sciences.
Social science concepts and research methods dominate the field and, correspondingly,
most evaluation specialists have had some social science training. All social science fields
have contributed to the development of evaluation research methods. The best evaluation
research and the best evaluators are multidisciplinary, using an eclectic repertory of
concepts and methods drawn from all of the constituent disciplines.
Policy Space. The substantive roots of evaluation research are deep in policy
concerns. Evaluations are almost entirely confined to issues that are encompassed in
whatever may be the current "policy space." In other words, this means that evaluations
are almost always concerned with making judgments about policies and programs that are
on the current agenda of policymakers (broadly construed to include a wide variety of
"players," not just public officials). Clearly, policy space is time-bound and does not
encompass a permanently fixed set of policies and programs. It shifts and changes over
time.
It is the almost exclusive attention to matters that are included in policy space that
distinguishes the evaluation researcher from the academic social scientist. A good
evaluation researcher knows how to find out what is included in the policy space and what
is not.
Stakeholders. By virtue of its engagement in policy space matters, evaluation
research is saturated with political concerns. The outcome of an evaluation can be expected
to attract the attention of persons, groups, and agencies who hold stakes in the outcome.
These "stakeholders" include policymakers at the executive and legislative levels, the
agency officials who administer the policies or programs under scrutiny, the persons who
deliver the services in question, groups representing the targets or beneficiaries of the
programs, or the targets or beneficiaries themselves, taxpayers, and citizens generally. In
almost all program issues, stakeholders may be aligned on opposing sides, some favoring
the program and some opposed to it. Whatever the outcome of the evaluation may be, there
usually are some who are pleased and some who are disappointed: it is not usually possible
to please all of the stakeholders.
For the most important political issues, all or nearly all of the groups listed above may
appear among the vocal stakeholders. The vocal stakeholders, composed of those who
make their views known, may be more narrowly restricted on typical issues.
As a consequence, an evaluation report ordinarily is not regarded as a neutral
document; rather, it is scrutinized, often minutely, by stakeholders who are quick to discern
how its contents affect their activities. Even when an evaluation is conducted "in house"—
by an agency concerned with its own activities—stakeholders may appear within the
agency to appraise the report's impact on their activities. A clear implication is that
evaluation research should not be undertaken by persons who prefer to avoid controversy,
or who have difficulty facing criticism.
A consequence of the ubiquitous presence of stakeholders is that much greater care
needs to be taken in the conduct of evaluation research than in the conduct of its academic
cousin, basic research. Loose procedures that border on the slipshod will surely come to

-------
A Guide to Evaluation Research Theory and Practice 207

the attention of critical stakeholders and may render an evaluation report vulnerable.
Another consequence is that the conduct of evaluation research often involves careful prior
negotiations with stakeholders. For example, it can impede the evaluation of a school's
educational program if a teachers' organization recommends that its members not
cooperate with the evaluator.
Effectiveness: Three Meanings. In the broadest sense, evaluations are concerned
with whether or not a program or policy is achieving its goals; discerning these goals is an
essential part of the evaluation process, and almost always its starting point This tends to
be difficult because goals and purposes often are vaguely stated, typically in an attempt to
garner as much political support as possible. Programs and policies that do not have clear
and consistent goals cannot be evaluated. (A subspecialty of evaluation research,
evaluability assessment, has developed to uncover the goals and purposes of policies and
programs to judge whether or not they can be evaluated.)
A key concept in evaluation is effectiveness—the extent to which a policy or
program is achieving its goals and purposes. In practice, it should be emphasized that the
concept of effectiveness must always address "compared to what." For marginal
effectiveness the issue is dosage; the consequences of more or less of some intervention
are assessed. For example, one might study whether a long-term program produces
correspondingly more cancer screenings in comparison to a short-term educational
campaign. For relative effectiveness, the contrast is between two or more program
options.2 For example, one might compare the impact of public service announcements on
cancer screenings versus that of mass pamphlet mailings, where both contain the same
educational information. Finally, it is common to consider effectiveness in dollar terms:
cost-effectiveness. Comparisons are made in units of outcome per dollar. For example,
while mass mailings of pamphlets may increase cancer screenings more than public service
announcements, the latter may be more cost- effective because it may cost less to produce
an additional cancer screening using public service announcements.
Validity. All research activities need to achieve validity—results that will stand up
under the scrutiny of the harshest critics. Of course, validity is actually a bundle of goals;
the four that follow are the most critical:
Primarily, valid evaluation research uses valid measures. One must consider
whether the measurement procedures used are likely to measure accurately what they are
intended to, a topic that is sometimes considered under the rubric of "construct validity"
(Cook and Campbell, 1979). For example, a study measuring the impact of the Center for
Disease Control's pamphlet on AIDS, recently mailed to all U.S. households, must use
measures that properly capture what CDC intended to affect in the way of behavioral,
attitudinal, and cognitive responses among the public.
2 While "nothing" may be one of the options (serving as a comparison group, it cannot be
overemphasized that nothing is not nothing (pardon our Zen). At the very least, "nothing"
is likely to be the status quo. Moreover, subjects exposed to the status quo may react in a
variety of ways (e.g., resentment, depression) if they know that others have been exposed
to some innovative intervention. In this instance, the status quo becomes a treatment in the
conventional sense; it does something new to its subjects.

-------
208 A Guide to Evaluation Research Theory and Practice

It is important to stress that questions about measurement quality apply not only to
program outcomes such as "learning," but also to measures of the program (intervention)
itself and to other factors that may be at work (e.g., a child's motivation to learn). For
example, an experiment on the effects of income support payments on criminal recidivism
considered the payments to be the intervention, an incomplete description of the total
caring support that the experimenters gave to the released prisoners along with the
payments.3
Measurement Error. Space limitations prevent a thorough discussion of the mea-
surement issues of evaluation research. At a minimum, evaluation researchers should be
aware of the critical distinction between two kinds of measurement errors: those that are
systematic and those that are random. Measurement may be subject to bias, consisting of
systematic disparities between a measure used and the "true" attribute that is being
measured. This is at the heart of the perennial controversy over whether standardized IQ
tests really measure "general intelligence" without cultural bias. Measures also can be
flawed because of random error or "noise." Whether approached as an "errors in variables"
problem as in econometric literature (e.g., Kmenta, 1971:309-22), or as a "latent variable"
problem as in psychometric literature (Lord, 1980), or as the "underadjustment" problem
in the evaluation literature (e.g., Campbell and Erlebacher, 1970), random error can lead
to decidedly nonrandom distortions in evaluation results. The role of random measurement
error is sometimes addressed through the concept of "reliability."
Systematic errors lead a measurement device to produce biased readings, by either
over- or underestimating. In contrast, random errors lead to readings that are variable but
unbiased, just as likely to over- as to underestimate.
Causality. Many evaluation questions concern causal relations, e.g., whether or not
a proposed program encouraging people not to use wood-burning stoves on high-air-
pollution days will cause reductions in air pollution. The literature on causality and causal
inference is large and currently fraught with controversy (e.g., Pratt and Schlaifer, 1984;
Holland, 1986; Holland and Rubin, 1988; Berk, 1988). Suffice it to say that by a "causal
effect" we mean a comparison between the outcome (following an intervention) compared
to what the outcome would have been had the intervention not been introduced. For
example, the causal effect of a ban on diesel-powered automobiles might be the amount
of nitrogen based pollutants in the air after banning diesel automobile engines compared
to the amount had the ban not been put in place.
From the definition of a causal effect, it should be apparent that in practice, causal
effects cannot be directly observed. One cannot observe the amount of nitrogen based
pollutants in the air simultaneously with and without the ban in place. Rather, causal effects
must be inferred. Thus, one might try to estimate the causal effect of the ban by comparing
air quality before and after the ban. Or one might try to estimate the causal effect of the
3 For example, experimenters often escorted subjects to their employment, checked to see
that they picked up their payment checks, and provided advice on how to retain employ-
ment. Although these additional treatments may not have affected the results, valid
measurement of the treatment should have included these measures in addition to the
payment. (See Rossi, Berk, and Lenihan, 1980.)

-------
A Guide to Evaluation Research Theory and Practice 209

ban by comparing air quality in an area with the ban to the air quality in an area without
the ban. However, in the first case one must assume that no other changes have occurred
that could affect air quality in the interval between the earlier and later observational
periods. In the second case, one must assume that the two areas are otherwise identical with
regard to all factors that could influence air quality. In short, the need to infer causal effects
opens the door to inferential errors.
In practice, therefore, whenever a causal relationship is proposed, alternative
explanations must be addressed and presumably discarded. If such alternatives are not
considered, one may be led to make spurious causal inferences; the causal relationship
being proposed may not in fact exist. Sometimes this concern with spurious causation is
addressed under the heading of "internal validity" (Cook and Campbell, 1979) and, as in
the case of construct validity, is relevant regardless of the stage in a program's life history
(assuming causal relationships are at issue). For example, anyone who claims that an
educational TV program improved the knowledge of those who viewed it must also
consider the alternative explanation that viewers were self-selected persons interested in
the topic who would have picked up the same amount of information in some other way,
if the program were not available.
The consideration of alternative causal explanations for the success of programs is
an extremely important research design consideration (Heckman and Robb, 1985). In the
wood-burning example, an observed change in air pollution after the program went into
effect could be the result of milder weather, improved wood-burning equipment, or by a
rise in cord wood prices that led people to shift to other fuels, rather than changes produced
by the program.
In addition, programs that deal with humans are all more or less subject to problems
of self- selection; often the persons who are most likely to be helped or who are already on
the road to recovery are those most likely to participate in a program. Thus, vocational
training offered to unemployed adults is likely to attract those who would be most apt to
improve their employment situation in any event. Also, program operators sometimes
choose the best among target populations to participate in programs, thereby assuring that
such programs appear to be successful. In other cases, events unconnected with the
program produce improvements that appear to be the result of the program; an improve-
ment in employment for parents, for instance, may make it more likely that their adolescent
children will stay in and complete their high school training. In any case, we will have more
to say about causal inference later.4
Generalizabilitv. Whatever the empirical conclusions resulting from evaluation
research, it is necessary to consider how broadly one can generalize the findings in
question; that is, are the findings relevant to other times, other subjects, similar programs,
4 To anticipate a bit, the evaluator has two sorts of tools at his/her disposal. First, the data
may be collected in a manner that greatly simplifies causal inference. Experiments in
which subjects are assigned at random to experimental and control conditions are a good
example. Second, the data may be analyzed in a fashion that explicitly addresses a set of
specified, alternative causal explanations. Analysis of covariance is a common example.
A good rule of thumb, however, is that a strong data analysis will almost never overcome
a weak research design.

-------
210 A Guide to Evaluation Research Theory and Practice

and other program sites? Sometimes such concerns are raised under the rubric of "external
validity" (Cook and Campbell, 1979), and again, the question is germane to all program
stages regardless of the evaluation method. Thus, even if a quantitative assessment of high
school cholesterol education programs indicates that they do not change the eating patterns
of high school students, this does not mean that adult education programs nwould be
ineffective. Similarly, a descriptive account of why the cholesterol education program did
not work for teenagers may or may not be generalized to apply to adult education programs.
The high school cholesterol education example used here obviously is limited in generality
because health educators know that teenagers are motivated by different things than adults.
However, for other topics, limitations on generalization may not be as obvious.
Standard questions that can be raised about most evaluations are whether the
findings are applicable to other age groups, ethnic groups, cities, regions, agencies, or
school systems besides those in which they were found. Or are the results specific only to
the organizations in which the program was tested? Another issue that arises is whether
a program's results would be applicable to persons who are different in abilities or in
socioeconomic background. For example, Sesame Street was found to be effective with
respect to preschool children from lower socioeconomic families, but was more effective
with children from middle-class families (Cook, et al. 1975). Similarly, curricula that
work well in junior colleges may not be appropriate for students in senior colleges.
Programs that worked well with adults in their middle years may not be effective for the
aged.
There is also the problem of generalizing over time. For example, Maynard and
Murnane (1979) found that transfer payments provided by the Gary Income Maintenance
Experiment apparently increased the reading scores of children from the experimental
families. One possible explanation is that with income subsidies, parents (especially in
single parent families) were able to work less and therefore spend more time with their
children. Even if this is true, it raises the question of whether similar effects would be found
at present, when inflation is taking a smaller bite out of the purchasing power of
households. Finally, it is impossible to introduce precisely the same treatment(s) when
studies are replicated or when programs move from the development to the demonstration
stage. Hence, one is always faced with trying to generalize across treatments that are rarely
identical. In summary, external validity surfaces as a function of the subjects of an
evaluation, the setting, the historical period, and the treatment itself.
Another way of describing this issue of generalization is to consider that programs
vary in their "robustness"; that is, in their ability to produce the same results under varying
circumstances, with different operators, and at different historical times. Clearly a
"robust" program is highly desirable.
Chance. It is always important, whatever one's empirical assessments, that the role
of chance be properly taken into account. When formal, quantitative findings are
considered, this is sometimes addressed under the heading of "statistical conclusion
validity" (Cook and Campbell, 1979), and the problem is whether tests for "statistical
significance" have been undertaken properly. For example, perhaps people who have
viewed television programs about the risks incurred by excessive exposure to the sun
appear subsequently to lower their sun exposure, when compared to persons who have not
seen the television program in question. But no two groups are ever identical: The observed

-------
A Guide to Evaluation Research Theory and Practice                            211

differences in sun exposure may have resulted from chance factors having nothing to do
with the television program. Unless the role of these chance factors is formally assessed,
it is impossible to determine if the apparent program effects are real or illusory.
      Similar issues concerning the role of chance appear in non-quantitative work as well,
although formal assessments of the role of chance are difficult to undertake in such studies.
Nevertheless, it is important to ask whether the reported findings  rest on observed
behavioral patterns that occurred with sufficient frequency and regularity to warrant the
conclusions that they are not simply the result of chance.
      Three types of factors play a role in producing apparent (chance) effects that are not
"real." The first reflects sampling error and occurs whenever one is  trying to make
statements about some population of interest from observations gathered on a subset of that
population. For example, one might be studying a sample of students from the population
attending a particular school, or a sample of teachers from the population of teachers in a
particular school system, or even a sample of schools from a population of schools within
a city, county, or state. Yet although it is typically more economical to work with samples,
the process of sampling necessarily introduces the prospect that any conclusions based on
the sample may differ from  conclusions  that  might have been reached had the full
population been studied instead.  Indeed, one can well imagine obtaining  different results
from different subsets of the population.
      Although any subset that is selected from a larger population for study purposes may
be called a sample, some subsets may be worse than having no observations at all. The act
of sampling must be accomplished according to rational selection procedures that guard
against the introduction of selection bias.  A class of such sampling procedures that yield
unbiased samples are called "probability samples," in which every element in a population
has a known chance of being selected (Sudman, 1976; Kish, 1965).  Probability samples
are difficult to execute  and are often quite expensive, especially when dealing  with
populations that are difficult to locate. Yet there are clear advantages to such samples, as
opposed to haphazard and potentially biased methods of selecting subjects, that probability
samples are almost always to be preferred over less rational methods. (See Sudman [1976]
for examples of relatively simple and inexpensive probability sampling designs.)
      Fortunately, when samples are drawn with probability  procedures, disparities
between a sample and a given population can only result from the "luck of the draw." With
the proper use of statistical  inference, one can  place "confidence  intervals" around
estimates  from probability samples, or ask whether a sample estimate  differs  in a
statistically significant manner from some assumed population value.   In the case of
confidence intervals, one can obtain a formal assessment of how much "wiggle" there is
likely to be in one's sample estimates. In the case of statistical significance tests, one can
reach a decision about whether a sample statistic (e.g., a mean reading score) differs from
some assumed value in the population. For example, if the mean reading score from a
random sample of students differs from some national norm, one can determine if the
disparities represent statistically significant differences.
      A second kind of chance factor stems from the process by which experimental
subjects may be assigned to experimental and control groups. For example, it may turn out
that the assignment process yields an experimental group  that on the average contains
brighter students than the control group. As suggested earlier,  this may confound any

-------
212 A Guide to Evaluation Research Theory and Practice

genuine treatment effects with a priori differences between experimentals and controls;
here the impact of some positive treatment such as self-paced instruction will be artificially
enhanced because the experimentals were already performing better than the controls.
Much as in the case of random sampling, in controlled experiments in which the
assignment to treatment group or control is undertaken with probability procedures, the
role of chance factors can be taken into account. In particular, it is possible to determine
the likelihood that outcome differences between experimentals and controls are statisti-
cally significant. If the disparities are statistically significant, chance (through the
assignment process) is eliminated as an explanation, and the evaluator can begin making
substantive sense of the results. It is also possible to place confidence intervals around
estimates of the treatment effect(s) indicating the likely range of the effects, given that any
estimate is subject to sampling variation.
A third kind of chance factor has nothing to do with research design interventions
undertaken by the researcher (i.e., random sampling or random assignment). Rather, it
surfaces even if the total population of interest is studied and no assignment process or
sampling is undertaken. In brief, if one proceeds with the assumption that whatever may
be the program processes at work, other forces are at work that will have some impact,
though not systematically, on the outcomes of interest. Typically, these are viewed as a
large number of small, random perturbations that on the average cancel one another. For
example, performance on a reading test may be affected by a child's mood, the amount of
sleep the previous night, the content of the morning's breakfast, a recent quarrel with a
sibling, distractions in the room where the test is taken, anxiety about the test's consequences,
and the like. While these each introduce small amounts of variation in a child" s performance,
their aggregate impact is taken to be zero on the average (i.e., their expected value is zero).
Yet since the aggregate impact is only zero on the average, the performance of particular
students on particular days will be altered. Thus, there will be chance variation in
performance that needs to be taken into account. As before, one can apply tests of statistical
inference or confidence intervals. One can still ask, for example, if some observed
difference between experimentals and control is larger than might be expected from these
chance factors, and/or estimate the "wiggle" in experimental-control disparities.
In case it is not clear, statistical conclusion validity speaks to the quality of inferential
methods applied and not to whether some result is statistically significant. Statistical
conclusion validity may be high or low, independent of judgments about statistical
significance. (For a more thorough discussion of these and other issues of statistical
inference in evaluation research, and statistical inference in general, see Berk and Brewer,
1978, Barnett; 1982; Pollard, 1986).
It is important to understand that the critical issues outlined above apply to all
varieties of evaluation research, whether highly quantitative in approach or highly
qualitative.
In summary, evaluation research involves a number of key concepts each cor-
responding to critical questions linked to evaluation design issues. Evaluations are
concerned with policies and programs that are on the public agenda, suffused with political
concerns, relevant to policies and programs that have clearly formulated goals, often
obsessed with effectiveness issues, and designed to enhance outcome validity.

-------
A Guide to Evaluation Research Theory and Practice 213

The Best Possible Strategy. In the next sections, the general issues just raised will
be addressed in more depth. However, before proceeding it is important to note that
practical constraints may intervene in the real world of evaluation research, even when an
ideal marriage is made between the evaluation questions posed and the empirical
techniques employed. Questions of cost, timeliness, political feasibility, and other
difficulties may prevent the ideal from being realized. This in turn will require the
development of a second-best evaluation package (or even a third-best), more attuned to
what is possible in practice. On the other hand, practical constraints do not in any way
justify a dismissal of technical concerns; if anything, technical concerns become even more
salient when less desirable evaluation procedures are employed.

Policy Formation and Program Design Issues
Policy Issues andEvaluation Research. Virtually all evaluation research begins with
one or more policy questions in search of answers. Evaluation research may be conducted
to answer questions that arise during the identification of policy issues, the formulation of
policy responses to such issues, in the design of programs, in the improvement of programs,
and in testing the efficiency and effectiveness of programs that are in place or are being
considered. Specific policy questions may be concerned with how widespread a social
problem is, whether any program can be enacted that will ameliorate a problem, whether
programs are effective, whether a program is producing enough benefits to justify its cost,
and so on.
Fitting Strategy to Problem. A given evaluation problem may be tackled at levels
varying in intensity and thoroughness. When exquisite precision is needed and ample
resources are available, state-of-the-art evaluation procedures may be employed. When
the occasion demands approximate answers and when resources are in short supply,
"rough-and-ready" (and, usually, speedier) procedures can be used. Correspondingly, the
answers supplied by evaluations vary in quality: the findings of some evaluations are more
credible than others, but all genuine evaluations produce findings that are better than
haphazard guesses.
This does not mean that any evaluation can use any means available. The principle
that should be upheld in selecting evaluation procedures is the principle of "best possible,"
given available resources and constraints.5
Given the diversity of policy questions to be answered and the wide variations in
available resources, it should not be surprising that there is no single best way to proceed.
Evaluation research must draw on a variety of perspectives and procedures. Thus,
approaches that might be useful for determining what activities were actually undertaken
under some educational program, for instance, might not be appropriate when the time
comes to determine whether the program was worth the money spent. Similarly,
techniques that may be effective in documenting how a program is functioning on a day-
to-day basis may prove inadequate for the task of assessing the program's ultimate impact.
5 This principal requires far more than lip service. It is all too common to hear in response
to criticism of a slipshod evaluation the lame excuse that "it was the best that could be done
under the circumstances" when in fact technically superior (and sometimes less wasteful)
procedures easily could have been employed.

-------
214 A Guide to Evaluation Research Theory and Practice

The choice among evaluation methods depends in the first place on the particular
question posed; appropriate evaluation techniques must be linked explicitly to each of the
policy questions posed. While this point may seem simple enough, it has been overlooked
far too often, resulting in force- fits between an evaluator's preferred method and the
particular questions at hand. Another result is an evaluation research literature padded with
empty, sectarian debates between warring camps of "true believers." For example, there
has been a long and somewhat tedious controversy about whether assessments of the
impact of social programs are best undertaken with research designs in which subjects are
randomly assigned to experimental and control groups or through theoretically derived
causal models of how the program works. In fact, the two approaches are complementary
and can be effectively wedded (e.g., Rossi, Berk, and Lenihan 1980; Heckman and Robb,
1985).
Secondly, the choice among evaluation methods is conditioned by the resources
available and by the degree to which precision is needed. It is probably overkill to devote
more resources to an evaluation than to the program being evaluated.6 Nor does it make
sense to plan an evaluation that will take several years to complete when the answers it will
supply are needed within a few weeks. The evaluation effort must be tailored to fit the
circumstances; that is, the need for precision in information and the amount of resources
and time that are available.
Finally, evaluations must be tailored to the degree of importance of the issue under
scrutiny. At the one extreme, routine issues concerning potentially low impact programs
probably do not deserve to be evaluated with any degree of care. For example, it would
make very little substantive difference whether soft-steel paper clips were superior or
inferior to plastic paper clips: Hence, it is not worthwhile to invest many (if any) resources
toward evaluating their comparative merits.7 Similarly, it would make little sense to
evaluate a media campaign involving one 30-second television spot broadcast over a small
local station; we know in advance that the campaign would not be sufficiently strong to
leave any appreciable residual effect.
In contrast, policies dealing with central issues and programs that are very expensive
deserve the most careful evaluation possible. A program designed to reduce exposure to
the AIDS virus by saturating the media with messages deserves careful evaluation both
because the issue is a critical one and because the program in question would require a
major allocation of resources.
6 However, one must carefully judge what is at stake. For example, while the costs of an
evaluation may loom larger compared to the costs of the particular program under
consideration, the evaluation findings may have vital implications for many more pro-
grams and for a larger program. In the context of the universe of programs potentially
affected, the evaluation budget may be relatively small.
7 On the other hand, while they might perform similarly, they may have different
environmental implications. Much would depend on the ways the two kinds of clips are
manufactured and on what happens to them when they are thrown away. And, of course,
the issues might be extremely salient to the competitive needs of paper clip manufacturers.

-------
 A Guide to Evaluation Research Theory and Practice                           215

      The Policy Contexts of Evaluation.  To obtain a better understanding of the fit
 between evaluation questions and the requisite evaluation procedures, it is useful to
 distinguish between two broad evaluation contexts: 1) policy and program formation
 contexts, in which policy questions are being raised about the nature and amount of some
 identified problem, whether appropriate policy actions can be taken, and whether programs
 that may be proposed are appropriate and effective; and 2) existing policy and program
 contexts,  in which the issues are whether appropriate policies are being pursued and
 whether existing programs are achieving their intended effects.
      Although these two broad contexts may be regarded as stages in a process that starts
 with the recognition of policy needs and ends with the installation and testing of programs
 designed to meet those policy needs, the unfolding of the policy process may bypass some
 evaluation activities. There are many examples of major programs that have had truncated
 policy formation stages, going straight from the drawing boards of the executive or
 legislature to full-scale operation. For example, Head Start and school lunch programs
 were started with minimum amounts of program testing beforehand. The issue of whether
 Head Start was or was not effective did not surface until some years after the program had
 been in place. Similarly, many programs apparently never get beyond the testing stage,
 either by being shown to be ineffective or politically troublesome (e.g., contract learning,
 Gramlich and Koshel, 1975) or because the policy issues to which they were addressed
 shifted in the meantime (e.g., the case of negative income tax proposals, Rossi and Lyall
 1974).
      We do not mean to imply—by the organization of this section—that policymakers
 always ask each of the questions raised in the order that we addressed them. The questions
 are arranged from general to specific, but that is an order we have imposed and it is not
 intended to be a description of typical sequences. For example, research that uncovers the
 extent and depth of a social problem may spark the need for policy change, rather than vice
 versa, as may appear to be implied in this section.
      Policy Formation and Design Stage. Proposals for policy changes and new programs
 presumably arise as the result of dissatisfaction with existing policy, existing programs, or
 out of the realization that a problem exists for which a new policy and program may be an
 appropriate remedy. Policymakers and administrators need information that would make
 the policy and accompanying programs relevant to the  problem and efficacious in
 providing at least some relief from the burdens imposed by the problem.
      It is important that the previous paragraph not be misunderstood. For example, we
 do not presuppose that the solutions sought by policymakers  will solve the social problems
 in question as seen in some objective sense, but only that the problem as experienced and
 understood by the policymaker is to be addressed.   Thus, from the perspective of
 policymakers, eradicating poverty may not be the goal so much as lowering the level of
 expressed concern with the problem of poverty, as experienced by the decisionmakers.
      It is also important to stress  that defining a social problem is ultimately a political
process whose outcomes do not simply flow from an assessment of available information.
While it would be difficult to argue against providing the best possible data for potential
areas of need, there is no necessary correspondence between patterns in those data and what
eventually  surfaces as a subject of concern.  For example, in an analysis  of pending
legislation  designed to reduce adolescent pregnancy, the  General Accounting Office

-------
216 A Guide to Evaluation Research Theory and Practice

(GAO, 1987) found that none of the legislation defined the problem as involving the fathers
of the children in question. Every proposal addressed adolescent pregnancy as if it were
an issue involving only young women. Another example concerns varying definitions of
the problem of water pollution, each with different emphases placed on sources of
pollution, counteracting technical solutions, and end-user solutions. (See also Berk and
Rossi [1976] for a more thorough discussion of problem definition issues.)
In principle and in practice, no useful distinction can be made between the formation
of new policies and programs and the improvement of existing policies and programs. A
proposed improvement is nothing but a proposed change. Correspondingly, the same
evaluation procedures applicable to entirely new policies and programs are suitable for
proposed changes to existing policies and programs. Therefore, the discussion that follows
does not distinguish between them.
Defining the Problem . A political or program issue is a social construction. That
is, a condition that is defined as problematic thereby becomes a problem. Consequently,
the beginning of a political issue consists of defining the problem in question. The
preambles to proposed legislative actions usually recognize this principle by defining the
conditions for which the legislation is designed as a remedy. A legislation program
designed to address a particular problem is necessarily based on some definition or
understanding of the issue involved. For example, two contending legislative proposals
may address the issue of homeless persons, one identifying the homeless as needy persons
who have no kin upon whom to be dependent, and the other defining homelessness as the
lack of access to conventional shelter. The first definition centers attention primarily on
the social isolation of potential clients and the second focuses on their housing arrange-
ments. It is likely that the ameliorative actions that follow will be different as well. The
first might emphasize a program to reconcile alienated persons with their relatives, while
the second might propose a subsidized housing program. The two definitions lead to quite
different proposed programs.
To pursue another example, the presence of hazardous substances in water supplies
may be defined either as a user problem or as a production problem. In the first instance,
appropriate programs might educate users about how to best avoid contaminated water
sources or, alternatively, how best to purify water before consumption. The second
definition might lead to devising surveillance programs of potential polluters and the
setting of sanctions for allowing pollution to take place. Note that these two definitions are
not contradictory: rather, each highlights an aspect of the problem.
The explication of definitions is, of course, not a task for which evaluators are
uniquely trained. Lawyers and judges, textual analysts, and others are trained in laying
open the logical structure and probing the inclusiveness and exclusiveness of definitions.
Yet there is a special role that evaluators can play in this process by analyzing the
implications of definitions for substantive concerns. It is clear that the two definitions of
water pollution given above focus on slightly different (albeit overlapping) phenomena,
but they also contain clues to the underlying factors that are considered to be driving those
processes. Thus, judgments about definition issues may require substantive knowledge
that evaluators often have (or can get easily).
Especially critical in the explication of problem definitions is the relationship
between what is popularly considered to be the problem and the implicit or explicit

-------
A Guide to Evaluation Research Theory and Practice 217

definitions in the legislation addressing the problem. In this connection, the evaluator
ordinarily would refer to legislative proceedings, including committee hearings and floor
debates, journals of opinion, newspaper and magazine editorials, and other sources in
which discussions of the problem may appear. The purpose of this review of sources is to
examine how the problem has been formulated and to delineate as clearly as possible the
set of alternatives that define the policy space for the issue in question.
Certainly an important role for evaluators to play at this stage is to provide for
policymakers' critiques of problem definitions inherent in the proposed policies and
accompanying programs, and to propose alternative definitions that may be more ap-
propriate. For example, an evaluator might point out that defining the teenage pregnancy
problem as primarily one of illegitimate births ignores the large number of births that occur
among married teenagers.
Needs Assessment: Where Is the Problem and How Big Is It? The proper design of
a public program and the projection of its costs requires accurate information on the
density, distribution, and overall size of the problem in question. For example, in providing
financial support for emergency shelters for homeless persons, it would make a significant
difference if the total population of homeless is of a magnitude of 3.5 million or 350,000
(both estimates have been advanced). Whether the problem of homeless persons is located
primarily in large central cities or can be found in equal amounts in both small and large
places also would make an important difference in program design and planning.
An identified problem often is a complex mix of related conditions; planning
requires information on all related factors. For example, the proportions of the homeless
suffering from chronic mental illness, chronic alcoholism, or physical disabilities need to
be known in order to design an appropriate mix of programs.
It is much easier to identify and define a problem than to develop valid estimates of
its density and distribution. For example, a handful of battered children may be enough
to establish that aproblem of child abuse exists. However, to know how much of a problem
exists and where it is located—geographically and socioeconomically—involves obtaining
detailed information about the population of abused children and its distribution throughout
the political jurisdiction in question. Ordinarily, such exact knowledge is much more
difficult to obtain.
Through knowledge of the existing literature (consisting of government reports,
published and unpublished studies, and limited distribution reports) and an understanding
of which designs and methods lead to conclusive results, evaluation researchers are able
to collate and assess whatever information exists on the issues in question. Note that equal
emphasis is given to both collating and assessing: unevaluated information often can be
as bad as no information at all.
For some issues, existing data sources may be of sufficient quality to be used with
confidence. For example, data that are routinely collected either by the Current Population
Survey or the decennial Census often are accurate and trustworthy information sources to
use. Likewise, data available in many of the statistical series routinely collected by federal
agencies are often trustworthy.8 But when data from other sources are used, it is always
8 Unfortunately, there are exceptions. For example, it is widely acknowledged that the U. S.
Census undercounts the number of blacks and Hispanics. For the nation as a whole, the

-------
218 A Guide to Evaluation Research Theory and Practice

necessary to carefully examine how the data were collected. The assessment of data quality
is another a task for which evaluators are eminently qualified.
A good rule of thumb is that existing data sources will provide contradictory
estimates on any issue. But even chaos sometimes can be reduced to some order.
Seemingly contradictory data collected by opposing stakeholders can be especially useful
for needs assessment purposes. For example, both the Coalition Against Handguns and the
National Rifle Association have sponsored sample surveys of the American population,
concerning approval or disapproval of gun control legislation. Although the reports issued
by the Coalition and the NRA differed widely in their conclusions—one finding much
popular support for more stringent gun control measures and the other finding the
opposite—a close inspection of the data showed that many of the specific findings were
nearly identical in the two surveys (Wright et al, 1983). Those findings upon which both
surveys agreed substantially can be regarded with greater credibility.
In many instances, there may be no existing information that can provide estimates
of the extent and distribution of a problem. For example, it is likely that there are no sources
of information about how households use pesticides or about the level of popular
knowledge concerning how such substances can be safely used. Any instance of household
pesticide misuse identifies a problem, but how serious the problem is—in households with
children, for example—may be unclear. It may or may not be the case that the problem is
a lack of knowledge concerning the toxic properties of certain pesticides, or a lack of
knowledge about alternatives used to control household or garden pests. Ordinarily, there
are no sources from which information on such issues can be obtained. Under these
circumstances, an evaluator may wish to undertake a special preliminary study to estimate
the amount and distribution of household pesticide use and the level of popular knowledge
concerning the toxic properties of household pesticides.
Estimating Problem Parameters. There are several ways of estimating a problem's
parameters. Perhaps the easiest to undertake, but also the least reliable, is to rely on expert
testimony. Most of the larger estimates of the size of the homeless population are
essentially compilations of local experts' guesses of the numbers of homeless in their
localities (see US Conference of Mayors, 1987). Another source of information that can
be reliable—but is often unavailable—are the records from organizations that provide
services to the population in question. For example, the extent of drug abuse may be
extrapolated from the records of persons treated in drug abuse clinics. To the extent that
the drug-using community is fully covered by these clinics, such data may be quite
accurate.9
undercount is relatively small and for most purposes can be ignored. However, for some
jurisdictions with large populations of blacks and Hispanics, the undercount translates into
substantial losses of federal funds (since many programs are tied to the size of particular
populations). This has led to a lawsuit by the State of New York in which statistical
adjustments for the undercount have been proposed (Ericksen and Kadane, 1985). In short,
how good the data must be always depends on how those data will be used.
9 It is also the case that if drug-abuse clinics did not cover all or most of the drug-abusing
population, drug-abuse treatment programs may not be an issue. Hence, to the extent that

-------
A Guide to Evaluation Research Theory and Practice 219

In many cases, it may be necessary to conduct research to assess the extent and
amount of a problem. To illustrate, the Robert Wood Johnson Foundation and the Pew
Memorial Trust were trying to plan a program for making medical care more accessible to
homeless persons. Although there was an ample amount of evidence that serious medical
conditions existed among the homeless population in urban centers, there was virtually no
precise information on either the size of the homeless population or the extent to which
medical problems existed in that population. Hence, the foundations funded a research
project to devise technical advances in sample survey methods, in order to collect the
missing information. The result was a research project that influenced most of the
subsequent research on homelessness and has led to changes in plans for the 1990 Census
that will make it possible to arrive at reasonable estimates of the homeless population on
a national basis (Rossi, Fisher and Willis, 1986).
Needs assessment research is usually not as elaborate as the pilot research
described above. In many cases, straightforward sample surveys can provide most of the
necessary information. For example, in planning for educational campaigns to increase
public understanding of the risks associated with hazardous substances, it would be
necessary to have a good understanding of what the current level of public knowledge is
and which population subgroups pose special problems. A national sample survey would
provide the necessary information.10
The number of local needs assessments covering single municipalities, towns, or
counties done every year must no w be in the thousands. For example, the 1974 Community
Mental Health legislation called for community-mental-health needs-assessments to be
undertaken periodically. Last year's McKinney Act mandating aid to the homeless calls
for states and local communities to undertake needs assessments as the basis for planning
programs for the homeless. Also, social impact statements, to be prepared in advance of
large-scale alterations to the environment, often call for estimates of the numbers of
persons or households to be affected or to be served.
The quality of such local needs assessments varies widely but is most likely quite
poor on the average. Especially difficult obstacles lie in the need to devise valid
measurements of relatively subtle social problems (e.g., distrust of food additives, or
mental health). For such problems, unusually high- quality surveying methods are needed,
the resources for which are often simply lacking on the local level.
Qualitative Needs Assessment Approaches. It should be noted that the research
associated with needs assessments can be as inexpensive as copying the relevant information
from printed volumes of the U.S. Census, or as costly as several years of effort in designing,
fielding, and analyzing a large-scale sample survey. Moreover, needs assessments do not
have to be undertaken solely with quantitative techniques. Qualitative research—ranging
the problem is being adequately handled by existing programs, data from such programs
may be useful, but that is not the situation in which data are usually needed.
10 There are many national survey organizations that have the capability to plan, carry out,
and analyze such surveys under contract. In addition, it is often possible to add questions
to an existing national survey, thereby (possibly) reducing costs. It should be noted that
for surveys of a given sample size, national surveys are slightly more expensive than local
surveys.

-------
220 A Guide to Evaluation Research Theory and Practice

in complexity from interviewing a few people in group discussion sessions, as in the focus
group approach, to the more elaborate ethnographic research employed by anthropologists—
may also be instructive, especially in getting detailed knowledge of the specific nature of
the needs in question. For example, the development of educational campaigns may be
considerably aided by qualitative data on the structure of popular beliefs. What, for
instance, are the tradeoffs people believe exist between the pleasures of cigarette smoking
and the resulting health risks?
On the other hand, when the time comes to assess the extent of a problem, there is
usually no substitute for formal quantitative procedures. Stated a bit starkly, qualitative
procedures are likely to be especially effective in determining the nature of the need.
Quantitative procedures are, however, essential to determine the extent of the need.
An especially attractive feature of qualitative approaches is that they appear
inexpensive. Certainly conducting three or four focus group sessions is less costly than
conducting a sample survey. However, the information obtained from focus groups usually
cannot be generalized accurately beyond the highly self-selected focus group participants.
Although needs assessment research is ordinarily undertaken for the descriptive
purpose of developing accurate estimates of the amounts and distribution of a given
problem, such research also can yield some understanding of the problem's underlying
mechanisms. For example, a search for information on how many high school students
study a non-English language may reveal that many schools do not offer such courses;
therefore, part of the problem is that there are not enough opportunities to learn foreign
languages. As another example, the fact that many primary school children of low
socioeconomic backgrounds appear to be tired and listless in class may be explained by a
finding that many eat no breakfast.
Carefully, sensitively conducted qualitative studies are particularly important for
uncovering process information of this sort. Thus, ethnographic studies of disciplinary
problems within high schools may suggest why some schools have fewer disciplinary
problems than others, in addition to providing some indication of how widespread such
disciplinary problems are. The findings on why schools differ might suggest useful ways
in which new programs could be designed. Another example concerns how qualitative
research on household energy-consumption uncovered the fact that few households had
any information on the energy consumption characteristics of their appliances. Without
knowing how they consumed energy, households could not develop efficient strategies for
reducing consumption.
Indeed, the history of ups and downs in public concern for social problems provides
many examples of how qualitative studies (e.g., Lewis, 1965; Liebow, 1967; Riis, 1890;
Carson, 1955), and sometimes novels (e.g., Sinclair, 1906; Steinbeck, 1939) have raised
public consciousness about particular social problems. Sometimes the works in question
are skillful combinations of the qualitative and quantitative information, as in the case of
Harrington (1962), whose Other America contained much publicly available data inter-
laced with graphic descriptions of the living conditions endured by the poor.
Forecasting Needs. For program planning purposes, it is often important to be able
to project current circumstances into the future. A problem that is serious at present, for
instance, may be more or less serious years later. Yet forecasting future trends can be quite
risky, especially as the time horizon lengthens. There are a number of technical and

-------
A Guide to Evaluation Research Theory and Practice 221

practical difficulties, which derive in part from the necessary assumption that the future
will be much like the past. For example, a projection of the number of persons aged 18 to
30 a decade later at first seems easy to construct; the number of persons of that age ten years
hence is almost completely determined by the current age structure of the population.
However, had demographers in central Africa made such a forecast 10 years ago, they
would have been substantially off the mark. They would have failed to anticipate the tragic
impact of the AIDS epidemic, which is most prevalent among young adults. Projections
with longer time horizons would have been even more problematic because trends in
fertility as well as mortality would have to have been included.11
Note that we are not arguing against forecasting. Rather, we are concerned by the
uncritical acceptance of forecasts—acceptance of the information without a thorough
examination of how the forecasts were produced. For example, examining the forecasting
assumptions is a task that can vary considerably in complexity. For simple extrapolations
of existing trends, the assumptions may be relatively few and easily ascertained. However,
even if the assumptions are known, it is often unclear how to determine if the assumptions
are reasonably met. For projections developed from multiple- equation, computer based
models, examining the assumptions may require the skills of an advanced programmer and
the insight of a sophisticated statistician. All forecasts should be reported both as a point
and an interval estimate. The former is typically a single "best" guess, while the latter is
a range of values in which the true (future) value is likely to lie. Yet for a large number
of forecasting models, it is not apparent how proper confidence intervals may be
constructed.
Policy-Oriented Research: Can We Do Anything About a Problem? Diagnosis may
be the first step on the road to treatment. The second step is to understand enough about
the problem and its setting to devise appropriate remedies. That is, knowing a considerable
amount about the distribution and extent of a problem does not automatically lead to
solutions. In order to design programs one must call on two sorts of knowledge. First, one
needs valid knowledge on the leverage points and interventions that are useful for changing
the distribution and extent of a problem. Second, one needs to know—from a variety of
sources—something about the institutional arrangements that are implicated so that
workable policies and programs can be designed.12
11 There are a number of other problems that forecasters face. For example, suppose that
a utility company wanted to forecast the demand for electricity 10 years in the future. Since
there is obviously a strong relationship between the number of residential, industrial, and
agricultural customers and the demand for electricity, knowing the numbers of each kind
of customer would provide a basis for instructive forecasts. However, those numbers
would have to be forecasted themselves, since the number of customers affects demand
contemporaneously. These and other problems are discussed in a broad social science
context by Berk and Cooley (1987).
12 This conception of policy-oriented research apparently causes considerable misunder-
standing about the relationships between basic and applied social research. Policy-
oriented research tried to learn how changes in policy can affect the phenomenon in
question. In contrast, knowledge about the phenomenon per se (the province of basic
disciplinary concerns) may have no ready links to what can be done about it. For example,

-------
222 A Guide to Evaluation Research Theory and Practice

For example, applied research in microeconomics has shown repeatedly that
consumers typically will respond to price. All else held constant, they will generally buy
less of a commodity if its price increases. This lesson can be applied to conservation of all
sorts. Yet it has been virtually impossible in many states to institute marginal cost pricing
for water because of political opposition from agricultural users who are being subsidized
by residential and industrial users (Berk et al., 1981).
To take another illustration from water conservation, applied research in social
psychology indicates that people who are likely to conserve believe that others drawing on
the same resource are conserving as well. Yet it is unclear how water consumers who
believe that other consumers are not consuming can be convinced that they are not alone
in their support for conservation efforts. The only consumption data they usually see are
their own (on their bill). One strategy employed by some water districts in California has
been to enclose in each consumer bill a short newsletter reporting aggregate trends in
consumptions by different segments of the community (Berk et al., 1981).
It should be emphasized that to construct a program likely to be adopted by an
organization, one needs to know how to introduce new procedures that would be
undertaken at an appropriate level of effort. Large-scale organizations—schools, facto-
ries, social agencies, and the like—are resistant to change, especially when the changes do
not involve accommodations in reward systems. For example, an educational program that
is likely to work provides positive incentives for school systems, particular schools, and
individual teachers.
Inadequate attention to program organization is one of the more frequent causes of
program- implementation failure. Mandating that a particular program be delivered by an
agency that is insufficiently motivated, poorly prepared, or without personnel with the
appropriate skills to do so is a sure recipe for degraded and weakened interventions.
Indeed, sometimes programs are not delivered at all under such circumstances.
Developing Promising Ideas into Workable Programs. The act of transforming a
promising idea into a workable program is essentially the practice of art rather than of
science. Evaluation research has little to say about how to be creative although it may be
useful as an aid to program development.
For example, during the severe energy crisis of the late 1970s it became clear from
needs assessment research that consumers had little specific knowledge on how their use
of electrical appliances affected their energy consumption levels. Of course, nearly every
consumer knew that keeping their refrigerator doors closed saved electricity and turning
off their electrical burners when not in use would lower electrical consumption, but few
knew that there are wide variations in the energy consumption characteristics of different
refrigerators and electrical stoves. Needs assessment research also showed that most
consumers were quite concerned about energy costs. In short, there was a considerable
a convincing study finding that violent criminals often were abused as children does not
by itself lead to rehabili tation programs for the violent criminals or to concrete interventions
into the homes of abused children. However, such a study might stimulate ideas for the
kinds of policy-oriented research necessary to develop sensible responses. That is, basic
research may provide general clues about where and how to intervene.

-------
A Guide to Evaluation Research Theory and Practice 223

reservoir of motivation to adopt energy conservation measures and notable gaps in popular
knowledge about how best to conserve.
Given these circumstances, there are a variety of programs that could be constructed
to remedy it; for example, price changes that would reward consumers for shifting
consumption away from the high demand periods of the day, and educational programs
urging consumers to lower their thermostat settings during the winter. Furthermore, within
each of these broad categories of programs there are a variety of specific measures. Pricing
schemes, for instance, might be built on marginal price, average price, increasing block
pricing, or other similar ideas. The point is that developing program ideas is not the
outcome of evaluation research, but of artful innovation that links what is known about a
problem (the outcome of evaluation research) to what is known about how to bring about
change.
In contrast, evaluators should feel right at home with pilot studies of how different
pricing mechanisms might be instituted and whether there is any evidence that they might
work. For example, if consumers are to pay at a higher per-unit rate as the amount
consumed increases (e.g., under increased block pricing), some means must be found to
allow consumers to monitor their electricity use in real time. Moreover, the delivery of that
information needs to be studied. These are the kinds of tasks that allow evaluators to earn
their keep.
Likewise, evaluation skills are not relevant to the design of educational television
programs. However, evaluators can make contributions at the development stage. Pilot
testing (pretesting) television programs is a standard procedure in program development.
Educational programs must demonstrate that they can get the intended audiences'attention,
be understood, and produce a predisposition to act in a desired fashion. Pilot versions of
new programs are often tested on small audiences whose responses are carefully monitored.
The elements of a program that repel audiences, lead to misunderstanding, or lead to
undesired behavior can be changed. The program can be finely tuned until pretest-
audience responses are acceptable.
Pretesting can be informal, with pretest audiences selected haphazardly, or involve
more formal research programs. An example of a fairly extensive informal pretesting is
one conducted by the Children's Television Workshop, producers of the highly popular
educational television program, "Sesame Street." The producers monitor volunteer pretest
audiences of preschool children to measure the attention-getting abilities of its episodes.
The producers watch how closely pretest audiences follow the action of the episode being
tested. In addition, the audience is interviewed after each showing to ascertain whether or
not the message of the program was understood clearly. The program's deficiencies are
then rectified and the process is repeated until the program is acceptable to the producers.
At the other extreme, the Lodge Program developed by Fairweather and his
associates employed a highly formalized development testing procedure. The goal of the
program was to return mental patients to life outside the hospital in away that would reduce
their chances of being rehospitalized. Drawing upon social science findings about the
importance of informal group supports, Fairweather and his colleagues took two decades
to develop a technique that could be used by most mental hospitals and was effective in
lowering the patient return rates. The development process consisted of a series of

-------
224                            A Guide to Evaluation Research Theory and Practice

randomized field experiments in which version after version of the program was tested
until an effective version was achieved.
     Thorough pretesting during the development phase can increase the chances of
developing a worthwhile program. However, it is one thing to have a program that works
well with a test audience and quite another to have a program that will work well in practice.
A "Sesame Street" episode  that does well in  a  studio atmosphere has none of the
competition for attention that exists in an ordinary living room. Indeed, an adult-oriented
health  information program, "Feeling Good," that was developed by the Children's
Television Workshop (the producers of "Sesame Street") did well with pretest audiences
but failed to achieve significant audience shares when aired on public TV stations during
prime viewing hours. The test audiences in the studios liked the episodes they viewed, but
the unconstrained audience preferred programs on other channels that were competing
with "Feeling Good."
     The YOAA Problem. Moving from the development phase to the operational phase
usually means moving responsibility from a developing  organization  that is highly
committed to the program to an operating agency whose commitment level may be much
lower.  This has been called the "Can YOAA Do It?" problem:  Can Your Ordinary
American Agency carry out the program with fidelity?
     Often the YOAA problem has been identified as a problem of dealing with large-
scale bureaucracies, a diagnosis that obscures the problem as much as illuminates it. The
issue is whether an operating agency has the appropriately trained personnel, a sufficiently
motivating reward system, and the resources to carry out the program at the desired level
of fidelity. Asking an agency to perform an additional task when its current tasks are
straining its resources, or to undertake a task for which its personnel are not trained, are both
recipes for failure.
     Therefore, it is vital to study program implementation, and descriptive accounts may
be especially valuable. For example, just a few field visits to high schools—which were
supposed to have in operation a widely publicized program designed to raise the academic
motivation levels of poor black children—revealed that the programs existed  mainly on
paper and in the public relations releases of the main sponsor (Murray, 1980).  Similarly,
careful qualitative visits to the sites of the celebrated Cities in Schools Project brought to
light the fact that the projects, as implemented, fell  far short of original designs and
intentions (Murray, 1981).
     It is at this point that it may make sense to start up demonstration programs in which
operating agencies attempt to implement the program. Demonstration programs can be
viewed as another step in development in which attention is centered on the problems that
operating  agencies  encounter in carrying  out a  program.   A prime example is the
"administrative experiment" (a misnomer since  these demonstrations were not truly
experiments) carried out in connection with the proposed housing voucher program. Ten
municipalities were selected to work out procedures for administering housing voucher
programs in their localities and to carry them out for a period of years. The demonstrations
were closely monitored by researchers who carefully noted all the difficulties each of the
ten cities encountered in administering their versions of the program (Struyk and Bendick,
1981).

-------
A Guide to Evaluation Research Theory and Practice 225

Will Some Particular Program Work? The Effectiveness Issue. Once a program has
been fine- tuned and its operational kinks ironed out through demonstrations, the problem
remains of whether it will be effective. To this point, all one has managed to do is document
that the program in question can be implemented wifh sufficient fidelity as a "prototype."
It is important to realize that effectiveness goes far beyond implementation and revolves
around whether a program produces the changes anticipated.13
Effectiveness is rarely obvious for at least two reasons. First, it is often difficult to
distinguish program effects from other major forces affecting the outcome. We addressed
this earlier under internal validity. Second, it is often difficult to distinguish program
effects from chance variation, which as "noise" may mask any program impact. (We
addressed this earlier under statistical conclusion validity.) Furthermore, both problems
are exacerbated by interventions that are typically weak and for that reason unlikely to
produce strong effects.
The reasons for the fact that most interventions are weak raises issues beyond the
scope of this paper (Rossi, 1987). Nevertheless, among the most important explanations
is that the social processes in which interventions are likely to be introduced usually are
shaped by a large number of forces, while the programs introduced rarely address more
than one of these forces. Nutritional behavior, for example, is affected by upbringing,
ethnic background, disposable income, local availability of food products, information
about nutritional issues, subjective estimates of the risks to health and well- being of the
nutritional behavior in question, household composition, the nutritional practices of family
members and peers, chemical dependencies, and many other influences. Yet programs
intended to improve nutrition rarely target more than one of the many possible influences.
To make matters worse, there appears to be no single developmental stage that if
interrupted, will improve nutritional practices effectively.14 There are many handles
controlling eating habits but each handle can control only a small part of this complex
behavior.
When a particular program has been identified that appears to be sensible according
to current basic knowledge in the field, and a reasonable working version has been
developed, the next step is to determine whether it is effective enough to become part of
an agency's ongoing responsibilities. It is at this point that we recommend the use of
randomized controlled experiments in which the candidate programs are tested. Randomized
experiments are desirable (some would say mandatory) because randomly allocating
persons or other units (e.g., classes) to an experimental group (to which the tested program
is administered) or to a control group (from whom the program is withheld) assures that
all the factors that ordinarily affect the educational process in question are, on the average,
distributed identically among those who receive the program and those who do not.15

"Keep in mind that effectiveness may be relative or marginal, and also may take cost into
account.
14 In contrast, consider diseases transmitted by insects (e.g., typhus, malaria, bubonic
plague). If the insect hosts are destroyed, human infection is prevented. An effective way
of eliminating hosts is also an effective way of eliminating the disease.
"Randomization also means that the assumptions for routine significance tests are likely
to be met.

-------
226 A Guide to Evaluation Research Theory and Practice

Therefore, randomization, on the average, eliminates causal processes that may be
confounded with the intervention and enormously enhances internal validity. That is, the
problem of spurious interpretations can be addressed quite effectively.
We advocate the use of randomized experiments at this stage in program develop-
ment because of their scientific merit. (For other assets of randomized experiments see
Berk et al., 1985.) However, this commitment in no way undermines the complementary
potential of more qualitative approaches such as ethnographic studies,16 particularly to
document why a particular intervention succeeds or fails. For example, in designing
educational campaigns involving workshops, qualitative studies can uncover those orga-
nizations whose sponsorship can be most easily obtained. Workshops held after working
hours under the sponsorship of employers may appear to be an efficient strategy except that
interviews with employees may uncover the fact that few would remain after hours for any
purpose. Indeed, a program of proposed workshops to teach better health habits to persons
at risk of coronary heart disease that was based on this strategy attracted no more than a
handful of participants instead of the hundreds that had been planned for.
Ordinarily, developmental experiments should be conducted on a relatively modest
scale, and are most useful to policy needs when they test a set of alternative programs that
are intended to achieve the same effects. For example, it would be useful for an experiment
to test several ways of motivating people to have their homes tested for radon since the end
result would be to provide information on the relative effectiveness of several equally
attractive (a priori) methods.
There are many good examples of field testing through randomized experiments
of promising programs. The five income-maintenance experiments were devised to test
the impact of negative income tax plans, under varying conditions, as substitutes for
existing welfare programs (Kershaw and Fair, 1976; Rossi andLyall, 1976,Robins, 1982;
Hausman and Wise, 1985). The DepartmentofLabortestedtheextensionof unemployment
benefit coverage to prisoners released from state prisons in a small, randomized ex-
periment conducted in Baltimore (Lenihan, 1976). Randomized experiments have also
been used to test national health insurance plans and direct cash subsidies for housing to
poor families.
At issue in most of the randomized experiments was whether the proposed programs
would produce the intended effects and whether undesirable side effects could be kept to
a minimum. Thus the Department of Labor's LIFE experiment (ibid.) was designed to see
whether released felons would be aided in adjusting to life outside prison through increased
employment and lowered arrest rates. The most extended series of developmental
experiments is that reported by Fairweather and Tornatzky (1977). These involved more
than two decades of consistent refinement and retesting, resulting in an efficacious and
replicable treatment that can be implemented in a variety of conditions. Currently
16 An ethnographic study proceeds by careful observation of persons as they function
naturally in their environment. Such observations might require detailed interviewing, or
simply living in that environment. The art of ethnography, especially as practiced by
anthropologists, is a highly disciplined approach including linguistic skills as well as
training in precise and accurate recording of behavior and speech of the persons or groups
being observed.

-------
A Guide to Evaluation Research Theory and Practice 227

underway in three cities are several extensive tests designed to evaluate alternative ways
to lower the incidence of heart disease through improving nutrition. In the environmental
area, six alternative approaches to communicating information about radon have been
tested in New York (Smith et al, 1987).
Practical Developmental Evaluation Approaches. If all of the research activities
described in the preceding pages were undertaken for each and every proposed program
or policy shift, the pace of change in American public programs would be appreciably
slowed. One is forced to admire the devotion, care, and diligence of Fairweather and his
colleagues, who spent two decades designing and testing an effective treatment for
released mental patients. But it is instructive to note that when the Lodge approach finally
had been perfected, psychopharmacological developments and the community mental
health movement had so drastically changed the treatment of mental health patients that
the Lodge approach had become largely irrelevant.17 While Fairweather and his associates
labored carefully and at great length to perfect the Lodge approach, the content of policy
space had shifted to highlight other concerns about the treatment of the mentally ill.
Clearly, practical approaches to program development must take into account all the
constraints on time and resources that are ordinarily confronted. Decades-long development
efforts may be the "right" way, but the practical way must deliver the best possible
information in a timely fashion. There are no hard and fast guidelines about how best to
proceed, although a few general principles may be stated. In general, the greater the
potential impact of the proposed program, the more carefully it should be evaluated before
being implemented. This means that programs that promise to be costly, that may affect
targets adversely, or that deal with the central gnawing problems of society, deserve the
best possible evaluation. Minor programs, in which the loss to society of implementing an
ineffective program is slight, demand less thorough treatment. It is probably the case that
most prospective programs up for evaluation deal with relatively minor changes to existing
programs and therefore deserve lighter prospective evaluations.
Evaluation in support of development has been described above as a chronological
list of procedures. However, that need not be the case. A set of experiments conducted
simultaneously on several alternative programs can telescope the total amount of time
necessary to arrive at a conclusion. Demonstrations of programs can be used for fine-
tuning purposes. In addition, randomized experiments may be foregone when there are
strong indications of effectiveness from nonexperimental evidence.

The Assessment of Ongoing Programs: Accountability Evaluation
Once a program has been enacted and is functioning, one of the main questions
to ask is whether or not the program is appropriately in place. Here the issues are not so
much whether or not the program is producing its intended effects, but whether the program
is simply running in ways that are appropriate, and whether or not problems have arisen
in the field that need to be solved. Programs often have to be fine-tuned in the first few years
"Fairweather's efforts were not totally in vain. The basic understanding gained concerning
what is needed to sustain chronically mentally ill persons outside institutions has made
important contributions to the treatment of deinstitutionalized former patients.

-------
228 A Guide to Evaluation Research Theory and Practice

or so of operation. (Therefore, estimates of effectiveness should be made only when any
necessary "shakedown period" is over.)
Is the Program Reaching the Appropriate Beneficiaries? Assuring that the appro-
priate beneficiaries are covered by a program is often difficult. Sometimes a program is
so poorly designed that it simply does not reach significant portions of the intended
beneficiary population. For example, an educational program designed to reach intravenous
drug users through community institutions such as churches and schools may simply miss
its target population because they do not use the community institutions. A program to
provide food subsidies to children who spend their days in child care facilities may fail to
reach a large proportion of such children if the subsidy regulations exclude child care
facilities that are serving fewer than five children. A very large proportion of children who
are cared for during the day outside their own households are cared for by women who take
a few children into their homes (Abt Associates, 1979).
A thorough needs assessment of child care problems would have brought to light the
fact that so large aproportion of child care was furnished by small-scale vendors, and hence
should have been taken into account in drawing up administrative regulations. However,
the needs assessment might not have been thorough enough. In addition, patterns of the
problem might change over time, sometimes in response to the existence of a program. For
example, it is quite likely that the existence of shelters for battered women increases the
demand for such shelters because the existence of alternatives to remaining in an
oppressive living arrangement lowers the tolerance threshold of battered women. These
examples show the need to review from time to time how many of the intended
beneficiaries are being covered by a program.
Another example concerns the labelling of consumer products. Labels that are
printed in extremely fine print or that use professional jargon may satisfy agency
regulations but may be ignored by most consumers. The labelling program in its
implementation simply does not reach many of its intended beneficiaries.
Experience with social programs over the past two decades has shown that there are
few, if any, programs that achieve full coverage or near full coverage of intended
beneficiaries, especially where coverage depends on positive actions on the part of
beneficiaries. Thus, not all persons who are eligible for Social Security payments actually
apply for them; estimates indicate that up to 15% of all eligible beneficiaries never apply.
AFDC programs only reach about one-half of the families who are eligible. Some intended
beneficiaries may not be reached because the facilities delivering the services are not
accessible to them. A single job training program for the entire state of Iowa that is located
only in Dubuque does not exist, for all practical purposes, for those who live 50 or more
miles from that city.
There is another side to the coverage problem. Some programs cover and extend
benefits to persons or organizations that were not intended to be served. Such unintended
coverage may be impossible to avoid because of the ways in which the program is
delivered. For example, although "Sesame Street" was designed primarily to reach
disadvantaged children, it also turned out to be attractive to advantaged children and to
many adults. There is no way to keep anyone from viewing a television program once
broadcast (nor is it entirely desirable to do so in this case); hence, a successful TV program

-------
A Guide to Evaluation Research Theory and Practice 229

designed to reach some specific group of children may reach not only them but also many
others (Cook et al., 1975).
Although the unintended viewers of "Sesame Street" are reached at no additional
costs, there are times when unintended coverage may severely drain program resources.
For example, while Congress may have wished to provide educational experiences to
returning veterans through the GI Bill and its successors, it was not clear whether Congress
had in mind the subsidization of the many new proprietary educational enterprises that
came into being primarily to supply "vocational" education to eligible veterans. Or, in the
case of the bilingual education programs, many primarily English- speaking children were
found to be program beneficiaries, as some school systems discovered that the special
bilingual classes were an excellent place to tuck away their trouble-making English-
speaking students.
Studies designed to measure coverage are similar in principle to those discussed
under "Needs Assessment" studies earlier. In addition, overcoverage may be studied as
a problem through program administrative records. However, undercoverage often
involves commissioning special surveys.
Program Integrity Research: Are Benefits Being Delivered? When program ser-
vices depend heavily on the agencies' ability to recruit and train appropriate personnel,
retrain existing personnel, or to undertake significant changes in standard operating
procedures, it sometimes affects whether a program will manage to deliver to its target
population that which had been intended. For many reasons the issue of program integrity
often becomes a critical one that may require additional fine- tuning of legislation or
administrative regulations.
Several examples highlight the importance of this issue. Although informational
pamphlets can be provided to medical personnel, pharmacies, and hospitals, the distribu-
tion of such literature to patients is always problematic. It is difficult to motivate medical
personnel to add distribution of these pamphlets to their existing duties. When the
educational program requires special equipment, such as video and audio cassettes,
delivery of the program can be even more difficult.
In other cases, the right services are being delivered—but at a level that is too low
to make a significant impact on beneficiaries. Thus, a supplementary reading instruction
program that, on the average, results in only a mere additional 40 minutes per week of
reading instruction, is hardly being delivered at sufficient strength and quantity to make
any difference in reading progress.
We have used medical services as an illustration because they involve loosely
coupled organizations in which the lines of authority are clear but weak because of the
autonomy given to the professional personnel. Similar situations exist in almost all human
service organizations, such as police departments, courts, welfare departments, and
schools. In all such organizations it is difficult to control what is occurring at the point of
service delivery because of the discretion and autonomy given to service workers. For
example, exhorting or even requiring doctors and nurses to educate their patients about the
proper use of pharmaceuticals is difficult. It is much easier to regulate the pharmaceutical
industry, a more tightly coupled institutional complex.
Evaluation research designed to measure what is being delivered may be accom-
plished easily or may involve problems of considerable complexity. Thus, it may be very

-------
230 A Guide to Evaluation Research Theory and Practice

easy to learn from hospitals how many persons are served each week by their various
outpatient services, but very difficult to learn precisely what goes on in those contacts
between medical personnel and patients. If one is interested in the kinds of information
provided by physicians and nurses in outpatient care, direct observation would be
necessary, and could be very expensive to implement on a large scale. For example, the
second author was recently involved in an evaluation of efforts to teach literacy as a part
of vocational training. Although only six classes were being studied, two full-time
observers were needed to conduct classroom observation. In addition to the cost problem,
there is the possibility that the presence of observers may alter the behavior of teachers and
students.
One of the best examples of systematic studies in difficult-to-observe situations is
Reiss's (1971) study of police-citizen encounters. Research assistants were assigned to
ride with police on patrol to in order to systematically record each encounter between these
police and members of the public. Reiss's study provides basic descriptive accounts of how
such encounters are generated, how the behavior of citizens affected police responses, and
soon.
A recent example of an excellent implementation study is one conducted on the
mental hospitals that serve the Chicago metropolitan area (Lewis et al, 1987.) The main
issue of the study was to describe how the legislation and rules in place since the 1970s
concerning involuntary commitment to the mental hospitals was working out in practice.
The researchers discovered that fewer than 1% of the patients admitted over a year's time
were involuntarily committed. Observing the court procedures, it was found that many
persons brought to the attention of the police because of their bizarre or aggressive behavior
were offered the choice between voluntarily committing themselves for periods of up to
30 days or being involuntarily committed for 60 days or more. The courts and prosecutors
offered these alternatives because involuntary commitment involves lengthy procedures
that appreciably slow down the transactions that the court processes. Given the choice,
most persons brought in under complaint by the police choose the more lenient alternative.
These practices averted what might have been a very heavy burden on the courts and
prosecutors.
To fine-tune a program, it may not be necessary to proceed on a large scale. For
instance, it may not matter whether a particular implementation problem occurs frequently
or infrequently, because it is not desirable that it occur at all. Thus, small-scale, qualitative
observational studies may be the most fruitful for program fine-tuning. For example, if
qualitative interviews with welfare recipients reveal any instances in which a husband and
wife separated solely to retain or increase their benefit eligibility, one might judge that this
was sufficient evidence that the program rules should be altered to remove the incentive
for separation.
Programs that depend heavily on personnel for delivery, involve complicated
programs, or that call for individualized treatments for beneficiaries are especially likely
candidates for careful and sensitive fine-tuning research. Each of these characteristics—
either alone or in combination—can produce difficulties during implementation. (See
Fairweather and Tornatzky 1977, for an outstanding example of the problematic nature of
complicated individualized human services.)

-------
A Guide to Evaluation Research Theory and Practice 231

Are Funds Being Used Appropriately? Fiscal Accountability. The accounting
profession has been around considerably longer than has program evaluation; therefore,
the procedures for determining whether or not program funds have been used responsibly
and as intended are well established and not problematic. However, fiscal accountability
measurements cannot substitute for the studies mentioned above. The fact that funds
appear to be used as intended may not mean that program services are being delivered as
intended, but only that proper documentation existed for funds expended. The conven-
tional accounting categories used in a fiscal audit are ordinarily sufficient to detect, say,
fraudulent expenditure patterns, but may be insufficiently sensitive to detect whether
services are being delivered at the requisite level of substantive integrity.
Indeed, it is worthy of note that the General Accounting Office has set up a separate
section called the Program Evaluation Methodology Division. One of this Division's
major roles is to instruct GAO personnel in appropriate evaluation procedures and to
undertake evaluations of programs upon request by Congress.
It is also important to keep in mind that the definition of costs under accounting
principles differs from the definition of costs used by economists. For accountants, a cost
reflects conventional bookkeeping entries such as out-of-pocket expenses, historical costs
(i.e., the purchase price of an item), depreciation, and the like. Accountants focus on the
value of current stocks of capital goods and inventories of products, coupled with cash flow
concerns. When the question is whether program funds are being appropriately spent, the
accountant's definition will suffice. However, economists stress "opportunity costs"
defined in terms of what is given up when resources are allocated to particular purposes.
More specifically, opportunity costs reflect the next best use to which the resources could
be put. For example, the opportunity cost of raising teachers' salaries by 10% may be the
necessity of foregoing the purchase of a new set of textbooks. While opportunity costs may
not be especially important from a cost-accounting point of view, they become critical
when cost- effectiveness or benefit-cost analyses of programs are undertaken. We will
have more to say about these issues later.

Program Assessment Evaluation
The evaluation tasks discussed under accountability studies are directed mainly
toward how well a program is running. Whether or not a program is effective is a different
question, one to which answers are not easily provided. Essentially, the question is whether
or not a program achieves its goals over and above what would be expected to happen
without the program.
Many evaluators consider the effectiveness question to be quintessential evaluation.
There is some justification for this position because effectiveness assessment is certainly
more difficult to accomplish, requiring higher levels of skills and ingenuity than any of the
previously discussed evaluation activities. However, there is no justification for interpreting
every evaluation task as calling for an effectiveness assessment Apparently, some
evaluators have done this in the past, aided in their misinterpretation by imprecise requests
for help from policymakers and administrators. Evaluative information on implementation
and coverage can often suffice.

-------
232 A Guide to Evaluation Research Theory and Practice

Nevertheless, in the final analysis, a program that has been successfully placed might
still be ineffective. Estimating a program's degree of effectiveness is the main task to be
described in this section.
Can Effectiveness Be Estimated? The Evaluabilitv Question. A program that has
gone through the stages described earlier in this chapter should provide few obstacles to
evaluation for effectiveness in accomplishing its goals. However, many human-service
programs present problems for effectiveness studies because one or more of several criteria
for evaluation are absent. Perhaps the most important criterion—one which is frequently
absent—is the lack of well-formulated goals or objectives for the program. For example,
a program that is designed to raise the level of learning among certain groups of school
children through the provision of per capita payments to schools for that purpose cannot
be evaluated for its effectiveness without further specification of its goals. "Raising the
level of learning" as a goal must be defined to indicate what is meant by "levels" and the
kinds of learning achievements that are deemed relevant. Goals can often be clarified by
helping program personnel articulate them. This step must be accomplished before
proceeding with an effectiveness evaluation.
A second criterion is that the program in question be well specified. Thus, a program
that is designed to make health education agencies more effective by encouraging
innovations cannot be evaluated for effectiveness. Primarily, the goals are not well
specified, and neither are the means for reaching the goals. Innovation as a means of
reaching a goal is not a method, but a way of proceeding. Anything new is an innovation;
hence, such a program may encourage the temporary adoption of a wide variety of specific
techniques and is likely to vary widely from site to site.
Third, a program can be evaluated from an effectiveness point of view only if it is
possible to estimate what the expected state of the targeted recipients would be in the
absence of the program. As we will discuss below, the critical hurdle in effectiveness
studies is to make comparisons between persons who experienced a program with those
who did not. Hence, a program that is universal in its coverage and has been going on for
a long period of time cannot be evaluated for effectiveness. For example, we cannot
evaluate the effectiveness of the public school systems in the United States because it is
impossible to make observations about American cities, towns, counties, and states that do
not have (or recently have not had) public school systems.
Finally, effectiveness evaluations are the most difficult evaluation tasks undertaken
by evaluators, requiring the most highly trained personnel and often considerable sums of
money for data collection and analysis. Such evaluations should not be undertaken unless
sufficient resources and appropriately trained professionals are available to undertake the
evaluations at the appropriate level. Legislatures and administrators have often mistakenly
requested effectiveness evaluations by agencies that are not prepared to undertake them,
and assumed that the costs of the evaluations would be slight (Raizen and Rossi, 1981).
Unfortunately, there are no hard and fast rules about how much an effectiveness evaluation
should cost or about how much skill may be needed for a given task.
Effectiveness evaluability is discussed here because we believe that evaluators are
often asked to undertake tasks that are impossible or nearly impossible. For example, the
second author was recently asked to design an evaluation of a prosecutorial effort in a
particular county to increase the likelihood that serious drug offenders would be sanctioned

-------
A Guide to Evaluation Research Theory and Practice 233

severely and swiftly. One of the evaluation outcomes was citizens' fear of crime;
presumably swift and severe sanctions would bring down the crime rate, at least for drug
related offenses. Unfortunately, the evaluation was being designed after the program
began; therefore, no pretest of citizen attitudes was possible. Without a pretest measurement,
it is simply impossible to tell whether the program made any difference. As is emphasized
earlier in this paper, there is no substitute for planning evaluations during the program
design phase.
Techniques have been developed (Whole, 1977) to determine whether a program is
evaluable in the senses discussed above. Decisionmakers are well advised to commission
such studies as a first step rather than to assume that all programs can be evaluated.
Evaluability assessments essentially determine whether program goals are sufficiently
well articulated, whether the program is uniform enough throughout the agency in question
to assume a single program, and whether the evaluation results are going to reach the
attention of decisionmakers.
Finally, it is worth mentioning that questions of evaluability have in the past been
used to justify "goal-free" evaluation methods (e.g., Scriven, 1972, Deutscher, 1977). The
goal-free advocates have contended that since many of a program's aims evolve over time,
the "hypothetico-deductive" approach to impact assessment (Heilman, 1980) is at best
incomplete and at worst misleading. In our view, impact assessment necessarily requires
some set of program goals, although whether they are stated in advance and/or evolve over
time does have important implications for one's research procedures (Chen and Rossi,
1980). In particular, evolving goals require far more flexible research designs (and
researchers). In other words, there cannot be such a thing as a "goal-free" impact
assessment. At the same time, we have stressed above that there are other important
dimensions to the evaluation enterprise in which goals are far less central. For example,
a sensitive monitoring of program activities can proceed productively without any
consideration of ultimate goals. Thus, goal- free evaluation approaches can be extremely
useful as long as the questions they address are clearly understood.
Did the Program Work? The Effectiveness Question. As discussed above, any
assessment of whether or not a program was successful assumes that what the program was
supposed to accomplish is known. For a variety of reasons, the legislation establishing
programs often sets relatively vague objectives for the program, making it necessary to
develop specific goals during the design phase. The goals for such general programs may
be developed by program administrators through consideration of social science theory,
past research, and/or studies of the problem that the program is supposed to ameliorate.
However the goals are established, the important point is that it is not possible to
determine whether a program was successful without developing a limited and specific set
of criteria for establishing the condition of "having worked." For example, it would not
have been possible to assess whether "Sesame Street" worked without having decided that
its goals were to foster reading and number-handling skills. Whether these goals existed
before the program was designed or whether they emerged after the program was in
operation is less important for our purposes than the fact that such specific goals existed
at the time of evaluation. By the same token, a public education campaign intended to raise
public consciousness about environmental hazards—but also to have the contradictory

-------
234                            A Guide to Evaluation Research Theory and Practice

goals of reassuring the public and making them worried about such hazards—probably
should not be evaluated until these contradictions are resolved.
     Programs rarely succeed or fail in absolute terms.  Success or failure is always
relative to some bench mark. Hence, an answer to "Did the program work?" requires
considering "compared to what?" Appropriate comparisons can be made in at least three
dimensions: 1) comparisons with different subjects, 2) comparisons with different settings,
and 3) comparisons with different times. In the first instance, one might compare different
sets of persons, varying the setting and the times of comparison. In the second instance, one
might compare the performance of the same set of persons in different settings—for
example, at home and at work. In the third instance, one might compare the same students
in the same setting, but at different points in time.
     Because everyone is familiar with the  different levels of aggregation involved in
school settings—individual students, classes, and schools—and  the time structure of
schools—class  periods,  terms, and academic years—we will use examples in which
students, classes, and classroom periods figure strongly. However, it is important to keep
in mind that the concepts being illustrated are generally applicable; for example, in the
adultpopulation, we can distinguish individuals, households, neighborhoods, and cities for
different levels of aggregation and life cycle  stages for the time periods of adult life.18
     As Figure 1 indicates, it is also possible to mix these three fundamental dimensions
to develop a wide variety of comparison groups.19 For example,  comparison group C2
varies both the subjects and the setting although the time is the same. Comparison group
C6 varies the subjects, the setting, and the time.  However, with each added dimension by
which a comparison groupdiffers from the experimental group, the validity of the resulting
effectiveness estimates necessarily decreases. For example, the use of comparison group
C4 (different setting and different time period) requires that the assessment of program
effectiveness simultaneously take into  account  possible  confounding factors such as
differences in student background and motivation, or the "reactive" potential of different
classroom environments. This in turn requires either an extensive data collection effort to
obtain measures of these confounding factors coupled with the application of appropriate
statistical adjustments (e.g., multiple regression analysis), or the use of randomization and,
thus, true control groups.
      Of course, randomization will, on the average, eliminate confounding influences in
the estimation of impact.  For analytic simplicity alone, it is easy to see why so many
expositions of impact assessment strongly  favor research designs  based on random
assignment. In addition, it should be emphasized that appropriate statistical adjustments
(in the absence of randomization) through multivariate statistical techniques, require a
 "The convenience of using schools lies in the typical school organization, which assigns
 students to classes and classrooms, and instruction to periods. Adults are sometimes found
 outside of households, neighborhoods often do not have distinct boundaries, and human
 life cycle stages have only fuzzy boundaries.
 19 We have used the term "comparison group" as a general term to be distinguished from
 the term "control group." Control groups are comparison groups thathave been constructed
 by random assignment.

-------
A Guide to Evaluation Research Theory and Practice                        235

                              Figure 1

                 A Typology of Comparison Groups
            Same Subjects                  Different Subjects
          Same     Different             Same     Different
          Setting    Setting               Setting    Setting
 Same
 Time    xxa       xxa                  c\            C2
 Different
 Time    C3        C4                   C5
•Although logically possible, these two boxes imply comparison groups that are not
sensible with human subjects.

-------
236 A Guide to Evaluation Research Theory and Practice

number of assumptions that are almost impossible to meet fully in practice.20 For example,
it is essential that measures of all confounding influences are included in a formal model
of the program's impact, that their mathematical relationship to the outcome is properly
specified (e.g.,alinear additive form versusamultiplicativeform),and that the confounding
influences are measured without error. Should any of these requirements be violated, one
risks serious bias in any estimates of program impact.
At the same time, random assignment is often impractical or even impossible.
Furthermore, even when random assignment is feasible, its advantages rest on randomly
assigning a relatively large number of subjects. To randomly assign only two schools to
the experimental group and two schools to the control group, for example, will notproduce,
on the average, equivalence between experimentals and controls.21 Consequently, one is
often forced to attempt statistical adjustments for initial differences between experimental
and comparison subjects. Whether or not such adjustments succeed in performing their
function is always questionable.
The use of multivariate statistical adjustments raises a host of questions that cannot
be addressed in detail here. Suffice it to say that there is a growing consensus among
statisticians that various social scientists have routinely pushed statistical procedures well
beyond where they are designed to go.22 However, as a general rule, multivariate adjust-
ments are justifiable to the extent that appropriate measures are used for the adjustment,
and that such measures are highly reliable and valid—criteria that are not easily satisfied.
To assess the usefulness of impact evaluations not resting on random assignment,
consider a recent evaluation (Robertson, 1980) of the effectiveness of driver education
programs in reducing accidents among 16 to 18-year-olds. The evaluator took advantage
of the fact that the Connecticut legislature decided not to subsidize such programs within
local school systems. In response to this, some school districts dropped driver education
20 There are a number of nonrandomized designs that yield effectiveness estimates of high
validity without random assignment (Cook and Campbell, 1979). One of the strongest is
the regression- discontinuity design, which, under very modest assumptions, guarantees
unbiased estimates of treatment effects (Berk and Rauma, 1983). A discussion of such a
"quasi-experimental" design is beyond the scope of this paper, but it is an important option
when true experiments cannot be conducted. In general, the better randomized designs
cannot be used because the conditions for their proper use are not often met.
21 Since classes are randomized, it is necessary to have relatively large numbers of classes
randomly allocated to the experimental and control conditions to be assured that the two
sets of classrooms are tending to equivalence.
22 See, for example, the Summer 1987 issue of The Journal of Education Statistics. Sta-
tistical procedures have far too often been applied to data that are not even remotely
appropriate, using models that have virtually no convincing justification. Coming under
particular criticism is the use of structural equation models (especially with latent
variables) which regularly outstrip social science data and theory. At this juncture, perhaps
the best advice is to keep one's statistical analyses simple and as close to the data as
possible. For example, multivariate matching, when feasible, may be superior to statistical
adjustment (often based on techniques such as multiple regression) because matching
assumes no functional form between the explanatory/control variable and the outcome.

-------
A Guide to Evaluation Research Theory and Practice 237

from their high school curriculum and others retained it. Two sets of comparisons were
possible: accident rates for persons of the appropriate age range in the districts that dropped
the program, computed before and after the program was dropped; and accident rates for
the same age groups in the districts that retained driver education were compared with the
accident rates in districts that dropped the driver education program. It was found that the
accident rates dropped significantly in those districts that dropped the program, a finding
that might lead one to interpret that the program increased accidents because young people
were enticed to obtain licenses earlier than otherwise. The use of non-randomized
comparison groups is justified in this research because there was some knowledge about
the selection process involved in some school boards dropping the program; in most cases
school boards did so on the basis of financial considerations rather than because the
program was successful or unsuccessful.
It is sometimes possible to either solve or partially bypass comparison group
problems by resorting to some set of external criteria as a baseline. For example, it is
common in studies of desegregation or affirmative action programs to apply various
measures of equity as a"comparison group" (Baldus and Cole, 1977). Thus, an assessment
of whether schools in black neighborhoods are being funded at levels comparable to
schools in white neighborhoods might apply the criterion that disparities in excess of plus
or minus 5% in expenditures per pupil indicate inequality and, hence, failure (Berk and
Hartman, 1972). However, the use of such external baselines by themselves still leaves
open the question of causal inference. It may be difficult to determine if the program itself
or some other set of factors produced the observed relationship between outcomes of
interest and the external measurement.
It is also important to understand that distinguishing between success and failure is
not a clear-cut decision because there are usually degrees of success and failure. While
decisionmakers may have to make binary decisions, for example, to fund or not to fund,
the evidence provided on effectiveness usually consists of statements of degree that have
to be translated into binary terms by the decisionmakers. Thus, a program that succeeds
in raising the average level of reading by half a year more than one would ordinarily
expect—not an inconsiderable gain—may be less successful than one that has effective-
ness estimates of a full year. This quantitative difference has to be translated into a
qualitative difference when the decision to fund one rather than the other program comes
into question. At this point, other considerations may surface, including costs, potential
negative effects, public acceptability, and so on. In short, passing a statistical significance
test does not necessarily mean that a program's effects are substantively significant.
Designs Frequently Used For Estimating Effectiveness. The preceding discussion
of comparison group strategies has, of necessity, been couched in relatively abstract terms.
The actual practice of choosing among such strategies leads to a large variety of practical
research designs; a typology of research designs commonly used for assessing the
effectiveness of programs is shown in Figure 2. There are two main bases for the typology:
1) how the comparison and treatment groups are constituted, and 2) whether or not the
comparison groups are reflexive (i.e., involve comparisons of the subjects with them-
selves). The data-collection strategies usually associated with each research design also
are shown. The last column on the right indicates whether the research design is applicable
to full-coverage programs or to partial-coverage programs. As indicated earlier, ongoing

-------
                       Figure 2

A TYPOLOGY OF RESEARCH DESIGNS FOR IMPACT ASSESSMENT
                                                                        to
                                                                        U)
                                                                        00
RESEARCH
DESIGN

I:



II:




III:

IV:



"True" or
randomized
experiments

Regression
discontinuity



Time Series

Quasi-experi-
ments with
non-random
INTERVENTION
ASSIGNMENT TO
TARGETS
Researcher
controlled random
assignment

Controlled, biased,
but known
selection*3


Uncontrolled
selection
Uncontrolled
selection:3 non-
random assignment
TYPE OF
CONTROLS

Randomization
often with
statistical
controls
Statistical
control
modeling
known
selection bias
Reflexive

Constructed,
statistical
and/or generic0
OUTCOME DATA COLLECTION
POINTS

Minimum = After intervention.
usually before and after, often many
measures during intervention

Minimum = Before and after
intervention



Many measure before and after
intervention
Minimum = After intervention.
usually before and after, often many
measures during intervention
APPLICABILITY

ONLY partial
programs



coverage



Partial coverage
programs







Partial and full coverage
programs
ONLY partial
programs


coverage


                                                                        Q
                                                                        c
                                                                        Si
                                                                        o

                                                                        8

                                                                        W

                                                                        E.
                                                                        e
                                                                        o'
                                                                        3
                                                                        o.
                                                                        o'
                                                                        o>

-------
                                                   Figure 2 (continued)
V:    Panel Studies  Uncontrolled
Reflexive
                                                         More than two measures during
VII:  Time Series    Uncontrolled
                     selection
Retrospective    After intervention with respective
reflexive        measures of before state
Partial and full coverage
Partial and full coverage
programs
                                                                                                                             O
                                                                                                                             c
RESEARCH
DESIGN
INTERVENTION
ASSIGNMENT TO
TARGETS
TYPE OF
CONTROLS
OUTCOME DATA COLLECTION
POINTS
APPLICABILITY

— o.
0
Evaluat
                                                                                                                               '

VI:


Regression
discontinuity

selection
Uncontrolled
selection

intervention
Reflexive Minimum = Before and after
intervention

programs
Partial and full coverage
programs

esearch Theory
                                                                                                                             o.
VIII:  Cross-section  Uncontrolled
      surveys        selection
                                         Statistical       After intervention only
                                                         Partial coverage
                                                         programs
IX:   Judgmental    Uncontrolled        Shadow
      assessments    selection            controls
                                                         After intervention only
                                                         Partial and full coverage
                                                         programs
aln a few quasi-experiments, the control over who will receive the treatment is exercised by the researcher.
^Selection process must be clearly stated and faithfully carried through.

cGeneric controls are known standards, such as average IQ or reading skills of the population.
                                                                                                                             to
                                                                                                                             L»J

-------
240 A Guide to Evaluation Research Theory and Practice

programs intended to cover all of the targeted subjects present special difficulties; in
general, only reflexive controls are applicable.
It is not possible logo into detail here on how each of the designs can be implemented
appropriately. They are ranked from top to bottom roughly in the order of their ability to
produce unbiased effectiveness estimates. Although the more powerful research designs
are generally more expensive, there are notable exceptions including time-series designs.23
The several common approaches to establishing comparison groups are sketched
below:
Randomized Comparisons: Targets are randomly divided into an experimental
group, to whom the intervention is administered, and randomized controls,
from whom the intervention is withheld.

Constructed Comparisons: Targets to whom the intervention is given are matched
with an equivalent group—constructed comparison groups—from whom
the intervention is withheld.

Statistical Comparison: Participant and nonparticipant targets are compared, hold-
ing differences statistically constant between participants and nonparticipants.

Reflexive Comparisons: Targets who receive the intervention are compared with
themselves, as measured before the intervention.

Repeated Measures Reflexive Comparisons: A special case of reflexive controls in
which targets are observed repeatedly over time. Also called panel studies.

Time-Series Reflexive Comparisons: A special case of reflexive controls in which
rates of occurrence of some events are compared before and after the start of
an intervention.

Generic Comparisons: Intervention effects among targets are compared with estab-
lished norms of typical changes occurring in the target population.

Shadow Comparisons: Targets who receive the intervention are compared with the
judgments of experts, program administrators, and/or participants about
what changes are to be ordinarily expected for the target population.

The most severe restriction on strategy choice is whether or not the intervention in
question is being delivered to all (or virtually all) members of a target population. For
programs with total coverage, as in the case of long-standing, ongoing, fully funded
programs, it is not usually possible to identify a group that is not receiving the intervention
23 Time-series designs are typically possible when some agency has collected time-series
data over some lengthy period. Typical time series include stock market prices, unem-
ployment measures, crime rates, and the like. If the full cost of collecting the series is
included, time-series designs would be among the more expensive.

-------
A Guide to Evaluation Research Theory and Practice 241

and that is essentially comparable with the subjects who are beneficiaries. In such cir-
cumstances, the main strategy available is the use of reflexive comparisons. In contrast,
interventions that are to be tested on a demonstration basis ordinarily will not be delivered
to all of the target population. Hence, in the start-up phase new programs are, by definition,
programs with partial coverage. In all likelihood, no program has ever achieved total
coverage of its intended target population. Even in the best programs, there are some
persons who refuse to participate, others who are not aware that they can participate, and
still others who are declared ineligible on technicalities. Nevertheless, many programs
achieve almost full coverage. The Social Security Administration's retirement payments,
for example, reach most of the eligible portions of the population.
Fortunately for our purposes, there are enough programs with full coverage that are
also not uniform over time or over localities. These differences, over time and across
administrative subdivisions, provide the evaluator with some limited opportunities to
assess the effects of variations in the program. Thus, one might not be able to assess what
the net impact of elementary schooling is (as compared to no schooling at all) but one can
assess the differential impact of various kinds of schools and of changes in schools over
time.
These variations in ongoing, established programs occur in a variety of ways;
policies change over time along with their accompanying programs. A program's
administrators may also institute modifications in order to meet some new condition or to
make administration easier. Thus, from time to time, Social Security benefits have been
increased to take into account new conditions or to add new services (e.g., Medicare).
Similarly, sufficient local autonomy may be given to states and local governments so that
a program (e.g., Aid to Families with Dependent Children) may vary somewhat from place
to place. With proper precautions, such "natural variation" may provide a leverage point
for the estimation of program effects.
For partial-coverage programs, a larger variety of strategies are available. If the
program is under the control of the evaluator (as may be the case in new or prospective
programs), the ideal solution is to use randomized comparisons. A set of potential target
subjects, representative of those who might be served if the program goes into effect, are
selected in some unbiased way and randomly sorted into an experimental group and a
comparison or control group. This process of randomization assures probabilistic
equivalence of the beneficiaries receiving the intervention (the experimental group) to
others who are not (the randomized controls). When an evaluator cannot employ
randomization by forming experimental and control groups or conditions, adequate
comparison groups may be formed by uncovered target subjects, if the proper precautions
are taken. The simultaneous consideration of comparison group strategies, intervention
features, and data collection strategies produces the schematic classification of impact
assessment research designs shown in Figure 2. Each of the research designs shown in the
table is discussed below:
• DESIGN I: Randomized "True" Experiments.
"True" experiments are applicable only to partial-coverage programs. The essential
feature of true experiments is the random assignment of treatments to targets and the
random withholding of treatment from targets, so that these constitute, respectively,
an experimental and a control group.

-------
242                            A Guide to Evaluation Research Theory and Practice

     The most elaborate true experiments are longitudinal studies consisting of a series of
     periodic observations of experimental and control groups. Most of the large-scale
     field experiments undertaken over the past two decades to test proposed programs
     have been longitudinal, randomized experiments in which data on participants were
     collected over periods of years. For example, the several negative income tax
     experiments have all employed the same basic longitudinal design, varying one from
     the other in the kinds of treatments tested and in the length of time over which the
     intervention treatments were given, ranging from three to ten years.

     However, preintervention  measures  often are simply indefinable.  For example,
     prisoner rehabilitation experiments that are designed to affect recidivism can be
     based only on postintervention measures, since recidivism  cannot be measured
     before release from prison. Similarly, intervention efforts designed to reduce the
     incidence of disease or accidents have undefined preintervention outcome measures.

     •    DESIGN II: Regression-Discontinuity Studies
     Some programs are administered using a definite and precise set of rules for selecting
     participants. For example, some college fellowship programs allocate fellowships
     on  the basis of  scores received on  standardized tests  (e.g., The National Merit
     Scholarship Test) and food stamp eligibility is determined by income eligibility
     rules. If such rules are followed with reasonable fidelity, it is possible to derive fairly
     accurate estimates of the net effects of the program in question by statistical analyses
     that focus on persons who are at the cutting points used in selection. The analyses
     require that the  rules of selection be administered uniformly and  that valid and
     reliable measures of outcomes be employed.

     Although this approach to studying impact is free of many of the problems associated
     with nonexperimental designs, it is of limited usefulness because few programs are
     administered by selecting participants in  a clear and precise fashion. In addition, the
     required  statistical analysis is considerably  sophisticated and cannot be used by
     persons with only an elementary knowledge of statistics. A detailed discussion of the
     regression-discontinuity design may be found in Trochim (1984).

          DESIGN III: Time-Series Designs
     Time-series designs are based on the analysis of repeated measures taken on an
     aggregate unit (usually a political jurisdiction) with many data points surrounding a
     point in time when a new, full-coverage intervention was introduced or an old
     program  was  substantially modified.   By aggregate statistical series we mean
     periodic measures taken on a relatively large population, such as vital statistical
     series (births, deaths, migrations), usually defined as rates for fairly large popula-
     tions.24
24 Whether the time-series data concern a city, state, or the nation as a whole, only one entity
is under study. Indeed, the basic strategy of time series has been used to study single cases,
as in clinical studies of individual persons.  (See Kadzin, 1982.)

-------
A Guide to Evaluation Research Theory and Practice                            243

     Time-series analyses are especially important for estimating the net impacts of full-
     coverage programs, which present especially difficult problems in impact assess-
     ment because they lack an uncovered target population that might serve as a control.
     However, if extensive, over-time, before-program-enactment observations on out-
     come measures exist,  it is  possible to use the powerful techniques of time-series
     analyses. Thus, it may be  possible to study the effect of the enactment of a gun-
     control law in a particular jurisdiction, but only if the evaluator has access to a
     sufficiently long-term series consisting of crime statistics that track long-term trends
     in gun-related offenses. Of course, for many ongoing interventions, such long-term
     measures do not exist; for example, there are no long-term, detailed time series on
     the incidence of certain acute diseases, making  it difficult to assess the impact of
     Medicare or Medicaid on them.

     Although the technical procedures of time-series analyses are quite complicated, the
     ideas underlying them are quite simple.  The trend before a treatment was put into
     place is analyzed in order to obtain a projection of what would have happened without
     the intervention. The trend after the intervention is then compared to the resulting
     projections, and statistical tests are used to determine whether or not the observed
     postintervention trend is sufficiently different from the projection to infer that the
     treatment had an effect. For example, the effects of changing the pricing policies on
     household water consumption can be studied using time-series analysis by analyzing
     the consumption trends before the pricing policy changes, projecting water
     consumption trends on that  basis,  and comparing actual  consumption with the
     projections (Berk et al., 1981).

     Perhaps the most serious  limitation on time-series designs  is that many prein-
     tervention observations are needed in order to model preintervention time trends
     accurately (more than  30 points in time are recommended). For this reason, time-
     series analyses are usually restricted to outcome concerns for which governmental
     or other groups routinely collect and publish statistics.

     •     DESIGN IV: Ouasi-Experiments with Constructed. Generic, and/or Statistical

     A large class of impact assessment designs consists of nonrandomized "quasi-
     experiments," all of which have in common comparisons between experimental
     groups, created out of targets  who have elected (in some fashion) to participate in a
     program (or have been selected administratively as participants), and constructed
     comparisons, groups of nonparticipants who are in some critical ways comparable
     with the participants.  Such comparisons may be made through the assembly (or
     construction) of groups of nonparticipant targets who resemble closely the group of
     participants. It is critical to select comparison groups that are very similar to each
     other.

     Comparison groups may be constructed by matching each unit in the intervention
     group with a similar unit or by matching aggregate features of the intervention group

-------
244 A Guide to Evaluation Research Theory and Practice

with another group with the same aggregate feature. For example, cities in the
intervention group may be matched each with another city similar in size, regional
location, and economic base. Or individual persons may each be matched in age,
gender, and ethnic background. An example of aggregate matching is to select school
classes with the same averages in age, IQ scores, and ethnic proportions as the group
of students used in the intervention.

Key to the construction of appropriate comparison groups is some prior knowledge
of the important factors on which the intervention group and the comparison group
are to be matched.

Closely related to constructed comparisons are those defined through statistical
analysis. Persons who have not participated in a program are compared to those who
have, using statistical techniques that hold constant known differences between
participants and nonparticipants. Statistical controls are often used, along with
pre-, ongoing, and postmeasures of outcomes in constructed comparison groups.
Indeed, the combined use of constructed controls and statistical controls can often
increase the power of a quasi-experiment considerably.

If statistical controls are used with postmeasures only, then the design is really that
of a cross-sectional survey. (See discussion of Design VIII.) In short, the line
between nonrandomized experiments with constructed controls and one-shot surveys
is often obscure. The important point is that the reasoning involved in both is much
the same: both attempt to estimate net effects by creating control groups that
presumably represent potential targets who were not exposed to the intervention.

Another approach to the comparison group construction problem is to use generic
controls, usually consisting of measurements purporting to represent the typical
performance of targets or the population from which targets may be drawn. Thus,
in judging the performance of school children enrolled in a new learning program,
the participants' scores on a standardized achievement test may be compared with
published general norms for school children of that age or grade. Generic controls
are widely available for some subjects, such as IQ and achievement tests, but for most
subjects are not easily at hand. In any event, generic controls are rarely suitable;
targets are often selected because of the ways in which they differ from the general
population.

DESIGN V: Panel Studies
Panel studies are ones in which the same units are repeatedly measured over time, the
period in question spanning the introduction of an intervention. For example, a panel
may be established before the beginning of an educational campaign, and queried
repeatedly before, during, and after the campaign is put in place. Panel studies are
based on a reflexive control strategy in which the changes in individuals occurring
during the intervention are attributed to be the effects of the intervention. Although
panel studies appear to be a simple extension of before-and-af ter designs (see Design

-------
A Guide to Evaluation Research Theory and Practice                            245

     VI) through the addition of more data collection points, these studies enjoy a
     considerably higher standing in the plausibility order of impact assessments. The
     additional time points, properly employed, allow the researcher to begin to specify
     the processes by which an intervention has impacts upon targets.
     This design is especially important in the study of full-coverage programs.   A
     prominent but controversial example of how this design was used is a study of the
     impact of children's viewing of violence and aggression shown  in television
     programs on their manifestations of aggression toward their classmates. Given the
     circumstances of almost universal television viewing among children and, hence, the
     virtual impossibility of establishing controls who do not watch television, the best
     approach was to study how varying amounts of watching violence and aggression
     affected the display of aggression at some subsequent point in time (Milavsky et al,
     1982).

     In some circumstances, because subjects are repeatedly contacted, panel studies risk
     affecting the subjects through the research effort itself. Thus, in the New York State
     study of communication to households about the dangers of radon gas, subjects
     became more sensitive to the problem simply because they were repeatedly asked
     questions about the topic (Smith et al, 1987).

          DESIGN VI: Before-and-After Studies
     Although few designs have as much intuitive appeal as before-and-after studies, they
     are among the least valid of assessment approaches. The essential feature of abefore-
     and-after study is a comparison of the same targets at two points in time, separated
     by a period of participation in a program, the differences between the two measurements
     being taken as an estimate of the net effects of the intervention. The main deficiency
     of such designs is that ordinarily they cannot disentangle the effects of extraneous
     events occurring during that period from the effects of the intervention. For example,
     a mass educational campaign's effects cannot be easily separated from those caused
     by ordinary media coverage of the same topics.
          DESIGN VII: Retrospective Before-and-After Studies
     The principal feature of this design is that it is based on retrospective reconstructions
     of the state of targets before an intervention along with postintervention measures.
     Typically, people are selected who have participated in a program, and they are asked
     to reconstruct what their circumstances were before they participated in the program.
     For obvious reasons, this design yields estimates of the net effects of programs that
     are even less plausible than straight before-and-after studies based on Design VI. In
     addition to the ambiguities of interpretation caused by uncontrolled-for extraneous
     events, this design also suffers from the problems of using fallible reconstructions of
     the situation before the intervention, relying as it does on possibly faulty recall. For
     these reasons, this design is not recommended for use in any evaluation.

-------
246 A Guide to Evaluation Research Theory and Practice

DESIGN VIII: Cross-Sectional Surveys
Cross-sectional surveys are single censuses or sample surveys. They are cross-
sectional in the sense of providing a set of measures—as of aparticular point or cross-
section in time. The typical cross-sectional survey used to provide estimates of net
effects is usually a sample survey of some target population, part of whom have
received a treatment (or participated in a program) and part of whom have not. In
some cases, the cross sectional survey is of target population members who have
received differing amounts of a treatment or who have experienced several variations
of the treatment. Those who received the treatment are compared with those who did
not on postintervention outcome measures, using statistical techniques to hold
constant differences between the two groups.

Although cross-sectional designs are among the less expensive ways to estimate
impact, they are also among the more difficult to carry out rigorously. The critical
problems center on whether sufficient knowledge exists concerning which are the
important factors to hold constant in making statistical comparisons between persons
who have been exposed to an intervention and those who have not. Indeed, in most
cases, an important case can be made that exposure itself, being selective and
voluntary, is an indication of important intervention and comparison group differ-
ences. By definition, this self-selection cannot be held constant in any comparison.
When cross-sectional surveys are used with partial-coverage programs, they are to
be considered a variant of constructed comparison groups. However, using them to
gauge the effectiveness of full-coverage programs that vary from place to place
constitutes a unique application. For instance, there are several studies that attempt
to gauge the effectiveness of gun-control legislation by examining the levels of
restrictions on licensing and gun usage (Krug, 1967; Geisel et al., 1969; Seitz, 1972).
In this case, the states constitute the units, with the observations being rates for
various sorts of crime in a particular year. These studies are not analyses of time
series, but use rates at only one point in time. Note that such impact assessments lead
to estimates of how much of a net effect one variation in the treatment has compared
with others. In the case of gun-control legislation studies, the variations being
assessed are degrees of stringency in state laws. If applied to the study of whether
Medicaid plans of varying levels of generosity affect medical care usage, the same
kind of state comparison assesses the effects of varying levels of generosity but will
not be able to tell whether Medicaid per se has any effect on medical care
consumption.

A variant of the cross-sectional survey may be seen in a design that uses constructed
controls with after-only measures. One of the best known of such studies is the
controversial evaluation of Head Start (Cicirelli et al., 1969), which was based on a
comparison of children in the first grade who had participated in Head Start at
nursery-school-age with first-graders of comparable background in the same or
nearby schools who had not participated. Whether or not a cross-sectional evaluation
was carried out properly is a question that centers on the types of statistical controls
that were employed, which is almost always a matter subject to disagreement. The

-------
A Guide to Evaluation Research Theory and Practice 247

issues involved in the proper design and analysis of one-shot surveys of existing, full-
coverage programs with treatment that varies by site are especially complicated.

• DESIGN IX: Judgmental Assessments
The final design considered in Table 2 is one in which the judgments of some
presumed experts, program administrators, or participants play the largest role in
estimating net impact. In connoisseurial impact assessments, an expert—or connois-
seur—is employed to examine a program, usually through visits to the program site.
The expert gathers data in an informal way and then makes a judgment. Such
judgments may be aided by the use of generic controls- -that is, existing estimates of
what the population as a whole usually experiences—or "shadow" controls, which
are educated guesses about what normal progress is considered to be. Needless to
say, such assessments are the shakiest of all impact assessments.

Equally suspect are impact assessments that rely upon the judgments of program
administrators. Because of their obvious interests in making their efforts appear
successful, such judgments are far from impartial.
In the assessment of some programs, participants' judgments of program success
have been used. These judgments have some validity, especially for programs that
seek to increase participant satisfaction. However, it is usually difficult, if not
impossible, for participants to make judgments about net impact because they do not
have appropriate knowledge to bring to bear on their judgment.

We do not mean to argue against all judgmental assessments; there are circumstances
in which the evaluator can use nothing else. Furthermore, although some might
advise against undertaking any assessment, we believe that some assessment is better
than none. Judgmental designs may be the only type that can be used when few funds
are available; when no preintervention measures exist; when no reflexive controls
can be used; and when everyone is covered by the program and the program is
uniform in place and time—so that neither randomized nor constructed controls can
be used.

Choosing which design to use in an evaluation is difficult. As a general rule, one
should employ the best design possible, given the time and resources available. In addition,
programs that rely heavily on voluntary self-selection should not use designs that cannot
adequately handle self-selection biases.

Was the Program Worth It: The Economic Efficiency Question. Given a program
of proven effectiveness, the next question one might reasonably raise is whether the
opportunity costs of the program are justified by the gains achieved. The same question
might be more narrowly raised in a comparative framework: Is Program A more "efficient"
than Program B?—both being otherwise equally acceptable ways of achieving a particular
goal?
The main problem in answering such questions focuses on establishing a yardstick
for such an assessment. For example, would it be useful to think in terms of dollars spent

-------
248 A Guide to Evaluation Research Theory and Practice

for units of achievement gained, in terms of students covered, or in terms of classes or
schools that come under the program?
The simplest way of answering efficiency issues is to calculate cost-effectiveness
measures, dollars spent per unit of output. Thus, in the case of the "Sesame Street"
program, two cost- effectiveness measures were computed: 1) dollars spent per child-hour
of viewing, a measure of the cost of running the program, and 2) dollars spent per each
additional letter of the alphabet learned, a cost-effectiveness measure that takes into
account the resulting increase in learning. Note that the second measure implies knowing
the effectiveness of the program as established by an effectiveness evaluation.
The most complicated way to answer the efficiency question is to conduct a full-
fledged cost- benefit analysis in which all the costs and benefits are computed. Relatively
few such analyses have been made of social programs because it is difficult to convert all
the costs and all the benefits into the same yardstick terms. In principle, it is possible to
convert into dollars all the costs and benefits of a program; in practice, it is rarely possible
to do so without some disagreement on the value, say, of learning an additional letter of the
alphabet.
An additional problem with full-fledged cost-benefit analyses is that they must take
into consideration the long-run consequences; notonly of the program,butalsoof the long-
term consequences of the next best foregone alternative. This immediately raises the
question of "discounting": the fact that resources invested in a social program today may
produce, over a number of years, consequences that have to be compared with those that
might have resulted from the next best alternative. For example, a vocational program in
inner-city high schools should address (among other things) the program's long-term
impact on students' earnings over their lifetimes. This in turn requires that the costs and
benefits of the program and the next best alternative be phrased in terms of today' s dollars.
Without going into the arcane art of discounting, the problem is to figure out what might
be a reasonable rate of return over the long run for current program investments and
competing alternatives. One can obtain widely varying assessments, depending on what
rate of return is used (Thompson, 1980).

Evaluation in Evolution
The field of evaluation research is scarcely out of its infancy as a social science
activity. The first large-scale field experiments were started in the mid- 1960s. The interest
in large-scale national evaluations of programs had its origins in the War on Poverty. The
art of designing large-scale implementation and monitoring studies is still evolving
rapidly. Concern with the validity status of qualitative research has just begun. Neverthe-
less, the demand for sound program evaluations is growing.
In this context, perhaps the best overall message is to keep it as simple as possible.
Typically, simple programs will be hard enough to design and implement. Simple research
designs usually will be sufficiently demanding. And simple data analyses will likely tax
even the best evaluators. In other words, there is no such thing as a routine evaluation. To
add unnecessary complexity to the burden is to turn a promising opportunity into an almost
certain failure.

-------
A Guide to Evaluation Research Theory and Practice                           249

REFERENCES

Abt Associates. 1979. Child Care Food Program. Cambridge, Massachusetts: Abt As-
sociates.

Baldus, D.C., and J.W. L. Cole.  1977. Quantitative proof of intentional discrimination.
Evaluation Quarterly l(l):53-86.

Barnett, V. 1982. Comparative Statistical Inference. New York: John Wiley.

Becker, H.S.  1958.  Problems of inference and proof in participant studies. American
Sociological Review. 23(6):652-60.

Berk, R.A. 1988. The role of subjectivity in criminal justice classification and prediction
methods.  Criminal Justice Ethics. 6(1),, in press.

Berk, R.A. 1988. Causal Inference for Sociological Data. In the Handbook of Sociology.
N.  Smelser (ed.).  Beverly Hills: Sage Publications.

Berk, R.A., R. Boruch, D. Chambers, P. Rossi, and A. Witte. 1985. Social policy ex-
perimentation: a position paper. Evaluation Review 9,4 (August) 387-429.

Berk, R.A., and M.  Brewer. 1978. Feet of clay in hobnailed boots: an assessment of
statisticalinference in applied research.  In Evaluation  Studies Review Annual. Vol. 3.
T.D. Cook, ed. Pp.  90-214. Beverly Hills: Sage.

Berk, R.A., and T.F. Cooley. 1987. Errors in forecasting social phenomena. Climatic
Change 11(2): 247-265.

Berk, R.A., T.F. Cooley, CJ. LaCivita, and K. Sredl. 1981. Water Shortage: Lessons in
Water Conversation Learned from the Great California Drought. Cambridge, Mass: Abt
Books.

Berk, R.A., and A. Hartman. 1972. Race and class differences in per-pupil staffing ex-
penditures in Chicago elementary schools. Integrated Education  10(l):52-57.

Berk, R.A., and D. Rauma. 1983. Capitalizing on nonrandom assignment to treatments:
a regression discontinuity evaluation of a crime control program. Journal of the American
Statistical  Association.

Berk, R.A., and P.H. Rossi.  1976. Doing good or worse: evaluation research politically
reexamined. Social Problems 23(4):337-49.

Campbell, D.T., and A. Erlebacher.  1970. How regression artifactsin quasi-experimental
evaluations make compensatory education look harmful. In Compensatory Education: A

-------
250                           A Guide to Evaluation Research Theory and Practice

National Debate. J. Hellmuth, ed. pp. 185-210. New York: Brunner/Mazel.

Carson, R.  1955. The Silent Spring. New York. Bantam Books.

Chen, H., and P.H.Rossi. 1980. The multi-goal, theory-driven approach to evaluation: a
model linking basic and applied social science. Social Forces 59:(1): 106-22.

Cicirelli,V.G.,etal. 1969. The Impact of Head S tart. Athens, Ohio. Westinghouse Learning
Corporation and Ohio State University.

Coleman, J., et al. 1967. Equality of Educational Opportunity. Washington: GPO.

Conant, James B. 1959. The American High School Today.  New York:  McGrawHill.

Cook, T., et al. 1975. Sesame Street Revisited. New York:  Russell Sage.

Cook, T., and D. Campbell. 1979. Quasi-Experimentation. Chicago: Rand McNally.

Cronbach, L.J.  1975.  Five  decades of controversy over mental testing.  American
P§ychologM30(l):l-14.

Cronbach, L. J. and Associates. 1980. Towards Reform of Program Evaluation. Menlo
Park,California: Jossey-Bass.

Cronbach.L. J. 1982. Designing Evaluations of Educational and Social Programs. Menlo
Park.California. Jossey-Bass.

Deutscher, 1.1977.  Toward avoiding the goal trap in evaluation research. In Readings in
Evaluation Research.  F. Caro, ed.  pp. 221-38. New York: Russell Sage.

Ericksen,E. P.,andJ. B.Kadane.  1985, Estimating the population in a census year: 1980
and beyond. Journal of the American Statistical Association. VolSO.  98-131.

Fairweather, George, and Louis G. Tornatzky.  1977. Experimental Methods for Social
Policy Research. New York:  Pergamon.

Franke, R.H.  1979. The Hawthorne experiments: review.  American Sociological Re-
view44:(5):861- 67.

Franke, R.H., and J.D. Kaul. 1978.  The Hawthorne Experiments:  first  statistical
interpretation. American Sociological Review 43:(5):623-43.

Geisel, M. S., R. Roll, and R. S. Wettick, Jr. 1969. The effectiveness of state and local
regulation of handguns: a statistical analysis. Duke Law Journal. (August) 647-676.

-------
A Guide to Evaluation Research Theory and Practice                           251

Gramlich,E.M.,andP. Koshel. 1975. Educational Performance Contracting. Washington:
TheBrookings Institution.

Cuba, E., and Lincoln, Y.  1981. Effective Evaluation. Menlo Park, California. Joseey-
Bass.

Guttentag, M., and E. Struening, eds. 1975.  Handbook of Evaluation Research. 2 vols.
BeverlyHills: Sage.

Harrington, Michael.  1962.  The Other America. New York: MacMillan.

Hausman.J. A.,andD. A. Wise. 1985. Social Experimentation. Chicago, Illinois. The
Universityof Chicago Press.

Heckman, J., and R. Robb.  1985. Alternative methods for evaluating  the impact of
interventions, in J.J. Heckman and B. Singer Longitudinal Analysis of Labor Market Data.
New York. Cambridge University Press.

Heilman.J.G. 1980. Paradigmatic choices in evaluation methodology. Evaluation Review
4:(5):693-712.

Holland, P. W. 1986. Statistics and causal inference. Journal of the American Statistical
Association. Vol 81. 945-960

Holland, P.  W., and D.  B. Rubin.  1988.  Causal inference in retrospective studies.
Evaluation Review:  in press.

Kazdin.A.E.  1982. Single Case Research Designs. New York. Oxford University Press.

Kershaw, D., and J. Fair. 1976. The New Jersey Income Maintenance Experiment. New
York: Academic Press.

Kish, L. 1965. Survey Sampling. New York: John Wiley.

Kmenta, J.  1971.  Elements of Econometrics. New York: MacMillan.

Krug, A. S.   1967. The relationship between  firearm licensing laws and crime rates.
Congressional Record. 113 Part 15. July 25. 200060-200064.

Lenihan, K.  1976. Opening the Second Gate. Washington:  GPO.

Lewis, D. A., T. Pavkov, H.  Rosenberg, S. Reed, A. Lurigio, Z. Kalifon, B.  Johnson,
and S.  Riger. 1987.  State Hospitalization Utilization in Chicago. Evanston, Illinois.
Center for Urban Affairs and Policy Research.

-------
252                           A Guide to Evaluation Research Theory and Practice

Lewis, Oscar. 1965. La Vida. New York: Random House.

Liebow, Elliot. 1967. Tally's Corner.  Boston:  Little-Brown.

Lord,F. M.  1980. Applications of Item Response Theory to Practical Testing Problems.
Hillsdale, NJ. Erlebaum.

Mathematical Policy Research.  1980.  Job Corps Evaluated. Princeton: Mathematica.

Maynard,R.A.,andR.J.Murnane. 1979. The effects of the negative income tax on school
performance. Journal of Human Resources 14:(4):463-76.

Mensh, I.N., and J. Henry. 1953. Direct observation and psychological tests in anthropo-
logical field work. American Anthropology 55:(4):461-80.

Milavsky, J. R., H. H. Stipp, R. C. Kessler, and W. S. Rubens.  1982. Television and
Aggression: A Panel Study.  New York. Academic Press.

Murray, Sandra A.  1980.  The National Evaluation of the PUSH for Excellence Project.
Manuscript. Washington:  American Institutes for Research.

Murray, William A.  1981.   Final Report:  Evaluation  of Cities in School Program.
Manuscript. Washington:  American Institutes for Research.
                                                             t,
Nathan, R. F. C. Doolittle and Associates.  1983. The Consequences of Cuts. Princeton.
NJ. Princeton Urban and Regional Research Center.

Pollard, W. E. 1986. Bavesian Statistics for Evaluation  Research. Beverly Hills, Cali-
fornia. Sage Publications.

Pratt. J.W., and R. Schlaifer.  1984. On the nature and discovery of structure. Journal of
the American Statistical Association 79(1): 9-21.

Raizen, S., and P. H. Rossi.  1981.  Program Evaluation in Education: When? How? To
What Ends? Washington. DC: National Academy Press, 1981.

Reiss, Albert E. 1971.  The Police and the Public. New  Haven:  Yale University Press.

Riis, Jacob A. 1890.  How the Other Half Lives. New York: C. Scribner.

Robertson, L.S. 1980.  Crash involvement of teenaged drivers when driver education is
eliminated from high school. American Journal of Public Health 70:(6'):599-603.

Robins, P. K., et al.  1980.  A Guaranteed Annual  Income: Evidence from  A Social
Experiment. New York. Academic Press.

-------
A Guide to Evaluation Research Theory and Practice                          253

Rossi, P.H.  1978.  Issues in the evaluation of human services delivery. Evaluation
Quarterly 2:(4):573-99.

Rossi, P.H. 1987. The iron law of evaluation and other metallic rules. In J. Miller and M.
Lewis (eds). Research in Social Problems and Public Policy. Vol. 4, JAI Press, Greenwich
CN. pp. 3-20.

Rossi, P.H., and B. Biddle.  1966. The New Media and Education.  Chicago:  Aldine.

Rossi, P.H., and Robert Dentler.  1961. The Politics of Urban Renewal:  The Chicago
Findings. New York: Free Press of Glencoe.

Rossi, P. H.,G. Fisher and G.Willis. 1986. The Condition of the Homeless of Chicago.
Amherst, MA and Chicago II.  Social and Demographic Research Institute, University of
Massachusetts, and NORC: A Social Science Research Institute, University of Chicago.

Rossi, P.H., and K. Lyall. 1976. Reforming Public Welfare. New York: Russell Sage.

Rossi, P.H., J. D. Wright, E. Weber-Burdin, and J. Pereira. 1983. Victims of the Envi-
ronment: Loss from Natural Hazards in the United States. 1970-1980. New York.  Aca-
demic Press.

Rossi,P.H.,RichardA.Berk,andBettyeK.Eidson. 1974. The Roots of Urban Discontent.
New York: John Wiley.

Rossi,P.H.,R.Berk,andK.Lenihan. 1980. Money. Work and Crime. New York:  Academic
Press.

Rossi, P.H., H. Freeman. 1985. Evaluation: A Systematic Approach. (Third Edition)
Beverly Hills: Sage.

Scriven.M. 1972. Pros and cons about goal-free evaluation. Evaluation Comment3:d'):l-
4.

Seitz, S. T. 1972. Firearms, homicide and gun control effectiveness. Law and Society
Review. 6. (May) 595-613.

Sinclair, Upton. 1906. The Jungle. New York: Doubleday.

Smith, V. K., W. H. Desvousges, A. Fisher, andF. R. Johnson.CommunicatingRadonRisk
Effectively: A Mid-course Evaluation. Washington, D.C. The Environmental Protection
Agency. 1987 (Publication # EPA-230-07-87-029.)

Steinbeck, John. 1939. The Grapes of Wrath.  New York: Viking.

-------
254                           A Guide to Evaluation Research Theory and Practice

Struyk, R. J. and M. Bendick. 1981. Housing Vouchers for the Poor: Lessons from an
National Experiment.  Washington, D.C. The Urban Institute.

Suchman, E. 1967. Evaluation Research. New York: Russell Sage.

Sudman, S. 1976. Applied Sampling. New York: Academic Press.

Thompson, M. 1980.  Cost-Benefit Analysis. Beverly Hills: Sage.

Trochim, W. M. K. 1984.  Research Design for Program Evaluation: The Regression
Discontinuity Approach. Beverly Hills, California. Sage Publications.

U.S. Conference of Mayors. 1987. The Continuing Growth of Hunger. Homelessness. and
Poverty in U.S. Cities: 1987. Washington D. C. U.  S.  Conference of Mayors.

Wardwell, W. L.  1979. Comment on Kaul and Franke. American Sociological Review
44:(5):858-61.

Weiss, C.  1972. Evaluation Research.  Englewood Cliffs, New Jersey: Prentice Hall.

WholeyJ.S. 1977. Evaluability Assessment. In Evaluation Research Methods. L. Rutman,
ed. pp. 49-56.  Beverly Hills: Sage.

Wright, J. D., P. H. Rossi, and K. Daly. 1983.  Under the Gun: Weapons. Crime and
Violence in America.  New York. Aldine Press.

-------
PARTICIPANTS

-------
                             JOHN AHEARNE

                       Vice President and Senior Fellow
                           Resources for the Future

     Resources for the Future is an independent, nonprofit research organization special-
izing in natural resources, energy, and the environment.  In addition, Dr. Ahearne is
Chairman of the Department of Energy Advisory Committee on Nuclear Facility Safety;
Chairman of the National Research Council Committee on Risk Perception and Commu-
nication; and is on the National Research Council Steering Committee for the Workshop
on Chemical Processes and Products in Severe Reactor Accidents.
     Formerly, Dr. Ahearne served in varying posts as Deputy Assistant Secretary of
Defense for General Purpose Programs; Principal Deputy Assistant Secretary of Defense
for Manpower and Reserve Affairs, including Acting Assistant Secretary; Assistant to the
Secretary of Energy; Deputy Assistant Secretary of Energy for Power Applications; and
Commissioner and Chairman of the Nuclear Regulatory Commission. Recent publications
include "Nuclear Power after Chernobyl," in Science: and "Three Mile Island and Bhopal:
Lessons Learned and Not Learned," in Hazards: Technology and Fairness. Dr. Ahearne
received his Ph.D. in physics from Princeton University.
                          FREDERICK W. ALLEN

                              Associate Director
                           Office of Policy Analysis
                   Office of Policy, Planning and Evaluation
                    U.S. Environmental Protection Agency

     Mr. Allen is Associate Director of EPA's Office of Policy Analysis. In this capacity
he helps to manage an office working on a variety of issues concerning environmental and
health risk. In the past several years he has managed a number of projects designed to
improve the manner in which the Agency communicates with the public about environ-
mental and health risks.  He was also the lead staff member on a major agency task force
which published the widely discussed report, Unfinished Business: A Comparative As-
sessment of Environmental Problems.
     Mr. Allen has been at EPA since 1978.  He has been Acting Director of the Energy
Policy Division, Chief of the Energy Development Branch, and  Staff Director  of the
Interagency Resource Conservation Committee. He has also worked on the staff of the
Secretary of Labor, at the federal Energy Administration, the Cost of Living Council, and
VISTA.
     Mr. Allen earned his B.A. with Honors at Yale University and his M.B.A. at the
Harvard Business School.
                                                                         257

-------
258                                                               Participants

                          ELAINE BRATIC ARKIN

                      Health Communications Consultant

     Following 16 years with the U.S. Public Health Service devoted to developing
communications programs, Mrs.  Arkin left her position as Deputy Director of Public
Affairs to become an independent consultant  She provides marketing and communica-
tions assistance to Federal and other public sector clients including EPA, the National
Cancer Institute, the National Heart, Lung, and Blood Institute, and the Institute for Health
Policy Analysis at Georgetown University.


                        ANN STOUFFER BISCONTI

                                Vice President
                      Research and Program Evaluation
                 U.S. Council for Energy Awareness (USCEA)

     As Vice President, Research and Program Evaluation, Dr. Bisconti is responsible for
public attitude tracking, advertising testing, and evaluation of program effectiveness. She
has over twenty years' experience directing projects ranging from advertising research to
survey research on energy, higher education, human resource development, and health.
She is author of five books and over 30 other publications.
     Dr. Bisconti received her bachelor's degree in sociology and anthropology from
McGill University and her Ph.D. in social science research from Union Graduate School.
Before joining USCEA, she held the positions of Director of the National Center for Allied
Health Leadership, Director of the Washington Office of Higher Education Research
Institute, and Vice President, Human Resources Policy Corporation.


                               CARON CHESS

                              Associate Director
              Environmental  Communication Research Program
                       Cook College, Rutgers University

     Ms. Chess coauthored Improving Dialogues with Communities: A Risk Communi-
cation Manual for Government and has given a variety of presentations and workshops on
the subject. Before moving to academia, she coordinated programs for both advocacy
organizations and government agencies. She played a leadership role in the campaign for
the country's first right-to- know  law, which gave the public access to information about
toxic hazards and has written a book and many articles about the development of such laws.
She was founding Executive Director of the Delaware Valley Toxics Coalition, linking
environmental and labor constituencies.  As Right-to- Know Coordinator for the New
Jersey Department of Environmental Protection, Ms. Chess lead the effort to implement
the state's new right-to-know law. She also laid the groundwork for New  Jersey's
innovative Risk Communication Unit.

-------
Participants 259

THOMAS CHIZMADIA

Director of Corporate Communications and Public Policy
CIBA-GEIGY Corporation

Mr. Chizmadia has been with CIBA-GEIGY Corporation since 1979. In his present
position, he directs internal and external communication for the Corporation. Addressing
environmental issues constitutes a large part of his effort due to their importance to the
chemical industry and CIBA-GEIGY throughout the United States. Prior to assuming his
present position at corporate headquarters, Mr. Chizmadia was Director of Government
Affairs and Communications for the CIBA-GEIGY plant in Toms River, New Jersey. His
tenure there coincided with the EPA Region II pilot public education program on
Superfund.

VINCENT T. COVELLO

Executive Director
Health Effects Institute
Boston, Massachusetts

In addition to his work at the Health Effects Institute, Dr. Covello is a professor at
Columbia University and Director of the Center for Risk Communication. Prior to his
current positions, Dr. Covello was Director of the Risk Assessment Program at the
National Science Foundation and a senior scientist on detail to the White House Council
on Environmental Quality. He has also been a study director at the National Academy of
Sciences and a professor at Brown University.
Dr. Covello received his Ph.D. from Columbia University and his B. A. with honors
from Cambridge University in England. He has edited or authored over twenty books and
numerous articles on various aspects of risk assessment, risk management, and risk
communication. Dr. Covello is on the editorial board of several journals and is currently
President of the Society for Risk Analysis.
CHARLES DARBY

Director
Survey Research, Evaluation, and Analysis
Prospect Associates

In his present position, Mr. Darby provides design and data collection, analysis, and
interpretation services in response to program and communications evaluation needs of
government agencies and private-sector organizations. He has over twenty years experi-
ence in survey and evaluation research focusing on health-related issues. Mr. Darby has
directed a range of quantitative and qualitative research projects from large-scale national
surveys to small-scale message testing and in-depth interviewing projects.

-------
260                                                              Participants

                              CAROL DECK

                        Program Evaluation Division
                   Office of Policy, Planning and Evaluation
                    U.S. Environmental Protection Agency

     Ms. Deck has been on the staff of the Program Evaluation Division since October
1985. Before joining EPA, she was employed by ICF, Incorporated, and the National
Science Foundation. Ms. Deck received her Bachelor's degree from Kalamazoo College
and her Master's degree from Georgetown University. She is a recipient of a 1988/89
Fellowship from the Bosch Foundation to study environmental planning in the Federal
Republic of Germany.


                         ROBERT W. DENNISTON

                Director, Division of Communication Programs
                    Office for Substance Abuse Prevention
            Alcohol, Drug Abuse, and Mental Health Administration
                          U.S. Public Health Service

     Mr. Denniston received his M.A. degree in mass communications and is currently
pursuing a Ph.D. in public communications at the University of Maryland.  Prior to
assuming his present position at the Alcohol, Drug Abuse, and Mental Health Administra-
tion (ADAMHA), Mr. Denniston was Director, Division of Prevention and  Research
Dissemination, National Institute on Alcohol Abuse and Alcoholism. Before that, he held
positions as Chief, Information Projects Branch, National Cancer Institute, and Com-
munications Director, Mayo Comprehensive Cancer Center, Mayo Clinic. Mr. Denniston
is Chair-elect of the American Public Health Association's Section on Alcohol and Drugs.
His other professional affiliations are with the National Council on International Health
and the International Communication Association.
                        WILLIAM H. DESVOUSGES

       Senior Research Economist, Environmental Economics Department
                        Center for Economics Research
                         Research Triangle Institute

     As a Senior Research Economist at RTI for the past eight years, Dr. Desvousges'
particular area of expertise has been risk communication.  His recent projects include
measuring the risk-related impacts of siting a high-level nuclear waste repository,
measuring the effectiveness of alternative radon risk communication materials sent to
2,300 homeowners and supervising 7,000 interviews to measure radon risk perceptions
and to  track expenditures  to reduce those risks, and measuring the effectiveness of a
community-based radon risk communication effort. For these studies, he developed print

-------
Participants 261

and radio public service announcements and designed survey questionnaires, coordinated
data collection, and analyzed survey results.
Dr. Desvousges is also an acknowledged expert in benefits analysis, having
published a book and articles on the subject. He holds a B.A., M.A., and Ph.D. in
economics and was a professor at the University of Missouri-Rolla for five years after
receiving his Ph.D. from Florida State University.
RICHARD A. EISINGER

Assistant Branch Chief of the Human Resources and Housing Branch
Office of Information and Regulatory Affairs
Office of Management and Budget

Possessing extensive knowledge and experience in survey research, program evalu-
ation and regulatory issues, Mr. Eisinger heads a staff that is responsible for overseeing
these functions for various government departments and agencies, including the Department
of Health and Human Services, the Education Department, the Veterans Administration,
and the National Science Foundation. Mr. Eisinger has a bachelor's degree in psychology
from the University of Colorado and a master's in social psychology from the University
of Missouri. He was a Ph.D. candidate in business and industrial psychology at the
University of Maryland.
ANN FISHER

Manager, Risk Communication Program
Office of Policy Analysis
U.S. Environmental Protection Agency

Dr. Fisher joined the U.S. Environmental Protection Agency's Benefits Staff in
1980. Her initial work at EPA was on methods for measuring the benefits from improved
water quality and the role of risk assessment in the decision process. Later she switched
emphases to valuing changes in morbidity and mortality and measuring the benefits of
regulating hazardous wastes. In 1986, she set up EPA's Risk Communication Program for
exploring the use of information programs as potential alternatives to—and complements
of—regulation for reducing risk.
Dr. Fisher is author of numerous articles on risk communication topics in a variety
of publications. She has served as Associate Editor for the Journal of Environmental
Economics and Management and is currently a member of that publication's Editorial
Council. She taught for nine years at the State University of New York—College at
Fredonia and continues to lecture extensively on environmental and risk communication
issues. Dr. Fisher holds a B.A. in mathematics and an M.A. and Ph.D. in economics, all
from the University of Connecticut.

-------
262 Participants

JUNE FLORA

Assistant Professor, Institute for Communication Research
Department of Communication
Stanford University
In addition to her teaching responsibilities, Dr. Flora holds two positions in the
Department of Medicine—Associate Director, Stanford Center for Research in Disease
Prevention, and Director of the Educational Program, Stanford Heart Disease Prevention
Program. She has been coauthor of numerous monographs, book chapters, and presenta-
tions—most recently, "Indicators of Societal Action to Promote Physical Health" in
Individual and Societal Actions for Health Promotion: Strategies and Indicators.
Dr. Flora holds an M. A. and Ph .D. in educational psychology (with sub-specialization
in child development) from Arizona State University, and a B.A. in psychology from
Bridgewater College in Virginia.

VICKI S. FREIMUTH

Director of Health Communication
Associate Professor, Department of Communication Arts and Theatre
University of Maryland, College Park

At the University of Maryland, Dr. Freimuth teaches courses in health communica-
tion, diffusion of innovations, and research methods. Her research focuses on the
dissemination of health information in this country and in developing countries. She is lead
author of a forthcoming book from the University of Pennsylvania Press, Searching for
Health Information: The Cancer Information Service Model. Her publications have ap-
peared in Human Communication Research. Journal of Communication, American
Journal of Public Health, Health Education Research: Theory and Research. In addition,
Dr. Freimuth consults regularly for the National Cancer Institute, the National Heart, Lung
and Blood Institute, the National Institute on Alcohol Abuse and Alcoholism, and the
Agency for International Development. She is Chairperson of the Health Communication
Division of the International Communication Association. Dr. Freimuth holds a Ph.D.
from Florida State University.
HENRY L. GARIE

Assistant Director
Office of Science and Research
New Jersey Department of Environmental Protection

In his present position, Mr. Garie directs the activities of NJDEP's demographic
Information System and the Office of Environmental Health Assessment, which includes
the Risk Assessment Unit, the Risk Communication Unit, and the Risk Reduction Unit.
Prior positions in the Office of Science and Research included Acting Assistant Director,

-------
Participants 263

Research Scientist I and II, and Technical assistant to the Director. Mr. Garie also served
as Principal Biologist in the Office of Cancer and Toxic Substances Research, NJDEP.
Mr. Garie has been author or coauthor of numerous publications including a recent
article, "Overview of the Implementation of a Statewide Worker and Community Right-
to-Know Act," in Hazard Communication: Issues and Implementation. ASTM STP 932.
He has a B.S. in biology and an M.S. in environmental science from Rutgers University.
JAMES A. HARRELL

Deputy Director
Office of Disease Prevention and Health Promotion
Office of the Assistant Secretary for Health
U.S. Department of Health and Human Services

Mr. Harrell has been with the Department of Health and Human Services since 1975
having held positions as S enior Science Analyst; Chief, Program, Policy and Planning; and
Director, National Center on Child Abuse and Neglect. Immediately prior to assuming his
present position, Mr. Harrell served as Director, Planning, Research and Evaluation
Division, Administration for Children, Youth and Families. Not surprisingly, his publi-
cations focus on child abuse and day care issues as they relate to Federal initiatives and
information dissemination.
Mr. Harrell holds master's degrees from Yale University and the University of
Maryland. He has done graduate work in public administration at George Washington
University and participated in the Senior Executive Service Candidate Development
Program at DHHS.
JEANNE HERB

Research Scientist
Division of Science and Research
New Jersey Department of Environmental Protection

Ms. Herb is currently manager of the Risk Reduction Unit, which focuses on
studying technical and policy options for hazardous waste source reduction. She has been
with the New Jersey Department of Environmental Protection for three years and during
that time has participated in implementing a community Right-to-Know program, assisted
in establishing an environmental health assessment program within the Division, and
directed the initial activities of the Risk Communication Unit.
Ms. Herb holds an M. A. in science and environmental journalism from New York
University and a B.S. in environmental science from Rutgers University. Her previous
professional experience includes magazine editing and school teaching.

-------
264 Participants

ROGER KASPERSON

Member, Hazard Assessment Group
Center for Technology, Environment, and Development (CENTED)
Clark University

Dr. Kasperson, who holds his Ph.D. from the University of Chicago, is co-author of
Participation. Decentralization and Advocacy Planning, and co-editor of The Structure of
Political Geography. Water Re-Use and the Cities. Equity Issues in Radioactive Waste
Management, and Nuclear Risk Analysis in Comparative Perspective (in press"). He has
written widely on issues connected with risk assessment and risk management, nuclear
energy policy, and radioactive wastes. For the past seven years, Dr. Kasperson has directed
a series of research projects, funded by the National Science Foundation and the Russell
Sage Foundation, dealing with technological risk management, industrial management of
hazards, and ethical and policy issues involved in occupational safety and health manage-
ment. His current research projects deal with emergency planning around nuclear power
plants, the risks issues and social impacts associated with the siting of radioactive waste
repositories, and risk communication.
Dr. Kasperson has served as consultant to several public and private agencies on
energy and environmental issues. He was a member of the National Research Council's
Board of Radioactive Waste Management and chaired its panel on Social and Economic
Issues in Siting Nuclear Waste Repositories. He has also been Visiting Senior Scientist at
the Beijer Institute in Stockholm, Sweden. Currently, he is on the editorial boards of
Environment and Risk Analysis.
MARK KLINE

Research Associate
Environmental Communication Research Program
New Jersey Department of Environmental Protection

Mr. Kline is in the final phases of the doctoral program in clinical psychology at the
Graduate School of Applied and Professional Psychology atRutgers University. He brings
experience as an individual, marital, and family therapist in community mental health
settings to the fields of risk communication and evaluation. For the past year, Mr. Kline
has worked with Caron Chess and Peter Sandman at the Environmental Communication
Research program on a project for the New Jersey Department of Environmental Protection,
which involves assessing and recommending "quick and easy" evaluation strategies. The
clinical background he possesses has been most helpful in assessing attitudes, emotions
and motivations without the benefit of research tools. As a clinician, he is frequently called
upon to deal with emotional reactions to difficult interpersonal situations in an empathic
and productive manner. This perspective has been of great value in understanding the
dilemmas of risk communicators. Mr. Kline will continue his doctoral program as a
psychology intern at Dartmouth Medical School in 1988-89.

-------
Participants                                                              265

                                 MAXLUM

                Program Manager, Health Education Programs
               Agency for Toxic Substances and Disease Registry
                          Centers for Disease Control

      Dr. Lum entered the government as a White House Fellow with the Office of
Economic Opportunity. He worked for ten years in the Office of Program Evaluation and
Research of the Department of Labor and participated in various evaluation programs for
AID and the World Health Organization. He is currently Program Manager for ATSDR's
Health Education Programs.
      Dr. Lum holds a master's degrees in public administration, with an emphasis on
systems, and a doctorate in education, with a specialty in medical education.


                            DAVID McCALLUM

                                Senior Fellow
                      Institute for Health Policy Analysis
                    Georgetown University Medical Center

      As senior fellow at the Institute for Health Policy Analysis, Dr. McCallum conducts
and develops research on health policy and risk communications and directs the Institute's
program on risk communication. Formerly, Dr. McCallum served as a senior analyst in
the Office of Technology Assessment of the U.S. Congress where he worked on a study
of the impact of technology on aging in America.  He has served in a variety of other
governmental and private agencies examining technology, disease prevention, and public
health. Dr. McCallum received an M.S. in chemical engineering and a Ph.D. in biomedical
engineering from the University of Virginia.


                            JOHN C. McGRATH

                Chief, Communications and Marketing Section
                Communication and Public Information Branch
                   National Heart, Lung, and Blood Institute
                         National Institutes of Health

      Mr. McGrath is currently Chief of the Communications and Marketing Section at the
National Heart, Lung, and Blood Institute. He  serves as the co-project officer on the
Institute's Cardiovascular Risk Factor Public Education Program. His area of emphasis
is public communication campaigns. At the Institute he is also responsible for dealing with
the press. He has worked with several consulting firms supporting federal public education
programs. Mr. McGrath has a master's degree in communications.

-------
266 Participants

LOUIS A. MORRIS

Psychologist and Acting Director
Division of Drug Advertising and Labeling
Food and Drug Administration

In addition to his work at the FDA, Dr. Morris is an Adjunct Professor of Marketing
at the American University and teaches part time at the University of Maryland and Johns
Hopkins University. Dr. Morris is a graduate of Tulane University and has authored over
75 articles, chapters of books, and reports on topics related to drug information for
consumers and health professionals. Recently, Dr. Morris served as a scholar-in-residence
at the Center for Marketing Policy Research at the American University.

WILLIAM D. NOVELLI

President
Porter/Novelli

Mr. Novelli is President and Co-founder of Porter/Novelli, lead agency of the
Omnicom PR Network, which is part of the Omnicom Group (a global organization of
marketing communications agencies). In addition, he is a member of the Communications
Planning Board of the National Cancer Institute and a national board member of CARE.
Mr. Novelli regularly teaches a marketing management course in the MBA program at the
College of Business and Management, University of Maryland. He holds undergraduate
and graduate (master's in communication) degrees from the University of Pennsylvania
and did post-graduate work at New York University.
Mr. Novelli's past experience in marketing and advertising includes positions as
marketing manager for Lever Brothers Company and account manager for Wells, Rich &
Greene advertising agency. He also served as director of advertising and creative services
for both the Peace Corps and ACTION.

MARIA PAVLOVA

National Expert on Toxicology
Office of Toxic Substances
U.S. Environmental Protection Agency

Dr. Pavlova has both a medical degree and a Ph.D in microbiology and public health.
She practiced medicine in Bulgaria, specializing in disease prevention activities. Since
coming to the United S tales in 1969, she has been involved in cancer research and the study
of environmentally and occupationally related disease at the University of Massachusetts
and the Medical Department of Brookhaven National Laboratory. At Brookhaven, Dr.
Pavlova also conducted research on interactions between chemical carcinogens and tumor
viruses.

-------
Participants                                                               267

     At EPA, Dr. Pavlova is Project Coordinator of the EPA Program on Community
Needs Assessment and Educational Resource Development related to Emergency Pre-
paredness and Community Right to Know (SARA, Title in). She is also a member of the
Working Group of the Task Force on Environmental Cancer and Heart and Lung Disease
and serves as Chairperson of the Interagency Group on Public Education and Communi-
cation. In addition to her many presentations and scientific publications, Dr. Pavlova was
Program Coordinator of a pilot education and communication program, "Communicating
Risks," in Toms River, New Jersey.
                            JAMES L. REGENS

                    Associate Professor of Political Science
               Associate Director, Institute of Natural Resources
                            University of Georgia

     Dr. Regens specializes in policy analysis, environmental regulation, and energy
policy. His professional activities and honors include: Science Advisory Board, Living
Lakes, Inc.; Research Fellowship, North Atlantic Treaty Organization, Scientific Affairs
Division, Committee on the Challenges of Modern Society; Acting Director, Center for
Science and Public Policy, University of Georgia; Recipient, James E. Webb Award,
American Society for Public Administration; Recipient, Bronze Medal for Commendable
Service, U.S. Environmental Protection Agency; Chairman, Group on Energy and the
Environment, Organization for Economic Cooperation and Development; Joint Chairman,
Interagency Task  Force on Acid Precipitation;  EPA Representative, Committee on
Materials of the Office of Science  and Technology  Policy, Executive Office of the
President; Assistant Director for Science Policy, Office of International Activities, U.S.
EPA; Senior Technical Advisor to the Deputy Administrator, U.S. EPA; Senior Policy
Analyst, Office of Research and Development, U.S. EPA; Public Administration Fellow,
National Association of Schools of Public Affairs and Administration; Faculty Research
Fellowship, U.S. Department  of Energy, Oak Ridge National Laboratory; Visiting
Research Fellow,  Energy Division, ORNL; Member, State and Local  Government
Subcommittee  of the Committee  on Science, Engineering and Public Policy of the
American Association for the Advancement of Science. Dr. Regens has also served as
consultant to various public and  private groups and is  author/coauthor of over 50
publications and 40 conference presentations.
                              MARILYN RICE

            Regional Advisor in Health Promotion, Health Education
                        and Community Development
        Pan American Health Organization/World Health Organization

     Ms. Rice has extensive experience in designing, initiating, and executing local,
national and international public health, primary health care, and health promotion and

-------
268 Participants

education programs. With fluency in Spanish, French, and Portuguese, Ms. Rice has
provided political and technical leadership in the development of health promotion and
education programs for the thirty-eight countries in the Americas including a plan of action
for women, which she developed and promoted. In addition she has published guidebooks,
newsletters, and technical health manuals for national and international distribution. Ms.
Rice holds a master's degree in health education from Columbia University and a B.A. in
English and sociology from the University of Wisconsin.
ROSE MARY ROMANO

Chief, Information Projects Branch
Office of Cancer Communications
National Cancer Institute

Ms. Romano is a graduate of Manhattanville College, with a master's degree in
community health education from New York University. As a Public Health Educator at
the National Cancer Institute, she is responsible for designing, implementing, and
evaluating programs to reach the public and professionals with cancer information. She
has presented numerous workshops on market research and evaluation. She serves as a
marketing and promotional resource, organizing national conferences and teleconferences
and providing communication consultation within NCI and to outside agencies, organiza-
tions, and groups. Ms. Romano also serves on the American Red Cross Corporate
Communication Advisory Committee.
PETER H. ROSSI

S.A. Rice Professor of Sociology and Research Associate
Social and Demographic Research Institute
University of Massachusetts at Amherst

Dr. Rossi has extensive experience as a social science researcher in addition to his
responsibilities in the classroom as S.A. Rice Professor of Sociology. Selected recent
publications include the Handbook of Survey Research. Evaluation: A Systematic Ap-
proach. "The Iron Law of Evaluation and Other Metallic Rules" in Research in Social
Problems and Public Policy. Armed and Considered Dangerous: A Survey of Felons and
TheirFirearms.TheCondition of the Homelessof Chicago, and "Homelessness: The Nature
and Origin of the Problem" in Homelessness and Health. He has been recipient of the
Common Wealth Award for contributions to sociology, and awards from the Evaluation
Research Society for technical contributions to Evaluation Research.
Dr. Rossi has taught at Harvard, the University of Chicago, and Johns Hopkins and
been the Director of the National Opinion Research Center at the University of Chicago.
He has held elective offices as President, American Sociological Association; Editor, the
American Journal of Sociology and Social Science Research; and Fellow, American
Academy of Arts and Sciences.

-------
Participants 269

MILTON RUSSELL

Professor of Economics and Senior Fellow
Energy, Environment and Resources Center
Waste Management Research and Education Institute
The University of Tennessee

In addition to his present work in academia, Dr. Russell is Senior Economist at the
Oak Ridge National Laboratory. Prior to that, he was Assistant Administrator for Policy,
Planning and Evaluation, U.S. Environmental Protection Agency. During his years at
EPA, Dr. Russell wrote and presented extensively on issues relating to the economic
implications of environmental protection and the role of risk management and assessment
in environmental policy-making. Dr. Russell holds both an M.A. and Ph.D. in economics
and was Professor of Economics for many of the 18 years that he taught the subject.
Dr. Russell has served on numerous energy and economics advisory committees
over the years, as well as holding the positions of Senior Fellow and Director, Center for
Energy Policy Research, Resources for the Future; Senior Staff Economist, Council of
Economic Advisers; and Staff Economist, Federal Power Commission.
JUDITH A. SHAW

Research Scientist
Division of Science and Research
New Jersey Department of Environmental Protection

Ms. Shaw is in her second year with the New Jersey Department of Environmental
Protection. She currently manages the Risk Communication Unit, which focuses on
developing communication models and assisting in the integration of risk communication
strategies into overall management and practice within the NJDEP. Her previous
professional experience includes education, community organizing, and public relations.
Ms. Shaw holds an M.A. in education and community development from the University
of Michigan, a B.S. in elementary education from the University of North Dakota, and a
B.A. in zoology/sociology from Indiana University.
SHELAGH A. SMITH

Public Health Educator/Evaluator
Office of Cancer Communication
National Cancer Institute
National Institutes of Health

In her present position at NCI, Ms. Smith is primarily responsible for designing and
monitoring evaluation of the Office of Cancer Communication's mass media programs in
cancer prevention and patient education. This includes developing guidelines for pretest-
ing and interviewing, as well as evaluation of programs through surveys, case studies, pilot

-------
270 Participants

programs, and focus groups. In addition to responding to public inquiries regarding NCI
survey data, marketing and communications research, and tobacco education materials,
Ms. Smith serves as a liaison with other government as well as non-government groups.
Ms. Smith was previously employed at the Health Care Financing Administration in
Baltimore, Maryland, as a social science research analyst in the Office of Research and
Demonstrations, Division of Health Services and Special Studies. She has given numerous
presentations on issues ranging from funding preventive services to public knowledge of
such illnesses as cancer and sexually transmitted diseases. Ms. Smith received her B.S. in
education from the University of Tennessee and her M.P.H. in health services adminis-
tration from Johns Hopkins School of Hygiene and Public Health.
MILDRED Z. SOLOMON

President, Solomon Associates

A specialist in the design and development of health communications, Ms. Solomon has
over 12 years' experience in developing educational programs for use in diverse settings,
including schools, community organizations, hospitals, and clinics, on subjects as diverse
as nutrition, drug abuse, stress, occupational health, injury control, and sexually transmitted
diseases. She is particularly committed to designing health education interventions that
result in measurable behavior changes as well as changes in knowledge and attitudes. Ms.
Solomon is currently a doctoral candidate in human development at Harvard University.
She has also taken graduate courses in filmmaking and is a producer of award-winning
health education audiovisual materials.
JAMES W. SWINEHART

President, Public Communication Resources, Inc.

In his present position, Dr. Swinehart assists various organizations in planning,
producing, and evaluating mass media programs or campaigns. Prior to that he was
Director of Research for a 24-program television series on health broadcast nationally by
PBS. His major professional interest is planning, production, and evaluation of public
service communication programs (social psychological approaches to communication and
influence and the use of audience research to develop and appraise media campaigns).
At the University of Michigan, where he received a Ph.D. in social psychology, Dr.
Swinehart held faculty appointments in the Survey Research Center, School of Public
Health, and Highway Safety Research Institute. His publications pertaining to evaluation
include titles such as "News about Science: Channels, Audiences, and Effects," "Creative
Use of Mass Media to Affect Health Behavior," and the "Feeling Good" series. He has been
involved in producing and evaluating TV and radio spots, TV programs, films, print ads,
and supplementary materials for many campaigns.

-------
Participants                                                               271

                              NANCY ZAHEDI

                Program Analyst, Program Evaluation Division
                   Office of Policy, Planning and Evaluation
                    U.S. Environmental Protection Agency

Ms. Zahedi received a bachelor's degree from Stanford University and a Master's of Public
Policy from the John F. Kennedy School of Government. Before assuming her current
position in the Program Evaluation Division at EPA, she served as a Peace Corps volunteer
for two years and worked for the Save the Children Federation as a Planning and Evaluation
Coordinator for two years.

-------
INDEX

-------
INDEX

Audience analysis; See also Needs Assessment; Surveys
        Overview of methods 50-53
        Using data 154
Audience Analysis Matrices 51
Audience Information Needs Assessment 51
Audience segmentation 35, 36,68,74,139
Audience motivations 83

Bounce-back cards 155,157
Broadcast Advertisers Reports (BAR) 155

Cancer Information Service (CIS) 183
Causality 208-209
Central location intercept interviews See Intercept interviews
CIBA-GEIGY Corporation, Toms River (NJ) Plant 147,181
Communication Style Survey 58
Communicator assessment
        Overview of methods 57-58
Comparisons See also Study Design; Impact evaluation
        Comparison groups  23,234-235
        Overview of methods 240
Concept development 35; See also Message design.
Concept testing 138,154
Conflict Management Survey 58
Consultant services 103
        Academic partnerships 103
        Criteria for selecting 104
        Directories 108
Cost-effectiveness 247-248

Data See also Needs assessment; Study design; Surveys
        Importance for planning 27
        Sources 152,158,217-218
        Use in message development 153

Eat for Health program 138
ENVIRON Corporation 148
Environmental changes
        Role in risk reduction 65
Environmental Protection Agency (EPA) 163,165
        Office of Toxic Substances 175
        Region II181
                                                                        275

-------
276                                                                      Index

Ethical issues
        Code of ethics 9
        Duty to inform public 6
        Elitism 7
        Individual rights 5,6
        Manipulation vs. deception 7
        Self-inflicted illness 5
Evaluability assessment 207,231-233
        Goal-free evaluation 233
Evaluation
        Benefits of xii-xiii, 89,93,99,135,137
        Costs 104-107
        Criteria 206
        In relation to policy xiii-xiv, 206,213,215-216
        Interpreting findings 28, 117,128,155
        Levels of 21,66,102,117,137,213,227
        Obstacles xiii-xvi, 18,38,47,100,139
        Social context of 196
        Timeframes 117
        Using results xvi, 38

Fear in messages 66,77
Fetal alcohol syndrome 159
Field review (by experts) 155,163
Field testing 181,226; See also Pilot testing.
Focus groups 13, 56,108,138-139,154,157,173-179
Food and Drug Administration (FDA) 171
Formative evaluation 12,21,25-26, 33-37,99-101,138,163; See also Focus
                groups; Concept testing; Pretesting

Generalizability of findings 209
Goals and objectives xiv-xv, 26,27,115,143

Health Belief Model 69
Health Objectives for the Nation 91

Impact evaluation 15,27-28,46,232-237; See also Summative evaluation
Individual interviews 13
Intercept interviews 55-56
Intermediaries 68
        Local groups 145,149
Interpersonal communication 35,58, 59
        Door-to-door campaign 149
        Vs. mass media 83-84

-------
Index                                                                  277

Lead 163
Legislation
        Role in risk reduction 65
Love Canal 191

Marketing research 41
        Directories 108
Maryland Department of the Environment 165
Mass media See also Public Service Announcement
        Selecting 78-79
        Types 79-80
        vs. interpersonal communication 83
Materials development 84-86
        Stages of evaluation 157
Measurement error 208
Meeting evaluations
        Overview of methods 59-60
Meeting Reaction Form 59
Message design
        Cognitive dissonance 68
        Principles of 76-79,84
        Use of fear 66,77
        Using data 154
Midcourse reviews 97
Myers-Briggs Type  Indicator 57

National Cancer Institute (NCI) 137
        Cancer Prevention Awareness Program 173
National Cholesterol Education Program 151,177
National Heart, Lung, and Blood Institute (NHLBI) 151,177,187
National High Blood Pressure Education Program 151,187
Needs assessment 12,35,138,217-200 ; See also Audience analysis
        Forecasting 220
        Qualitative 219
        Sources of data 217-218
New Jersey Department of Environmental Protection (NJDEP) 141
News clippings (for audience analysis) 52
NHLBI Smoking  Education Program  151

Observation and Debriefing 60
Ocean County Citizens for Clean Water 150
Office of Management and Budget (OMB) 127,129
Outcome evauluation 14-15,20,99,101,139,160; See also Summative
                evaluation

-------
278                                                                      Index

Pilot programs 26
Pilot testing 14,140
Planning tools 50, 77
Policy Profiling Questionnaire 50
Pollstart 53
Pretesting  12-13,223-224; See also Focus groups; Intercept Interviews;
                Readability; Theater testing.
        Overview of methods 13,53-56
Process evaluation 14,20,37-38,93-96,99,138
        Delivery of program 224,227-230
        Fiscal accounting 231
        Using results 39
Psychographic survey 139
Public opinion poll See Surveys.
Public Opinion Polling (software) 52
Public service announcements 155-157

Qualitative research 23,219-220; See also Focus groups; Pretesting
Quantitative research  23; See also Data; Surveys; Study design

Radon testing 121, 163,165
Readability 13,54
        SMOG formula 54
Recommendations of Workshop xvi-xvii
Reye's syndrome 171
Rightwriter (software) 54
Risk assessment 144
        Integral to risk communication
Sampling 211; See also Study design; Surveys
        Probability samples 211
        Random sampling 211,235-237
Self tests 13
Signaled Stopping Technique 55
Speech Evaluation Checklist 60
Strength Deployment Inventory  57
Study design  237-247; See also Comparisons; Surveys
        Cross-sectional surveys 245-246
        Before-and-after studies 245
        Judgmental assessments 246-247
        Panel studies 244-245
        Quasi-experiments 243-244
        Randomized experiments 241-242
        Regression-discontinuity studies 242
        Time-series designs 242-243

-------
Index                                                                     279

Summative evaluation 26-29, 38,46,163 ; See also Outcome evaluation;
                Impact evaluation.
Superfund 141,147,181
Surveys See also Study design.
        As outcome measure 155,160,166,183-185
        Cross-sectional 246-247
        For audience analysis 152-156,183-185,218-219
        Limitations 166-167
        Questionaire design 172
        Using survey data 104-107,167,176
Theater testing 13,56
U.S. Council for Energy Awareness (USCEA) 169
Union Lake, NJ  141
Validity 212
        Construct validity 207
        External validity 210
        Internal  validity 209
        Statistical conclusion validity 210-212
Verbal Meeting Feedback 59
Vineland Chemical Company 141
      oU.S. GOVERNMENT PRINTING OFFICE:! 991 .51.8 -18 7/20550

-------