EPA QIC PROGRAM EVALUATION PILOT TEAM TRAINING
January 16-18, 2001
80817th StNW
Suite 400
National Center for Environmental Assessment
Washington DC 20006
AGENDA
Tuesday, January 16,1 00 PM - 5 00 PM
1 00 - 1 30 PM Welcome, Introductions, Agenda Rick
1 30 - 2 30 PM "What's Up With This'" Rick
2 30 - 2:45 PM
Break
2'45 - 4-45 PM The Basics of Program Evaluation
4 45 - 5 00 PM Wrap-up
Emmalou
Emmalou
Wednesday, January 17, 8 30 AM - 5:00 PM
83O -1015AM Logic Modeling Emmalou
1015-10 30 AM Break
10 30-11 45 AM The Pilot Process
11 45 - 1 00 PM Lunch
1 00 - 2 30 PM IT
2 30 - 2 45 PM
Break
2 45 - 5 OO PM The Pilot Process
Emmalou w/Connie, Dale, Art,
Rick
Ernie, Stephanie, Yvonne
Emmalou w/Connie, Dale, Art,
Rick
700PM
Dinner
-------
Thursday, January 18, 8 30 AM - 1 OO PM
8 30-9 30 AM The Pilot Process Emmalou, Rick, Consultants
9 30 - 12 00 Noon Team Meetings w/Facilitation Emmalou, Rick Consultants
12 00 - 1 00 PM Wrap-up Rick, Emmalou
1 OO PM Adjourn
-------
The Basics of Program
Evaluation
Emmalou Norland
>H
jmmii
what c|o you plan to gain...
I professionally from this training!*
I professionally from participating in the
pilot/
What Well Po This Afternoon,
• Share a little about yourselves and
your expectations
• Define program evaluation more fully
• Identify the links between program
development and evaluation
• Learn about the 'profession' of program
evaluation
• Distinguish program evaluation fro
other similar processes
What <\o you have to contributed
• Team skills
• Program knowledge
• Evaluation knowledge
• Special interests
• Special prior experiences
• Other important
contributions
-------
What is Program Evaluation?
I What is your definition?
I What is a definition from a reference
book/
I So... What does it involve/
land.-.Why do we do it/
iand-.For Whom is it done/
land. -What is the target of program
evaluation/
An4--Why 4o we evaluate?
• To improve the program -
FORMATIVE EVALUATION is done to'
help form or reform a program
• To prove the program - 5UMMATIVE
EVALUATION is done to sum up the
program's accomplishments
• BOTH are done for decision-making:
Formative - Change/
Summative - Keep or Kill/
So...What POES program evaluation
involve!*
• collaborating
• questioning
• planning
• information-gathering
• information-analysis
• communicating
• interpreting*
• judging*
• Decision-making*
An4 Why Else?
• Postponement
• Ducking Responsibility
• Window Dressing
• Public Relations
• Requirement
-------
Formative Evaluations
I Needs Assessments
I Design Evaluations and
Evaluability Assessments
[Implementation Evaluations
(Performance Audits)
lSome Outcome Evaluations
An4— For whom is program evaluation
Stakeholders of the program who hav<
information gaps about the activities
characteristics, and outcomes of it Ct
need information to make decisions)
For EPA OIG: Congress, the Agency
the Public, the Regulated Communi
other Agencies, States,...
Summative Evaluations
I Most Outcome Evaluations
I Impact Evaluations
I Cost-benefit and Cost-effectiveness
Analyses
iMeta-analysis
An4---What is the target of program
evaluation?
• a program (as opposed to policies,
personnel, or other 'object')
• "an organized set of activities that are
managed toward a particular set of goal
for which the program can be held
separately accountable" Kirchner, 2000
• "a general effort that marshals staff and
projects toward some (often poorly)
defined and funded goals" Scriven 19
-------
What are the levels of 'programs' we
can evaluate in EPA<*
I Objective
I Sub-objective
I Project fc
I other: ^P
Evaluation Questions, Phases of a Program,
and Corresponding Evaluation Types
I Should we have a
program/
I Is the program
designed to work/
I Is the program
implemented as
designed!*
I What outcomes
are being
achieved!*
I Before developed;
Needs Assessment
I'Before
conducted; Design
Evaluation
I During:
Implementation
Evaluation
lDuring:Outc
Evaluation
What actually comprises a program*)
• Inputs - resources used to accomplish
certain activities
•Activities - the activities which produce
products or services for customers
• Outputs -products or services for customers
• Outcomes - customer changes,
environmental changes,
environment/human health changes
• Externalities - contextual influences
Evaluation Questions, Phases of a Program,
and Corresponding Evaluation Types
•Are outcomes • During and after
caused by the the program:
program/ Impact Evaluation
• Were the benefits • After:
worth the cost/ Cost-benefit
Analysis
-------
Evaluation Questions, Phases of a Program,
and Corresponding Evaluation Types
• Could program
have been • After:
conducted more Cost-effectiveness
cost-effectively/ Analysis
• What are the big • After a series of
picture findings evaluations: Meta
and implications/ Analysis
Stan4ar4s of Practice
(Joint Committee Program Evaluation
Standards: Utility, Feasibility, Propriety,
Accuracy
I Yellow Book - GAO Auditing
I Turquoise Book CPCIE)
IAEA Guiding Principles
Evaluation is a Profession
• persons with specialized
knowledge and skills
• unique body of content
• preparation programs
• stablized career opportunities
• working on certification
• professiona I associations...
• ...which influence preparation
• standards
Evaluation an4 Similar Processes
• Evaluation is the broad category of
processes which gather and share
information with stakeholders to use in
decision-making (focus on users and use)
• Would also include auditing,
investigation, monitoring, assessment
• Research is different from evaluation in
that research questions are not nece
targeted by specific information user
use
-------
Are Auditing an4 Evaluation Different!*
I Activities
(Scope
I Types of Questions
I Resources Needed
[Purpose
(Stakeholders
I Utilization
(Results
No Yes
No Yes
No Yes
No Yes
No Yes
No Yes
No Yes
No Y
Can't Wait L/ntil Tomorrow!
(Wednesday, January 17)
• Starting Time: 8:3O AM
• Location: Here!
• Topics: Basic Steps in Program
Evaluation; Logic Modeling;
Pilot Process; Assistance for
the Teams
•Announcements:
What We DidThis Afternoon
• Shared a little about ourselves and
expectations
• Defined program evaluation more fully
• Identified the links between program
development and evaluation
• Learned about the 'profession' of
program evaluation
• Distinguished program evaluation fi
other similar processes
gm
mf
-------
EVALUATION IS.
1 My definition of evaluation is
2 A definition of evaluation from a reference book is
3 Here are some common definitions:
the systematic collection of information about the activities, characteristics, and
outcomes of programs to make judgments about the program, improve program
effectiveness, and/or inform decisions about future programming CPatton, 1997)
determining the extent to which a program has achieved its goals. (What about
implementation, program processes, unanticipated consequences, long-term
impacts')
determining the worth, merit, or value of the something - program, product
(Vsing what evidence; for what purposes')
the systematic assessment of the operation and/or the outcomes of a program or
policy, compared to a set of explicit or implicit standards, as a means of contributing
to the improvement of the program or policy (Weiss, 1998)
4 What is OPE's definition/
-------
Steps in Program
Evaluation
Fits Most Evaluation Need
1. Identify Key Stakeholders - Gather
their Questions
• Congress
• The Agency
• Other Agencies
• The Regulated Community
• The States
• The Public
• Results of Design Evaluation
r Steps
11. Identify Key Stakeholders and Questions
" 2. Assemble an Evaluation Team
i 3. Identify Information Needs and Plan Data
Collection
"4. Collect Data
• 5. Analyze Data
16. Develop Findings
• 7. Draw Conclusions and Make
Recommendations
(if there are questions for an
evaluation)...2. Assemble a Team to
Guide and Conduct the Evaluation
• Evaluation
• Subject Matter
• Agency
• IG Process
• Facilitator
-------
3. Identify Information Needs and
Plan the Data Collection
• What are the information needs/
• Are the data available Cgood, accessible...)/
• Do additional data have to be gathered!*
• I/sing what instruments and processes/
• I/sing what resources/
• Are there alternatives/
5. Analyze
Inductive process for Qualitative Data
(Interviews, Focus Groups, Some
Observational Process)
Deductive process for Quantitative Data
(questionnaires, tests, environmental
measures and monitoring data,
demographics, program descriptives)
4. Gather Data
• ifavailable.. gather
• if not, design and test instruments, gather
data
• OR contract out
6. Develop Findings
• if needs assessment...'The program fits in
these ways"
• if design evaluation... 'The program
appears like it can be implemented and •
can successfully reach these outcomes"
• if implementation evaluation .'The
program is being implemented in this
way"
-------
6. Develop Findings (continued)
• if outcome evaluation.. .'There are these
outcomes, anticipated and unanticipated"
• if impact evaluation...'The program is
causing these outcomes"
• if combination...
7. Draw Conclusions and Make
Recommendations (continued)
• Recommendations are based on findings
and conclusions of this evaluation, as well
as other evaluations and studies: "Based
on these findings and conclusions, as well
as other audit results, we recommend---"
7. Draw Conclusions and Make
Recommendations
• Conclusions are based on the findings and
some pre-identified criteria. 'The
program is being implemented in this way,
thus we conclude that..."
-------
TYPES OF EVALUATION
When, in the life of a program, 4oes evaluation take place (before, during, after), how are the
results of that evaluation used to make decisions about the program Cmid-course correction,
continuing, expanding, or institutionalizing the program, cutting, ending or abandoning the
program, testing a new program idea, choosing the best of several alternatives, deciding
continued funding.)
1 "Formative evaluations strengthen or improve the object being evaluated-they help form
it by examining the delivery of the program or technology, the quality of its
implementation, and the assessment of the organizational context, personnel, procedures,
inputs, and so on " Trochim, 2000
needs assessment
design evaluations.
evaluabihty assessment'
implementation evaluation.
2 "Summative evaluations, in contrast, examine the effects or outcomes of some
ob|ect-they summarize it by describing what happens subsequent to delivery of the program
or technology; assessing whether the object can be said to have caused the outcome,
determining the overall impact of the causal factor beyond only the immediate target
outcomes, and, estimating the relative costs associated with the object" Trochim, 200O
outcome evaluations
impact evaluation-
cost-benefit analysis and cost-effectiveness analysis
meta-analysis:
3 Other reasons for evaluation: postponement, ducking responsibility, window dressing,
public relations, fulfilling someone's requirements
-------
WHAT ARE WE EVALUATING/
1 What is a program'
"The term 'program' refers to an organized set of activities that are managed toward a
particular set of goals for which the program can be held separately accountable "
Kirch ner, 2000
"The general effort that marshals staff and projects toward some (often poorly)
defined and funded goals." Scriven, 1991
" .I will call a national program, like Head Start or Superfund environmental cleanup,
a program. The local operations of the program are each projects Thus, the Head
Start that is operated (locally! is a project An element of the [local] Head Start
project, like involving parents through weekly meetings, is a component. Evaluations
can be directed at any of these levels. We can evaluate national programs, local
projects, or sub-project components." Weiss, 1998
2 What are the various levels of'program' we might evaluate/
3 What comprises a program/
-------
EVALUATION AS A PROFESSION
Refer to articles in' Altschuld, James and Molly Engle, Eds (1994) New Directions for
Program Evaluation. No 62 San Francisco CA Jossey-Bass Publishers
Worthen, Blame "Is Evaluation a Mature Profession That Warrants the Preparation of
Evaluation Professionals/"
Mertens, Donna M "Training Evaluators l/nique Skills and Knowledge"
Kingsbury, Nancy and Terry E Hednck "Evaluator Training in a Government Setting"
What is a Profession/ Is Evaluation There Yet/
Q 1 It has persons with specialized knowledge and skills.
Q 2 There is a developed a body of content unique to its area of specialization.
Q 3. There are preparation programs designed to produce practitioners who are well qualified
in the unique knowledge and skills
Q 4- Stable career opportunities have'emerged for such well-qualified practitioners.
Q 5. The specialization has developed procedures for the certification or licensure ofthose
|udged qualified to practice it.
Q 6. There are associations devoted to furthering the professional development of its
practitioners
Q 7. There are criteria for determining membership in such associations
Q 8. The relevant professional associations influence the preparation programs.
Q 9. The specialization has developed standards to guide those who practice it
Who conducted evaluation before evaluators/
Accountants and auditors, management consultants, planning and systems analysts,
economists, research/product development, test marketing specialists, academics in social and
behavioral science.
Knowledge and Skills Associated with Evaluation:
Research Methodology, Project Management, Strategic Planning, Auditing, Program
Development, Communication Skills, People Skills, Negotiation Skills, Personal Skills
(credible, good judgment...), Cross-cultural Skills, Policy Analysis, Valuing, Economics,
Specific to discipline (education, psychology, health, business, government, environment)
-------
ihe An n Evaluation Association, the AEA's pasi. currem, and incoming
presidents—David Cordray, David Fetterman, and Karen Kirkhan, respec-
tively, Arnold Love, president or (he Canadian Evaluation Society (CES), the
regional CES presidents, Kaihy Jones of the CES. Gary Cox of the University
of Washington, the National Center for Science Teaching and Learning at The
Ohio Slate University, HealthEasi of Si Paul, Minnesota, and the University of
Alabama School of Medicine for assistance with preparation of the directory
The editors of this volume express their deep gratitude to those who partic-
ipated in the study Without them, it would not have been possible
James W Altschuid
Molly Engle
Editors
JAMES W ALTSCHULD is associate professor of educational research and evaluation
and evaluation coordinator for the National Center for Science Teaching and Learn-
ing at The Ohio State University His research interests include evaluation models
and methodology, needs assessment, and the development of evaluation training pro-
grams
MOLLY ENGLE is an assistant professor in the Behavioral Medicine Unit. Division of
Preventive Medicine, Department of Medicine tit the University of Alabama at Birm-
ingham School of Medicine She designs, implements, and conducts research and eval-
uations in behavioral medicine and community-based health services
Criteria for judging the matuiUy of any profession are applu .o
evaluation. Special attention is paid to the question of whether
programs for the preparation of evaluation specialists are
warranted
Is Evaluation a Mature Profession That
Warrants the Preparation of Evaluation
Professionals?
Blame R. Worihen
There is wide agreement that evaluation is an important piofessional special-
ization, but there is less certainly as to whether it has yet attained the status of
a distinct piofession To answer this question, I propose that a fully developed
profession has at least nine characteristics, and I will discuss these charauer-
istics in the context of the need for preparation Worthen and Sandtis (1991)
advanced six of these criteria in their discussion of trends in educational eval-
uation, and portions of this chapter draw on thai earlier work
First, a fully developed profession needs persons wiih specialized knowl-
edge and skills Second, it has developed a body ol content (knowledge and
skills) unique to us area of specialization Third, the profession has developed
preparation programs designed to produce practitioners who are well quali-
fied in ihe unique knowledge and skills Fourth, stable career oppoi mumcs
have emerged for such well-qualified practitioners Fifth, the specialization lias
developed procedures foi the certification or licensuie of those judged quali-
fied to practice it Sixth, the specialization has developed associations devoted
to furthering the professional development of us piaumoneis bevemh. the
specialization has developed criteria for determining membership in such asbo-
ciations Eighth, the relevant piofessional associations in flue nee the piepaiu-
tion programs Ninth, the specialization has developed sund.iulb 10 guide
those who practice it
A simple status check on each of the nine ciueriu propobed would he one
way of judging how far evaluation has moved towaid attaining ihe iluiauei-
isucs of a full-fledged profession However, the matuiaimn of evaluation
NLWDlIUIUNlHU PllH.IUM t-VAIlMIKJN llu OJ bummci I
-------
4 THE PREPARATION OF PROFESSIONAL EVALUATURS
toward the status of a profession can better be understood by considering the
forces that have shaped it across the past thirty years Although space will not
permit me to say much about the historical emergence and evolution of eval-
uation, I will sketch some portions of the histoncal backdrop when it helps me
to clarify the current status of evaluation on the nine catena proposed
Need for Evaluation Specialists
Although there were a few embryonic efforts to evaluate public programs pnor
to 1960 (Shadish, Cook, and Leviton, 1991, Worthen and Sanders, 1987).
most commentators believe that contemporary evaluation of educational and
social programs first emerged dunng the 1960s Early in that decade, the U S
Congress passed federal legislation that, in authorizing antipoverty, juvenile
delinquency prevention, and manpower development and training programs,
both required program evaluation and allocated funds for it (Wholey, 1986,
Weiss. 1987) Yet the emphasis on evaluation built into the Elementary and
Secondary Education Act (ESEA) of 1965 dwarfed previous efforts to mandate
the use of evaluation Broad in scope, the ESEA provided large-scale funding
for education that allowed tens of thousands of federal grants to be awarded
to local schools, state and regional education agencies, and universities Due
largely to the efforts of Robert F Kennedy, the ESEA required the recipients of
grants dealing either with compensatory education for disadvantaged youth or
with innovative educational projects (the great majority of grants) to file an
evaluation report showing what had resulted from the expenditure of public
funds
Overnight, thousands of educators were required to evaluate their own
efforts Few were up to the task Classroom teachers and building principals
were among those pressed into technical activities for which they had little
training The results were abysmal And when well-trained educational, psy-
chological, or sociological researchers were called in to help, the results were—
surprisingly—not much better Despite their technical prowess, these
researchers were not prepared for the complex tasks of identifying the influ-
ences that could be attributed to each of several components of a program or
even of separating the effects of the program from other activities going on in
the school Clearly, new evaluation approaches, methods, and strategies were
needed
Meanwhile, areas outside education were expenencing increased demands
for evaluation, although it was often called by other names By the late 1960s,
Congress had authorized monies for evaluation of social programs in areas as
diverse as the Job Corps, vocational rehabilitation, child health, and commu-
nity action Managers of the projects and programs funded under such social
legislation searched to find the individuals best equipped to fill the newly cre-
ated evaluation roles Faced with an absence of persons trained directly in eval-
uation, ihf ~«ployed people trained for roles that contained some evaluative
IS EVALUAI ION A MATURE PRO! ESSION? 5
functions—professional accountants and auditors, management consultants,
planning and systems analysts, economists, research, product development,
and lest marketing specialists from the private sector, and academics in areas
relevant to the collection and analysis of evaluative information (Shadish,
Cook, and Leviton, 1991).
The evaluations conducted by these persons were little better than those
conducted by the classroom teachers and educational psychologists who had
been pressed into service as evaluators on federally funded education projects
While most of those drafted or recruited into evaluation roles were very skill-
ful in some of the tasks required of evaluators. few were even aware of the
broad range of tasks that were essential for a complete and adequate evalua-
tion Fewer still possessed the skills that one must have in order to complete
those tasks. The need for persons with a new constellation of specialized skills
was evident to any insightful observer
Today the need for evaluation specialists is generally accepted, although
many policy makers and program managers who are naive about the knowl-
edge and skills that evalualors should possess still attribute evaluation exper-
tise to self-appointed or self-anointed "evaluators" who lack essential evaluation
skills and knowledge. Despite the frequent lapses when evalualors are selected,
that there is a need for evaluation specialists seems to be well established
Development of Unique Content
When demands for evaluation increased dramatically in the 1960s, the result-
ing evaluation studies revealed the conceptual and methodological impover-
ishment of evaluation as it then existed Theoretical and methodological work
related directly to evaluation did not exist, and evalualors were left to gather
what they could from theories in cognate disciplines and lo borrow whai they
could from the methodologies developed in such fields as experimental design,
psychometrics, survey research, and ethnography The results were disap-
pointing and underscored the need for the development of new conceptual-
izations and methods tailored to fit ihe needs of evalualors more precisely.
Scholars responded lo this need, and by 1970 important seminal wntmgs had
provided conceptual foundations and scaffolding for the young field of evalu-
ation (Cronbach. 1963. Scnven. 1967, Slake. 1967, Siufflebeam, 1968) Books
of readings on evaluation were published (Caro, 1971, Worthen and Sanders,
1973) Articles about evaluation appeared with increasing frequency in pro-
fessional journals. Together, these publications resulted in a proliferation ol
new evaluation models that collectively provided new ways of ihinkmg aboui
evaluation This emerging body of literature showed evaluation lo be a multi-
dimensional technical and political enterprise that required both new concep-
tualizations and new insights into the ways in which melhodologies borrowed
from other fields could be used appropriately
In recognizing the need for unique theories for evaluanon ' -lish. Cook.
-------
6 THE PREPARATION or PROI FSSIONAL EVALUAIORS
and Leviion (1991. p 31) noied that, "as evaluation matured, its theory look
on its own special character (hat resulted from the interplay among problems
uncovered by practitioners, the solutions they tried, and traditions of the aca-
demic discipline of each evaluator, winnowed by twenty years of experience "
Publications focusing exclusively on evaluation appeared in the 1970s
They included such journals and senes as Evaluation. Evaluation and Program
Planning, Evaluation Practice, Educational Evaluation and Policy Analysis, New
Directions for Program Evaluation, and the Evaluation Studies Review Annual The
number of books published expanded markedly m the second half of (he
1970s and throughout the 1980s Textbooks, reference books, and even com-
pendia and encyclopedias of evaluation all appeared Clearly, the necessary
conceptual underpinnings of a profession are accumulating in a body of eval-
uation literature that is arguably unique Thus, evaluation seems to qualify as
a profession on the second criterion There is a body of knowledge that out-
lines the content of the field and us unique (or adapted) theories, strategies,
and methods
Programs for the Preparation of Evaluators
Foreseeing that education had few persons trained in educational inquiry
skills, the U S Congress funded graduate training programs in educational
research and evaluation in 1965 These programs included fellowship stipends
for graduate study in these new specializations Several universities launched
full-fledged, federally funded graduate programs aimed at training educational
evaluators When federal funds disappeared, so did many of the graduate pro-
grams that they had supported In 1971. graduate programs for training eval-
uators existed at more than a hundred American universities (Worlhen and
Dyeis, 1971) Fifteen years later, only forty-four U S universities had such pro-
grams (May, Fleischer, Schener, and Cox, 1986) And many programs that had
had many courses m evaluation scaled back to a single elective course
The evaluation preparation programs that continued generally offered
training tailored to fit the reconceptualized views of evaluation that were
emerging Notions of how evalualors should be trained gradually expanded
beyond traditional training Courses in research design, statistics, and mea-
surement were often supplemented by a wide variety of applied methods and
techniques courses in such areas as naturalistic observation, interviewing tech-
niques, content analysis, peiformance assessment, and communication and
writing skills Evaluation internships, assistantships, and praclica became more
central in preparation programs as evaluation mentors realized that, in evalu-
ation as elsewhere, the best training is often apprenticeship (raining
In recent years, the training of evalualors has increasingly been relocated
to nonacaclemic settings In-service evaluation training for practitioners is often
offered in schools, slate agencies, and businesses On occasion, large corpora-
ls EVALUATION A MATURE PROFESSION' 7
(ions have established corporate (raining centers (such as Xerox Document
University) These centers, which resemble mmiuniversilies, provide training
in evaluation along with other techniques Some of these centers award cer-
tificates attesting that the recipient is qualified in the speuahzation in which
he or she has been trained And, of course, many persons follow serendipitous
career paths into evaluation roles where their preparation consists pnmanly of
on-ihe-job bootstrapping On balance, there are a sufficient number of evalu-
ation training programs m universities, government agencies, corporations,
and other settings to produce an ongoing supply of professional evalualors It
seems clear that evaluation meets this third criterion for having reached the
status of a profession.
Stable Career Opportunities for Evaluators
One sign that a specialization is a profession is a continuing need for the ser-
vices of personnel trained in that specialty No field that is only a fad that flour-
ishes briefly and then fades would qualify as a profession In judging evaluation
on this dimension, we must consider whether, despite uncertain social and
economic trends, evaluation provides the stable employment opportunities
that are typical of mature professions
At first, evaluation seemed to be just another boom-and-bust specialty
When the need for evaluators grew quickly between 1965 and 1975, it
seemed that evaluation training could provide stable career opportunities for
anyone who developed a reasonable degree of expeitise in evaluation That
view grew doubtful in the late 1970s, when a dip in the level of federal fund-
ing for evaluation appeared to signal a declining U S job market for evalua-
lors In the early 1980s, Ronald Reagan cast a darker shadow over the
evaluation scene Federal evaluation mandates were quietly shelved as the so-
called new federalism reduced federal funding for education and other social
programs and cut federal control over the ways m which states and local agen-
cies spent the federal funds that they received Much categoncal funding that
had required evaluation was replaced by block grants to states, which were
largely exempt from evaluation requirements Most analysis during the early
1980s were convinced that stale and local agencies, hard pressed for oper.i-
uonal funds, would use categoncal funding to buy supplies, repair equip-
ment, or add staff Evaluation was predicted to be one of the majoi casualties
of the Reagan administration Since federal mandates had spawned evaluation
in state and local agencies, it seemed reasonable to expect thai evaluation
would decline or even cease when federal evaluation requirements weie
relaxed or abolished
By 1982. these pessimistic prophecies seemed to have pioved accurate Gov-
emmenial monitoring of categorical funding programs was drastically reduced
Individual evalualors and evaluation agencies lhai depended on conirjcts wiih
-------
8 THE PREPARATION or PROFESSIONAL ^VALUATORS
federal programs found ihis source of income drying up For example,
Shadish. Cook, and Levuon (1991) noie thai ihe number of evaluation stud-
ies conducted by ihe U S Office of Planning. Budget, and Evaluation dropped
from 114 in 1980 to 11 in 1984 The declines in other evaluation activities that
depended on federal funds weie comparable
Gloom soon spread over the evaluation landscape, and evaluaiors' con-
ferences in the early and micl 1980s focused on such themes as the decline of
evaluation Evaluation trainers at many universities began to ask whether it
was ethical to tram neophytes in roles for which demand was thought to be
diminishing For a lime, it seemed that the evaluation bubble had burst Eval-
uation seemed destined for the graveyard of promising endeavors that had
failed to fulfill their potential
But the situation began to change For reasons that at first could not be
explained, some evaluation agencies seemed to be bucking the declining
trends Indeed, they found that the 1980s had brought a stronger surge of eval-
uation business than ever before, and soon they were eagerly seeking to add
well-qualified evaluaiors to their staffs Gradually, it became apparent that only
the evaluation agencies that depended primarily on federal funds had been
hard hit, while the agencies that served state and local agencies, corporations,
professional associations, and the like were finding that evaluation was still a
bustling, thriving enterpnse
Somehow—perhaps only the most sagacious historical analyst could
determine all the causes—decision makers in slate and local government, busi-
ness, and industry had begun to use evaluation for their own purposes—to
provide information that they needed to guide policy and program imple-
mentation Gradually, increasing numbers of agencies began to commission
evaluation studies not because they had been forced to but because they
believed that the resulting data would be helpful House (1990) noted this
trend in the emerging tendency for large bureaucracies to develop their own
evaluation offices, and Worthen and Seeley (1990) described how evaluation
had been institutionalized by a variety of enterprises across broad sectors of
contemporary society
For those who recognized that this widespread instilutionalizaiion of eval-
uation would lend stability to the evaluation job market, ihe pessimism thai
had prevailed at the beginning of the decade soon passed Openings for eval-
uniors were appearing in a wide variety of sellings, which included public and
private school districts, stale and regional education agencies, social service
agencies, univcrsmes and colleges, slate systems of higher education, test and
text publishers, and the military, business, and industry Evaluation academics
were at lirst amused and then amazed as ihcy saw their siudents recruited to
evaluate personnel training programs run by large, national aicounung fums,
insurance and brokerage houses, and fast-lood chains
The shortage of trained evaluatois is obvious today, as the traditional
employe- -f evaluaiors aic now forced to compete with suth firms as Aetna,
Is CVALUAI ION A MA luiir
Xerox, and Price Waterhouse Every year, the number of evaluation vacancies
outside academic sellings surpasses the number of qualified candidates And
the surge in federal program evaluation over the past few years has accentu-
ated the need for evaluators Ginsburg, Mclaughlin, and Takai (1992, p 24)
note lhai "spending on program evaluation by the U S Department of Educa-
tion exceeds $40 million per year, a tripling of the budget over the last five
years " So a career in evaluation seems again to be a very good possibility If
the probability of continued employment in a specialization is an imporiam
criterion for considenng u lo be a profession, evaluation may well be consid-
ered a viable profession
Procedures for the Certification or Licensure of Evaluaiors
Since an extensive discussion of this question is beyond the scope of this chap-
ter, I will only touch on U lightly For this chapter, the central question is
whether there are mechanisms for the certification or licensing of evaluators
similar to those thai mark teachers, psychologists, and certified public accoun-
tants as professionals The answer, of course, is no
Despite pleas that the Amencan Educational Research Association estab-
lish mechanisms to provide certification for qualified evaluators (Cagne, 1975,
Worthen, 1972), neither it nor any other association or agency has stepped for-
ward to assume responsibility for the licensing or ceilificalion of evaluation
practitioners As a result, there is currently no way of preventing incompetent
or unscrupulous operators from proclaiming themselves to be evaluaiors
Without some type of credenlialing process, it is difficult for ihosc who need
evaluation services lo determine in advance that those whom they select aie
indeed competent "Lei the buyer beware" is still the watchwoid for those who
must retain the services of an evaluation specialist In the absence of certifica-
tion or licensure, unprincipled hucksters can do much mischief and in the
process badly tamish ihe image of evaluation
Perhaps that cannot be helped 1 am much less sanguine today thai we can
set up credenlialing systems than 1 was two decades ago However desirable u
may be to have some way of ensuring that the unqualified cannot masquerade
as evaluaiors. the development of such a mechanism does noi seem feasible
for two reasons First, rooted as evaluation is in so many disciplines and with
today's evaluators trained in as many diverse specializations and through sui.li
diverse means as they are, it is hard to imagine how any broad agieemciu
about the essential elements of evaluation competencies could be luigeil I'm
bluntly, since there is so little agreement about the methods .uul ie(.tinu|iies
that evaluaiors should use, it seems almost certain thai a majoiny of |>iauit--
ing evaluaiors would reject an elfon 10 construct and use a lemplaie ul any son
to judge the qualifications of all evaluators Second, u seems unlikely that any
professional association or government agency will soon lie equipped to grap-
ple with the thorny and often litigious business of lucusinc •inlu.jiioii.
-------
10
1 IIL PREPARATION or PROIT-SSIONAI LVAI UAIOKS
especially in a field wheie those affected by the effort are more accustomed to
evaluating than to being evaluated
Nevertheless, until and unless we establish some feasible mechanism for
ensuring that those who practice evaluation are competent to do so, evalua-
tion cannot be considered a fully mature profession
Development of Professional Associations for Evaluators
Several professional associations m North America have emerged to provide
homes for evaluators (Similar trends are seen in other coumnes ) One of the
first North American efforts was not a full-blown association as such but rather
Division H of the American Educational Research Association, which provided
a home for school evaluators However, two professional associations for prac-
ticing evaluaiors were founded in 1976 The Evaluation Network (EN) con-
sisted laigely of educational evaluaiors, while most members of the Evaluation
Research Society (ERS) served in other professional fields
In 1985, the EN and the ERS merged to form the American Evaluation
Association (AEA), which, with about 2,200 members, is the largest profes-
sional association that exists solely to serve the needs of practicing evaluaiors
The Canadian Evaluation Society (CES) was launched 10 serve the needs of
Canadian evaluation practitioners who worked in settings ranging from provin-
cial ministries to private consulting groups Given the scope and stature of
these associations, it is clear that evaluators have viable professional organiza-
tions On this criterion, evaluation fares as well as any profession
Criteria for Determining Membership in Evaluation
Associations
Most professions have established criteria foi denying membership in profes-
sional associations to those who are patently unqualified in the business of ihe
profession This cannot be said of evaluation The criteria for membership in
all the professional evaluation associations just mentioned are lenient, and no
organization would effectively exclude those who were not qualified as evalu-
aiors from membership On this criterion, as on the criterion of cenificalion,
it appears (hat evaluation has not reached full maturity as a profession
Influence of Evaluation Associations on Preparation
Programs for Evaluaiors
In many professions, I he majoi professional associations play a powerful mle
in shaping university pieparanon progiams through accreditation or similar
mcilunisms Evaluation associations excit no such influence None of the pro-
fessional associations for evaluators mentioned earlier exercise any direct ton-
Is EVALUAIION A MAIURIE I'KOIKSSION'
II
trol or influence over any preservice program ihai purports to train evaluators
The evaluation associations do not accredit preservice training programs 01
control decisions about required course content, essential internship experi-
ences, or faculty qualifications On this criterion, too, evaluation is not fully a
profession
Development of Standards for Evaluation Practice
Most professions contain technical standards, ethical standards, or both that
are intended to ensure that professional practice is of high quality Evaluation
was without such standards during us early years Then in 1981, evaluation
took a giant step forward toward qualifying as a profession when several years
of work by the Joint Committee on Standards for Educational Evaluation, a
coalition of professional associations concerned with evaluation in education
and psychology, resulted in the publication of Standards for Evaluations of Edu-
cational Programs, Projects, and Materials (Joint Committee on Standards for
Educational Evaluation, 1981) These comprehensive standards were intended
to guide both those who conducted evaluations and those who made use of
evaluation reports
In 1982, the ERS published another set of standards for evaluation prac-
tice (Rossi, 1982) Six years later, the joint Committee on Standards for Edu-
cational Evaluation (1988) published the Personnel Evaluation Standards
Currently, the same organization is neanng the end of the process of revising
the standards first published in 1981 If a set of standards to guide professional
practice is a hallmark of a profession, then evaluation certainly qualifies, for us
standards are much better developed than those now used to guide practice in
several more venerable professions
Profession, Professional Specialization, or Field of
Professional Practice?
Up to this point, we have considered nine touchstones that seem useful in
ascertaining whether a field of endeavor has attained the status of a distinct
profession. Let us now consider these criteria together What do they tell us
about the progress of evaluation toward becoming a profession? Is evaluation
separate and distinct from the other professions and disciplines with which it
has been intertwined for decades7 In short, is evaluation a profession7
The answer depends on the rigor with which we apply the nine criiena
just examined If an area of specialization must meet all nine criteria before it
can be thought of as a profession, then evalujuon is not a piofcssion Tiguic
1 1, which summarizes the preceding discussion of evaluation and the char-
acteristics that most fully developed professions possess, shows that evaluation
falls short on three For evaluation to be consult ied a full-fledged profession.
-------
12 THH PREPARATION OF PROFESSIONAL EVALUAIORS
Figure 1.1. Criteria for Judging Whether Evaluation
Has Become a Profession
Does Evaluation Meet the Cntrnnn 0}
V«
N.i
1 A need for evaluation specialists7
2 Conicni (knowledge and skills) unique
lo evaluation7
3 Preparation progiams foi cvaluaiors7
4 Stable taieer opportunities in evalua-
tion7
5 Cenificaiion or licensure of evaluators7
6 Appropnaic professional associations
(or evaluators7
7 Exclusion of unqualified persons from
those associations7
8 Influence of evaluators' associations on
prcscrvice preparation programs for
e valuators7
9 Siandards lor the practice of evalua-
tion7
V
V
these three areas will need 10 be dealt wuh Nevertheless, some conditions may
be difficult ever to meei For example, we may never resolve ihe challenge of
certifying evaluators Does this mean that evaluation will never qualify as a pro-
fession7 Or can evaluation be considered a profession if 11 meets most of the
criteria7
Those who have commented on the status of evaluation as a profession are
not of one voice A decade ago, most writers seemed to hold the view that eval-
uation had not yet attained the status of a distinct profession For example,
Rossi and Freeman (1993, p 432) concluded that "evaluation is not a 'profes-
sion.' ai least in terms of the formal criteria (hat sociologists generally use to
characterize such groups. Rather, it can best be described as a 'near-group,' a
large aggregate of persons who are not formally organized, whose membership
changes rapidly, and who have little in common in terms of the range of (asks
undertaken, competencies, work sites, and shared outlooks " Merwm and
Werner (1985) have also concluded that evaluators cannot yet claim full pro-
fessional status
Several rcceni authors have reached a somewhat moie liberal conclusion
For example, Panon (1990) states unequivocally thai evaluation has become
a profession and thai it is a demanding and challenging one at ihai Shadish,
Cook, and Lcviion (1991, p 25) are slightly more cautious "hvaluation is a
piolcssion in the sense that n shares certain atiiibutcs wuh other professions
and differs from purely academic specialties, such as psychology or sociology
Although they may have academic loois and members, professions aie eco-
h EVALUATION A MAI LIRE PROFESSION?
13
nomically and socially structured to be devoted primarily 10 practical applica-
tion of knowledge in a circumscnbed domain wuh socially legitimated fund-
ing Professionals tend to develop standaids of practice, codes of ethics,
and other professional trappings Program evaluation is not fully professional-
ized, like medicine or the law, it has no licensure laws, for example but u
lends toward professionalization more than mosi disciplines"
To summarize, some now view evaluation as a profession because it pos-
sesses most of the touchstones that collectively define a profession Others
believe thai evaluation is no) now a full-blown piofession and that it may never
become one because u lacks licensure laws and some other characteristics ol
such professions as law and medicine Perhaps evaluation will forever be a
near-group that lends toward professionalization Perhaps we may best
describe u as a near-profession—an area of professional practice and special-
ization that has us own literature, as own preparation programs, us own stan-
dards of practice, and us own professional associations Or perhaps evaluation
is best viewed as a hybrid of profession and discipline thai possesses many
characteristics of both and lacks some essentials of each (Scriven, 1991,
Worthen and Van Dusen, in press) Or perhaps the label that we give lo the
practice of evaluation is of less consequence than the ways in which we struc-
ture programs aimed at preparing competent evaluation piaciuioncrs
Are Preparation Programs for Evaluation Practitioners
Warranted?
It would matter little whether we considered evaluation to be a profession if u
were not that our conceptions—and even our semantics—influence the ways
in which we prepare personnel for evaluation roles If we think of evaluation
as a discipline, then we will expect preservice programs for evaluators to be
patterned after those used to tram academics in other disciplines If we think
of it as a profession, the course work and internships in our evaluator prepa-
ration programs will tend to resemble the methods courses and practica used
to prepare practitioners for other professions If we think of evaluation as a
hybrid of discipline and profession, then our evaluation programs will com-
bine elements of programs aimed at training practitioners wuh elemcius ol pro-
grams used to prepare academics
However we think of evaluation, this much is cleai Evaluation has
matured rapidly dunng the pasi quarter century, and iheic is every uuli<_aiioii
that u will continue to develop and grow in the decades ahead Wuh lib own
journals, standards, and professional reference groups, evaluation has devel-
oped many of the important characteristics of a piofession And whcihci 01
not it can be considered a profession, u has emerged as .m impou.mi aic:i ol
specialization that demands uniquely prepaied personnel il u is to icadi us full
potential. Il has become institutionalized in many public and private sectors
-------
14 Till: fREPARATION OH PROITbSIONAL EVAIUAIURS
of oui society, and, if evaluaiois are prepared to meet the challenge, evaluation
can become one of the most useful and far-reaching areas of human endeavor
Against this backdrop, the present and potential importance of evaluation fully
warrants a careful consideration of the issues and strategics involved m prepar-
ing evaluation specialists
References
Caro. F G (ed ) Readings in Evaluation Research New York Sage, 1971
Cronbach, L J "Course Improvement Through Evaluation * Teachers College Record, 1963. 64,
672-683
Gagne, R M "Qualifications of Professionals in Educational R&D " Educational Researcher, 1975,
Gmsburg. A , Mclaughlin, M . and Takai, R "Remvigoraling Program Evaluation at ihe U S
Department of Education " Educational Researcher, 1992, 21 (3). 24-27
House. E R •Trends in Evaluation " Educational Researcher, 1990, 19(3). 24-28
|oini Committee on Standards for Educational Evaluation Standards /or Evaluations of Educational
Programs, Projects, and Materials New York McGraw-Hill, 1981
Joint Committee on Standards for Educational Evaluation The Personnel Evaluation Standards
Newbury Park, Calif Sage. 1988
May, R M , Fleischer, M , Schetrer, C J . and Cox, G B "Directory of Evaluation Training Pro-
grams " In B G Davis (ed ), leaching of Evaluation Across the Disciplines New Directions for
Program Evaluation, no 29 San Francisco jossey-Bass, 1986
Merwtn.J C , and Werner. P II "Evaluation A Profession'" Educational Evaluation and Policy
Analysts. 1985. 7(3). 253-259
Pallon.M Q " The Challenge of Being a Ptnfesston " Evaluation Practice, 1990. II (I). 45-51
Rossi, P M (ed ) Standards /or Evaluation Practice San Francisco Jossey-Bass, 1982
Rossi. P U , and Freeman, H E Evaluation A Systematic Approach (5th etl ) Newbury Park,
Calif Sage. 1993
Scnven.M "Ihe Methodology of Evaluation " In R C Slake (ed ). Cur niulum Evaluation Amer-
ican Educational Research Association Monograph Series on Evaluation, no I Chicago Rand
McNally. 1967
Scriven. M "Introduction The Nature of Evaluation " Evaluation Thesaurus (4ih ed ) Newbury
Park. Calif Sage. 1991
Shadish. W R, Jr. Cook, 1 D , and Levnon, L C Foundations o/ Program Evaluation Newbury
Park, Calif Sage, 1991
Slake, R E "The Countenance of Educational Evaluation " Teachers College Record, 1967, 68,
523-540
Stufflebeam, D L Evaluation as Enlightenment /or Decision Making Columbus Ohio Slate Uni-
versity Evaluation Center, 1968
Weiss. C H "Evaluating Social Programs What Have We Learned'" Society. 1987, 25(1). 40-^45
Wholey, J E "Using Evaluation lo Improve Government Performance " Evaluation Practice. 1986
7.5-13
Worthen. B R "Cemficanon for Educational Evaluaiors Problems and Potential " Paper pre-
sented ai the annual meeting of the American Educational Research Association, Chicago, Apr
15, 1972
Wculhen. B R , and Byers, M L "An Exploratory Study of Selected Varubles Related to the
Training and Caieers of Educational Research and Research-Related Personnel " Washington.
D C Ametican rducaiion.il Research Association. 1971
Wonhen. B R . and Sanders. | R Educational Evaluation Theory and Practice Belmont. Calif
Wadiwoitli, 1973
IS EVALUAI ION A MA I LIRE HKUrCSSION' I 5
Worlhen, B R , and Sanders, J R Educational Evaluation Alternative Approaches and Pradicul
Guidelines New York Longman, 1987
Wonhen, B R , and Sanders, J R "The Changing Face of Educational Evaluation * Theory into
Practice. 1991,30(1). 3-12
Wonhen, B R , and Seeley, C "Problems and Potential in Institutionalizing Evaluation in State
and Local Agencies " Paper presented al ihe annual meeting of the American Evaluation Asso
ciaiton, Washington, D C . Oci 19. 1990
Wonhen. B R . and Van Dusen. L M "The Nature of C valuation " In H Walben (ed ). Interna-
tional Encyclopedia oj Education (2nd ed ) Oxford. England Pergamon Press, in press
BLAINE R WORTHEN is piofessor and chan of the Research and Evaluation Method-
ology Program in the Department of Psychology at Utah State University and dira
tor of the Western Institute for Research and Evulualion in Logan, Utah
-------
The skills and knowledge that evalualors need include (dose bor-
rowed from other disciplines as well as (hose unique to the field of
evaluation Inclusion of multiple perspectives in evaluator (raining
can help to develop the field and improve the practice of evaluation
Training Evaluators: Unique Skills
and Knowledge
Donna M. Mertens
Evalualors work in complex environments, such as emichmem programs for
deaf, gifted adolescents, drug and alcohol abuse programs for (he homeless,
and managemenl programs for high-level radioactive waste The field of eval-
uation itself is evolving as u develops ihrough ihe leflccuvc practice of (he pro-
fessionals involved Evalualors have an ethical responsibility 10 continue their
education and keep up-to-date on developments in the field (Easimond,
1991) In consequence of this assertion, I have written this chapter for students
of evaluation, by whom 1 mean noi only those who are enrolled in formal
training programs bui also all practicing evalualors and teachers of evaluation
Perspectives and Assumptions
My answer (o (he question, What are (he unique skills and knowledge that
should be considered for the preparation of evalualors7 is based on four
assumptions First, we live in a multicultural society, and I assume thai evalu-
alors musi bnng a sensitivity to multicultural issues and perspectives to their
work Beaudry (1992, p 82) noies lhal program evaluators must seek in
include ihe multiple perspectives of eihmciiy. race, gender, social class, and
persons with disabilities "Program evaluation must lake IIOIILC ol the clungcs
I lhank the following people for their comments on my diafi Ir.imcwoik Jennifer Gicrne.
Jody Fuzpainck. Hallie Preskill. Nick Casimoncl. Jack McKillip. Di.iniu Nt-wiiian. and leiry
llednck
Ntw DlttLinm KM PniGUM FVAIIMIKW. no 61 Summer IVM c |IFMly BJU Publnhr» I 7
-------
18
Thi. i-Rf.PARAI ION Ol~ I'KOn SSIONAI. EVAI UAfORb
in our society and begin 10 respond to social issues represented by multicul-
tural education Hale crimes and ethnic strife are reported on the front pages
of newspapers and in the courts and the schools as well as all around the
world In education, much of what we know about negative lacial prejudice,
biases in testing, culluially biased instructional materials, and teacher effects
remains part of the hidden cumculum Multicultuial awareness and education
have equal relevance for health care, business, and industry as these sectors of
society cope with the shifting patterns of a culturally diverse work force "
Stanfield (1993, pp 6-7) addresses the same point "The dramatically
changing world in which we live demands lhai we cease 10 allow well-worn
dogma to keep us from designing research (read evaluation] projects that will
provide the data necessary for the formulation of adequate explanations for the
racial and ethnic dimensions of human life " Although the author just cued
speaks from the context of ethnicity and social science research, his comments
can be more bmadly interpreted as suggesting that evaluators must rethink tra-
ditional methods in order to be responsive to such alternative perspectives as
those of minorities, women, the poor, and persons with disabilities
Evaluation literature is only beginning to address the feminist (Farley and
Menens. 1993. Menens, 1992, Shapiro, 1987) and minority perspectives
(Madison. 1992) However, students of evaluation can borrow from the
research-based literature and cieate the applications and implications that are
necessary The peispeaive of persons with disabilities has not been addressed
as fully in the research literature (Mertens and McLaughhn, in press) as the
perspectives of other groups 1 include them here because a growing literature
suggests that they view themselves as an oppressed cultural group (Wilcox,
1989)
Second. I view evaluation as a unique discipline that has borrowed many
skills and much knowledge from social science research Evaluation is an
emerging profession with an expanding body of skills and knowledge that
require continual review Skills are things that evaluators need to be able to do
Knowledge is things that evaluators need to know The model that 1 propose
combines skills and knowledge, because evaluators need to be able to apply
what they know in order to conduct evaluations competently
Thud. 1 assume that a core set of skills and knowledge exists across disci-
plines for evaluators The particular emphasis of the skills and knowledge
requned in specific contexts depends on the discipline in which the training
occurs, the level of the training, the nature of the training (for example, degree
or nonacademic piogram, single course or program, new training or continu-
ing education), the aiea of application (foi example, education, economics,
psychology, criminal justice, public administration, business, health, sociol-
ogy, social work), the naiine of the organization that employs the evaluator,
and the level of the position that he or she holds Having asserted that there is
a coie set of skills and knowledge, I also want to recognize the dispute between
pioponents of the view that evaluation has content-specific knowledge and
advocates of the gencialist view Eisner (1991) describes a connoisseur as an
TRAINING EVALUATORS UNIQUI SKILLS AND KNOWIMH.I
19
individual who is highly perceptive in one domain and able to make fine dis-
criminations among complex and subtle qualities I believe that evaluatois
need either to be connoisseurs in the area of application (for example, drug
abuse, deafness) or to include a subject matter expert (connoisseur) in the
planning, conduct, and interpretation of an evaluation
Last, I assume also that evaluators must be capable of being responsive to
the needs of the client They must be capable of recommending the most
appropnate approach to an evaluation problem Some problems can best be
studied with quantitative data, while others call for qualitative data While
individual evaluators may not be expert in all quantitative and qualitative
research methods, they do need to be able to recommend the most appropn-
ate approach If necessary, they can work in a team with evaluators who have
greater expertise in other methods Sechrest (1992) argues that evaluators need
increased sophistication in quantitative methods I agree that there is room for
experts in either quantitative or qualitative methods, but there is also a need
for those who function comfortably m both domains Lincoln and Cuba
(1992) argue that a mixture of quantitative and qualitative methods can be
appropnale to any paradigm
Methodology
I used a number of different techniques to identify the skills and knowledge
unique to evaluation 1 reviewed existing literature, such as textbooks on eval-
uation (Bnnkerhoff, Brethower, Hluchyj. and Nowakowski, 1983. Popham.
1988; Shadish, Cook, and Leviton. 1991, Rossi and Freeman, 1993, Posavac
and Carey. 1992; Worthen and Sanders, 1987), presentations on training at the
annual meetings of the American Evaluation Association (Altschuld, 1992, Dar-
nngion. 1989; Covert, 1992, Eastmond. 1992, Mertens, 1992), literature iden-
tified through the use of ERIC and other data bases, training-related articles in
the journal Evaluation Practice, the U S General Accounting Office (1991) per-
formance appraisal system for evaluators, and Da vis's (1986) volume on eval-
uation training. I also consulted other evaluators and reflected on my own
expenence as an evaluator trainer for twenty-plus years I conducted a content
analysis of the skills and knowledge that I found in these various sources and
organized them into a conceptual framework I shared this conceptual frame-
work with evaluators and trainers in a variety of disciplines, including educa-
tion, psychology, business, administration, government, and interdisciplinary
programs.
Skills and Knowledge Needed
I have divided the skills and knowledge into four categories those unique to
evaluation, topics associated with typical training in the methodology of
research and inquiry, topics in such related areas as political science or anthro-
pology, and discipline-specific topics I chose this organizational framework
-------
20 THE PREPARATION or PRorhSbiONAL EVAUIAIORS
because it lends itself lo the overall design of an evaluation training program
A student who enrolls in an evaluation training program typically also receives
course work in research methodology, including research design, statistics, and
measurement This organizational framework suggests areas that need to be
included in evaluation courses because they are not often taught elsewhere (the
components of these other topic areas arising in evaluation are unique) It also
suggests other disciplines that can provide a more complete training experi-
ence Exhibit 2 1 displays the topics associated with research, related areas,
and specific disciplines Discussion of these topics is beyond the scope of this
chapter
I focus here on the unique skills and knowledge associated with evalua-
tion Standard evaluation textbooks typically cover some of these topics, and
I will therefore not elaborate on them I will discuss certain controversial and
emerging topics in evaluation for three reasons First, the standard textbooks
typically do not discuss them at length Second, students of evaluation should
be aware of the controversies m the field Third, I hope to push trainers to
think of including emerging topics—that is, topics that are still developing and
on which consensus has not yet developed—as valid for inclusion in the eval-
uation curriculum
The skills and knowledge listed in Exhibit 2 1 can all be taught with spe-
cial insights and examples from evaluation However, certain topics are not
generally covered in a course in research, related areas, or other disciplines
These topics are discussed in the sections lhat follow
Introductory Information About Evaluation. Evaluation textbooks typ-
ically include information about the definition of evaluation, the reasons why
evaluations are conducted, the various lypes of evaluations (for example,
implementation, process, outcome, impact, formative, summalive), trends
affecting evaluation, the roles thai evaluators can play (for example, external,
internal), and the history of evaluation
Philosophical Assumptions. Evaluation classes should teach the philo-
sophical assumptions underlying the positivisl and postpositive paradigmatic
orientations Although these assumptions should be taught in the research
methodology classes, the teacher of evaluation cannot safely assume lhat they
will be (Lopez and Mertens, 1993) Lather (1992) proposed an organizing
framework for paradigms that is relevant here positivists who seek to predict,
postposinvisis who seek to understand (this group includes those whom the
evaluation literature has labeled interpretive, natuialistic, and conslruclivisl), and
postposinvisis who seek 10 emancipate (this group includes feminists and race-
specific mqmrcis) For the reasons outlined in ihe section on perspectives and
assumptions, 1 would add pcisons wuh disabilities to ihc emancipatory cate-
gory Cuba and Lincoln (1989) have explained ihe assumptions underlying
positivists and post posit ivists (read consiruilwsis) in detail, and the trainer of
evaluation could follow up on emancipatory paradigms through such sources
as Lathe- ^1), Farley and Mertens (1993), Harding (1993), Shapiro (1987).
TRAINING EVALUATORS UNIQUE SKILLS AND KNOWLI IJCE 21
Exhibit 2.1. Knowledge and Skills Associated with Evaluation
I Knowledge and skills associated with research methodology
A Philosophical assumptions of alternative paradigms and perspectives, for exam-
ple, positives and postposinvisis (for example, construcnvisi, feminists, minoii-
lies, and persons with disabilities) (Lather, 1992)
B Methodological implications for alternative assumptions
C Planning and conducting research
1 Literature review strategies
2 Theoretical frameworks
3 Hypothesis/questions formulation
4 Research design (quantitative designs—for example, observational research,
surveys, experimental, quasi-expenmenial. correlational, causal comparative.
and single-subject—and internal and external validity qualitative designs—for
example, case studies and ethnography—and trustworthiness, and mixed
designs)
5 Data collection strategies sample selection, quantitative data collection (for
example, test construction, reliability, validity, application of tests, norm-anil
cnienon-referenced tests, selective measurement imniments, assessing mea-
surement instruments, measurement error and bias interpreting test results,
instrument construction), qualitative data collection (for example, observation,
interviewing, focus groups, document review, unobtrusive measures)
6 Data analysis and interpretation data preparation, construction of data bases.
handling missing data, computer usage for data analysis, statistical analysis,
qualitative data analysis strategies, display of data, presentation of well-sup-
ported findings, conclusions, and recommendations, communicative results
and follow-up
II Knowledge and skills needed for evaluation but borrowed from other areas
A Administration/business
1 Project management making effective use of resources, organizing and con-
ducting of meetings, developing and administering budgets, managing person-
nel, delegating work, supervising staff, reviewing work products, supervising
and evaluating staff, promoting teamwork, observing equal opportunity prin-
ciples
2 Strategic planning
3 Auditing and evaluation
4 Evaluation and program development
B Communication/psychology
1 Oral communication communicating with staff, external agencies, gener.il
public, and the press, obtaining needed information skillfully, avoiding mis-
understanding, projecting a positive image, using media appropnaiely in pie-
sentation. leading discussions, conducting productive meetings, handling
hostility and controversy, seeking and respecting others viewpoints
2 Written communication writing status reports, one-page faitual suinmants.
executive summaries, proposals, reports, briefing papers, mrmos. case studies.
interview notes, testimony, data collection, instrument pciloimaiiLC
appraisals, speeches, and professional articles, using umipmer software to
-------
22 THE PREPARATION OF PROFESSIONAL EVAI.UATORS
Exhibit 2 1. (continued)
produce appropriate lexi and graphics, establishing feedback loops 10 avoid
surprises and allow people to respond to drafts, providing constructive feed-
back on written products
3 People skills getting along with people, logically explaining expectations,
using sound judgment as to what should be said/written, counseling employ-
ees in need of remediation, resolving sensitive personnel problems, rewarding
good performance, providing timely feedback
4 Negotiation negotiating contract, evaluation questions, separating people
from ihe pioblem, dealing with issues and values, focusing on many interests
thai are represented, inventing options for mutual gam (Barnngion, 1989)
5 Personal qualities credible, good judgment, flexible, sense of humor, continu-
ally learning, self-reflexive, cunous about how things work, ability to show
respect for the efforts of others
C Philosophy
I Ethics
2 Valuing determining the value of an object, applying criteria to information
about an object to arrive at a defensible value statement
D Political science
I Policy analysis
2 Legislation and evaluation the place of evaluation in current legislation
E Anthropology cross-cultural skills
F Economics (T Hednck, personal communication, July 9, 1993)
1 Cost-benefit and cost-effectiveness analysis, supply/demand theory, discount-
ing, wage rate analysis
2 Controlling for economic factors, for example, changes in the unemployment
rale
III Knowledge and skills unique to specific disciplines
A Education educational objectives, instructional design, instructional product
evaluation, teacher evaluation, populations with special needs, accreditation,
alternative assessment strategies
B Psychology human development, social service programs, clinical models, goal
attainment scaling, outcome evaluation of psychotherapeutic interventions, psy-
chological measurement, work environment (motivation, job satisfaction, produc-
tivity) (Corday, Boruch. Howard, and Boozin, 1986)
C Health cpidemiological studies
D Business task analysis, job analysis, management, organizational change, market
research, organizational design and development, information systems, conflict
resolution (Perloff and Rich 1986)
E Government policies, procedures, regulations, and legislation that apply to the
work aica (U S General Accounting Office, 1991)
F Public administration distinctions between the public and private sectors (for
example, many "bosses" in the public sector, legislative, judicial, executive, pub-
lit., special micicsl groups), no clear bottom line as with piofil in the private sec-
tor, accountability 10 the public (J Fiizpatnck. personal communication. July 8.
1993)
TRAINING EVAI.UATORS UNiqut SKILLS AND KNOWLLUGE 23
and Nielsen (1990) for feminists, Madison (1992), Mann and Mann (1991),
and Slanfield and Dennis (1993) for minorities, and Menens and Mclaughlin
(in press) for persons with disabilities Evaluation courses should include these
diverse perspectives and should be integrated into the process of planning and
implementing an evaluation on the understanding that an inquirer's philo-
sophical assumptions and theoretical orientation influence every stage of the
design process
Theories and Models of Evaluation. Numerous methods for the orga-
nization of the many theories and models of evaluation have emerged
Shadish, Cook, and Levuon (1991) explore the knowledge base that has
emerged regarding evaluation theories Theories encompass the choice of eval-
uation method, philosophy of science, public policy, and value orientation
These authors have identified three stages of evaluation theones theories thai
use a rigorous, scientific method and emphasize the search for "truth", theo-
ries that emphasize the need for detailed knowledge about how organizations
in the public sector work to increase the political and social usefulness of
results; and theones that integrate alternatives generated in the first two
stages Cuba and Lincoln (1989) provide a contrasting framework for theo-
nes and models of evaluation that includes four "generations" measurement
(testing), descnplion (objectives), judgment, and the responsive, construc-
livisl theory of evaluation As 1 mentioned in the preceding sect ion, emerging
theones associated with the emancipatory paradigm provide ferule ground
for an exploration of the meaning of alternative perspectives and then
methodological implications
Planning and Conducting an Evaluation. Although the process of plan-
ning and conducting an evaluation vanes with the theoretical framework, the
student of evaluation should be knowledgeable about and able to apply the
following steps
1 Focusing the evaluation This step includes identifying ihe object of the eval-
uation, us purpose, its audiences, and the constraints and opportunities
Identification and involvement of stakeholders have been tied to the
increasing utilization of evaluation results, and they have also been a source
of controversy in the evaluation field Harding (1993) and Madison (1992)
assert that the stakeholders involved should be those with the least power
and that team-building and collaboration strategies should be devised to
include clients in a meaningful way T Hednck (personal communication.
July 9, 1993) believes that team building is not appropnate in such sellings
as federal oversight evaluations J Greene (personal communication. July
7, 1993) believes that what is distinctive about evaluation is the way in
which politics intertwines with public program and policy decisions and
the distinctive, contested audiences of an evaluation Students should
explore who should be involved in an evaluation, whose purposes an eval-
uation should serve, and how best they can be appropmidy involved
-------
24 Ti IE PRF.PARA riON OF PROPCSSIONAL I-VALUA i ORS
2 Designing ihe evaluation and formulating questions The choice of a theo-
retical framework and evaluation model discussed previously guides the
evaluator here
3 Planning data collection The evaluator needs to identify the information
needs, sources of information, instruments (including ways to describe the
program treatment and implementation), and ways to identify the theory
of the program being evaluated (that is. the context and presuppositions of
the organizations and groups involved)
4 Analyzing and interpreting daia The evaluator needs to identify appropri-
ate analytical approaches for the type of data collected Identifying them
will provide a mechanism for accurate and meaningful interpretation
5 Planning, reporting, and utilization The evaluator needs 10 facilitate effec-
tive communication and integrate utilization strategies throughout the eval-
uation process
6 Planning management The evaluaior needs to determine the resources
required and the time line of activities
7 Planning meta-evaluation The evaluaior needs lo know how to evaluate
the quality of the evaluation plan, process, and product
Students should be given the opportunity to implement their evaluation
plans through small evaluation projects completed as a class project or as a pan
of an internship Several authors have provided helpful hints concerning the
inclusion of practical experiences in training programs (Eastmond, Saunders,
and Merrell, 1989, Morris, 1989. Preskill. 1992)
Socialization into the Profession Students of evaluation should be
given the opportunity to become socialized into the profession by means of
involvement with professional organizations, networking with evaluaiors, and
interacting with the evaluation literature
Special Topics in Evaluation. The following topics are very important
in evaluation and should be included in the preparation of evaluators
1 Ethics professional behavior, use of information, confidentiality, sensitivity
to effect on others, pressure from client to distort facts, proper response to
discovering information that is morally or legally volatile, and so on (Mor-
ns and Cohn. 1992)
2 Standards for evaluation of programs and personnel (Joint Committee on
Standards for Educational Evaluation. 1981. 1988)
3 Politics of evaluation knowing the players, the policy environment, the
power of communication, how to get people to come to an agreement, and
how people are likely to use information (Barnnglon, 1989), knowing how
organizations work, how to understand an organization's goals and inter-
nal and external forces (that is, how to analyze us political context) (H
Preskill, personal communication, June 23, 1993)
TRAINING EvALUATORi UNIQUE SKII i.s ANO KNOWI EDC.I. 25
4 Specific methods and contexts, such as needs assessment (McKillip, 1987),
evaluabilily assessment, fulunng (the field of future studies) (Patlon. 1990).
and international evaluations
5 Evaluaior as trainer training evaluation clients and users
The section in Exhibit 2 1 on skills and knowledge borrowed from related
areas includes olher topics, such as policy analysis, communication skills, and
cost analysis, (hat are essential lo the preparation of evaluaiors
Summary
The training of evaluaiors should reflect ihe evolving, dynamic nature of the
field of evaluation. Many core topics identified here are reflected in evaluation
lextbooks. Evaluation also borrows skills and knowledge from olher disci-
plines, but a training program for evaluaiors should examine them specifically
through an evaluation lens. The inclusion of emerging topics in an evaluation
training program can sensitize students of evaluation 10 these issues and make
them better able to serve the people whom their evaluations affect The train-
ing of evaluators should prepare them to reflect on and engage in dialogue
about the besl ways of responding lo society's diverse demands The field of
evaluation needs to think in terms of multiple, not singular, pcispcciives when
il trains evaluaiors
References
Altschuld. J W "Structuring Programs to Prepare Professional Evaluaiors " Paper presented al
the annual meeting or the American Evaluation Association, Seattle. Wash . 1992
Bamngion, G V "Evaluator Skills Nobody Taught Me, or What's a Nice Girl Like You Doing in
a Place Uke This7" Paper presented at the annual meeting or the American Evaluation Associ-
ation, San Francisco, 1989
BeaudryJ S 'Synthesizing Research in Multicultural Teacher Education Findings and Issues
Tor Evaluation of Cultural Diversity " In A Madison (ed ), Minority Issues in Program Evalua-
tion New Directions for Program Evaluation, no 53 San Francisco Jossey-bass, 1992
Bnnkerhoff R O , Brethower, D M , Hluchyj, T , and Nowakowski. J R Program t'vuluulion
Boston Kluwer Academic Press. 1983
Cordray, D , Boruch, R , Howard. K , and Bootzm. R "Teaching of Evaluation in Psychology
Northwestern University " In B G Davis (ed ). The frothing oj Evaluation Across the Disciplines
New Directions for Program Evaluation, no 29 San Francisco Jossey-Bass. 1986
Covert, R W "Successful Competencies in Preparing Professional Evaluaiors " Paper presented
at the annual meeting of the American Evaluation Association, Seattle, Wash , 1992
Davis. B G "Overview of the Teaching of Evaluation Across the Disciplines "In B C Davis (ed).
The Teaching oj Evaluation Across ihe Disciplines New Directions for Program Evaluation, no
29 San Francisco Jossey-Bass. 1986
Eastmond. J N , Jr "Addressing Ethical Issues When Teaching Evaluation " Paper presented at
the annual meeting of the American Evaluation Association. Chicago, 1991
Eastmond, J N Jr "Structuring a Program lo Prepare Professional Evaluating " Paper presented
at the annual meeting of the American Evaluation Association. Seattle. Wash . 1992
-------
26
I III PRtPARATION OH PROI>SSIONAI. EVALUATOR-S
Easunond. J N , Jr . Saunders, W , and Merrell. D "leaching Evaluation Through Paid Con-
iraLlual Arrangements " Evaluation Practice. 1989, 10 (2). 58-62
Eisner. L W The Lnlightened Eye New York Macmillan. 1991
Farley, J . and Meriens. D M "The Feminist Voice in Evaluation Methodology " Paper presented
at the annual meeting of the American Evaluation Association, Dallas, 1993
Cuba, I: , .ind Lincoln. Y S Fouilh-Gfneralion Evaluation Ncwbury Park. Calif Sage, 1989
I larding. S "Rethinking Standpoint Epistemology 'What Is Strong Objectivity?1" In L Alcoffand
E Potter (cds ), /Vminist Episternology New York Koulledge, 1993
Joint Committee on Standards for Educational Evaluation Standards Jui Evaluationsof Educational
Programs. Projects, and Materials New York McGraw-Hill, 1981
joint Committee on Standards for Educational Evaluation The Peisonnel Evaluation Standards
Newbury Park. Calif Sage. 1988
Lather. P Celling Smart Feminist Research and Pedagogy with/in the Postmodern New York Rout-
ledge. 1991
Lather. P "Critical Frames in educational Research Feminist and Poststructural Perspectives "
Theory into Practice. 1992.31 (2). 1-12
Lincoln. Y S , and Cuba, E G "In Response to Lee Sechresi's 1991 AEA Presidential Address
•Roots Back to Our First Generations.' Feb 1991. 1-7 " Evaluation Practice. 1992, 13 (3).
165-170
Lopez. S D . and Menens. D M "Current Practices Integrating ihe Feminist Perspective in Edu-
cational Research Classes " Presentation at the annual meeting of the American Educational
Research Association. Atlanta, Ga , 1993
McKilhp.J Nerd Analysis Newbury Park. Calif Sage. 1987
Madison. A M (ed ) Minority Issues in Program Evaluation New Directions in Program Evalua-
tion, no 53 San Francisco Jossey-Bass, 1992
Mann, G . aYid Mann. B V Reseanh with Hispanic Populations Newbury Park, Calif Sage, 1991
Meriens, II M "Structuring a Program to Prepare Professional Evalualors Whal Aren l We Talk-
ing About (Ihai We Should Be)'" Paper presented al the annual meeting of the American Eval-
uation Association. Seattle. Wash . 1992
Menens. D M , and McLaughlm. J Reseanh Methods in Spenal Education Newbury Park. Calif
Sage,in press
Morns. M "Field Experiences in Evaluation Courses " In D M Meriens (ed ), Creative Ideas for
leaching Evaluation Norwell, Mass Kluwer, 1989
Morns. M , and Cohn, R "Program Evalualors and Ethical Challenges A National Survey" Paper
presented at the annual meeting of the Amencan Evaluation Association, Seattle. Wash . 1992
Nielsen. J M (ed ) Feminist Research Methods Boulder. Colo Westview Press, 1990
Ration. M Q "The Challenge of Being a Profession " Evaluation Piaclice. 1990. 11(1), 45-51
Perloff, R . and Rich, R F " The Teaching of Evaluation in Schools of Management" In B G Davis
(ed ), The Teaching of Evaluation Across the Disciplines New Directions for Program Evaluation.
no 29 San Francisco Jossey-Bass. 1986
Popham.WJ Educational Evaluation Englewood Cliffs, N J Prentice Hall. 1988
Posavac. E J . and Carey. R J Program Evaluation Method and Case Studies (4lh ed ) Englewood
Cliffs, NJ Prentice Hall. 1992
Preskill. II "Students, Client, and Teacher Observations from a Praclicum in Evaluation " Eval-
uation Pnulice. 1992. 13 (I). 39-46
Rossi. P . and Freeman, II E evaluation A Systematic Approach (5lh ed ) Newbury Park, Calif
Sage. 1993
Seihresi.l "Roots Back 10 our Firsi Generations " Fvuluulion Piaitue. 1992 1.1(1). 1-8
Shadish. W R.Jr. Cook. 1 1). and Leviton, L C Foundations of Program Evaluation fheones
ofPraititf Newbury Park, Calif Sage, 1991
Shapno.j I* "Cull.ibor.uivr Evaluation lowaidd rr.insform.mon of I valuation lor Feminist Pro-
grams and Projects " Paper presented al the annual meeting of the American Educational
Rcse:iich AiMKialion. Washington, 1} C , 1987
TRAINING EVALUATORS UNIQUE SKILLS AND KNOWI m.i 2 7
Stanfield.J H "Methodological Reflections " In J H Sianheld II and R M Dennis (eds ). Kace
and Ethnicity in Research Methods Newbury Park, Calif Sage. 1993
Sianfield. J H , II. and Dennis. R M (eds ) Race and Ethnicity in Reseanh Methods Newbury
Park. Calif Sage. 1993
U S General Accounting Office Performance Appraisal System o) Band I. II. and III Employees
Washington. DC US General Accounting Office. 1991
Wilcox. S "STUCK in School Meaning and Culture in a Deaf Education Classroom " In S
Wilcox (ed ), Amencan Deaf Culture Burtonsville. Md Linstock Press. 1989
Worlhen, B R , and Sanders. J R Educational Evaluation New York Longman. 1987
DONNA M MERTENS is professoi m the Depaitment of Educational I'ounilatumi. and
Research at Gallaudet University in Washington, D C
-------
In 1988, the US. General Accounting Office started an ongoing,
comprehensive evaluation (raining program for its staff. This
chapter sketches the program and describes the major substantive
areas of its curriculum
Evaluator Training in a Government
Setting
Nancy Kingsbury, Jerry E. Hednck
The U S General Accounting Office (GAO) is a nonpamsan agency in the leg-
islative branch of the federal governmem Its statutory mission, established by
the Budget and Accounting Act of 1921, is (among other things) to "investi-
gate, at the seat of government or elsewhere, all matters relating to the receipt,
disbursement, and application of public funds" (Budget and Accounting Act,
1921) Over the years, this responsibility has evolved from detailed audits of
individual agency purchases into economy and efficiency reviews of govern-
ment programs and more recently to a wide-ranging array of program evalua-
tions and policy analyses With very few exceptions, any program or activity
funded with federal tax dollars can be the subject of a GAO review And as
Congress grapples with the difficult decisions of the 1990s, the issues that the
GAO is asked to evaluate mirror the breadth and complexity of the day's head-
lines A recent sample of study requests includes these questions What strate-
gies have been most effective m reaching hard-la-serve recipients of welfare
programs7 What factors drive health care costs' How feasible arc various
approaches to the development of geothermal energy7 Are federal plans for the
elimination of tuberculosis from the United States achievable7 What are the
causes and effects of the European currency crisis7 What is the best strategy
for response by the federal government to natural disasters? Can nonnuclear
designs for aircraft carriers and submarines meet the Navy's need for future
missions7 What interventions are necessary to end the underrepiescnution o(
women and minorities in federal agencies7
In September 1993, the GAO had about 4.900 staff Three-quarters were
engaged in evaluation and auditing work Evaluaiors are organized into (hiriy-
six issue areas that correspond roughly to government progiar riil.li.lwit 61
-------
62 Tin.
PROI 1-bSIONAI. EVALUA10KS
such as healtli policy, employment and (raining, cnvironmenial issues, lax policy
and administration, administration of justice, management of defense and
space programs, international trade, and federal management issues
Because a substantial part of the GAO's evaluation work is earned out on
site where government programs operate, evaluation staff are located in Wash-
ington. D C , in fourteen regional offices around the U S continent, and in two
overseas offices (one in Hawaii, the other in Germany) Like many federal
agencies, the GAO expects to reduce us size over the next few years, although
(here is little likelihood that the work thai it will be asked lo do will decrease
Accordingly, ihe GAO is investing significantly in technology improvements
(computer networks, videoconferencing) that can improve its productivity and
in improving us work processes ihrough total quality management (TQM)
Evaluation and Evalualors at the GAO
The GAO defines evaluation broadly The term is used to describe a range of
activities Depending on the context, it can be synonymous with audit, review,
or policy analysis Methodologies and techniques from a variety of disciplines
are often brought to bear on an assignment, and, because of the agency's his-
tory in the accounting tradition, the results of the work must meet traditional
government auditing and accounting siandards as well as the standards of
other professional disciplines
Most staff responsible foi carrying oui reviews of governmem programs
are called cvalualois, and they arc expected to demonstrate strong genetic skills
in project planning, data gathering and analysis, written and oral communi-
cation, and interpersonal communications and management areas However,
(hey also need specialized expertise
The GAO recruits staff from an an ay of the professional disciplines found
in colleges and universities throughout the country In the 1930s and 1960s,
the GAO iccruited almost exclusively from the accounting profession In recent
years, it has greatly diversified its hiring practices, hiring master's degree and
doctoral level graduates in economics, ihe social sciences, public policy, pub-
lic administration, and business administration It continues to recruit accoun-
tants but generally at ihe bachelor's level Increasingly, GAO staff are
maintaining their professional identities (for example, as an economist) after
entcnng the agency Nevertheless, the requirements of the work necessitate thai
members of each discipline become familiar with the terminology and meth-
ods of othci disciplines
Although disciplinary diveisity is clearly an impoitant pan ol the GAO's
institutional capability, u is equally important that there be a common set of
coic values and woikmg procedures thai ovcilny the vjnety of disciplinary pci-
spet lives It is essential to have a common understanding of the GAO's mis-
sion, of what is meant by the expression "quality woik," of the GAO's
expectations for the w.iy in which woik will be carried out, of the way in
EVALUATOR TRAINING IN A GOVERNMENT SETIING 63
which ihe executive and legislauve branches operate and mteisect, and of the
way in which the GAO will ultimately meet the mformalion needs of Congress
and provide value to the taxpayer Meeting all these needs while developing spe-
cific technical and computer skills poses a large challenge for the GAO's tram-
mg programs Training serves as a vehicle for establishing values, leaching
agency procedures, and understanding the broad coniexi of evaluaiion work
Training at the GAO
In 1988, the GAO established us own Training Institute Training and educa-
tion responsibilities were consolidated and separated from career counseling.
personnel, and orgamzauonal developmeni suppon The intent was to high-
light the importance that the agency places on training and professional devel-
opment Investing in staff development was deemed critical to meeting ihe
mformalion needs of Congress.
In comparison with other federal agencies, ihe GAO makes a relatively
large investment in training opportunities for us siaff The physical plam
includes iwo major training ceniers in Washmgion. D C , lhai have a lotal of
seventeen classrooms Each regional office also has space and equipmeni for
training activities Completion of a nationwide videoconferencing capabiluy
this year will permu training lo be provided simuhaneously in headquarters
and regional offices
GAO evaluators averaged seventy-four hours of continuing education in
1992. Two-thirds of that irammg was delivered by the Training Institute The
institute has a rosier of 210 aciive courses and offers more lhan a ihousand
classes a year (Training Institute . ,1991) Evalualors have also been given
significant resources, boih centrally and wuhm organizational units, lo partic-
ipate in professional development activities outside ihe GAO The agency's on-
gm in accounung conmbuies directly lo ihis emphasis on irammg by imposing
a continuing education requirement on all evaluaiors To conunue lo be
deemed qualified to do ihe GAO's audit and evaluation work, every evaluator
musi oblam a minimum eighly hours of irammg every iwo years
Teaching Evaluation in a Work Setting
The environment of the GAO makes demands and imposes constramis on ihe
design of training programs that are quiie different from the forces thai opcr-
aie in academic sellings Our siudems are adults langmg from recent gradu-
ates of graduate-level programs to experienced evaluators who have bioad
evaluation and management experiences and mature auditors ncarmg rcine-
ment Almosi all ihese students work full-time and expect naming to be
directly relevam to what they will do on the job the very next week
Work schedules and geographic dispersion require ihai training l>e deliv-
ered in intensive segments A typical Training Institute course consists of iwo
-------
64 Tllfi I'RFPARATION OF PROFESSIONAL EVAIAIATORS
lo four successive eight-hour days of training This pattern permits a short
period of full-time training (about all that is manageable given the press of
ongoing work), and it gives regional participants reasonable (ravel time This
concentrated, full-time training schedule and the nature and expectations of
the Training Institute's students—evaluator staff—heavily influence training
methods Most courses are a mixture of lecture, case studies, and opportu-
nities for practical application of skills thiough role playing or demonstra-
tion, and most courses make extensive use of materials taken directly from
GAO work When possible, we give training participants oppoitunnies to
use material from their current assignments in class exercises (for example,
by using real data from an ongoing assignment when they practice writing
testimony)
In part as a means of focusing training on skills and activities directly
related to the work, instructors are heavily drawn from line staff and managers
Many of our senior executives regularly act as course instructors This pattern
of training delivery requires development of course frameworks and instruc-
tional materials that are easy for multiple instructors to use We also train our
instructors to leach effectively We are more likely to use external than inter-
nal instructors for courses in such things as statistics, writing, computer soft-
ware, and generic management topics
Major Areas of Emphasis
Overall, the GAO's formal training progiam for cvaluators has six areas of
emphasis agency mission and policies, assignment planning and execution,
communication skills and strategies, computers and information technology,
workplace relations and management, and issue area expertise With the
exception of issue area training, (he courses in each area have been designated
as required, core, or elective and determined to be appropnate for staff, senior
staff, management, and/or executive levels
All evaluators must lake the required courses, which contain information
that the agency believes is necessary for all peisons regardless of prior educa-
tion or work experience Core courses contain material with which all evalu-
alois should be familiar, but the agency recognizes that individuals may excuse
themselves from specific courses if they have previously mastered the mater-
ial Elective courses can be selected to fill specific needs, depending on the type
of work m which the individual is currently engaged Courses at the staff and
senior staff levels are concentraied in technical areas Courses at the manage-
ment and executive levels emphasize management The training provided for
the upper levels often relies on external opportunities for continuing educa-
tion, such as piofcssional conferences Cvaluators move thiough a structured
set of couiscs as they progress in their careers
The GAO's curriculum structure was developed in collaboration with an
advisory c-~ -nmee of managers diawn from the agency's divisions and offices
EVALUATOR TRAINING IN A GOVFKNMI-NI Sen INC, 65
Specific courses have been developed over the past four years, and only in the
past year can the GAO be said to have fully implemented us evaluatoi cur-
riculum The six sections that follow describe each of the major substantive
areas.
Mission, Policies, and Individual Responsibilities. All new staff mem-
bers are required to attend an initial orientation course that describes the GAO's
history, its mission, its role in supporting congressional decision making, ethics
guidelines, and the policies and procedures for the conduct of evaluation assign-
ments Subsequent required courses that evaluators take in the next few years
elaborate on standards for work, internal control issues concerning the quality
of acceptable evidence, and processes for ensuring accuracy in the agency's
reports
As staff move up the career ladder after each promotion, they are invited
to attend so-called promotion programs that lay out the organization's expec-
tations for their new roles The discussions in these programs are structured
both around people issues—interpersonal communication, supervision, per-
formance feedback—and around planning and reporting issues—for example,
what it means lo have responsibility for directing an evaluation or managing
the work of multiple teams of evaluaiors Ai mid levels, these programs can
include special topics, such as information on a manager's equal employment
opportunity responsibilities
Assignment Planning and Execution. Each audit or evaluation at the
GAO is referred to as an assignment, and the skills necessary lo design and
manage an assignment are a crucial part of the GAO's internal training pro-
gram for newly hired evaluaiors The goal here is twofold 10 create an aware-
ness of other professional disciplines and to build specific skills
The overall intent of ihis pan of the curriculum is lo fosier an awareness
of the wide range of work that the agency docs and of the need to apply appro-
pnate methodologies when the work is done All entering staff are required to
attend a workshop on the selection of an approach and methodology The
workshop provides guidance on how to lake an area of congressional concern
and develop focused questions lhat can be answered within the constraints
imposed by resources and lime Workshop participants then analyze these
questions lo determine how they can mosi appropriately be answered Stalf
then lake core methods courses on compliance auditing, economy and effi-
ciency reviews, program evaluation, and policy analysis Follow-on courses on
such lopics as procurement and contract processes, financial management.
budgeting processes, fraud awareness, and special issues in economics aie
available. The goals are for all individuals—staff and manageis—in become
comfortable with a variety of types of work and lo be able to work rfleuivcly
in a multidisciplmary environment
To meet ihe skill-buildmggoal. the institute offers couiscs on such topiis
as sampling, questionnaire design and structured interviewing, jpphed siatii-
lics (basic classes and elective classes on advanced topics, sir log-linear
-------
66 THE (-REPARATION or PROFESSIONAL EVALUATORS
modeling and lime senes analysis), and qualitative methods At entry, staff are
provided with self-paced training materials on organizing their documentation
(work papers) for the evidence and analysis thai support an audit or evalua-
tion
Communication. Although a GAO evaluation team may conduct an
excellent study, ihe value of the study will be weakened significantly if it is not
communicated effectively in published reports and oral bnefmgs For this rea-
son, the GAO's curriculum gives evaluators a series of courses reflecting the
latest research on written and oral communication and on cognitive psychol-
ogy For example, instructors may discuss readability principles and factors
that increase the retention of information read
Writing courses at the entry level clanfy the GAO's basic communications
policy and differentiate between academic writing and workplace writing The
focus is on producing an institutional—not an individual—product In class,
evalualors work on skills that they need in order to wnle GAO documents—
for example, analyzing the writing situation, wnting collaboratively, recogniz-
ing the difference between wnier-based and reader-based documents, assessing
the readability of their own documents, and using review comments to
improve documents
Class exercises for senior staff show how writing and thinking are inextri-
cably linked and how the structure of a written report can affect us interpre-
tation Training participants also practice constructing a succinct message out
of masses of data Using data from a case study, evalualors develop report
issues, prepare for a message conference (a meeting m which all evaluation
team members, advisers, and managers discuss the evaluation results and agree
on their interpretation), and conduct a simulated message conference Mes-
sage conferences are stressed because they improve the quality and timeliness
of the documents produced and reduce unnecessary rework A course for man-
agers called Managing Writing reviews the writing principles embodied in the
curnculum, suggests strategies for managing the writing process, and presents
ideas about the role of oral and written communication in public policy
processes
The wnting curnculum also includes specialized courses thai help staff to
write specific kinds of documents, such as an executive summary or written
testimony for oral presentation These two kinds of writing are emphasized
because both are highly visible statements of the GAO's work Both kinds of
wntten summaries receive close scrutiny, and many readers may never read the
full evaluation report During the testimony course, evaluators develop testi-
mony by following guidelines for effective congressional presentations They
practice testimony-wnimg skills and receive constructive feedback The course
also includes discussions with executives who excel in the delivery of written
testimony
Oral communication is equally important, because much of the agency's
work is conveyed through bnefmgs and testimony Training on oral presenta-
tion skills begins during the first three months after a new evaluator starts
EVALUATOR TRAINING IN A GOVI.RNMENT SETTING 67
work, and it seeks to improve his or her interviewing and briefing skills The
follow-on course is dedicated to honing presentation skills, it makes use of
videotaping and feedback Electives are available ai a more advanced level 10
improve presentation skills and learn how to conduct meetings effectively At
ihe most advanced levels, managers and executives can lake hands-on courses
involving practice in communicating effectively with ihe media and delivering
oral testimony to Congress Figure 6 1 shows how the conient of ihe commu-
nications courses vanes by position
Computer Use. As a large organization, ihe GAO uses several software
packages lhat must be supported with technical assistance and training Several
years of expenence have taughi us thai users prefer thai course maienal deliv-
ered in the classroom be very bnef. Maienal is often delivered in modules—for
example, WordPerfect sort features. WordPerfeci lexi columns, Loius 1-2-3 daia
lables. Users can enroll in ihe course most suiied lo iheir immediate needs
Forty courses are available (Training Institute , 1991) They relate to word
processing applications, spreadsheets, and database management systems Addi-
tional training is available on microcomputers, data analysis packages, such as
SAS and SPSS; computer communications; and support for local area networks
As ihe agency is moving to design and implement software applications for
organizing, sharing, and accessing working papers and databases, the institute
is designing training to support their use. Steps are also being taken lo revise
existing training in ways thai recognize how these apphcauons and the use of
local area networks can change the ways in which work gets done.
Figure 6.1. Communications Courses in the GAO's Evaluator Curricu-
Eiacunva
Producing
OganlMd
WMnoml
Wnuttop
Evwy ftiM
EiaCUM
Sufimwiy
Wortuliop
Tummy
rsa?
EiaoilM
Di.tvo.hg
lammony
Note Introductory Evaluaior 1 wining, the iwo-week orientation, includes module;, on wining and
oral bntfmg skills
-------
68
Tnr PREPARAIION or PROCESSIONAL EVAI.UAIORS
Workplace Relations and Management. Training in the area of work-
place relations and management contains material that traditionally has been
classified as both soft and hard skills As one might expect, the institute offers
classes on time management, relations with Congress, and management of
one's issue area, that is, a body of work in an area like transportation or energy
These kinds of courses focus on planning and coordination processes Super-
vision and performance management seminars are also available
Courses on interpersonal relations in the workplace, mediation, diversity
in the workplace, and advanced communications and negotiations are newer
additions to the GAO's curriculum. As the GAO has become increasingly
involved with total quality management (TQM), the emphasis on interpersonal
communication, teamwork, and management skills has increased Recent hires
have an opportunity to enroll in such courses as Workplace Relations and
Communications Mid-level managers take Managing Quality Improvement,
and top-level executives learn about the role and responsibilities of quality
councils Five-day courses prepare leaders of problem-solving teams to use
appropriate tools and techniques, leach fellow team members, and be aware
of how to foster positive group dynamics Additional training is expected to be
added in this area as the GAO advances in us implementation of TQM
This area also contains courses to build skills and heighten awareness and
knowledge about key supervisory and management responsibilities All supervi-
sory staff recently participated in workshops on ways of preventing sexual harass-
ment in the work environment, and a similar course is now available to
nonsupcrvisory staff All supervisors and managers have already received train-
ing on their equal employment opportunity (EEO) responsibilities To maintain
this awareness, the GAO auiomaucally enrolls newly promoted staff in the EEO
workshop And in recognition of the increasing diversity of its work force and of
the need to have a work environment that makes all staff feel welcome and val-
ued, the institute has started to provide workshops on the valuing of diversity
Issue Area Training. As noted earlier, the GAO has thirty-six issue areas
covenng work in areas as wide-ranging as national security policy and national
resource management Generally, the Training Institute has neither the exper-
tise nor the resources needed to develop issue area subject mailer training, so
in most cases issue area groups pursue their own strategies to develop and
maintain staff proficiency These strategies can include inviting subject matter
experts to give informal talks, using consultants on specific projects, and hold-
ing planning conferences with invited participants from government agencies
and academic and other relevant groups, including businesses, professional
associations, and think tanks
However, major training initiatives have been supported internally in two
key issue aieas In the financial management aiea, the institute worked closely
with the GAO's accounting and financial management expeits to develop a
financial auditing ami<_ulum And in the information management technol-
ogy aiea, •' sniute has offered courses on such topics as computer security,
EVALUATOR TRAINING IN A GOVCRNMEN r Sc RING 69
telecommunications, and systems development Several masier's-level courses
leading to a certificate in information systems from the George Washington
University have been offered at the GAO's training center on a regular basis at
the end of the workday
Self-Paced Training
Besides classroom courses, the institute provides a variety of self-paced courses
in interactive multimedia, audio, video, and print formats Many of these
course offerings are related to computer software packages, but others cover
such topics as management and supervision, human resource management,
writing, and administrative support activities In most cases, courses can be
mailed to the work site and used on the individual's own compuier When spe-
cial equipment is needed or licensing restrictions make widespread distribu-
tion impractical, individuals can sign up to take such courses in ihe msuiuie's
learning center Acceptance of this type of course delivery has grown Self-
paced hours increased by iwo-thirds in the past year More than 1,300 evalu-
ators enrolled in self-paced courses in fiscal year 1993. and there have been
about 800 course completions to date
Lessons Learned
Developing and delivering ihe GAO's evaluator curriculum has been a contin-
uous learning process for everyone involved Several valuable lessons have
been learned that may be useful to others who have a responsibility for nam-
ing functions in similar contexts
Firstfjob relevance isrrul^Drnnro. developers and msiructois need to
be able to assess training needs accurately, design effective learning experi-
ences, and demonstrate to training participants that the training material is
directly relevant to their work Training is not an end in uself. u mus,i be
designed to support the effectiveness of the individual and the organization
One technique for enhancing the relevance of training is to design courses that
use real case studies or have the students bring ongoing work to class and
apply the material to it This means that instructors have to be adaptable and
quite proficient in their area of expertise.
Second, th^tnvolvement of line managers and stottjncrrw* the credibil-
ity and quality of training The GAO's curriculum was developed under the
guidance of a management advisory committee, and all decisions regarding
mandated, core, and elective courses were made by this committee Housing
decision-making authority in the "line" makes ownership of the curriculum
greater than if the training department were to make all ihe decisions Execu-
tives, managers, and senior staff also often serve as mstruciors or preseniers on
panels or contribute course material, thereby endorsing the value of training
and building commitment
-------
70 THE PREPARATION OF PROI-ESSIONAL EVALUAIORS
Third, training needs lo dclivefcpnsistcnt messages at all levels^The
GAO's curriculum structure was intended to be an integrated one, with simi-
lar concepts and skills in courses at staff, senior staff, and management levels
We are still completing this process, and we hope by next year to have paral-
lel courses running at all levels, with the upper-level courses incorporating a
managerial perspective This consistency will ensure that managers and staff
are familiar with the same terminology, methodologies, and guidance We also
plan to increase the amounl of training that we deliver lo intact work groups
and reduce the number of open enrollment classes We believe that training
the members of work groups together makes it more likely thai the concepts
laughl in class will be reinforced and used on the job Work unit-based train-
ing is also expected lo have a direci effeci on improving teamwork and inira-
unu communication—important goals in an organization focused on quality
We plan to lesi ihese assumptions by conducting follow-up evaluations for
selected courses
Finally, in the spinl of quality^management, we strive l^fassess^he effec-
uveness of our training elTorts^oTuinuousTji and we revisit our delivery strate-
gies The challenges lhai ihe GAO laces Change consianily, the needs of our
work force change, and lechnological advances creaie new training demands
References
Budget and Accounting Act of 1921 (PL 67-13. 42 Stal 20, codified asamended al 31 U SC
712)
Training Institute. U S General Accounting Office Training and Education 1992-1993 Catalog
Washington. D C Training Institute. U S General Accounting Office. 1991
NANCY KINGSBURY is dneclor for federal human resource management issues in the
General Government Division of the V S General Accounting Office
TLRRY E HLDRICK is direcloi of the Training Institute al the U S General Account-
ing OJftic
The procedures used to collect information for the Directory of
Evaluation Training Programs are described, and the results of the
survefcflre discussed. Tables list the programs identified in the
United States, Canada, and Australia. '
The 1994 Directory of Evaluation
Training Progral
James W. Allschuld, Molly &wle, CarA Cullen,
Inyoung Kim, Barbara Rae Ma$ce /
The American Evalualion Association ^A) has periodically published a direc-
lory of evaluation training program? m fhe United Slates and Canada (May
Fleischer. Schreier. and Cox, 1986. ConnerMZIay. and Hill, 1980. Gephan and
Potter. 1976) May. Fleischer. Schreier. andVox (1986) listed fony-six pro-
grams in six different lypes of sellings PrograrXs locaied in education and psy-
chology predominated In late J992, ihe boarclof ihe American Evalualion
Associaiion commissioned a sliyly lo provide a cUrrem lisung and description
of evaluauon iraming progranjs This chapler describes ihe siudy meihodol-
ogy. reports some of ihe stud^s findings, and labulaifcs ihe informauon aboul
evaluation training prograr
Methodology
Sample. Since 198|($. numerous changes have occurred in ihe field of
evaluation thai could ajjfeci ihe nature of training programs \mong them are
changes m ihe meihods lhai evalualors use and improved understandings of
the relationship belween evaluation and policy development As\a result, new
programs have emerged, others have changed or altered theirtonieni and
structure, and still/others have ceased to exist
The initial and perhaps most challenging task for the study team was to
develop a comprehensive sampling frame of potential candidaics We used the
following process to develop the sampling frame First, we placed an
announcement in ihe call for papers for ihe AEA's 1992 annual conference
The announcemem asked AEA members to nominate candidates for the study
Ntw Di.bLiKiNS ioi PKH.MH EVAIIMIKW no 61 Summri IWM O |«K) ibu r.ibl»hr,, 7 |
-------
Meetings and Events http /Avww evaJ org/Conferences meetings html
American Evaluation Association Annual
Meetings
AEA holds its annual conference during the first week of November each year.
Topics of current interest are discussed in sessions proposed by members, as well
as in sessions presented by invited speakers. In addition, a computer-assisted Job
Bank is provided at the annual conference. Professional awards for outstanding
contributions to the field of evaluation are presented in the areas of: service to the
profession, evaluation theory, evaluation practice, and service to AEA. Preceding
the conference, over 25 training sessions are offered. Past training session topics
include: increasing evaluation use, cost-benefit analysis, reporting and debriefing,
applying professional standards, statistical analysis software, focus groups,
qualitative evaluation, and secondary analysis of available data.
Future AEA Annual Conferences
Evaluation 2001
Dates: November 7-10 in St Louis, MO
Hotel: Millennium
Call for Proposals in Mail: By January 7, 2001
Proposals Due: March 16, 2001
Notifications of Proposal Status: By July 1, 2001
Registration Materials Available: By July 1, 2001
Evaluation 2002
Dates: November 6-9 in Washington, DC
Hotel: Hyatt-Crystal City
Evaluation 2003
Dates: November 5-8 in Reno, NV
Hotel: Nugget
Other Evaluation-Related Meetings or
Conferences
• List of Events
Archives from Past AEA Conferences
,of2 1/4/01 1040AM
-------
Links
http.//www eval org. ListsLinks-EvaluationLmks/links htir.
Links of Interest to Evaluators
Here are some sites that you may find useful. If you know of other sites that might
be of interest to our members and others involved in evaluation, please send your
suggestions to AEA manager Susan Kistler at aea@kistcon.com.
AEA Topical Interest Groups
TIG on Alcohol. Drug Abuse, and Mental Health
TIG on Assessment in Higher Education
TIG on Business & Industry
TIG on Collaborative. Participatory & Empowerment Evaluation
TIG on Extension Evaluation Education
TIG for Graduate Student Association
TIG on International and Cross Cultural Evaluation
TIG on Minority Interests in Evaluation
TIG on Program Theory and Theory- Driven Evaluation
TIG on Research Technology and Development Evaluation
TIG on Teaching of Evaluation
AEA Local Affiliates
Arizona Evaluation Network fazENet)
Eastern Evaluation Research Society
Ohio Evaluator's Group
Oregon Program Evaluators Network
Southeast Evaluation Association
Washington Evaluators
Western Pennsylvania Evaluator's Network fWPEN)
Other National Evaluation Associations
1 of 3
1/4/01 1037 A^
-------
Links
http '/www eval orS'ListsLinks-EvaluationLinks/hnks him
A'. 5:.'?.'3S!-:- -. Evaluation Soae;:.'
Canaci:?n Evaluation Society
European Evaluation Socier-/
French Evaluation Society
German Evaluation Society (German language)
Ghana Evaluators Association
Italian Evaluation Society
Malavsian Evaluation Society
Monitoring & Evaluation in Latin America and the Carribean
Nigerian Network of Monitoring & Evaluation
Sri Lankan Evaluation
Swiss Evaluation Society
UK Evaluation Society
Walloon Evaluation Society
Other Associations of Potential Interest to Evaluators
• American Society for Public Administration
General Evaluation Sites
The American Educational Research Association
Applied Survey Research
Centre for Program Evaluation. The University of Melbourne
CRE5ST Home Page (Center for Research on Evaluation. Standards, and
Student Testing)
Educatiional Research Methods (follow link then click on textbook)
ERIC Clearinghouse on Assessment and Evaluation
Evaluation Associates Ltd.
The Evaluation Clearinghouse
The Evaluators' Institute
Harvard Family Research Proiect
The Joint Committee on Standards for Educational Evaluation
Lesley College Program Evaluation and Research Group
Literature on Programs
Monitoring and Evaluation News. A news service focusing on developments in monitoring and
evaluation methods relevant to development projects with social development objectives
(UK).
On-line Evaluation Resource Library. A resource of project evaluation tools (plans and
instruments) and reports used by the National Science Foundation's Directorate for Education
and Human Resources; topic areas focus on curriculum development, teacher education, and
faculty development, including minority group representation (US). URL: http //oeri.sn com/
Performance Assessment Links in Science An online resource of performance assessments
for students studying science in grades K-12 provides information on standards, tasks and
rubrics for evaluative purposes (US)
Program for Public Sector Evaluation. Roval Melbourne Institute of Technology This
interdisciplinary group focuses on public sector (e.g., program) evaluation; information is
provided on recent articles, coursework, and projects (AU)
2 of 3
1/4/01 10:37 AN
-------
Links http.//wwwevalorg'ListsLinks,Evaluai,,. -niks/'lmks.hnr
>--s on th^ Met A site with many resources, information is organized for
general audiences, students, professionals, researchers, with a "room" for chat and feedback
(CA)
• Gene Shackman's List of Free Evaluation pesourceq on the Web Resources on methods in
evaluation and social research (US)
• Student Evaluation Case Competition Open during the CES annual meeting to students of all
levels and disciplines, this is an opportunity for small teams to compete in the analysis of an
evaluation case file available in English and French. The cite includes archives with past
competition scenarios and winning entries. (CA)
• Bill Trochim s Center for Social Research Methods. Links for applied social research and
evaluation; look for the Knowledge Base (online textbook), statistical test selector, and the
simulation book (US)
• UK Evaluation Society Home page for a professional organization that promotes the use of
evaluation as a contribution to public knowledge (UK)
• UMASS Foundation Relations Responsible for managing the University of Massachusetts
relationships with private foundations, this office provides a wealth of information about
grants, philanthropy, and foundations (US)
• UNICEF Research and Evaluation A timely update of policy analyses, evaluations, and
research; links, statistical data, and newsletter archives can also be accessed (US)
• University of Wisconsin Program Development and Evaluation Full-text publications in PDF
format available for download; targeted to evaluators assessing extension programs, but
resources have general evaluation appeal, as well (US)
• Vanderbilt Center for Mental Health Policy Research focuses on child, adolescent, and family
mental health; follow the links to current projects, including the homeless families study (US)
• Virtual Library Evaluation
• Western Michigan Evaluation Site, the Evaluation Center/Evaluation Support Services.
Information on evaluation checklists and instruments, terminology; resources also include a
directory of evaluators and related links (US)
• The World Bank Institute The evaluation unit analyzes the learning activities for World Bank
Institute clients and staff; be sure to check out the newsletter, too (US)
• The World Bank. Operations Evaluation Department. This independent division evaluates the
lending operations of the World Bank; online publications in different languages are made
available as a result (US)
Federal Sites and Databases
CIESIN's US Demography
CYFERNet f Children Youth and Family Education and Research Network^ Gopher
List of WWW Servers (USA - Federal Government')
National Institutes of Health Home Page
Substance Abuse and Mental Health Services Administration
US Census Bureau Home Page
Statistics
• American Statistical Association
• One-Stop Federal Statistics Site
• Statistical country profiles and global maps of indicators of interest to UNICEF
• Statistics Canada
3 of 3 1/4/01 10-37 AN
-------
STANDARDS OF PRACTICE IN EVALUATION
Refer to
Articles in Fitzpatrick, Jody L and Michael Moms, eds (1999) New Directions for Program
Evaluation No 82 San Francisco CA Jossey-Bass Publishers.
Fitzpatrick, Jody L "Ethics in Disciplines and Professions Related to Evaluation"
Datta, Lois-ellin "The Ethics of Evaluation Neutrality and Advocacy"
"The Program Evaluation Standards" -Joint Committee on Evaluation Standards
"Cuidmg Principles for Evaluators" - American Evaluation Association
Your copy of "Government Auditing Standards" (the Yellow Book) - GAO
Your copy of "Quality Standards for Inspections (the Turquoise Book) - PCIE
1 Think about the differences between; ethics, standards, rules, principles, philosophy
Ethics
Standards
Rules'
Principles-
Philosophy
-------
1:1)1 it)KS' Nuirs
L FllZl'MMCK is dssoi late /);o/cssor D/ />uJ>lic ddmimstidfion d/ (lie University
o/ Colorado 5'it* mmniuins an rtciivf p«( ore in evaluation anil is interested in (/if
ethical nuances o/evaluaior-dient relations Slie serves tin the Board of the American
fl valuation Association and is working on a booh of IMC studies Joi the association
MlCHAFL MoKKfS is pro/essor o/ psyi luiltgy and director o/graduate jield training in
((immunity psychology at the University ajNew Haven He a/Us the column "lilln-
fcn ^jivcn /D (lie
slud^ o/e(lniul codes in evaluudun-rcluleJ (nnsul(mj> f)/o/cssiuns She
examines the ethical cades within evaluation and ielated disciplines
uriJ />i<|/essiuris and discusses implications foi content, dissemination,
and compliance
Ethics in Disciplines and Professions
Related to Evaluation
Jody L Filzpatnch
Donald Campbell (1969) bemoaned lite iracluional isulaiion, 01
of diffcieni disciplines Using (he analogy of a lish. he picsenicd a lish sea It-
mode I of omniscience in depicting the social sciences and iclaied discipline!)
Each discipline believes u knows the uiuh ((he whole fish) when, in lau. we tend
10 know only our own discipline (single scale) This chapter is designed to help
us avoid such ethnocenliisin in ethical malleis and instead, as Campbell aigued
we should, learn fiom related fields
F.lhicul Codes in Program Evaluation
To pi o vide a loundation fbi (Ins learning, I fust Imclly leview the liibimy ul
ethical codes in evaluation In the eaily 1980s two documents wcie
to guide evaluatois in then ethical consideiaiions the Standaid** /or (•
of Educational /'rogrums. I'lojecl^, and Malenal\ ()
-------
h l>ll Kl.INC I'llllC Al I".MAI I I Ml.IS IN LVAIIJAIUIN
maintained (hi- same four in.ijoi groups of standards—iinliiy, Icasihiliiy. pio-
pneiy, ami auuiacy—Inn wiilim these majoi categones some of die onginal
ilnily stiindaids weie combined .UK! oiheis wcie it-vised I uiilici, these newei
siandaids. Ihf /'M^KIIII /Ivdliidlioii Sliini/diiK, wcie intended lo addicss evalua-
luins beyond die educational aiena, though much of llic focus leinains on cdu-
(..iiion .Mid naming Siinil.uly, in 1995, (lie Amencan IZvalualion ASSOLIJIIOII
(AI:A) |)til)lisheil us new GiiK/mg Pniicijrii-s/oi F.valuiUon These guiding pnnci-
ples olfcied a set of values—foi example, honesiy. miegiiiy. anil Responsibility
loi public well.ue—.is guides loi evaluation practice (In 1986. the Lvaluaiion
kescaich SoLieiy had merged wnh I;valuation Neiwoik lo lomi die Amencan
livaluaiion Association, a piofessional associauon lo lepiesem the enure pio-
lession in ihe United Slates ALA those not 10 adopt (he old ERS standaids. but
i.11 hei 10 develop us own guiding prmuples ) Today, evaluaiois in the United
btales have these two documents lo advise their ethical practice Oilier coun-
tnes (Canada), oigamzations (Government Accounting Office), and groups of
tommies (Ausii jl.isian Evaluation Society) have developed then own standards
(See Won hen, Sandeis, and Fiizpainck 11997| foi a discussion of these )
What can we learn fiom these two documents and Irom the history ol eth-
ical codes for evaluation7 Compaimg the documents published in the eaily
1980s with those published in the mid-1990s teveals a major change a move lo
include a gieaiei locus on noii-inelhodological issues "I Ins change is most obvi-
ous in comparing the l:kS standaids and the ACA guiding principles Table 1 I
lists the niajoi headings loi both "Ihe I. KS siandaids generally minoi the stages
ol an evaluation In coniiast, the AI:A guiding pi maples are moie concerned
wiih qualities 01 pnnciples thai peime.ue the evaluation pioccss As I discuss
liuihei on. the natuie ol the guiding pnnciples is moic congruent wuh the eth-
ical codes ol other piolessional associations I hat is, the articulation of values.
as opposed 10 stages of tasks, is a moie lommon stialegy in olhei ethical codes
Peihaps moie mipoiiani lo the lusioiy ol evaluation, this change Illinois die
move in the education .met naming of evaluaiois fiom .1 veiy strong focus on
methodological issues (which ceitamly lemains die suit' <|i«i non of evaluation)
to a gieaiei examination ol the many political faciois anil peisonal judgments
entailed in conducting evaluations I his change has been positively noted in sev-
eial of the commcnianes on the AHA Guiding Pnnciples with special refeicnce
lahle I I I:US Standards veisus ATA Principles. Major Headings
f KS
AM I'MIII I/id's
I Ulllllll.il1011 .111(1 l
Man Inn- .ind
D.II.I iiillnlion .iiul |)ii|).ii.iiuiii
D.II.I .in.ilybib .uid iiiicipii'i.iiiDii
("(iniiiiiiiiii.iii' ilisi Insult-
Systematic nu|iniy
-Ll lor people
ki-spnii!>il>ililif<> loi genual
.ind pulilk will.ui-
I IIIKSIN DlSCII'IINI.S AND I'KOI I.SSIONS KlIAIII) IO I.VAII IA I ION 7
lo the pnnciple conceinmg lesponsibilities foi (he gencial and puhlu wellaie (!•)
(Covert, 1995. House, 1995)
Though the headings in "lahle I I ic-llecl changes in the lone and empha-
sis in evaluation ethics Irom (he 1980s to the 1990s, the dilleience should not
be overstated As one might expect, the oveilap between the enure body ol 1-kS
Siandaids and die AHA Guiding Principles is gieat, most ol the topics ami con-
tent covered in ihe first are lellected in (he second Ihe conveise is also line.
even in cnntioveisial aieas The HIS Siandaids thus recommended ideniifying
various gioups ol stakeholders and then "mloimillion needs and expei laiions"
(f.KS Standaids Committee, 1982, p 12) I hey even aigued thai "evaluaiois
should also help identify aieas of public inieiest m the piogram" (I:KS .Stan-
dards Coniminee, 1982. p 12) But the lone is diffeiem The AI:A Guiding
Principles emphasize the diveise groups we serve. 01 miglii seive. and our
obligation lo be inclusive in ensuring those gioups aie icpiesemecl I in.illy. die
language used in major headings is important One goal of piolessional codes
is to inspire ethical behavior among us members Lolly language can help m
thai regard As such, the major categories lor the AI:A Guiding Pniuiples. as
with the Joint Committee Standards, aie moie inspnalional than (he su-p-by-
siep emphasis ol the eailiei LiKS Standaids
The Social Sciences and Evaluation Codes
Ihe ethical codes discussed above have been stiongly influenced by ethics con-
cerning social science teseaich in specific disciplines 'I Ins influence can be
seen in (he initial impetus foi the codes and m'llien piocessol devclopim-m
The original Siandaids (1981) weie a spinoll fiom the levision ol the hiuiuiimJ*
for Educational and Psychological TCMS and Manual* by the Aineiuan liiliuational
Keseaich Associauon (AI:KA), the Amencan I'syihological Assoiianou (AI'A).
and the National Council on Measuiement m liducaiion I he twelve sponsoi -
mg organizations for ihe 1994 version continue 10 lepii'snil these areas, hut
llie validation panel for the newer Joint Comnmice btandaids also nu Killed
broader aiuliences lo n-piesenl adull naming m many aieas Ni-vtnlifles:>. die
locus remained on education and naming I he development ol the ALA tiiiul-
mg Principles was initialed by leviewmg the ethical codes in psychology (AI'A).
film anon (ALKA). and aiillnopology (Amencan Anlliiopology ASSOH.IIKUI.
AAA) I he committee also icviewed oihei codes dealing wuh iesi-aii.li, nii.liii.l-
mg the federal regulaiions on Pioftdnm i>[ llumtin .SuJ'/cus and tin lltltnuni
RffWH on biomedical and behavioial teseaich (bhadish. Newman. Si lu-nci. and
Wye. 1995)
"I hese di)cumenis. which locus pnmanly on iCM-auh. .m- icii.nnly |>i in-
neiii to the ethical pnnciples of piogiam evaluaims I hey piovidt impoiiaiu
gLiidelmesconceiningihe design ol leseaich and the c-thual UIIK fiiisciu.ulid
when one collects data horn people I lowevei. the almost t ivt liuuson
the social sciences fails 10 iiilonn usol the ethical coiilluis. us l.ncil by
piolt-ssioiis. suih as evaluation, thai woik ilneiily with i s As I h.ivt-
-------
. HiiiiCAi QIAIILNOI.SIN HVAI MA i ION
8
argued elsewhere, because the graduate naming of mosi evaluatois is in ihe
social sciences, we tend 10 use these disciplines as exemplars and neglect pio-
(essions that are similar in our own (Filzpatiick, 1994) As the AEA Guiding
Principles introduce ethical issues concerning relationships with clients and
the public, balancing of stakeholder needs, and the values involved in these
mieiaciions, the ethical codes of other professions that struggle with conflicts
among clients, other stakeholders, the public welfare, and the values of their
discipline become iinpoiiant learning tools In fact, the Joint Committee has
found some procedures from the accounting profession to be useful in devel-
oping their standards for evaluation (Sanders, 1999)
Ihe AEA Guiding Pi maples were initiated to stimulate dialogue among
program evaluators on how we deal with ethical dilemmas Yet, that dialogue
has not piogressed as much as many might have desired Some might argue
that the absence of dialogue is clue to the generality of the principles (Rossi.
1995) House wines that the "endorsement of general principles sometimes
seems platitudinous or irrelevant" (1995, p 27) However, he goes on to
encourage the dialogue, observing that, "Ethical concerns become interesting
only in conflicted cases, and it is often the balance of principles that is crucial
rather than the punciples themselves" (I louse. 1995, p 27) Examining the
codes, cases, and pioceduies of professions confronting similar conflicts can
be fiuitful m fuither stimulating this dialogue
Consulting versus Scholarly Professions
Bayles (1981), m witting about piolessional ethics as a broad subject, makes
an important distinction between consulting firu/fssioruils and scholaily piofes-
sinnfils Admitting the terms represent a continuum and the middle can become
murky, Bayles wines that consulnng ptofessionals differ from scholarly pro-
fessionals in two important ways they establish peisonal. working relation-
ships with their clients, and their method of reimbursement is typically
fee-for-service In contiast, scholaily piofessionals generally deal with clients
at a distance (students m a class, leadeib of a journal) and aie salaried Con-
sulting professionals woik us "eniiepieneuis" and. as such, "depend on attiact-
ing individual clients" (Bayles, 1981. p 9) Consulting piofessionals include
lawyers, physicians, aichitects. consulting engineers, accountants, and psy-
chologists Scholarly piolessionals include teacheis, professors, and scientific
researchers As program evaluators who use research methods, we may fall in
that murky middle, but I would argue that we are more akin to consulting
engmeeis 01 accountants in our relationships with clients than we are to oui
social science biethien
The diffeient economic and peisonal lelalionships with clients, Bayles
argues, "are ciucial in defining the kinds of ethical problems each confronts"
(1981, p 9) The peisonal lelationships that consulting professionals develop
with ilien clients and the expectations engendeied by clients' direct lining and
llIMICb IN DIM MM INI'S ANI> PKorLSblONS KUAIID IO IIVAIUAIION 9
Because the relationship ol the consulting professional is closer 10 ihe cliein
than to other stakeholder, the ptofessional must guaid against bias toward, or
ovendentification with, the clients' views or needs further, because the pio-
fessionals' ongoing livelihood depends on utiiacimg and retaining clients, it
can be against the piofessionals' self-interest (at least in the shoil-teim) 10 pui-
sue ethical norms that conflict with the clients' self-perceived needs I he schol-
arly professional is not so bulfeted by the pressures of individual clients'
expectations or the exigencies of maintaining a practice This distinction can
be uselul for evaluators m consideimg ethical codes Ceitamly, foi method-
ological issues, our ethical codes should build on those from the scholaily pro-
fessions But as evaluation ethics moves toward a focus on the values entailed
in dealing with diverse stakeholders and balancing the public interest, we can
also look to the consulting professions for guidance
The Content of Various Professional Codes
Codes of professional groups vary considerably in their comprehensiveness.
explicimess. and means of enforcement Table 1 2 piesents the principles of sev-
eral professional associations, both scholaily and consulnng ('I hese principles
are referred to as standards, canons, and principles by the difleiem gioups, but
they all represent the first level of values articulated m the code ) One first
notices the commonality across punciples m spue of (he variation in the fields
represented Several (accounting, engineering, public administration) begin with
a principle concerning the public service or public welfaie Psychology ends
Table 1.2. A Sample of Characteristics of Selected Professional Codes
I'ro/ession
I'HIM i/ilYj *
Accounting (AICI'A)
Inlinijl auditors (IAA)
Professional engineers
(NSIM2)
Psychology (APA)
Public admmislrJiioii
(ASPA)
Responsibilities as piolesstnnals. seiving tlie public inltiibi.
integrity, objectivity and mdt pendencc. exeuise due »aie.
apply piniLiples lo scope and nariiie of scivues
Honesty. ob|iilivtly. diltguiiL. luyjliy, lonlliil* ol mil KM. lies
or gills, confidentiality, due (..lie lo obt.tm biilln.ii in l.u iu.il i vi
denrr lo support tbe expulsion ol .in opinion
Safely, bcjllli. and wellaie ol tbe public, i output mi. H|>|LI|IVI
and liullifnl. failbful agent of employer 01 ilieiil. avoid dm p
live acts, conduct onestll bonorably and responsibly 10 lionm
I be profession
Coni|>eieiice, integrity, professional and bucniilic ii'>puu->ibiliiy.
ies|>ecl (or people's nglils and dignity, joniiin loi otbeis w«l
faie. social responsibility
Serve public interest, lespeil constitution and law. niirgiity.
pioinole ethical organisations, slnve loi piolessional exielluue
* Ilir plllicipk'b dlf llblnl in llirir Hitler (il pirMlilalinii in tin units. In i.HIV ilni milt i iiuy irllnl I lie
I'Mii'iin < ill llif ill*.' mint*1 ni
-------
10 |"MI:KI.IN<. 1-iinrAi CHAMKNC.ISIN CVAIIIAIION
wuh a pi maple concerning social lesponsibililics Tlic piommcnce of attention
lo I he public good in ni.niy ol these codes ni.iy lellecl the consulting piolcssions
desire to emphasize explicitly and prominently the importance of audiences
oilier (luin the duecl client Although the piofcssional engineer's code includes
a pimciple cunceining seiving as a faithful agent to a client or employer, their
code. too. emphasizes, lust and foi e most, the obligation to the safely, health, and
welfare of ihc public The complete discussion of their canons, uiles. and pio-
fessional obligations stresses this pnoiiiy The Amencan Institute of Cemlied
Public Accountants (AICI'A) expressly states, "In resolving those conflicts
(between dilfeieni audiences or stakeholdeisl, members should act with integiity,
guided by the precept that when membeis fulfill their responsibility to the pub-
lic, clients' and employers' interests are best served" (Albrecht, 1992, p 175)
They define the public interest as "the collective well-being of the community of
people and institutions the profession serves" (Albrecht, 1992. p 175)
Of couise, the lole of accountants generally differs from that of program eval-
uators But loi many public accountants, their work in assessing a piogram and
defining public interest might be quite similar to those of a program evaluator
(See Wisler 11996| for a discussion of the similarities and differences in the roles
of auditors and evaluators)
In contiast to these codes, which emphasize (he professional's obligation
to the gpneial public, the ALA Guiding Pi maples siiess the tJivmKy of partic-
ipants m the evaluation process and the need lo recognize these differences,
considei the interests of all gtoups, and piovide lesults in such a way that they
are accessible to all 'I he emphasis is on the heterogeneity of stakeholders, not
the homogeneity But the pimciples close with an exhortation to "encompass
the public interest and good," which, they acknowledge, "aie rarely the same
as the interests of any particular gioup " This lattet admonition more closely
Illinois the codes of othei associations The committee, however, noted strug-
gling with this issue, and as they acknowledge, furthei discussion and inter-
polation aie needed to apply (Ins pnnciplc effectively (Shadish and olheis,
1995) 'Ihese olliei codes might piovide some effective guidance
As a fiame foi analysis, Bayles (19HM) has identified six staiulaids ol a
good or iiusiwoithy piofessional honesty, candor, competence, diligence, loy-
ally, and discretion Most of these standaids can be seen in the piofessional
codes listed m 'lable I 2 and m the ALA Guiding Pimciples shown in Table
I I Honesty is addiessed in vaiymg ways It heads the list foi internal audi-
tois Accountants and engmeeis si i ess "objectivity", accountanls add "inde-
pendence" and evaluaiois add "mitgiity" Itaylcs sees "candoi" as going beyond
honesty to include lull disclosuie 1 he code ol piolessional engineers addresses
candoi, loi example, by aiticulaiinga pmlessional's obligations lo acknowledge
eiiuis to clients and to advise clients whrn a piojixi will not be successlul
Similaily. competence is addiessed in each of the piofessional codes, but
undei cK ->t woids Only the code of piolessional engineers, like the AHA
Ciindin, iples, directly uses ihe word "competence " Accounting and
iiiii-iiul iii.iiMisrmnh.isw "i|iii- i lie " wlll( ll lilt oilioi.ilrs In ill) dihtu'llic ,illd
I.IIIKSIN DlSUI'l INI S AND I'KOI I SSIONS KlIAIII) l() I.VAI IIAIIUN I I
competence Psychology and public admimsnrmon annulate pnuciplesioii-
ceining piolessional lesponsibilinesand piolessional excellence
Loyally is concerned laigely wuh the conllici belween obligations in
clients and ie.spousibililies lo olheis, including the piofession I he conllu.ii
faced by piolcssionals in this aiea are paitly addiessed by the pnnc iples con
ceinmg public and piolessional responsibilities Many piolessional codes deal
extensively with loyally Seveial of these codes addiess conflicts ol mien, si and
independence- of judgment 1 liese issues .lie subsumed uncki loyally liei ausi
the client has an expectation thai the piolessional they hue has u-ve.ilcd any
potential conflicts of interest that would (nuclei then completing ihc woik
fairly and will, in fact, be able to provide an independent judgment on tin
issue of concern Guiding Principle I: 4 stales the need to "maintain a balance-
between client needs and othei needs " 'I he evaluaioi is urged lo "meet legm
male client needs whenever u is feasible and appiopiiaic to do so," but tin-
principle notes that when client mieiests conllici with othei pimciples. "ev.il
uatois should explicitly identify and discuss the conflicts wuh the client and
relevant stakeholders" (American Lvaluation Association, 1995. p 25)
Discretion is addressed less duecily by most piolessional code:. Ihe AI:A
Guiding Principles, as with the code ol ethics loi psychology, add i esses the issue
of confidentiality under "lespeci for people " Bui. the implication is thai these
people aie pailicipams in the evaluation, not ihe agency 01 client What obliga-
tion does the evaluator have to the client in ic-gaid lo confidentiality and discic
(ion7 Guiding Pi maple II 3 advocates bioad dissemination of I Hidings Undei
what cncumsiances is such dissemination unethical7 An accounting ethics casi
asks readcis whether an accounting piofessor should use- inmeiuls (mm an oiu
side pioject in the classioom (Albieclil, 1992) II so. should the identity ol tin
firm be disclosed? Although laws on public iccoicls may cast an evaluation u |*>it
on a public program in a dilferem light, what ethical obligation^ loi disc tenon
does an evaluator have7 What loyalty does the evaluaioi have lo the diem7 I hes<
aie issues that should be discussed, luiihei building on the smiilanties and the
distinctions between oui piolession and those in lelatc-d lielcls
IZiiforccineiil of litlucal Codes
HiMoncally. most consulting piolessioiib have been se'll-iegulaimg As pmlts
sionsollen come undei sonic cnticism loi then lailuic- to legulair. |>iolrx,ioiul
associations have established mechanisms loi eiiloueiiinii ol die «»!> •, ih.u
the evaluation piolession cniienily lacks Ihe Amencan Psyi liologu.il Assou
anon, many ol whose membeis aie piacltcing psychologists, the Ann in an It.n
Association (ABA), ihe Amencan Medical Associauon (AMA). the Ann-man
Institute ol Ceinfied I'ublic Accountants (AIC PA), and the National Assucia
lion of Social Woikeis (NASW) all have enloufmeni bodies Hirst ionium
lees answei c|uestions. hear complaints, and issue disc i -iy decision:, m
sanctions as appiopnate I hen hearings and det ismns h .Id i ase law Im
ihe miei inflation ol the elhual codes
-------
12 IIMIRCINI. l.lllll'AI ClIAniNU-SIN I VAIUAIION
lii contiast. enforcement mechanisms are typically absent lioni ilie pro-
fessional associations ol scholaily piolessions Tor example, (he Amencan Edu-
cational Researcli Assou.ilion (Ai:RA). ilie American Anthropology Assouulion
(AAA). and even llic Aniencan Sotieiy foi Public Administration (ASPA)
develop and disseminate then ethical codes but have not developed official
mechanisms foi enloicement Plant (1998) discusses the reasons for the
absence of external enfoiceinent mechanisms for the ASPA code, drawing on
extensive writings in public administration concerning ways to create ethical
behavior Some aigue against even the codification of pmfessional ethics, main-
taining that practmoneis should be (heir own moral reasoncis (Rohr. 1978)
ASPA. however, argues that codes are necessary to socialize and educate the
practitioner about common standards, but thai enforcement al the individual
level is more appropriate than cenlial enforcement Organizations such as
ASPA appear to believe thai the development of "inner controls" will be more
successful at engendering ethical behavior among members than the "external
management of conduct" (Plant, 1998. p 165)
The AHA Guiding Pnnciples may not include enforcement mechanisms
because of a belief in the success of inner controls, however, the more likely
reason lor their absence may be the need to reach consensus on the meaning
and application of the principles, (he continuing tensions among the diverse
paradigms used in evaluation, and (he relative newness of the evaluation pro-
fession All of these factois ueate difficulties in developing and implementing
enforcement mechanisms As evaluation inatuies as a field and greater con-
sensus is achieved on the appropriate methods and actions of evaluators. devel-
opment of enfoicemi-ni meihanisms may be icconsidered They seem more
appropriate to the self-regulating role ol the consulting professions
If AEA continues to use internal mechanisms to motivate ethical behav-
ior among members, however, levisions ol the code may consider the use ol
language to bettei achieve that goal I he style of ASPAs Code of l-thics is con-
sistent with the puipose of instilling internal contiols The code is slioit, it
could lit on one legal-size page It amuilales five bioad principles wiih foui (o
eight bnef points explicating each 1 he woids and language used in this code-
are designed to inspire and are less legalistic than the codes of the piofessional
agencies that include enfoicement The ALA Guiding Pi maples make use of
this formal (principles widi brief points), but the (one of the language, as noted
by the authois, is moic legalistic than the ASPA codes
In the absence of foimal committees delegated with cnloiccment poweis,
oilier means ol educating memhcis and enloiung codes of ethics do exisi and
nuisi be used (o encomage uncle-islanding and compliance I he National Soci-
ety ol Piolession.il Lngmecis, which does not have an official enlorcement
body, uses us Boa id of lilhical Review to inteipiet ethical dilemmas submitied
by engineers, public officials, and membeis of (he public They publish these
cases on-line and in punt with an index, sponsoi an annual ethics contest in
which membeis lespond (o a case, and disseminate a senes of videotapes for
.in.I.-in- ii.,l iM, In You. descnbmg the code and
examining five case studies Congressional c niicism ol accountants assoc uiied
with the savings and loan fiasco in the 1980s stimulated these actions, the pio-
fession was aroused to attend to ethics and us public image (Mmtz. 1992)
The Future for Our Ethical Codes
Compared to the disciplines and professions leviewcd heie. progiam evalua-
tion is quite new The accounting piolession in the United Slates cclchiaied us
centennial a few years ago (Mintz. 1992) Physicians, engmeeis. and lawyeis
have been defining then piofessions and tinkeimg with then ethical codes foi
even longer It is therefoie not surprising that the ethical codes loi pmgiam
evaluators are less well loimcd, their slate lellects the stale ol oui held How
ever, we can leain from the codes levicwed heie A most immediaie issue.
which docs not icquiie consensus but instead action, is to continue to expand
the dialogue that the AI:A Guiding Pi ma pies weie iniended to aeale We need
more discussion of cases thiough oui publications and iimleicnces to aigue
and interpict the meaning ol various pnnciples I he publication ol a new senes
in 1 lie Amencan Journal of Evaluation on e(hica) challenges is a first step in thai
duection (Morns, 1998) EvalTalk and focus gioups at die annual conlcicnce
can be used to futlhei ailiculaie (he meaning ol vauous pnnciples I am cui-
rently working on a casebook for the Amencan Evaluation Association, loi
which I will diaw upon styles used in othci piolessions Ciiiiein discussions
of certification or licensure are peitmeni to oui ethical codes Q-iiilicaiion and
licensure. as with ace led nation, piovide a mechanism foi ensiinng thai pm-
fessionals are mfoimecl of and concerned with oui eihical codes I inally. con
sideiation might be given to linking inemheislnp in die Aim-man I valuation
Association with the AI:A Guiding Pnnciples The Joint Committee Siandauls
for Progiam Evaluation, AT As Guiding Pnnciples loi I valuaiois. and codes
from other disciplines have provided us wuh food foi thought Now we must
continue 10 discuss and amculaie what it means to be an ethical c valuaioi
References
Alhieelll. W b /.lined/ KMII •. in (lie- Tide (in- <>/ /Viiiiuifmi; ( nu iiin.ili '.milli Wi -,11 in
Ainencan Lvaln.iiinii Absoi I.IIIOM Ciiiidin^ l'iiiiii|>lfb loi I v.iliuiois In W |< Slmlisli.
D 1. Newiii.in, M A Sclic-im. .mil C Wye (i-d* ). dunlin); f'iiiui|>li >, /m / nilii.ilnis Ni w
Dirc'Llions Ini l'ruj;r.iin r.vjItinlKin, no 66 San 1 1. un ism |n->s< y I U-.-. I'W>
Bayles, M I) /'ni/t-sMoiiul kihit* hcliimni. C_.ilil Wdilbw.mli I'JHI
Campbell, D "rilinnteniiibin nl Disciplines .uul ilie fisli Sulc Miul< I nl i IINUIM u iu< In
M Shcuf and (1 Shcnf (ciU ), (nfi-i i/iu ifliiuiiy Kilii(inii\lii(Mii iJu \m,il \i,n, ,\ ( Im-i^n
Al.lini |«h«>
-------
14 I'MIRl-INC. I-IIIICAI ClIAIIINC.ISIN I'VAIUAIION
Covul K W "A Iwi'iiiy-Yeai ViliMan's Kcllct lions on ihe (.iiiiilmgl'nniipks for l.valuaiois"
lnW R Sliadish. D I Newman, M A Sclii'iiei, anil C. Wye (eds ). Glinting Pnnufiln JIN
(vuhuiiuis New Diieiiions loi Piogiam (.valuation, no 66 San francisco Jossey-Bass. 1995
I valuation Ki-icaii.il Sixieiy (I RS) Standaids Comniiiice "Evaluation Rcsean.h Souety
Siandauls loi 1'iogi.iin kvaluauon " In I' Rossi (eil ), 5liimii tvuliitilion PHIIIIIC
New Pireiiion* fin Program Evaluation, no 15 San riancisco jossey-Bass. 1982
I iiz|KUiu.k. J L "Aliemaiive Models lor ilie SiniLluringol 1'iofessiunal 1'ieparalion I'rogianis"
InJ W Aliscliuld and M (Ingle (oils ). Hit Pii/wiuliiiniiJPiiiJi-s\iiwKil Cvuliiufius ls«as. (Vi-
>|>fifm->. niiJ l>n>xi(iiiu New PIICUIUMS for Piogiam Evaluation, no 62 San 1'iani.ibLO
Josscy-llass. 199-4
House. L R "Principled Evaluation A Crinque of ihe AEA Guiding Principles " In W R
Sliadisli. D L Newman. M A Scheirer. and C Wye (eds ), Guulmx Piiiuif>/i-s/c>i Cvu/im-
(ui> New Diieuions loi Piogram I'valuaiion. no 66 San FranLisco Jossey-Bass. 1995
Joint Committee on Standards lor Educational Evaluation Standards Jui Cvuluiilimu of
Muuifioruil Curiums. I'Ki/uli, mill Miiii-iiuff New York M(.Graw-lldl. 1981
Joint Coniiiiiitee on Sundaids lor tducaiional Lvaluanon Tlie Pingum Cvu/uulinn S(lrs jai I'vuluahn* New Uirections for 1'iogiaiii
LvaliMlion, no 66 b.m I laniiMo |ossey-l).iss. 1995
Sanders. J IVisonal (.oininuiiKation. January. 1999
Shadish. W R . Newman. O I. . Siheiiei. M A . and Wye. C (1995) "Developing the
Guiding PiiiKiples" In W R bhadibli. D I Newman. M A Stlieiier. and C Wye (eds ).
CuiJiiigPiiiuifilir^dr LvdliuilDii New Direuions foi Program Evaluation, no 66 ban
liancisco Jossey-Uass, 1995
btulllcl)caiii. D L "A Next btep DIM ussion 10 Conmdei Unifying the I.RS and Joint Com-
mittee Siandaids " In I' II Rossi («l ). Sluiufcinl*/m /.v/ /»nl»li( m/ininiMiulKMi iu< IILC in fvuliiiHioH duel i.s inlcii'slrd i" llu1
rs.'i rviiliiiilni diriiJ nliiliniis Slir scivrsdii llir llndiilnf
Although empiiual /tit'im/i on evaluation fllncs i!> not /)lt-;i/i/ul,
strvctul impoitant findings have emeiged 7/iesf uu liulc on
Iticfe o/foriicnsiii wilhrn iht field concerning whal tonsli/ules on
ethical issue, the/ie^uent ouuncnit of elhmil /)/o/)/cnis ilunng the
latei stages of evaluation piojects, and the peiceivcd ethual
significance oflhe tendency foi evaluaiors to be nwic itspondve to
some stakeholder than others The authoi (/licuiu's the neea to
incoipoiate reseat di (jueslions (in ethics into nn^ytin^ eval
piojCLli, and to imciS systematically evaluating pen f/)d
Guiding Pnnci/)lfi
tion
Research on Evaluation E
What Have We Learned
Why\Is It Important?
cs:
Michael Morris
\
\
NtMily twenty yc^tis ago Slu'inle)d and l.oid (IS>KI) nuicd ili.n "i-inpiin.il Mud-
les of the ethical Concerns of /valuaiion leseaicheis aie lew" (p )HO) Wh.ii
was true then is only sligliily/fess nue tod.iy Indeed, at a leu-iu session devoted
to "What Should We\Be IVseai clung m l:valua(ion Films'" (Morns. I"A>7) .11
the American livjIiiaiW/Associaiion's (A I: As) annual iiieeting. I lie panelists
outnumbered the audience1 Whatever else ethical issues may he, they do noi
appear to have auiac^ed HIC attention of a large segment of the leseaich com-
munity in evaluulir
'I his is not lp say, of course, that theie has heen virtual silence on (he
subject beyond/he Guiding humpies jot l'.vahiatoi\ and the joini ( omiiiii
lee's Program Uvttluation Siandaids Analyses of ethical concerns, fiequeiuly
based on tlie/ju'isonal experiences ol the auihois. ate iclaiively easy to hiul
(see CnghsU; 1997. Gensheimer.Vers, and Roosa. 1991. Schwandi. 1997.
Slake and/Mabry. 1998) Far fewerNjiowever. are cases in which the auihois
have garnered primary data m a systematic fashion for the explicit puipose
of shedding light on ethical issues iiAvaluatton In a held that pi ides use 1 1
on bemg commuted to decision making informed by such data, ilns stall- ol
affairs is cause loi concein Accordmgly.Nhischapiei will locus on |>i 1111.11 y
studies and then value loi enhanctn)l om undeislanding of evaluation
ettcs
-------
76 EMLKGINC, rniu'Ai. CMAII LNGI.S IN EVAIIIAIION
Kawls, J A Theory ajjustut Cainbiulge. Mass Belkiiap 1'icss of Harvard Umveisiiy Press,
1971
Kyan. K L Using reiiiiin:.! Sii.negies loi Aildiessing Issues ot Social Justice Do I hey
Help7" Pa|>ei presenicil at the annual meeting of the Amentan Evaluation Assncianon.
November 1994
Sanders. J R . Newman. D I., Owen.J , and Woilhen. B "Making the Piogiani evaluation
Slandards Meaningful in (iiaduale IMucauon " Panel presented at the Annual Meeting ol
ihe Ameiu.au Lvalualion Association. Atlanta. November, 1996
Sk.hwandl.1 A 'Bctiei living I liniiigli Evaluation Images of Progirss Shaping Evaluation
Practice "{.'viiluimiiii fun (in-. 1992. (1(2). 135-H-t
,l)IANNA L NLWMAN is associate pioffssoi and dntiloi ojlhe Lvaluation Consoi-
duni (it (lit iiwif UniveiMty o/Nrw Yoili at Albany She is coauthoi oj Applied ElhiLs
in Pnigiam Evaluation, and "Guiding Pnnuplesfoi Hvaluatois" and has piesented
and published moneious papeis and unities on cdncs in mifiKiliou
on (i Jrvrli>f>iiu'iii«l inot/cl (i/lidiiiin^ cvalualon* that
issues sft'/n (o tiiuiisc the fxissum ^cm uiici/ by di\t IISSIOMS
oj whe
incieased Tlie federal governmeiil IMS incieaseil ilk- pciulncs .ly.tinsi ilic pei|x:-
liaiors and also lias laken ulhei nie,isuies An ev.ilu.uion ic.un g.uln.-ib 10 JbiCbS
I he effectiveness of these measuies One inetnbei ol the le.nn. like many oilier:,
in this nation, is strongly "pio-lifc," believing that under (he Niueinlxrig and othei
i tilings, any action is justified in pievenlmg what this pet son legaidb us the
slaughter of innocents 'I his team member sees the government as pioleiimg
"murderers," believing the genetal welfaic and public good demand doMiif ol the
(.limes Another team meml>ei, like many otheis in lliii, naiion. ib Miongly "pio-
Lhoice," believing that law and ethics give a woman coniiol ova hei own body
and regarding violence against the clinics and medical pcisnnnel asuiminal. not
heioic To this team member, ensuring the geneial welfaie anil public good
requires the pioieciion ofclmics olTeiing ahoinons lioih memlx:is call thcmselvt-.s
evaluaiois Should eithei of these ev.iluatois panicipaii- in 01 lead ihe evalnaiion'
For this chaptei. I have been asked to examine aigumenis loi and ag.nn.si ev.il
nation neuliahty and advocacy To exploic these issues I havr looked pimi.mly al
articles by [xist ptesidentsol the Ameiuan Lv.tlu.u ion Association and mini piunn
nent figures in out field 'I heie is an abundance ol pnoi wouls on this lopu . sonu
of which will be summaiized in the next sections Soiling ilnough tlitin, wh.u
struck me was not then dissimilanty but—with some exceptions—iln.it agKeineni
after one had woiked thiough die definitions given ol advoc.u y and neimaliiy
Nonetheless, some of the discourse on the ethics of advocacy in cvalnaiion si-ems
to take place as though the moial high giound had loom loi only om. luimu
Why the passion, given the common giouiuP (3nc U.IMMI may he die
|X)ieniial for common gionnd in theoiy to gel "balkamzal" in piauut A sicond
-------
7H llMIRC.INC MlIK Al ClIAl I I'NU.S IN HVAI.UAMON
leason may lie whether I lit' evaluaior pnmanly has in innul a nnuonal study or
one i lose to client service delivery The cxplanatoiy power of ihis distinction,
gianted. does not always hold Without deprecating the many ethical and moral
dilemmas confronting evaluatois, peihaps we could advance a hit funhei by
examining ( I ) specific evaluations earned out at ronifwicjiMr levels in light of (2)
the pi maples and theones put foith undei diffetent banners
Guiding Piincinles for Lvaluulors
Our suiting point is the "Guiding Principles for Evalualois" adopted hy the
American 1: valuation Association (A Li A) as "a set of principles that should
guide the ptofessional practice of evaluations, and thai should inform evalua-
tion clients and the geneial public about (he principles they can expect to be
upheld by prolessional evaluatois" (Amencan Evaluation Association, 1995,
p 21) There aie five broad principles systematic inquiry, competence,
miegiiiy/honesiy, icspect for people, and lesponsibilines for general and pub-
lic welfare These have been presented eailier in this volume
All ol the ALA Guiding Pi maples aie iclevant, to some degiee, to the ethics
of advocacy One cannot, howevei, sum the pi maples foi geneial guidance
regaidmg advocacy and know exactly what actions to take Fust, as intended,
the pi maples aie not siandaids They do not indicate, for example, matteis of
practice, such as what would constitute incompetent performance or what types
of education, abilities, skills, and expenence would be mappiopriaie for differ-
ent types of evaluation (asks for dilfeicnl evaluations Second, it is possible for
|>t i sons uiking dilleient positions on the ethics ot advocacy 01 neutiality in eval-
uation to cue one 01 anoihei principle as consistent with their views
In the next sections, these appaiently dissimilar positions are piescnted
togethei with the AIZA Guiding Pnnciples that seem 10 suppoit them, anil then
the positions aie teexammed to identify what may be common giound that
ledefmes the Pnnciples fust, loin definitions (Websteis. 1994)
Advocate One who ilcli-iuls, vindicates. ui espouses a cause hy argument,
upholder. defendei, one who pleads lur or in behalf ol another
Adveisaiy A person 01 gioup who opposes another , opponenl, foe, any enemy
who lights deteiminedly. lelcnilessly, loniinuoiisly
r.inis.in An ailheieni or supporter ol a |>erson, pany, or cause, hiased, partial
Nonpanisan Objective. not suppoiluigany of the established or tegular parlies
To IZvalualc Uequiics Credibility: No, livalualors
Should Not Uc Advocates
I heie is no laik ol woids and deeds lonccinmg what evaluation and the eval-
ualoi's lole a' u Some uuild be lead as indicating that the cvaluaiois wle
is .ihoui ne iiip.iiiis.in evaluations leg.ndless of how pauisan 01 non-
11 II IIS 111 I 111'
IHJi'll
I III Ll I IICS Ol II VALDAI ION Nl.UIKAl II Y ANU AllV()( A( >
For example, Chehmsky (1997) observes
In he listened in hy various stakeholders in even an ordinary political debate
requires a great deal of ellon by evaluaiois not only lo be competent and oh|ce-
live bul 10 appear so I here are a great many things we can do not just
technically, in the steps we lake lo make and explain our evaluative decisions,
bul also intellectually, in the effort we put forth 10 look at all sides and stake-
holders in an evaluation A second implication lor evalualors of a political
environment is the need lor courage Speaking out in situations that may
include numerous political adversanes, all with different viewpoints and axes to
grind, and also insisting on the right lo independence in speaking out, lakes a
strong stomach li takes courage 10 refuse sponsors the answers they want
to hear, lo resist becoming a "member of the team," to fight inapproptiate intru-
sion into the evaluation process bul when courage is lost, everything is lost
Ipp 57-60. see also Cook, I997|
This is Scnven( 1997)
Distancing can be thought of as a stale on which a minilx r of points aie nf par-
ticular inteiesl At one end ol the scale is complete distancing, as when a pio-
gratn (person, policy, or whatever) is evaluated on the basis ol extant data alone
At the other end is ownership or authorship of the progiain, usually conceded to
be a poor basis for objective evaluation of it Although u is belter in principle
to use extant data, u is often the case thai one needs moie. and the n&ks atten-
dant on personal involvement |bias| must be undeilaken So-called p.iiiiu
palory design, pan of the empowerment movement, is about as sloppy as one can
get. short of participatory authoring of the final repoit (unless thai report is
mainly done lor educational or therapeutic put poses) li is sometimes sug-
gested that the push for distance is tisell an attempt 10 lie superior, external, an
attempt to play God the Judge On the contrary, n is pan of the simple and sen-
sible human effort to gel things light, to uncover and report the truth— Deciding
when and to what extent lo withhold those findings horn those who paid loi
them is the "doing what's good for you, not what you asked me to do" step ovi i
the holder between expemse and censoiship/p.tiennng |pp -IH-I.
79
To Evaluate Is lo Advocate. Yes, F.valualors Should He-
Advocates
Other wotk could be lead as saying we aie about dealing painsan ev.tliuiions in
an inetnevably pauisan woild We should be advocates, weighing in on the side
of llie underdog, the oppressed, the maigmalizcd in the light foi sot i.il justice
Lincoln (1990) wines, "|To the posilivistsl, only il icseauli u-sulis wen-
free of human values, and. theieloie, liee fiom bi.ts, pie|udi- ' individual
slakes could social action be taken thai was neunal with it n> poltiK.il
n.illls.mslno I lie i OIISIIIK IIVIM ll.ii,ulii'ili Hi is ,i> us < i nli.il Im u-.| the
-------
80
LMI Kl.lNt. 1.1 Illl Al C IIAIII NC.IS IN I VAIIIAIUIN
picseniaiion ol multiple. holistic, competing .uul often conlliclual ic.ilnies of
multiple stakeholders JIK! icscaich participants the wntlcn report should
demonstrate the passion, (he commitment, and tlie involvement of the inquirer
with his or her eopaiiiapams in the mquiiy" (pp 70-71) She fuitliei corn-
menis, "We sliould abandon the lole ol dispassionate ohseivct in favor ol the
lole of passionate paiuupam" (p 86)
Greene (1995). expanding on this thought, uiges in hei classic, widely
cited aniclc. "Evaluation inlim-mly involves advocacy. so ilie inipoiiiint qucs-
tiun becomes advocacy ku whom The most defensible answer 10 this question
is that evaluation should advocate loi the interests of the paitiLipants" (p 1 ) In
a iclaied statement, rctieiman (1997) offers a nuanced argument, considering
both advocacy and data credibility, that evaluation is best seen as a foun of
empoweiment He observes, "Empowerment evaluation has an unambiguous
value orientation — it is designed to help people help themselves and improve
iheir programs, using a foim of self-evaluation and leflection Advocacy.
in this context, becomes a natuial by-pioduct of the self-evaluation process —
if the data mem it" (pp 1H2-384)
AndMeriens(l995)
I Ins prmuple (III D ">) coiiitinnig divcisny and inclusion has implications not
only ai ihc level of idemilying anil it-spelling I he viewpoints ol marginalized
groups, liul also for I lie li-ihiiK.il adequacy of whai evaluaiors do Evalua-
tors need it) leflnl on huw loaddiess validity and reliahilily honestly in acul-
liual context, so as not lo violate the human rights of I he culturally
uppie&sed 1 1 he ein.iiu ipauny iramewnikl is more appropriate to slop
oppression and bring about social justice Three characteristics |ol ilus fiame-
woik arc) (I) recognition ol silt-need voucs, eiisuiing dial groups liadinonally
marginalized in soueiy are e(|iially "lie-aid" tluimg the evalualion piotessaiul
foinianon of evalualion findings and n.-ionunendalions. (2) analysis of power
iiicc|iiiucs in lenns of suual iil.Miunships involved in I he planning, iniplcmeii-
laiion. and reporting of evahiaitous. (3) linking evaluation resold in polniial
|pp 41-92)
III the context of evaluation as advocacy, stakeholder involvement seems
to mean the- evaliuior should take up the cause ol the marginalized "I he e val-
uator should nuke 01 snppoii piOLcdiir.il, technical, and methodological deci-
sions l.ivoi ing the side ol the poisons directly leccivmg sei vices
Sonic Relevant Principles and "Ilicir Implications fur
Ami- and I'ro-Ailvocacy Stances
I he ALA Cjiiiding humpies do not tule out ciihei lite ami-advocacy 01 pio-
advncai y stances, anil vanous ones can lie cited to support eithci position
Against Advocacy bevcial ol the Al'A Guiding rnnciplcs can he cited to
emphasize the iiKompaiihihly ol evaluation with an atlvnc.u v posnum as nuli-
liir Knurs or MVAIUAIION NI.UIKAIIIY AND AIIVOI'AI r HI
cated in the quotes given These aie lound piiinanly under I'linciplc C
Integrity/Honesty In us subpaits. (his principle emphasizes that evaluatots
should assute the honesty and integrity ol the crime evaluation process
through practices such as being explicit about then own (and others') inleiesis
concerning the conduct and outcome of evaluations, disclosing any roles 01
relationships that might pose a significant conllict ol interest
As these words aie generally understood, they are inconsistent with an
advocacy position Accoidmg to Websieis (I99'l). fionrsf means "lloruualile
in principles, intentions, acuons, fan, genuine 01 unadulterated, iiiiihlul or
creditable, unadorned, just, incomipnhle. liustworthy, mnhlul. stiaiglit for-
ward, candid " In common understanding, as an evaluaioi one cannot he lair
to all stakeholders and at the same time lake a position of advocacy (or adver-
sary) for or against one stakeholder gioup or the other The principles tell us
to be scrupulous about identifying biases, values, pieconceptions favoimg one
outcome or another that may be held so strongly the evaluaini could find it
difficult to be fair, incorruptible, just, trustworthy "I hese ilneais to faiiness
specifically and explicitly included political stances 'I hat is. the principles
assume that evaluatois have biases, prejudices, values, opinions I hey require
us, however, to be ever mindful of how our values may alien oui londuci ol
the evaluation — and to disqualify ourselves from a paiticulai study il we can-
not be balanced, fair, just, incorruptible
Different organizations use slightly different leims loi (he same idea I he
U S General Accounting Oflice (1997) speaks of "impanmeius" in one's abil-
ity to be fair and just These impairments can come not only horn financial and
career interests, but also from values, attitudes, and political views Mowevei
phrased, and with appreciation for the nuances of phiasmg. the evaluatoi can-
not, this principle makes clear, take sides This is quite dilleient hum iepoii-
ing findings that may favoi the inleiesis of one pany or anoihei Kathei. it
means conducting the evaluation so that the findings aie not slanted begin-
ning, middle, and end by the e valuators own passions liy this punt iple, the
evaluatoi must forego balancing peiceived inequities with a thumb giving
greater weight to the scales of the oppressed
Considering this reading of Principle C. neithei the pro-life nor the pio-
choice evaluator should be on the evaluation UMMI llieii pnlitual p. .sir urns
seem so deeply held as to be considered an impaiiinenl to a Ian, just, nusi-
worthy evaluation
For Advocacy Another pimciple, howcvci. uuild l>c read as |viiiiiuing and
perhaps encour aging advocacy in evaluation I'liniiplc I! considers u^pDiisihiliiy
for the general and public welfare It explicitly stales "cvaluaiois have obligations
that encompass the public interest and general good ck-ar tineas to the pub-
lic good should never be ignored in any evaluation llecause the public mieicsi
and good are raiely the same as the inteiests of any paitu ular gioup ivalua-
tots will usually have to go beyond an analysis of puiticutai sukcliolilt-i micu-sis
when considering the welfare of society as a whole" (Anifiicau rv.iluaiioii A.->s«)-
ii.ition 199-5 pp 2V26)
-------
82 I.MiK(.iN(. limirAi CIIAIIINI.LS IN I VAIUAIION
A common language leading ul ilns pnnciple lequnes cvaluatois ID he
cvei conscious of the public good and gcneial welfare liui the guidelines do
noi indicate wh.ii view ol ilie gcnei.il welfaic and public good is considcicd
Wbai is slated in law' l)y unrenily elected olficials' By majority opinion? Ity
ilie views of whatever gmup seems mosi disenfiancbised by whatever indica-
tois' lly ibe evaluatoisown peiccplion ol social justice' As Rossi (1995). dis-
cussing ibis pimciple, points out." what is the public good is (lie bone of
contention .iniong political parties, political ideologies, and even world reli-
gions" (p 57) It seems as tliougb evaluatois can select any definition ol ilie
public good tbey choose
Wbai are the implications ol ibis position for the hypothetical abonion
clinics' evaluation7 Considenng this leading of Principle E, depending on your
poitu of view, eithei ihe pm-ltfe or die pro-choice evaluator should seive on
ilie learn bui noi boih Moreover, any evaluators who have not thought
through what ihe common good and general welfaie mean on this issue (thai
is. on abonion) should leach a position as pan of their responsibility
li seems noteworthy that the basis for Pniiciple E is not a belief (hai eval-
uators are irtemediably unable to l)e objective, but rather that we serve a htghei
social good beyond serving those in charge and ihe proximal and miermedi-
aie stakeholders, such as stalf and paiticipanis of a parttculai ptogram To do
only the bidding of those paying fui die evaluation is seen as making evalua-
tion hide moie than maikei icseaich Although icsponsible 10 our clienis.
wheihei internal or external, we are equally icsponsible. in light of ihis prin-
ciple, foi consideimg die geneial good and public welfaie
Hxactly what evaluatois have lo do beyond "considering" is left unsiaied
Presumably it includes inlusmg all aspects of the evaluation with the repie-
sciuaiion of the ultimate stakeholder—the public good as undeistood by die
evaluaioi—in die same way one would a more pioximal siakeholdei
Coiniiiun Ground
"Ibis bnef analysis illusti.iies what many oiliei evaluaiois have alieady noted
(see. lor example, Rossi, 1995) The pi maples appaiently can be cited in sup-
lion of neutialiiy or advocacy in evaluation It is theiefoie not to the ALA
(uncling hmciples as stated thai evaluaiois might look for standaids ol con-
duct in specific cases 01 foi a icconciliaiion. if (Ins is possible, between appai-
cntly ineconcilable views
I he ambiguity ol the AI~:A Guiding I'unciplcs is consistent with the inten-
tional dilleience between the geneial guidance of pi maples and the opeia-
nonal guidance ol siandaids Rossi (1995) commented The membership ol
AI:A is divided on a nurnbei ol cntical and substaniive lechnical issues A
stiongly woided set of standards might easily sundei the weak bonds that bind
us together :••- ' nullily the compromises that make AI:A possible" (p 5o)
Ihe |). es developed between 1992 and I99H weie intended as pail
ol loiiiinum^.iMiissionson ethical issues I hey seivrd us well ilint in ollei-
Im KIIIICSOI I'vAiiJAiioN NLDIKAMIY AND AIWIX AC Y H)
ing an ecumenical liamewoik lor lobust discouise on ethics Seven yeais laiei.
however, the pi maples may need lefieslung in oidei to leflect new appioaches.
such as emtiffnl italism (Henry, Julnes. and Mark, 1998), and to guide piac-
lice Indeed, some common ground may be piesent in the values shaied by vai-
ions perspectives on the ethics of evaluation advocacy and neutiality I bai is.
by examining possible common denominator in lecent commentaiies on these
issues, we may get back to a sense of how 10 balance appaiently competing
principles
Iwo sinking common denominatois aie the value placed on launess and
faithfulness to all stakeholders and on lespecimg deeply ihe dignity of all stake-
holders and their right to be heard A series of recent anicles by leaders in oui
field, such as Lincoln, House, and Greene, gives a window on contempoiaiy
definitions of advocacy in evaluation
This is Lincoln (Ryan, Greene, Lincoln. Matbison. and Meitens. 1998)
We opeiaic Iroin profound social commiiinenis whn.h honoi all siakeholdeis
groups' views and peispccnves, wheiher or not we liap|>en 10 agree wuli those
views We speak ol "jilvocacy" as if u meant we go inio an evaluaimn deiei
mined 10 take sides, and that would mean typically. "agamsi" ihr piog i man-
agers, administrators, flinders, or olliei uilical individuals When I lalk aliinii
advocacy, I don't mean taking sides in llial moie spv< iln sense Wliai I IIILJII
ralher icfeis 10 becoming an advocate for pluiahsm, lor many voii.es in Ictd mm
the evaluation Wliai I am ad vocal ing lor is less a paniciilar individual 01
group ilian a position whii.li insists dial all siakeliulders be idenulied and
solicited for their const muions of I lie siiengilis and weaknesses ol ihf |>uigiain
Ipp 102. IOH|
A similar idea was expressed vividly in hei discussion (1990) ol the need lo
"express multiple, socially constructed, and often conflicting lealities I he lai-
lei we termed fairness, and judgments weie made on the aihievemcnt of this.
criterion in much the same way that labor negoiiaiois and mediaiots delei-
mme fan ness in bargaining sessions" (p 72)
This is House and I lowe (1998)
We lliink llit liamewoik |ol a Uielnnsky study) must he something like tins
Include conllicung values and siakeholilci gioups in I lit- snuly Make smr all
major views aie sufficiently included and n-pirstnied Ui ing > outlining vuus
logeihei so llieie can lie lie-liberation and dialogue ahiuil (hem .iniong tin leli
vanl pailies Make sine I lie if is sulliiienl mom lor dialogue lo usolvi lonlliii
ing claims, but also 10 help ilie policy makeisand media u solve ilust- iLnms liy
soiling ihiough ihe good and bad information Is ilns advoi acy on die |i.ui
of the evaluaiors? We would say no. even though linn woik is heavily value
laden and incoipoiales judgment U is noi advoeaiy. sin h .is i.iki- •» :.idi 1.1
the- otlieil al ilie beginning ol (lie study and iliaiii|iioning 01 sidi 01
anolhei We suggest lliree iiilena loi evaliialions in In |iiopiil> kilinm!
-------
I MIK(.IN(. I: I Mil Al QlrtlllNC.IS IN I VAIIIAIION
FIISI (lie study sliiiulil IK- iiiLlusive so .is lo iipiest-iu .ill u-levam views, interests.
values anil stakeholder beioiul, I he it should In- sulliuent dialogue wiih the
relevant groups so lh.il I he views aie pmpeily anil authentically represented
llnrd. there should be biilhcienl deliberation ID anive ai proper findings |pp
'I Ins is Gicene (Kyan. ct .il ,
Ilicepl in unusual iiicumslanii-s. I ilo mil believe llial cvalualois should advo-
tale for (he piogiam being evaluated Sui.li advocacy LOiiiproiinscs (he peiceived
nedihilny and ihus ihe persuasiveness of (he evaluative claims whai evaln-
atuis should advocate for is ihcir own value commitment In panicipa-
loiy evalualion. tins value commitment is to democratic pluralism, lo
broadening the polity conversation to nuludc all legitimate perspeinvrs and
voices, and lo lull and fan stakeholder paiiiupaiiun in policy and program den
sion making the paiticipatory evaluaior needs in get in close to the pin-
gram Dm this closeness should not be misionsirued as piogiant
partisanship I hat is. pailiupalnry evalualors do advocate, not for a particular
progiam. but lather lor an open, inclusive, engaged conversation among stake-
holder about the mem and wonh ol dun progiam of fanly and fully rep-
lesennng all legiliiuaie inleiesis and concern!) in an evaluation |pp 109. Ill)
Refraining llic Discussion
Neulial A person or gioup not taking pan in a controversy, unahgned with
one side or another in a contioversy, ol no partuular kind or characteristic,
iiulrhnile
linpaitial Fair.jiist
Wiih ihese clcfiiiiiions (Webster's. 1994). a shared theme among die eval-
ujiors cited liete is iinparuality — (lie sense of fan ness and justice Neuiialily.
which might initially seem similai, is too passive, connoting a son of with-
drawal fiom the storms and complexities ol the wot Id Passivity does not seem
to me either characteristic of. 01 common giouncl for. our field I his review.
I hen, suggests
• Dive tse evaluaiois ayce thai ihe evaluatoi should not he an advocate (01
ptcsumahly, an advcisaiy) of a specific piogtam in the sense ol taking sides,
ol a pieconceived position of suppon (01 destiuction)
• *l here is agieement that the ev.iluatoi should he an advocate foi making sine
that as many lelevam cauls as possible gel laid on the table, face up, with
the values (wonh, mem) showing
• 'I heie is uycemcnt that the evaluaiot must be awaie of how less poweilul
voices or unpopulai views, positions, infotmaiion can get silenced and make
spa i.il rlloits lo etisuie ih.tl these voters (data, view minim. iimnl m-i heaid
III! htmisol I-VAIUAIION Nl.DIRAIIIY AND Al)V()( A( V H"i
'Iheiefore. it may be he I pi n I to lehame the iliscusstons in leims ol inipai-
lialily or fairness No evaluation appioach ol which I know would couiiicnam e
(I) delibeiately ignoring program ihcones leading to dilleienl expectations
about what should be studied 01 measuied, or what lesults to look lot, (2)
deliberately selecting measiues 01 quesiions to favoi one side ovei anothei, (5)
dehbetulely misquoting what an interviewee said, (4) delibeialely dealing data
out of the whole cloth to prove a point, (5) Jflilieialcly going fiom the teams
of raw data to com Instons hy a sneaky path siipponing one side ovei anoiliri,
(fi) delibcmtcly tailing to listen lo the views ol .ill panics with conlliiiing pi i
speclives, (7) dflibtialely suppressing mfoimaiion that did not suppon ihe
evalualors own values 01 ihe lesults ihe cvalualoi wished lo obtain, (H) iMtli-
mifffy using woids that cumulatively skew the repot I lo one side 01 (he oihct,
or (9) delibeiately presenting complex, nuancecl Imdings m a simplisiic way to
favor one position over anothci (Dana. 1997) Pet haps these points aie a Mail
lowaid expiessing standards in this atea with which many evaluaiois could
agiee
This is not to say that we may not inadvenenily m practice—through
methodological limitations, tgnoiance ol how out own views and language
cieate subtle biases, or failuie to use strategies lot achieving Ian and latihful
evaluations—do all of the above 01 mote Kathei, n is to say thai .is I tead
iccent elloits to atliculaie what we mean, I hnd that we seek balance and want
fairness, like a mighty rivet, lo pour down
QED? No, Dilemmas Will Remain in Application to
Practice
Principles aie not standards on how to be Ian and just m piacticc What pnn-
c i pies mean m practice is likely to requite continued leexaminaiion and tern
leipietalion as experience giows, evaluation iheones develop, and new
technologies aiisc l:or example, to some evaluaiois. such as (iiccnc and
Meitens, closeness and inclusion aie essential 1 he evaluatot models, by how
the evaluation itself is clone, ideals ol empowerment, demaigtnalizaiion ol the
disenfranchised and oppressed, and in so doing leaps many evaluation hem -
fits such as greater authenticity, bellei balance, gicaicr latiness, "natuial" i val-
uation utilization Since truth lies in the eye ol the beholdei. one logically gi is
as many beholdeis as possible
Toothers, such as Sciiven (1997), opinions and srll-iiiu-usi lu- m the tyt
of the involved stakeholders, albeit expeiietued by them as until ( lost tuss is
to be avoided, risking as H docs co-option and bias Also to be avoided is l>< mg
impartial on an issue dunng wot king lioins but an advcisaiy 01 advoc.nr on
the same issue when the metei isn't iiinmng Inclusion ol ielc vani, but unpop
ular or silenced views, lo such evaluaiois is as uncial lo evalualion as n is in
those encouraging closeness "I he techniques loi achieving stu h nu IIIMOII aie
not seen as requiring silting atomic! a table, as n wete, wiih tin evalnaioi as
moderator when decisions an-made about di-sii'n mfasiins analvsi s and
-------
H6 I'MI KfilNC. I. MIIC.AI ClIAl I LNC.Ib IN IJVAI IIAIION
reports. R:n her, (he techniques UK kale using cxlani cl,U;i and lelymg on pci-
lonnance cl.ua rathci ilian siall interviews, and wheic suc.li iniciviews aie
essential, being sine ihey involve piesiruciunng based on oilier daia and ate
conducted by well-named, wcll-supeivised inicivieweis Oihei appioachcs
include goal-free meihods, heavy inteiviewing with consumers and other
stakeholders, and in all of ilus. applying quality control procedures such as
audioiape backups Paieniheiically. a fine example ol an evaluation using such
approaches in a icsponsive evaluation frame wot k is now available (Slake.
Davis, and Guynii, 1997)
Chehmsky (1997) is pragmatic about methods for achieving inclusive-
ness Though considerably less daunted than Scriven about being captured,
subdued. 01 co-opted by silling down with stakeholders, she would be
highly on hei guaid against efloits to coeice evaluators or otherwise place
them in an advocacy or adversary position (For example, being set up as a
Congressional pit-bull chomping on a possibly effective but out-of-favoi pio-
gram such as WIG would be as threatening to the GAO's independence and
credibility as being cast as a slull for a possibly ineffective but popular pro-
gram such as chemical wai fare )
Are Greene and Meitens talking about diffeient types of progiams ihan
Senven and Chelnnsky. and thus ihe appaiem disagieements aie a case ol "It
depends"7 Lvaluaiois vary m ihe ease of public access to evaluations they have
completed. 01 in how closely ancboied then discussions ol advocacy and neu-
trality are in specilic woik li seems likely that positions lecommendmg close-
ness and mclusiveness are moic feasible with fauly small-scale evaluations.
peihaps on local 01 si.de levels Tot example, one could bung out all stake-
holders' voices lanly in a small piogiam. such as a local Hospice Center, an
individual school, or even a county-wide lecyclmg program
In coniiast. although u is easy to envision mclusiveness in a national eval-
uation, it is moie dillicuh to imagine one-on-one closeness As an example, the
fust issue of the new Head Stan journal aimed at promoting reseaichei-
evaluaior and piacimonei dialogue Ionises on stakeholder collaboration and
paiiicipation, but includes examples only fiom small studies, not the many
national evaluations Howevei. some ledeial agencies now are wining Requests
for Pioposals (RIH's) consistent with empowerment and participatory views
(such as the lluieau of Indian Affairs), so in the future, we may be able to see
more empincally ihe tianspoilability of the inclusive, close, paruupaioiy
appioaih
Peisonally, I would like 10 icad. in lull, an evaluation someone has com-
pleted (seveial. if possible) as a way ol seeing what diffeience the pimciples
make in piactite and wlieie, il any place, "it depends " We might be somewhat
lanhei along il such evaluations weie easily available as companion pieces to
die moic ihcoieiieal ailnles House (1995) wisely wrote, "It is difficult to wine
mielligently about ethics and values One icason is that ethical problems aie
mamlesied -n paiticulai loncieic cases and endoisement of general pim-
uplrs some _s sirms platitudinous 01 inelevam lillncal concerns become
III! I IIIIC SOI liVAIUAIION NllllltAIIIY AND AllVix New Dnrciioiib loi hiigiam I v.iliuiioii ,„,
66 San rianciscn Josscy-Hass. I«J95
Hell. S "Cullinga Non-l'.ims.in Lvjluaiiun in .1 r.inib.ui Woild Ihe Uih.m IIIMIIUU Niw
I'edtiahsm l:valu.ilion " I'.ipcr piesemed .11 (he Aim-man I v.iliuiinii ,\y>oi I.UI..M ( on
ference, San Diego, Niivemlier IW7
Ilicknun, 1. •Implications ol die Ion Biagg l.v.ilii.mon ' / I',I(II,KI,>II 1'i.nn., IWo //
( lieliinsky. l: "Ilie Pnliik.il I nviioniiieni of I v.ilu.iiion .mil Wli.n h MI.IM:,!,,! id, |i,i-,|
opineni ol ihe Field " In I. c.helimbky and W Sh.idi»li (eds I / v.iln.iiinii /,» id, JIM (. „
luiy A llanJIwIt lluuisand Oaks. C .ill) baj-e. I'W7
Cook. I D "I esiiins I.ejinetl in I vjlu.uion C)ver ihe I'.IM 2") \\ .lib ' In I < In limsky ,m,l
W K Shadish (eds ), llvttlualum jtn llif 2hl Ccnluiy A fdiiuflniiik Ihiui-nid ().iks ( .ilil
Sage. 1997
n.m.i. I TufliiiK Nnn-p S.IH l.v.ilu.nions in a I'.inis.in Woild ' I'.ij M MI«| .11 ilu
Aiiiiiu.inlv.il HI AssiiiMiHiiiCoiiliiiiiii 'u-|ii. L.I M..tll,il..i l'i-p/
-------
HH HMIRC.INC. nimCAi. CIIAIIINUSIN LVAIUAIION
I ciieiman. I) "I'nipowernicnl Lvaluaiion .mil Acneiliiaiion ID Ilighei Education " In II
Chelnnsky and W K Shadish (oils ) Evaluation JIH ihc 2lsl Century A llinullnHili "I liou-
sand Oaks, Calif Sage, 1997
(ill-cue. J C Tvjlu.ilion as Advocacy " l.vuliiiiliiin 1'iuilnc, 1995,/#. 25-36
lleiuy. li I . Julncs. G . and M.uk. M M (eds) Ktahsl Lvaluiilum An /"mfijjmj; lliatiy in
iii/i/>oil (i/ I'iMlKf NewDireciioiiblorbvaliidiion.no 78 San Francisco jossey-Uass.
1998
I IOUM;. II K . anil I lowe, K K "Hie Issue ol Advocacy in Evaluations " Antfiiuin Jiiuinul iif
Lvulutiiiiiii 1998. /'J. 233-236
House, L R Tinuiplcd I valuation A CnlU|iie oflhe ACA Guiding I'muiples" In W K
Shadish. D I. Newman. M A Sclii-ner. and C Wye (eils ), New UiiulKim/cii L'vuliiulinii.
66 San Franc isio |ossey-Bass, 1995
Lincoln. Y 'Hie Making ol a Consiruciivist A Remembrance of transformations I'asi " In
E Cuba. I he PaiaJiyn Unity I housaiul Oaks. Calif Sage. 1990
Meilens. I) "Identifying and Resigning Dtlfeiences Among Participants in Evaluation Stud-
ies " In W R Shadish. D Newman. M A Scheirer. and C Wye (eds ). Guiding Pnmiplers
foi r.vuliuilois New Dneciions lor Piogiam Evaluation, no 66 San Francisco Jossey-
Bass. 1995
Moms, M (Chair) "Wliai's an ["valuator 10 Do' Confronting Ethical Dilemmas in Practice "
Session piesented at the American ('valuation Association Confetence. San Diego,
November 1997
Nelkm. V S (Chair) T.ihiial Dilemmas in (.valuation " Session presented at tbe Ameiican
Evaluation Association Conlereiue. Vancouver, I) C , November 1995
Rossi, P II "Doing Good and Gelling It Right " In W K Shadtsh. D Newman, M A
Scheirer, and C Wye (eds ). Guiding I'umiplrsjoi Cvulnulnis New Directions for Prngiam
Lvaluaiion.no 66 San l:iancisco Jossey-Bass, 1995
Kyan, K . Gieene.) , Lincoln. Y , Mallnson, S . and Meilens, D M "Advantages and Disad-
vantages of Using Inclusive I.valuation Appioaches in Evaluation Piaclice " Amenutn Jinn-
nut o/ Emiiuuiiim. 1998, 19. 101-122
bcnven, M "Tiulh and Objectivity in Evaluation " In I: Chelunsky and W Shadish (eds ),
Evdluulion/or the 2Js( Ctiituiy A llanJIntuli Ihousand Oaks. Calif Sage. 1997
Slake. R E . Davis, R . and Guynn. S fc'vuluulion ofReadei FotuscJ VVniiiix/oi tl\e Veterans
Hfiif/ili Ailniiiiisiiuiion Champaign. Illinois CIRCE at the Univeistly of Illinois. 1997
United States General Accounlin|> Office Ain/Hiiig blumfuidi Washington. D C Author.
1997
Wtl»lei'i Encyilo|ifi/ii Unabiulyd Dulumttiy n/ the F.n^h Lunffiagf New York Gramercy
Hooks. 1994
s-i 11 IN DAI IA, /m-iiJfiil oj Dallu Aitdl^iis. is a />tiil pitsulent of the Evaluation
Rexanh Sttiicty ami ifci/nciil oj (lit- Alvd ami Gunnai MyiJal Awaidfoi livahm-
Mun in Goveinmcnl Set vu t ami oj ihe Ralwl Ingpl Awaidfot Exliaoidinary Sewtce
to (lie AmriiiuH Evaluation /\iSiiti(ilio;i Mir Juis been dneclor oj lestanli and eval-
uation /o/ Pmjecl Head Stait and ihc Childicn* Buieau. diiectHi fat («i<.hmg. leam-
in£. and dis^iwncn( at the National Inhume of llduialton. and Dnectoi }m Pioyam
'Ilieie ha* been little diuussion in evaluation litciatuie in (lie United
Stale* of ethical liiues in condui ting evaluation in mfcnuilioiKil
stKin^i Although many of the Mine ethical issues «mse wheievci the
evaluation is iondmted, two sets of ethical issiirs that ate i>aiticulaily
important in developing countne* umuin how Maheholden should be,
mvulvea\md to what extent the evuluator should leywct Imul custoi?^
and values
\
\
Ethical Issiks in Conducting
Evaluation ^International Bettings
Michael Bambeigci
This chapter reviews some ol llrtc eihicalXsucs iclctiiiliccl in evaliiiiitoiis in ilic
Unued Slaies and considers ihe Sjmilafiucs and diffeienees in application of
these issues in developing couninesvlnilso identities a nutnbei ol ethical issues
ansing in international evaluationsytat aie less common in U S domestic eval-
uations These concern the role/if uJlei national agencies in linaiiung. pro-
moting, and conducting evaluation ift developing commies and how the
political, economic, and cultural charactVistiLS of developing couiitties affect
evaluation practice We reftVTrequenily uAhe Joint Committees I'K^KIIII Eval-
uation Standutds and ihe Ainencan EvaluaiVin Association's (AHA) "Guiding
Principles for Evaluauon" as illustiationsXof how U S evaluaiois have-
approached issues reUtmg to professional eval^ialion standaids and lo show
how these approaches have been viewed fiom rhe pcispecitve of clevclo|)ing
countries
Ethical Issues in International Evaluation
Ihisihamer is concerned pnmaiily with evaluaitoiib iri developing l
that are/Conducted by. or sponsoied by. multilateial crvvi-lopmeiu agtiuu-s
(WoiUf Bank. UNICEF. InterAinencan Development UankX fot i-xampli). lnl.n-
etaUlevelopmeni agencies (USAID, CIDA). iiiieinaiiona\iiongovriniiiciu.il
o/gamzalions (NGOs) (OXPAM. CARL. Win Id Vision). an\J Noith Amciican
,<6r European-based iese.iK.li institutes (umveisities) rvaluait
-------
Program Evaluation Standards hrtp '.'www eval org. c .equationDocuments/proeeval html
The Program Evaluation Standards
Summary of the Standards
Utility Standards
The utility standards are intended to ensure that an evaluation will serve the
information needs of intended users.
Ul Stakeholder Identification—Persons involved in or affected by the evaluation
should be identified, so that their needs can be addressed.
U2 Evaluator Credibility—The persons conducting the evaluation should be both
trustworthy and competent to perform the evaluation, so that the evaluation
findings achieve maximum credibility and acceptance.
U3 Information Scope and Selection—Information collected should be broadly
selected to address pertinent questions about the program and be responsive to
the needs and interests of clients and other specified stakeholders.
U4 Values Identification—The perspectives, procedures, and rationale used to
interpret the findings should be carefully described, so that the bases for value
judgments are clear.
U5 Report Clarity—Evaluation reports should clearly describe the program being
evaluated, including its context, and the purposes, procedures, and findings of the
evaluation, so that essential information is provided and easily understood.
U6 Report Timeliness and Dissemination—Significant interim findings and
evaluation reports should be disseminated to intended users, so that they can be
used in a timely fashion.
U7 Evaluation Impact—Evaluations should be planned, conducted, and reported in
ways that encourage follow-through by stakeholders, so that the likelihood that
the evaluation will be used is increased.
Feasibility Standards
The feasibility standards are intended to ensure that an evaluation will be realistic,
prudent, diplomatic, and frugal.
Fl Practical Procedures—The evaluation procedures should be practical, to keep
disruption to a minimum while needed information is obtained.
F2 Political Viability—The evaluation should be planned and conducted with
anticipation of the different positions of various interest groups, so that their
I of 4 1/4/01 10 35 AiV
-------
Program Evaluation Standards http /Avww evaJ.ora- Evaluation Documents, progeval html
cooperation may be obtained, and so that possible attempts by any of these
groups to curtail evaluation operations or to bias or misapply the results can be
averted or counteracted.
F3 Cost Effectiveness—The evaluation should be efficient and produce information
of sufficient value, so that the resources expended can be justified.
Propriety Standards
The propriety standards are intended to ensure that an evaluation will be
conducted legally, ethically, and with due regard for the welfare of those involved
in the evaluation, as well as those affected by its results.
PI Service Orientation-Evaluations should be designed to assist organizations to
address and effectively serve the needs of the full range of targeted participants.
P2 Formal Agreements—Obligations of the formal parties to an evaluation (what is
to be done, how, by whom, when) should be agreed to in writing, so that these
parties are obligated to adhere to all conditions of the agreement or formally to
renegotiate it.
P3 Rights of Human Subjects—Evaluations should be designed and conducted to
respect and protect the rights and welfare of human subjects.
P4 Human Interactions—Evaluators should respect human dignity and worth in
their interactions with other persons associated with an evaluation, so that
participants are not threatened or harmed.
P5 Complete and Fair Assessment—The evaluation should be complete and fair in
its examination and recording of strengths and weaknesses of the program being
evaluated, so that strengths can be built upon and problem areas addressed.
P6 Disclosure of Findings—The formal parties to an evaluation should ensure that
the full set of evaluation findings along with pertinent limitations are made
accessible to the persons affected by the evaluation, and any others with
expressed legal rights to receive the results.
P7 Conflict of Interest-Conflict of interest should be dealt with openly and
honestly, so that it does not compromise the evaluation processes and results.
P8 Fiscal Responsibility-The evaluator's allocation and expenditure of resources
should reflect sound accountability procedures and otherwise be prudent and
ethically responsible, so that expenditu res are accounted for and appropriate.
Accuracy Standards
The accuracy standards are intended to ensure that an evaluation will reveal and
2 Of 4 1/4/01 1035AM
-------
Program Evaluation Standards hrtp '/www eval org'EvaluationDocuments.progeval html
convey technically adequate information about the features that determine worth
or merit of the program being evaluated.
Al Program Documentation—The program being evaluated should be described
and documented clearly and accurately, so that the program is clearly identified.
A2 Context Analysis—The context in which the program exists should be examined
in enough detail, so that its likely influences on the program can be identified.
A3 Described Purposes and Procedures—The purposes and procedures of the
evaluation should be monitored and described in enough detail, so that they can
be identified and assessed.
A4 Defensible Information Sources—The sources of information used in a program
evaluation should be described in enough detail, so that the adequacy of the
information can be assessed.
A5 Valid Information—The information gathering procedures should be chosen or
developed and then implemented so that they will assure that the interpretation
arrived at is valid for the intended use.
A6 Reliable Information—The information gathering procedures should be chosen
or developed and then implemented so that they will assure that the information
obtained is sufficiently reliable for the intended use.
A7 Systematic Information—The information collected, processed, and reported in
an evaluation should be systematically reviewed and any errors found should be
corrected.
A8 Analysis of Quantitative Information—Quantitative information in an evaluation
should be appropriately and systematically analyzed so that evaluation questions
are effectively answered.
A9 Analysis of Qualitative Information—Qualitative information in an evaluation
should be appropriately and systematically analyzed so that evaluation questions
are effectively answered.
A10 Justified Conclusions—The conclusions reached in an evaluation should be
explicitly justified, so that stakeholders can assess them.
All Impartial Reporting—Reporting procedures should guard against distortion
caused by personal feelings and biases of any party to the evaluation, so that
evaluation reports fairly reflect the evaluation findings.
A12 Metaevaluation-The evaluation itself should be formatively and summatively
evaluated agajnst these and other pertinent standards, so that its conduct is
appropriately guided and, on completion, stakeholders can closely examine its
3 of 4 1/4/01 1035AM
-------
Program Evaluation Standards
httpv/www.eval org/EvaluationDocuments,progeval htm
strengths and weaknesses.
Prepared by:
Mary E. Ramlow
The Evaluation Center
401B Ellsworth Hall
Western Michigan University
Kalamazoo, MI 49008-5178
Phone: 616-387-5895
Fax: 616-387-5923
Email: Marv.Ramlow(a)wmich.edu
4 of 4
1/4/01 10.35 A.\
-------
Guiding Principles for Evaluators htrp .'/www eval org/EvaluationDocuments-aeaprin6 hrmi
Guiding Principles for Evaluators
A Report from the AEA Task Force on
Guiding Principles for Evaluators
Members of the Task Force
Dianna Newman, University of Albany/SUNY
Mary Ann Scheirer, Private Practice
William Shadish, Memphis State University (Chair),
w.shadish@mail.Dsvc.memDhis.edu
Chris Wye, National Academy of Public Administration
I. Introduction
A. Background: In 1986, the Evaluation Network (ENet) and the
Evaluation Research Society (ERS) merged to create the American
Evaluation Association. ERS had previously adopted a set of standards
for program evaluation (published in New Directions for Program
Evaluation in 1982); and both organizations had lent support to work of
other organizations about evaluation guidelines. However, none of these
standards or guidelines were officially adopted by AEA, nor were any
other ethics, standards, or guiding principles put into place. Over the
ensuing years, the need for such guiding principles has been discussed
by both the AEA Board and the AEA membership. Under the presidency
of David Cordray in 1992, the AEA Board appointed a temporary
committee chaired by Peter Rossi to examine whether AEA should
address this matter in more detail. That committee issued a report to the
AEA Board on November 4, 1992, recommending that AEA should pursue
this matter further. The Board followed that recommendation, and on
that date created a Task Force to develop a draft of guiding principles for
evaluators. The AEA Board specifically instructed the Task Force to
develop general guiding principles rather than specific standards of
practice. This report summarizes the Task Force's response to the
charge.
B. Process: Task Force members reviewed relevant documents from
other professional societies, and then independently prepared and
circulated drafts of material for use in this report. Initial and subsequent
drafts (compiled by the Task Force chair) were discussed during
conference calls, with revisions occurring after each call. Progress
reports were presented at every AEA board meeting during 1993. In
addition, a draft of the guidelines was mailed to all AEA members in
September 1993 requesting feedback; and three symposia at the 1993
AEA annual conference were used to discuss and obtain further
feedback. The Task Force considered all this feedback in a December
1993 conference call, and prepared a final draft in January 1994. This
1 of 9 1/4/01 10-35 A.V
-------
Guiding Principles for Evaluators httpV/^ww eval org/EvaluationDocuments.aeapnn6 html
draft was presented and approved for membership vote at the January
1994 AEA board meeting.
C. Resulting Principles: Given the diversity of interests and employment
settings represented on the Task Force, it is noteworthy that Task Force
members reached substantial agreement about the following five
principles. The order of these principles does not imply priority among
them; priority will vary by situation and evaluator role.
1. Systematic Inquiry: Evaluators conduct systematic,
data-based inquiries about whatever is being evaluated.
2. Competence: Evaluators provide competent performance to
stakeholders.
3. Inteahtv/Honestv: Evaluators ensure the honesty and
integrity of the entire evaluation process.
4. Respect for People: Evaluators respect the security, dignity
and self-worth of the respondents, program participants,
clients, and other stakeholders with whom they interact.
5. Responsibilities for General and Public Welfare: Evaluators
articulate and take into account the diversity of interests and
values that may be related to the general and public welfare.
These five principles are elaborated in Section III of this
document.
D. Recommendation for Continued Work: The Task Force also
recommends that the AEA Board establish and support a mechanism for
the continued development and dissemination of these Guiding
Principles.
II. Preface: Assumptions Concerning Development of Principles
A. Evaluation is a profession composed of persons with varying interests,
potentially encompassing but not limited to the evaluation of programs,
products, personnel, policy, performance, proposals, technology,
research, theory, and even of evaluation itself. These principles are
broadly intended to cover all kinds of evaluation.
B. Based on differences in training, experience, and work settings, the
profession of evaluation encompasses diverse perceptions about the
primary purpose of evaluation. These include but are not limited to the
following: bettering products, personnel, programs, organizations,
governments, consumers and the public interest; contributing to
2 of 9 1/4/01 10-35 AN
-------
Guiding Principles for Evaluators hnp //www eval org/Evaluation Documents. aeaprm6 html
informed decision making and more enlightened change; precipitating
needed change; empowering,all stakeholders by collecting data from
them and engaging them in the evaluation process; and experiencing the
excitement of new insights. Despite that diversity, the common ground is
that evaluators aspire to construct and provide the best possible
information that might bear on the value of whatever is being evaluated.
The principles are intended to foster that primary aim.
C. The intention of the Task Force was to articulate a set of principles
that should guide the professional practice of evaluators, and that should
inform evaluation clients and the general public about the principles they
can expect to be upheld by professional evaluators. Of course, no
statement of principles can anticipate all situations that arise in the
practice of evaluation. However, principles are not just guidelines for
reaction when something goes wrong or when a dilemma is found.
Rather, principles should proactively guide the behaviors of professionals
in everyday practice.
D. The purpose of documenting guiding principles is to foster continuing
development of the profession of evaluation, and the socialization of its
members. The principles are meant to stimulate discussion and to
provide a language for dialogue about the proper practice and application
of evaluation among members of the profession, sponsors of evaluation,
and others interested in evaluation.
E. The five principles proposed in this document are not independent,
but overlap in many ways. Conversely, sometimes these principles will
conflict, so that evaluators will have to choose among them. At such
times evaluators must use their own values and knowledge of the setting
to determine the appropriate response. Whenever a course of action is
unclear, evaluators should solicit the advice of fellow evaluators about
how to resolve the problem before deciding how to proceed.
F. These principles are intended to replace any previous work on
standards, principles, or ethics adopted by ERS or ENet, the two
predecessor organizations to AEA. These principles are the official
position of AEA on these matters.
G. Each principle is illustrated by a number of statements to amplify the
meaning of the overarching principle, and to provide guidance for its
application. These statements are illustrations. They are not meant to
include all possible applications of that principle, nor to be viewed as
rules that provide the basis for sanctioning violators.
H. These principles are not intended to be or to replace standards
supported by evaluators or by the other disciplines in which evaluators
participate. Specifically, AEA supports the effort to develop standards for
3 of 9 1/4/01 I035AV
-------
Guiding Principles for Evaluators http-.'/www eval on>'EvaluationDocuments/aeapnn6.html
educational evaluation by the Joint Committee on Standards for
Educational Evaluation, of which AEA is a cosponsor.
I. These principles were developed in the context of Western cultures,
particularly the United States, and so may reflect the experiences of that
context. The relevance of these principles may vary across other
cultures, and across subcultures within the United States.
J. These principles are part of an evolving process of self-examination by
the profession, and should be revisited on a regular basis. Mechanisms
might include officially-sponsored reviews of principles at annual
meetings, and other forums for harvesting experience with the principles
and their application. On a regular basis, but at least every five years
from the date they intially take effect, these principles ought to be
examined for possible review and revision. In order to maintain
association-wide awareness and relevance, all AEA members are
encouraged to participate in this process.
III. The Principles
A. Systematic Inquiry: Evaluators conduct systematic, data-based
inquiries about whatever is being evaluated.
1. Evaluators should adhere to the highest appropriate
technical standards in conducting their work, whether that
work is quantitative or qualitative in nature, so as to increase
the accuracy and credibility of the evaluative information they
produce.
2. Evaluators should explore with the client the shortcomings
and strengths both of the various evaluation questions it might
be productive to ask, and the various approaches that might
be used for answering those questions.
3. When presenting their work, evaluators should communicate
their methods and approaches accurately and in sufficient
detail to allow others to understand, interpret and critique their
work. They should make clear the limitations of an evaluation
and its results. Evaluators should discuss in a contextually
appropriate way those values, assumptions, theories, methods,
results, and analyses that significantly affect the interpretation
of the evaluative findings. These statements apply to all
aspects of the evaluation, from its initial conceptualization to
the eventual use of findings.
B. Competence: Evaluators provide competent performance to
stakeholders.
4 of 9 1/4/01 10:35 AM
-------
Guiding Principles for Evaluators http '/www eval org'EvaluaiionDocumems
-------
Guiding Principles for Evaluators http "www eval org/EvaluanonDocuments.aeaprm6 hrnii
interests).
4. Evaluators should disclose any roles or relationships they
have concerning whatever is being evaluated that might pose a
significant conflict of interest with their role as an evaluator.
Any such conflict should be mentioned in reports of the
evaluation results.
5. Evaluators should not misrepresent their procedures, data
or findings. Within reasonable limits, they should attempt to
prevent or correct any substantial misuses of their work by
others.
6. If evaluators determine that certain procedures or activities
seem likely to produce misleading evaluative information or
conclusions, they have the responsibility to communicate their
concerns, and the reasons for them, to the client (the one who
funds or requests the evaluation). If discussions with the client
do not resolve these concerns, so that a misleading evaluation
is then implemented, the evaluator may legitimately decline to
conduct the evaluation if that is feasible and appropriate. If
not, the evaluator should consult colleagues or relevant
stakeholders about other proper ways to proceed (options
might include, but are not limited to, discussions at a higher
level, a dissenting cover letter or appendix, or refusal to sign
the final document).
7. Barring compelling reason to the contrary, evaluators should
disclose all sources of financial support for an evaluation, and
the source of the request for the evaluation.
D. Respect for People: Evaluators respect the security, dignity and
self-worth of the respondents, program participants, clients, and other
stakeholders with whom they interact.
1. Where applicable, evaluators must abide by current
professional ethics and standards regarding risks, harms, and
burdens that might be engendered to those participating in the
evaluation; regarding informed consent for participation in
evaluation; and regarding informing participants about the
scope and limits of confidentiality. Examples of such standards
include federal regulations about protection of human subjects,
or the ethical principles of such associations as the American
Anthropological Association, the American Educational
Research Association, or the American Psychological
Association. Although this principle is not intended to extend
the applicability of such ethics and standards beyond their
6 of 9 1/4/01 10-35 AN
-------
Guiding Principles for Evaluators httpVAvvvw eval org/EvaluationDocuments,aeaprm6 html
current scope, evaluators should abide by them where it is
feasible and desirable to do so.
2. Because justified negative or critical conclusions from an
evaluation must be explicitly stated, evaluations sometimes
produce results that harm client or stakeholder interests.
Under this circumstance, evaluators should seek to maximize
the benefits and reduce any unnecessary harms that might
occur, provided this will not compromise the integrity of the
evaluation findings. Evaluators should carefully judge when the
benefits from doing the evaluation or in performing certain
evaluation procedures should be foregone because of the risks
or harms. Where possible, these issues should be anticipated
during the negotiation of the evaluation.
3. Knowing that evaluations often will negatively affect the
interests of some stakeholders, evaluators should conduct the
evaluation and communicate its results in a way that clearly
respects the stakeholders' dignity and self-worth.
4. Where feasible, evaluators should attempt to foster the
social equity of the evaluation, so that those who give to the
evaluation can receive some benefits in return. For example,
evaluators should seek to ensure that those who bear the
burdens of contributing data and incurring any risks are doing
so willingly, and that they have full knowledge of, and
maximum feasible opportunity to obtain any benefits that may
be produced from the evaluation. When it would not endanger
the integrity of the evaluation, respondents or program
participants should be informed if and how they can receive
services to which they are otherwise entitled without
participating in the evaluation.
5. Evaluators have the responsibility to identify and respect
differences among participants, such as differences in their
culture, religion, gender, disability, age, sexual orientation and
ethnicity, and to be mindful of potential implications of these
differences when planning, conducting, analyzing, and
reporting their evaluations.
E. Responsibilities for General and Public Welfare: Evaluators articulate
and take into account the diversity of interests and values that may be
related to the general and public welfare.
1. When planning and reporting evaluations, evaluators should
consider including important perspectives and interests of the
full range of stakeholders in the object being evaluated.
7 of 9 1/4/01 10 35 AN
-------
Guiding Principles for Evaluators http /www eval.org/EvaluationDocuments/aeaprin6.html
Evaluators should carefully consider the justification when
omitting important value perspectives or the views of
important groups.
2. Evaluators should consider not only the immediate
operations and outcomes of whatever is being evaluated, but
also the broad assumptions, implications and potential side
effects of it.
3. Freedom of information is essential in a democracy. Hence,
barring compelling reason to the contrary, evaluators should
allow all relevant stakeholders to have access to evaluative
information, and should actively disseminate that information
to stakeholders if resources allow. If different evaluation
results are communicated in forms that are tailored to the
interests of different stakeholders, those communications
should ensure that each stakeholder group is aware of the
existence of the other communications. Communications that
are tailored to a given stakeholder should always include all
important results that may bear on interests of that
stakeholder. In all cases, evaluators should strive to present
results as clearly and simply as accuracy allows so that clients
and other stakeholders can easily understand the evaluation
process and results.
4. Evaluators should maintain a balance between client needs
and other needs. Evaluators necessarily have a special
relationship with the client who funds or requests the
evaluation. By virtue of that relationship, evaluators must
strive to meet legitimate client needs whenever it is feasible
and appropriate to do so. However, that relationship can also
place evaluators in difficult dilemmas when client interests
conflict with other interests, or when client interests conflict
with the obligation of evaluators for systematic inquiry,
competence, integrity, and respect for people. In these cases,
evaluators should explicitly identify and discuss the conflicts
with the client and relevant stakeholders, resolve them when
possible, determine whether continued work on the evaluation
is advisable if the conflicts cannot be resolved, and make clear
any significant limitations on the evaluation that might result if
the conflict is not resolved.
5. Evaluators have obligations that encompass the public
interest and good. These obligations are especially important
when evaluators are supported by publicly-generated funds;
but clear threats to the public good should never be ignored in
any evaluation. Because the public interest and good are rarely
8 of 9 1/4/01 10-35 AiV
-------
Guiding Principles for Evaluators http //www eval.org/EvaluationDocuments/aeaprm6 html
the same as the interests of any particular group (including
those of the client or funding agency), evaluators will usually
have to go beyond an analysis of particular stakeholder
interests when considering the welfare of society as a whole.
9 Of 9 1/4/01 1035 AN
-------
WHAT COMES TO MIND WHEN YOU HEAR THE WORD...
process
ma [or- Activities
purpose
worc|s to
Describe
Evaluation
Auditing
Investigation
Research
Monitoring
Assessment
-------
EVALUATION AND AUDITING
o 71
Refer to articles in- Carl Wisler, ecj. C1996) New Directions for Program Evaluation
San Francisco CA Jossey-Bass Publishers
Editor's Notes, Carl Wisler
Divorski, Stan "Differences in the Approaches of Auditors and Evaluators to the
Examination of Government Policies and Programs"
Chelimsky, Eleanor "Auditing and Evaluation Whither the Relationship/"
"Performance auditing is an objective and systematic examination of evidence for the
purpose of providing an independent assessment of the performance of a government
organization, program, activity, or function in order to provide information to improve
public accountability and facilitate decision-making by parties with responsibility to oversee
or initiate corrective action " Comptroller General of the L/S, 1994-
Evaluation
Auditing
New C1960's)
Offshoot of Social Sciences concerned with
theory and explanation
Less precise, more intellectually stimulating
Multiple audiences
Criteria selection flexible
Cooperative, interactive relationships with
evaluees
Examines a question of "why" (what will
produce desired/undesired effects')
Other
Older profession (stemming from
accounting, bookkeeping)
Verification of authoritative documents
Single client
Fixed criteria (comparing 'what is' to 'what
should be')
Examines a question of "whaf'Cdoes what
was done conform to standards/)
End result is an opinion- Is observed
performance consistent with accepted
norms/
Other.
-------
EDITOR'S NOTES
During the last several decades, the disciplines of evaluation and auditing
have each gone through substantial change The purpose of this volume is not
to explore all the reasons and consequences associated with such change, but
to focus on the extent to which audit and evaluation have converged on simi-
lar procedures and organizational structures and on the extent to which they
have remained different An effort has been made to understand and describe
the issues from both evaluation and auditing perspectives
Both evaluation and auditing claim to help decision makers by providing
them with systematic and credible information that can be useful in the cre-
ation, management, oversight, change, and occasionally abolishment of pro-
grams Yet despite considerable overlap in objectives, subject matter, and
clients, auditing and evaluation have until recently functioned largely in isola-
tion from one another. The literature of each discipline scarcely recognizes the
existence of the other Academic preparation of auditors and evaluators could
hardly be more different Organizationally, the two activities have traditionally
been separate The practitioners have difficulty communicating with one
another not only because of differences in vocabulary but also because of some
important differences in mind-set Will auditing and evaluation persist as dis-
tinctive services to decision makers or is there a possibility of a merge or blend
of the two activities?
Differences between auditing and evaluation are rooted in the older pro-
fessions from which they emerged Auditing evolved from financial account-
ing and so makes much use of concepts like verification, internal controls, and
good management practice. Evaluation emerged from the sciences, especially
social science, and so has tended to carry with it the trappings of measurement,
probability sampling, and experimentation. The authors represented in this
volume make frequent reference to the conceptual underpinnings of the two
disciplines when they point out how auditing and evaluation are different
One way to get a quick sense of some differences between auditing and
evaluation is to consider the kinds of questions for which the two fields have
tned to provide answers Three categories of questions are especially useful for
comparing auditing and evaluation descriptive, normative, and cause-and-
effect (Wisler. 1984a, 1984b) These categories are explicitly or implicitly
referred to any number of times in the following chapters
Program evaluation has always given much attention to cause-and-effect
questions, especially ones about the overall impact of a program The answer
to an impact question is usually formulated as the difference between an out-
come observed after a program has been in operation with the outcome that
would have been observed in the absence of the program Evaluation clients
also seek the answers to descriptive questions—ones that do not compare two
-------
2 IJHIOR'S NOM.S
conditions bin simply desuibc a stale ol ilic world Common examples include
questions about tbc societal needs, the selling in which a program operates, or
the way a progiam was implemented
Auditing has traditionally focused on normative questions for which the
answer compares "what is" with "what should be " Seldom do auditors seek
answers to descriptive questions nor do they often consider cause-and-effect
questions, at least m the sense that evaluators understand that term
One of the most interesting differences in disciplinary perspective is the
coniiasi between program audit, which usually starts with a normative ques-
tion, and impact evaluation, which focuses on cause and effect A brief
overview of these two forms of inquiry is illuminating because practitioners
from each discipline claim that their respective approaches address the issue
of program effectiveness However, the similarity seems to end there Method-
ologically, they are different because in fact they address two distinct questions
about program effectiveness Although the two methodological approaches can
be distinguished easily, it is probable that most clients, and at least some prac-
titioners as well, do not lully appreciate the important differences
As defined by the Comptroller General of the United States (1994, p 14),
peijoimance aud\l is "an objective and systematic examination of evidence
of the performance of a government organization, program, activity, or func-
tion in order to provide information to improve public accountability and facil-
itate decision-making " A piogiam audit is a subcatcgory of a performance audit
for which one objective (of three) is to determine "the extent to which the
desired results or benefits established by the legislature or other authorizing
body aie being achieved " This application of program audit provides an inter-
esting contrast with impact evaluation (The scope of auditing is broad, of
course, and the piactitioners may employ objectives and methodologies dif-
ferent from those reviewed here The notion of value-for-money audits, used
in many countries, is similar to the concept of performance auditing )
Schandl (1978, p 4) says "auditing is a human evaluation process to
establish the adherence to ccitam norms, resulting in an opinion (or judg-
ment) " I lerbeit (1979) describes developments at the U S General Account-
ing Office in the 1960s that led to a prescription for management and program
audits, those prescriptions are comparable to the general concept set forth by
Schandl regarding how to undertake management and program audits The
central idea in a program audit is to compare "what is" with "what should be"
(Compunllci General of the United Slates. 1979 |1974|) This notion seems
to How lioin earlier forms of auditing in which generally accepted accounting
puuiics 01, mote bioadly, generally accepted management practices played
key mlcs The methodology was to establish a standard generally accepted
management piaciicc, lor example, and to compete that standaicl with actual
piogiam piauiLC Any discovery of a scuous discrepancy would be regaided
as a deliLii— •• and lead to a negative audit icport
In n o broader issues of program effectiveness, auditors look along
the noun.. question To conduct a proginm audit, it was therefore nei.cs-
ni>noR's NOILS
sary to identify one or more specific program objectives to piny the role of
"what should be " Such objectives might come from legislation, regulations,
declarations of intent by program managers, and so on Actual program per-
formance, determined empirically, provided the "what is" component And
program effect was defined as the difference between the program objective
and actual performance (In auditing, an effect is sometimes defined differently
to mean the consequences of a discrepancy For example, if a program to
improve water quality showed a shortfall in the achievement of water pumy
standards, the effects might be greater incidence of disease, lost time at work
due to illness, and so on )
The audit approach to program effectiveness is in the spirit of the strat-
egy advocated in the problem-solving literature as exemplified by Kepner and
Tregoe (1976) It is also very much in the mode of objectives-oriented evalu-
ation approaches (see Chapter 5 of Worthen and Sanders, 1987) sometimes
used with educational programs — and especially of the version called dis-
crepancy evaluation (Provus, 1971) It does not correspond, however, to the
conventional notion of impact evaluation
Program impact evaluation stems from the experimental design used in a
number of sciences wherein compansons are made between outcomes associ-
ated with randomly assigned treatment and control groups In evaluation, the
methodology is generally understood not to require random assignment but 10
extend to other approaches that permit compansons between what happened
in the presence of the program and what would have happened in (he absence
of the program Such quasi-experimental designs are prominent in evaluation
as ways to answer impact questions (Cook and Campbell, 1979) (Auditors and
others new to the evaluation literature have to contend with the variety of terms
that may be used interchangeably with impact evaluation, most notably impact
assessment, impact analysis, and program effectiveness evaluation )
The two questions posed by auditors and evaluators about program effec-
tiveness can, and generally will, lead to quite different conclusions about the
performance of a program Both conclusions may be correct — they are just
answers to different questions Unfortunately, because of the language used,
the unwary may perceive the questions to be the same
The foregoing companson of program audit and impact evaluation illus-
trates the purpose of this volume, not to give a full-featured account of the
heartlands of evaluation and auditing, but more 10 focus on ihe territory where
they come together. The authors of the five chapters of this volume have all
been in positions to survey the terrain thai is roamed boih by bands of evalu-
ators and of auditors, and they offer their views on the jointly occupied terri-
tory along with occasional references to the heartlands Earlier vcisions of the
chapters were presented at a session of the International Evaluation Confer-
ence held in Vancouver, Bntish Columbia, November 1-5, 1995
Stan Divorski, from his vantage poini with the Oflicc of (he 'or Gen-
eral in Canada, sets forth five key dimensions that he finds d nsh the
mmrl rpic nf
i unr< fmiTi
Pivmr Rinnl/c
Im, iwisnri live 111 I hi'
-------
4 EDITOR'S NOTES
Minnesota Office of Legislative Auditor to consider the extent to which the
auditing and evaluation cultures have been blended, or at least have that
potential if blending is desired Christopher Pollitt and Hilkka Summa bring
their experiences with audit institutions m the United Kingdom and Finland,
respectively, to bear in comparing the ways m which auditors and evaluators
approach similar tasks Frans L Leeuw from the Netherlands Court of Audit
focuses on the contributions of evaluators and auditors to the improvement of
public sector performance and, in so doing, draws attention to a slice of the
international literature companng the two fields Finally, Eleanor Chelimsky,
formerly of the U S General Accounting Office, highlights the conclusions of
the other authors, based on their conference papers, and offers her own views
on the pros and cons of integrating auditing and evaluation or keeping them
as separate services to decision makers.
Collectively, the authors point out many similarities and differences between
auditing and evaluation Indeed, there is such vanety in the observations that it
is difficult to categorize them and gauge their importance If, as seems to be the
case, evaluation and auditing are moving closer together, what are the differences
that seem most likely to hold them apart? Readers may wish to consider the fol-
lowing three themes and conjectures as they read this volume
The inclination of auditors toward normative questions and of evaluators toward
descriptive and impact questions The difference is rooted in the history of the
disciplines and in the educational preparation of the practitioners Although
examples of crossover between the disciplines can be cited, a broad-scale mix
of the approaches seems likely to lake a long time, if it ever occurs
Independence versus collaboration with the subjects of audit and evaluation
Auditors have attached great importance to their independence from both
client and audilee, while evaluators have tended to work more closely with
their clients and to move toward yet greater collaboration with evaluees Even
if the tenets of fourth-generation evaluation are not adopted by most evalua-
tors, mainstream evaluators seem unlikely to return to the extremes of scien-
tific detachment that once prevailed The two disciplines therefore seem
destined to be some distance apart on the scale of independence
Di/ferences m the degree to which auditing and evaluation have become rou-
tinized government operations The role of auditing in government activities has
been generally acknowledged with legislative mandates, clearly identified
clients, and organizational permanence The shorter history of public program
evaluation reveals less widespread acceptance, more variability and multiplic-
ity of clients, and considerable fluctuation m fortune over the short term The
repercussions of these differences on achievements of the two disciplines (in
terms of program improvements, new legislation, and so on) may be hard to
sort out, but in the case of evaluation, one wonders how long the shakedown
cruise will last and how, or if, a greater stability will be achieved A continued
convergence of evaluation and auditing might be a course toward a higher-
quality, more steadfast service to decision makers
CDIIOR'S Nori-s 5
References
Comptroller General of the United Stales "Report Manual" As adapted in I. I lerben. Audit-
ing the Performance of Management Belmont. Calif Lifelime Learning. 1979 (Originally
published 1974) 6 '
Comptroller General of the United Stales Government Auditing Standards. 1994 Revision
Washington, DC US General Accounting Office. 1994
Cook, T D , and Campbell. D T Quasi-Expenmentation Skokie, III Rand McNally. 1979
Herbert. L Auditing the Performance ojManagement Belmom. Calif Lifetime Learning. 1979
Kepner.C H , and Tregoe, B B The Rational Manager (2nd ed ) Pnnceion NJ Kcpner-
Tregoe, 1976
Provus. M M Discrepancy Evaluation Berkeley. Calif McCulchan. 1971
Schandl.C W Theory of Auditing Houston. Tex Scholars. 1978
Wisler, C E "Topics in Evaluation " GAO Review. 1984a. 19-1
Wisler, C E "Topics in Evaluation " CAO Review. 1984b. 19-3
Wonhen, B R , and Sanders, J R Educational Evaluation New York Longman, 1987
CARL WISLER is an evaluation consultant with Wisln
Maryland
in Mitiliellvillf.
-------
Although audits and evaluations may have similar characteristics,
the perspectives of auditors and evaluators can be quite different
These differing perspectives are reflected in then respective treatments
of program impacts
Differences in the Approaches
of Auditors and Evaluators
to the Examination of Government
Policies and Programs
Stan Divorsfei
On any given dimension, the extent to which audit and evaluation differ is largely
a matter of degree Any characteristic thai can apply to audits will also apply to
some evaluations, somewhere, sometime Any distinguishing characteristic of
evaluations probably can be found to apply to some audit Overall, however,
audits and evaluations can be very different
This chapter attempts to describe the perspective, the mind-set, that audi-
tors bring to the examination of programs, so as to illustrate how different the
perspectives of auditors and evaluators can be Although this perspective has
us origins in the requirements of financial auditing, the focus here is on value-
for-money auditing, which includes the examination of program activities, as
well as of management systems and procedures for controlling these activities
Five key dimensions distinguish this mind-set from that of evaluators In
general, auditors.
• Make a judgment as to how adequate or inadequate are the mailers examined
• Make this judgment against a preestabhshed set of criteria
• Focus on management systems and procedures for coniiollmg program
activities rather than on the program activities themselves
The views expressed in this chapter are ihose of the author and should not br • onsiriicd a>>
representing I hose of the Auditor General of Canada
-------
8 EVAI.UAI ION AND AUDI PINO
• May avoid commenting on substantive government policies
• View information on program results (when considered at all) as evidence
of other mailers, rather than as an end in itself
Judging Against Pre-Established Criteria
The mandate for an auditors work requires a judgment about the adequacy of
the matters examined Such judgment is based on a comparison of the results
of the examination against a sei of criteria, or expectations, that the auditor
may draw from a variety of sources, including previous audits and professional
standards or guidelines
For example, in reporting an audu of The Control and Clean-up of Freshwa-
ter Pollution, ihe Auditor General of Canada slated that it "expected to find thai
the various federal components of action plans were coordinated and that the
means were m place to handle interdepartmental conflicts over policy, planning
and funding as action plans are implemented" (1993b, p 370) In addition to
general cniena of this nature, an auditor may establish more specific subcntena
The way in which a judgment is reached and expressed depends upon the
assignment the auditor receives The auditor may reach a judgment after exam-
ining directly the matters at hand or after examining the reliability of manage-
ment's assertions At limes the (wo approaches may be combined The choice
of approach may be imposed by ihe auditor, the client, or, in the case of a leg-
islated mandate, by law
An interesting example of ihe audit of penodic financial statements is that
recommended by ihe Canadian Comprehensive Auditing Foundation (CCAF)
for reporting on elfectiveness (1987) The CCAF recommends that management
make rcpresentaiions (that is, that they provide information about effectiveness
in their oigamzaiions), and that auditors provide opinions on the fairness of
ihose representations Rather than define effectiveness, the CCAF sets out twelve
attributes of effectiveness against which managers are expecied to report
1 Management direction (including clarity of objectives)
2 Continued relevance of a program
3 Appropriateness of program design
4 Achievement of intended results
5 Satisfaction of clients or stakeholders
6 Secondary impacts
7 Costs and produuiviiy
8 Responsiveness to changed circumstances
9 Financial results
10 The extent to which the organization provides an appropnate work envi-
ronment foi us employees
11 Safeguarding of assets
12 Monitoring and reporl ing of pci lormancc
DIFFERENCES IN THE APPROACHES OF AUDITORS AND EVALUATORS 9
The conditions on an audit engagement may also specify how ihe auditor is to
report the judgment In exception reporting, the auditor is required to report
any deficiencies, that is, situations that did not meet these criteria Canada's
Auditor General Act, Section 7(2), gives such a mandate
Each report of ihe Auditor General under subsection (1) shall call aiiemiun 10
anything thai he |sic| considers to be of significance and of a nature that should
be brought to the attention of the House of Commons, including any cases in
which he has observed thai
(a) accounts have not been faithfully and properly maintained or public money
has not been fully accounted for or paid, where so required by law, inio (he
consolidated revenue fund,
(b) essential records have not been maintained or the rules and proceduics
applied have been insufficient to safeguard and control public property, to
secure an effective check on the assessment, collection and proper alloca-
tion of the revenue and to ensure that expenditures have been made only
as authorized,
(c) money has been expended other than for purposes for which it was appro-
priated by Parliament,
(d) money has been expended without due regard to economy or efficiency,
(e) satisfactory procedures have not been established to measure and report ihe
effectiveness of programs, where such procedures could appropnaiely and
reasonably be implemented, or
(D money has been expended without due regard to ihe environmental effects
of those expenditures m the context of sustainable development
For example, in the audit of The Control and Clean-up of Freshwater Pollution
described earlier, ihe auditor reported instances of poor coordination and unre-
solved conflicts. "This difference in departmental objectives and program fund-
ing led to coordination problems Although the two departments agreed to
provide for a management structure to coordinate their respective programs.
llhe slructurel proved to be ineffective11 (1993b, p 374)
Alternatively, an auditor may be required to reporl on the level of comlort
or assurance (hat a third party can have regarding the management of ihe pro-
gram The CCAF model for effectiveness auditing is one example of uu assm-
ance engagement In this instance, the auditor provides assurance icgarding
management representations as to the matters at hand Another example of an
assurance mandate is provided m Part X of Canada's Financial AdminiMiuiion
Act (1991), bearing on the responsibilities of Crown Coiporanons Suhseiiion
138(1) of the act requires corporations to engage an auditor to undertake a
special examination, in which the auditor is required to "deicimine it the sys-
tems and practices referred to in paragraph I3l(l)(b) were, m the period
under examination, maintained in a manner that ptovidcd reasonable assur-
ance that they met the requirements of paragraphs 131 (2)(a) and (O "
-------
10
HVAIIIAIIUN AND AUDI I INC.
Subsection 131(2) clarifies ihm " I'hc hooks, records, systems and practices
referred lo in subsection (1) shall be kept and maintained in such manner as
will provide reasonable assurance that (c) the financial, human and physi-
cal resouiccs of the corporation and each subsidiary are managed economically
and efficiently and the operations of the corporation and each subsidiary are ear-
ned out effectively" In this instance, the inability to find an exception from (he
positive expectation would be considered a judgment in confirmation of it
In assuiance engagements, a positive judgment is reached and reported
with gieat care because (I) available methods may be incorrectly applied or
not fool pi oof, and (2) theic may be an undetected case that constitutes an
important variation fiom the expectations established
Focus on Management
The core business of an audit is the examination of management controls over
expendituie, including whether or not management has in place the systems
and procedures necessary to ensure that expenditures are made with due
regard to economy and efficiency and in compliance with existing regulations
or policies Evaluation laicly intrudes into these areas The distinction between
audit and evaluation has become blurred, however, through increasing atten-
tion by audit 10 management control over results, including program effects,
as an indication of good financial management and control
This change has been lemforced by trends in government to decrease the
emphasis on formal controls and increase the delegation of authority, accom-
panied by a gieatcr need for accountability for results on the part of managers
The mandate for audits of Crown Corporations under Canada's Financial
Administration Acl (1991) reflects this focus on managemenl, as it requires
attention to management systems and practices Of particular significance to a
comparison of auditors and evalualors, the act specifically requires the auditor
to consider managemenl control over the effectiveness of program operations
The approach recommended by the Canadian Comprehensive Auditing
Foundation (CCAP) for auditing effectiveness also reflects the auditor's focus
on improving managemenl The CCAF notes that "The decision to emphasize
management icpiescntations reflects the reporting obligations of managers, the
needs of governing bodies, and their mutual desire for better management"
(1991. p 9)
The focus on management issues rathei than outcomes per se is reflected
in the CCAF's twelve attributes of effectiveness, which include such matters as
managers' responsibility for costs and productivity, responsiveness lo changed
circumstances, piovision of an appropriate working environment, and so on
Restrictions on the Scope of Audit
Under tlu F approach, the auditor is potentially limited in the scope of
rhc iiivrsiiiMiioi) hv ihr information ihai ihr m.m.it'ri < linn^r--, in n-nori
DIFFERENCES IN HIE APPROACHES OF AUDITORS AND EVAI UAIURS 11
There may be other, more formal restrictions on the scope of audit work
This is especially likely to apply 10 ihe examination of major government poli-
cies For example, with regard lo the special examination of Crown Corpora-
tions, Section 145 of Canada's Financial Administration Act specifies. "Nothing
in this Pan or the regulations shall be construed as authorizing the examiner
of a Crown corporation to express any opinion on the merits of mailers of pol-
icy, including the merits of . (c) any business or policy decision of the cor-
poration or of the Governmenl of Canada "
Other examples are provided by the Swedish National Audii Office (1995,
p 8) and the Australian National Audit OITrce (1995, p 1 3). whose mandates
specifically exclude commenl on governmeni policy
When Auditors Examine Results
To this point, 1 have argued that auditors are more likely than cvaluators to
focus on management systems and procedures and to face restnciions on their
freedom to comment on policy I have also pointed oul lhai the mandate of
auditors frequently includes effectiveness li is in the area of examining effec-
tiveness that the distinction between audu and evaluation is least clear.
although reducing to iwo basic issues The first is the focus of effectiveness
work for each The second is the reasons why auditors and cvaluators tackle
issues of program effectiveness.
The model for results-based audit depicted m Fable 1 1 illustrates the pos-
sible foci for effectiveness work.
The model locales results measuremeni of government programs along
two dimensions the level of results measured and the level of program ana-
lyzed With regard to the program level, a distinction is made between ihe
results of management systems and procedures and ihe results of program
activities. Systems and procedures include such mailers as planning, manage-
menl information systems, and procedures for detecting, recording, and col-
lecting overpayments 10 program clientele Program evaluation itself is viewed
as a management control over program effectiveness The levels of results iden-
tified include economy and efficiency, the achievement of intermediate pro-
gram objectives, and ihe achievemeni of overall program objectives
Thus a results-based audit could potentially examine the efficiency of pro-
gram activities, the exteni lo which ihese activities further intermediate program
objectives, or ihe exteni lo which (hey further the achievement of overall pro-
gram objectives Audits may also examine ihe effect of management c ontrols on
efficiency or the achievemeni of intermediate or ultimate program objectives
A few examples will help illustrate the model In 1993, the Office ol the
Auditor General examined management controls over pension benefit pay-
ments. The audit concluded thai ihe sysiems and proceduies in place for the
recording, control, and collection of overpayments fell far short ol r irsniiril
-------
9
I
w*
O
>
i
u.
•5
•XJ
i
SB
Potential for end
foreign economic
development proj
anticited beca
-el'
.
C u D.
H .a
assured in
1993a Pa
ipat
not
ects
proj
i
e
5
Weakn
and
incre
whe
199
a «
-a >.
P «
la
u "" -• ft
6»- a B
£ 8«£
•g
rt
I
1
-3
c
-S
"o
!
K
DIFFERENCES IN THE APPROACHES OF AUDIIORS AND E VALUATORS 1 3
0 5 percent of total program payments, increasing ihe adminisirativc cost of
program delivery by more than 50 percent The observation does not bear on
the potential contribution that management controls may make to program
effectiveness, but rather on their effects on the costs of program delivery
By comparison, an audit of Search and Rescue examined the relative effi-
ciency of program activities, specifically the types of rescue vessels employed
The audit concluded that the largest class of vessels were the most costly and had
not been critical to saving any lives during the penod audited (1992b, p 225)
Management systems and controls may also have an impact on the
achievement of program objectives In 1992, the Auditor General of Canada
reported on a program of payments to government employees scheduled to be
laid off. The payments were expected to permit employees who so desired to
quit immediately if there was no work to be performed and to save costs ol
employee benefits, retraining, and finding a job (I992a, p 189) The audit
observed (hat the number of payments, in all years except one, had consis-
tently exceeded the reduction in person years (p 194) Moreover, the auditors
concluded that the situation was one indicator of problems in the administra-
tion of the policy, pointing to such matters as inadequate planning, no appro-
priate management framework, and the failure of senior management to
provide leadership direction and support (p 196)
Results as Evidence of Management Deficiencies
As noted earlier, the core business of audit is the examination of management
controls over expenditure management It is therefore the bottom row of Table
1 1 where audits will commonly be found Economy and efficiency of program
operations, the left-hand column of Table 1 I, is also a common matter for
attention by audit However, control over results as an indication of good
financial management and control has also become an audit concern, leading
to increased attention to the contribution of program operations to intermedi-
ate and overall program objectives The result is illustrated by the cases cued
in Table 1 1 No area of effectiveness is exempt from attention by audit
It is in attention to the contribution of program activities to intermediate
and ultimate program objectives where the functions of audit and evaluation
become most difficult to distinguish In principle, u is in the examination ol
activities in relationship to intermediate program objectives where the two may
be most likely to overlap, but the central focus is substantially diflcicnt l;m
an auditor, problems with the effectiveness of piogiam activities is ol interest
as an indication of the importance of deficiencies in program management I he
auditor will do sufficient work to assess whether piogiam clleciiveness is ai
risk, before turning attention to the management factors that should enable
managers to gain control over program results This may involve relying on
evaluations conducted by program management, synthesizing the findings of
evaluations of similar programs in other jurisdictions, and, as a last lesoii, con-
ducting measurement and analyses of program impaus
-------
1 4 F.VAI.UA1 ION AND AUDI I INC.
Tor an auditor, the aimhuiion of impacts to program activities is less
important than in the classic social scientific model of evaluation In the social
science-based model, the ngoious pursuit of the causal link between the pro-
giam and us outcomes is the main focus For an auditor, it is sufficient to
determine that results may be inadequate, whatever the cause If results appear
to be positive, there is no deficiency to report and no need to elaborate the
causal chain
If there aie pioblcms wnh results, the search is on for deficiencies in prac-
tices that may have impeded management from detecting and solving the prob-
lem or for related areas of program weakness However, this search does not
necessarily involve exploration of the causal chain, for the auditors job is not to
solve management's problems but to identify that there is a problem to be solved
and the areas that may be involved In fact, it may be sufficient for an auditor to
point out thai management has done little to measure program effectiveness
This lack of attention to a rigorous exploration of the causal chain may
puzzle evaluators The answer to the puzzle is that for an auditor, information
on results provides evidence of other matters and is not an end in itself
References
Auiliior General of Canada Taymems 10 Employees Under ihe Work Force Adjustment Pol-
icy " In Report i)/(he Auditor General of Canada Ouawa Minister of Public Works and Gov-
einincni Services Canada. 1992.1
Auditor General of Canada "Scaich and Rescue " In Report of the Auditoi General of Canada
Ottawa Minister of Public Works and Government Services Canada. I992b
Auditor General of Canada "CIDA—Bilateral Economic and Social Development Programs "
In Report o/ihe Auditor General of Canada Ottawa Minister of Public Works and Gov-
ernment Services Canada, 1993a
Auditor General of Canada "Department of the knvironment—1 he Control and Clean-Up
of Freshwater Pollution " In Rrpoit of the Auditor General of Canada Ottawa Minister of
Public Works and Government Services Canada, I993b
Auditor General of Canada "Department of Fisheries and Oceans—Northern Cod Adjust-
ment and Recovery Program " In Report of the Auditor Central of Canada Ottawa Minis-
ter of Public Works and Government Services Canada. 1993c
Auditor General of Canada "Department of National Health and Welfare—Programs for
Seniors " In Report of the Auditor General of Canada Ottawa Minister of Public Works and
Government Services Canada. 1993d
Austialian National Audi! Office Per/or rnunte Auditing Canberra Australian National Audit
Office. 1995
Canada Financial Admmisualum Act RS 1985. c F-ll.S I Ottawa Government of
Canada. 1991
C.iiiadi.in Compiehensivr Auditing foundation C//cc nvencss Re;x»fm£ ami Auditing in (he /lib-
lie Sfilni .StiriimMry Ki'/iori Ou.iwj Cuiailiaii Comprehensive Auditing round.uion, 1987
Swedish Naiion.il Audit Olficc l'eifi>inian
-------
60 EVAIUAIION AND AUDIT ING
Chchmsky. I: "Comparing ami Contrasting Auditing and [valuation Sonic Notes on Their
Relationship " Evaluation Revuw. I9S5. 9 (4). 483-503
Clielimsky, I: "Expanding GAO's Capabihlies in Program Evaluation " The GAOJournal.
Winter/Spring 1990,8. p 51
Chen, M Theory-Driven Evaluations 1 housand Oaks. Calif Sage, 1990
Davis. D I" "Do You Want a Peifoimance Audit or a Program Evaluation'" Public Adminis-
tration Review, 1990, 50. 35-41
Day, P . and Klein. D Accountabilities Five Public Services New York Tavistotk. 1987
Frey. B , and Serna, A "Erne polmsch-dkonomische Delraclilung des Rechnungsliofs "
nnanzarchiv. 1990.18. 244-270
Cuba. I:, and Lincoln, Y Fuuilh Generation Evaluation Thousand Oaks, Calif Sage. 1989
Leeuw, f L "Performance Auditing and Policy Evaluation Discussing Similarities and Dis-
similarities " Canadian Journal of Program Evaluation. 1992, 7. 53-68
Lceuw, F L "Performance Auditing, New Public Management, and Performance Improve-
ment Questions and Challenges " Accounting. Auditing and Accountability, forthcoming
Leeuw. F L , Rist. R C , and Sonmchsen, R C (eds ) Can Governments Learn? New
Brunswick. NJ Transaction, 1994
LeGrand, J , and Banleti, W (cds ) Quasi-Marfeels and Social Policy Old Tappan, N J
Macmillan. 1993
Mason, R , and Milroff, I Challenging Strategic Planning Assumptions New York Wiley. 1981
Moukheibir. C , and Barzelay. M "Performance Auditing Concept and Controversies "
Paper presented at the Public Management Group/Organization for Economic Coordi-
nation and Development Audit Symposium. Pans, June 6-7, 1995
Meyer, K . and O'Shaugnessy, K "Organizational Design and the Performance Paradox " In
R Swedberg (ed ). Explorations in Economic Sociology New York Russell Sage Founda-
tion. 1993
Osborne, D . and Gaebler, T Reinventing Government How the Entrepreneurial Spin! Is Trans-
/ormmg the Public Sector Reading, Mass Addison-Wesley, 1992
Pawson, R . and Tilley. N "Whither (European) Evaluation Methodology " The International
Journal o/Knowledge Transjer and Utilization. 1995. 8 (3). 20-34
Public Management Group/Organizaiion for Economic Coordination and Development
"Background paper to the OECD Conference on Auditing " Public Management Group/
Organization for Economic Coordination and Development Audit Symposium, Pans,
June 6-7, 1995
Rist, R C "Management Accountability The Signals Sent by Auditing and Evaluation "Jour-
nal of Public Policy. 1989, 9 (3). 355-369
Smith. P "On the Unintended Consequences of Publishing Perfoimance Data in the Pub-
lic Sector " International Journal o/ Public Administration. 1995, 18. 377-310
UdryJ (ed ) The Media and Family Planning Chapel Mill University of North Carolina
Press. 1974
Walker, W E "The Impact of General Account ing Office Program Evaluations on Govern-
ment " Evaluation and Program Planning, 1985, pp 359-366
Wilson J Bureaucracy New York Free Press, 1989
While there is wide consensus that evaluation ana* auditing are moving
closer together, there is disagreement on the width ojlhe remaining
gap Further integration has both advantages and disadvantages
Auditing and Evaluation:
Whither the Relationship?
Eleanor Chelimsky
In a paper wnuen more lhan ten years ago, I examined some of the similari-
ties and differences 1 perceived in the ways thai auditors and cvaluaiors,
respectively, assess program performance, I linked these (o the hisioncs, mind-
sets, training, functions, and methodological approaches of the two professions
and spoke to the important and promising relationships I was beginning to
glimpse between performance (then called program results) audits and program
evaluations (Chelimsky, 1985) Since that time, and based now not only on
Elmer Staats's pioneering introduction of program evaluation into the U S
General Accounting Office (GAO) in 1980, but also on a good many other
experiences of collaboration between auditors and evaluaiors worldwide, u
appears that major two-way influences are, at very least, changing ihe nature
of both professions, even if they have not as yet produced an actual "blending
of the two cultures," in Roger Brooks's phrase (1995)
Indeed, there is quite wide consensus today (and this is reflected by the
articles in this volume) that audit and evaluation have moved, and arc con-
tinuing to move, toward increasing closeness with regard to understanding
and to methodological approach Disagreement exists, however, about the
degree of closeness that has actually been achieved, along with the reasons foi
the change
I-RANS L LI:UIW is director of the Division of Policy Evaluation at the Netherlands
Court of Audit and pw/essor. Department of Sociology, University of Utrecht, the
Evidence from Five Observers
At one end of the spectrum, Leeuw (1995. pp 15-16) saw the two professions
as still quite different and spoke to the need "to improve the training of audi-
tors and open their minds to the social and behavioral mechanisms opciaung
-------
62 UVAI.UAI ION AND AUDI i ING
m (he public sector and in decision-making " Out he also indicated that cur-
lent efforts to combine "the strong points of evaluation (like the methodology
applied, the attention paid to theory, and theory-driven evaluations) with the
strong points of auditing (the orientation toward management, the focus on
'follow-the-money', and the attention paid to documentary evidence) may well
lead to a new interdiscipline in the 21st century"
At the other end of the spectrum, Pollitt and Summa (1995, pp 24-26)
noted thai "the methods and approaches of auditors and evaluators are com-
ing closer to each other," and found only a marginal difference in the tool kits
potentially available to performance auditors and evaluators They believe this
increasing closeness has occurred "as performance auditing becomes more
common," and they infer that many differences in methodological approach
may be more apparent than real, given that evaluators—who are typically
deprived of the statutory authority of auditors—have a vested interest in lay-
ing claim to "superior methodology and expertise "
Divorski, who spoke uniquely about the differences between auditing and
evaluation approaches, did not, given this topic, discuss the degree to which
auditors and evaluators are moving toward each other despite these differences
Like Leeuw—that is, at the opposite pole from Pollill and Summa—he viewed
evaluation and auditing as enterprises that are still very different, especially in
terms of focus and mind-set With respect to focus. Divorski invoked audil-
ing's need to judge whether management performance is adequate and audi-
tors' consequent targeting of management systems and controls, as opposed to
evaluation's targeting of management activities in the search to determine pro-
gram results. Differences m mind-set referred to the same issue that is.
Divorski saw auditors as interested m results only insofar as they reflected
management performance, whereas evaluators are interested in results for their
own sake These differences also give rise to other differences, for example, the
assumption by auditors that causes for problems found can be explained via
"auditors'judgment," whereas evaluators would see the cause-and-effecl ques-
tion as a matter for empirical inquiry As Divorski put it "If there are problems
with results, the search is on for deficiencies in practices that may have
impeded management from delecting and solving the problem " Bui evalua-
tors would say that deficiencies in management practices may not be the cause
of all program problems, and that improving managers' awareness may not do
much to solve many of them
Biooks (1995), somewheie m the middle of the spectrum, did not argue
that the methodological approaches of auditors and evaluators are now virtu-
ally the same, but ratliei, from his experience at the state level in the United
States, that auditors and evalualors today are "drawing upon the approach of
iradiuonal auditing as well as ihe approach of social science-based evaluation "
Like Lecuw, Brooks recognized that "the 'two cultures' of auditing and evalu-
ation sti1' 'St." and thai "when comparing approaches across states, one
observe tant differences " But even though he saw these differences as
real, he .. j with Polliil and Summa that they are declining However, he
WHITHER THE RI:LAiIONSIIIP' 63
believes thai the decline is "mostly because of a liberalization of traditional
auditing," rather than simply the increasing presence or the performance audit
Also, he views the "emergence of a blended approach to auditing and evalua-
tion" as a fan accompli, at least in some places
It is, of course, difficult to generalize persuasively across the widely dif-
ferent institutional arrangements within which evaluators and auditors work
together, and without empirical data, some of the points cited here may be
impressionistic Should we really assume thai because similar tool kits are
potentially available to evaluators and auditors alike, this means that the tools
are equally used? Given the existence of the "two cultures," even if (he method-
ological tool kits were identical, there would still likely be major differences
made by auditors and evaluators in the application, use, and pervasiveness of
the particular methods selected (For example, I would expect to see auditors
use many more surveys and case studies than, say, quasi-experimental designs,
and I would also expect to see the issues of reliability—in survey questions—
and of generalizability—in case study findings—handled very differently by
evaluators and auditors.)
Again, if there is more "togetherness" observed between auditors and eval-
uators, is this due to a liberalization in traditional auditing standards and pro-
cedures7 to the increasing prevalence of performance auditing7 to the changing
nature of policy makers' questions that force evaluators and auditors 10 bor-
row from each other? to the increased prestige of evaluation among auditors
today? to the belated recognition by evaluators that auditors are right to be
interested in costs7 We don't really know
Thus, although there may be little consensus as yet about how or why it
has happened that evaluators and auditors, despite some real differences, are
coming closer together, the important point is that many observers believe
they are And the question 1 would like to raise here is whether any of the
mingling and blending being discussed in this volume would have happened
without the physical juxtaposition of auditors and evaluators within an orga-
nization It may well be this kind of proximity that has allowed productive
comparisons of work methods and of ways to examine policy issues, to
resolve technical problems, to establish credibility, such as those we see in the
chapters in this volume.
It is much less frequent today than it was fifteen years ago, say. to hear
"the auditors judgment" being used as the sole basis for a finding, or to see
evaluators unable to explain what savings (or expenditures) are likely to result
from service changes they propose But if physical closeness is important, then
it is also important to examine what we know about how to manage auditors
and evaluators together in an organization That is, given the diffcicnces and
the similarities of the two fields, and given the benefits for public policy likely
to accrue from their increased collaboration, what is the oig.imzaiional device
that will allow us to reap the greatest rewards7 Should we integia* '~» func-
tions that is, house, and supervise auditors and evaluators togcthc hould
we keep them separate7
-------
64 LVALUAl ION AND AUDI TING
Separation or Integration-
Some Advantages and Disadvantages
Ai first glance, keeping the auditing and evaluation funuions separate has a
number of obvious disadvantages Increased costs are involved in maintaining
a separate group or department, and important potential diffusion-of-mfor-
malion benefits to the organization may be foregone unless an enlightened
management takes steps to break down walls (or pieveni them from rising in
the first place) In addition, differences in findings among auditing and evalu-
ation units working on the same issues but using different methods can
become a problem for the organization The normal competition between
groups for organizational hegemony can move from what is stimulating and
healthy to something profoundly unhealthy unless thai competition is quite
carefully managed
On the other hand, separation does allow evaluators the independence
they need for credibility (that very same independence for which auditors have
fought so hard over their long history) Separation also gives evaluators the
freedom to develop a critical mass of skills, to establish the legitimacy of their
work with policy makers who may not be familiar with evaluation, and to
respond to policy questions with strong studies that can demonstrate the worth
of evaluation not only to policy makers but to auditors as well Finally, one of
the most important advantages of separation is us feasibility start-up does not
require behavioial change in an organization, only funds, operational know-
how, and leadership
What about integration7 Most experience suggests that this can be quite
difficult to achieve, ai least immediately A first problem arises because audi-
tors and evaluators have been trained so differently, and tend to have dissimi-
lar mind-sets with regard to a study Auditors are taught, for example, that they
are wasting the taxpayers' money if an investigation does not uncover a major
pioblem "If nothing is wrong," they say, "then why should we be doing an
audit7" But evaluators are trained instead to ask whether some policy or pro-
gram has made a difference, any difference, good or bad That is. to evalualors,
positive findings are as important as negative ones in improving public policy
and need no special justification To find out what works is at least as useful,
evalualors think, as to find out what didn't work
These mind-sets have some ramifications for the work process Auditors
ofien do their besi 10 determine whether theie is (or is noi) a significant prob-
lem based, say, on a one-month informal investigation, and may then abandon
the project if no significant problem has turned up If evalualors expecl that a
study may genei.ite weak or strong findings in any direction, they will often
spend thiee months designing it (moie if the policy question asked is complex
or contiovcrsial), and they have difficulty in answering auditors' questions
about what their findings arc likely to be befoie they have collected their data
One result of this process is that evaluations arc often afflicted by findings that
aie anything but conclusive, and tins means that undei an unrelated organi-
WmiiiCRiiir. Ri.LArioNsmp? 63
zational arrangement, auditors may greel (he evalualors' findings wuh a bored
"So what?" and a large yawn, whereas evalualors will always be skeptical about
those exciting one-month findings thai are established before data on both
sides of the question could possibly have been collected
Emphasis (or the lack of it) on measurement is another training issue thai
causes tensions between auditors and evalualors The measurement questions
lhai preoccupy evalualors (like ihe reliability of items in a questionnaire, or
threats to ihe internal validity of siudy findings, or the real comparability of
before-after or cross-secuonal siudy daia) lend to take a bit of time to resolve,
are typically low on the auditors' pnonly list, and are usually nol well under-
stood by them Because of this, abseni training on boih sides, and especially
training of managers, auditors may reproach evalualors for their slowness and
evalualors may reproach auditors about the validity of their findings Unhap-
pily, these perceptions lend lo linger within an organization they engender us-
and-them mentalities, they cause morale problems, and they militate against
the continuing recruitment and retention of methodologically sirong evalua-
lors whose presence is critical to the success of cross-fertilization efforts
The point here is that when these tensions of mind-sei and measurement
bubble up, many of ihe advantages lhai integration was counted on lo supply
may not materialize For example, in a tense atmosphere, knowledge diffusion
and organizational learning may be even worse off than under separatum Dif-
ferences in findings based on methodological approach are still likely to surface
under integration, although al a lower level and thus more manageably—that
is, easier lo suppress lhan confront—from an organizational perspective And
even if evalualors may feel more protected institutionally than they do as pan
of a separate unit, the trade-off between thai protection and technical quality
may nol seem worth u 10 many evaluators
Organization for Production or for Cross-Fertilization?
Perhaps a useful way to think of separation versus integration is as a function
of ihe institutional goals to be pursued If the mam aim is lo increase the capa-
bilities of the organization to answer complex policy questions and to begin
doing thai as soon as possible, then separaie evaluation and audit units such
as ihose ai GAO and ai the Minnesota Program Evaluation Division have much
lo recommend them But if the aim is cross-fertilization in an organization,
then separation, especially if it is poorly managed, may bring two important
long-run costs communication wuh ihe larger institutional entity may not be
good enough, and evaluation staff may begin to feel excluded from important
policy decisions
A good example of this problem comes from the experience of the Office
of Management and Budget (once known as the Bureau of the Budget, or
BOB) In ils early days, BOB decided lo bring in people with strong technical
skills lo complemeni the work of their budget analysis, and they separated
these technical staff from the budget people by cicaimg technical centers in
-------
66 EVALUAI ION AND AUDI I INC.
which (he new skills would be deployed In the words of a former BOB Assis-
tant Director, William D Carey, what happened was this separation "built a
kind of concentrated quality in the technical centers and it successfully accu-
mulated a critical mass of top-flight specialists " But it also led to an organiza-
tion in which technical staff became alienated In Careys words
I he budget people sal al I he table during the Director's reviews, but the techni-
cal people had only backbench chairs and very limned possibility 10 participate
in the discussions When they did speak, their commenis were considered intru-
sive Promotions and supergrades went to line, not staff, personnel BOB Direc-
tors had little tune or interest in the technical work, and technical staff had little
or no access 10 them On their side, technical people tended 10 look down on
budget examiners as journeymen of very average capabilities |U S General
Accounting Office. 1990. p 23|
To remedy this developing nft within the organization, BOB decided to dis-
solve the technical centers and scatter their personnel across the budget divi-
sions In this way it was hoped that organizational cohesion and communication
could be improved and that the work of the budget divisions would be enriched
by the closer proximity of the technical staffs expertise Carey believes this move
to have been a mistake and its results unfortunate, for three reasons
First, the same problems reappeared, but at the lower, divisional level The scat-
tered technical staff continued 10 feel they were second-class citizens and now
the situation was worse in ihat they had no organizational voice Their sense was
thai they had 10 keep proving their worth (as technical people in a budget divi-
sion) and that they were no belter off in terms of having direct inputs into orga-
nizational decisions and products Second, the professional quality of the
technical staff weakened over lime because the technical centers which had
attracted some of the brightest people in their respective fields were no longer
there And finally, the dispersed technical personnel did not appear to have any
visible effect on the work of the budge) divisions |U S General Accounting
Office. 1990. p 24|
Ai the GAO and in Minnesota, where audit and evaluation units have been
separate, at least some of these evils have been avoided, and certainly, with
regaid lo visible effect, the influence of the evaluation work has been recog-
nized and highly regarded
In short, if integration has not been easy—and it has not—we may need
10 learn moic about why u has been so difficult before dismissing it as an
option Is u, for example, because evaluation and auditing have different
sources of credibility and legitimacy7 Is u because mind-sets and cultures
become cntieiKhed under the stresses of organizational competition7 Is u
bcL.ii- simply haven't yci developed both the management techniques and
the m .is needed lo do the job7
WmmcR i HE RLLAIIONSIIIP' 67
My own view is that — given effective managers who know and value both
auditing and evaluation, and given also some harmonization of training for
auditors and evaluators — it should be possible to integrate the evaluation func-
tion successfully in audit organizations But until that training has taken place
(and especially for audit offices beginning now to incorporate evaluation into
their work programs), keeping the functions separate, building multiple
bndges between them, and watching their interactions and managing them
carefully may be not only the most prudent but also the best organizational
course of action
A lot may be ndmg on this selection of the right organizational model To
be viable, both evaluation and audit functions need independence, skilled per-
sonnel. credibility, sponsors who understand the benefits to be drawn from
both audits and evaluations, and the capability to respond appropriately to the
policy questions of today's political environment Such capability requires the
use of both auditing and evaluation methods When we can bring these two
together and target them properly to policy makers' information needs, and
when findings from both types of studies can make their way unimpeded into
the policy process, then both evaluations and audits will have achieved their
real public purpose, to help make government services more effective, more
meaningful, more responsive, more accountable, and — last but not least — bet-
ter managed
References
Brooks. R "Blending Two Cultures State Legislative Auditing and Evaluation " Paper pre-
sented at (he International Evaluation Conference in Vancouver. D C , Nov 1995
Chelimsky, E "Comparing and Contrasting Auditing and Evaluation Some Notes on Their
Relationship." Evaluation Review. 1985. 9 (4). 483-503
Leeuw, F. L "Auditing and Evaluation Bndging a Gap, Worlds to Meet7" Paper presented
at the International Evaluation Conference in Vancouver. B C , Nov 1995
Pollill, C , and Summa, H "Performance Auditing Travellers' Tales " Paper presented at the
International Evaluation Conference in Vancouver. B C . Nov 1995
U S General Accounting Office Diversifying and Expanding Technical Skills at GAO GAO/
PEMD-90-18S. Vol 2 Washington. D C US General Accounting Office. Apr 1990
ELEANOR CHELIMSKY is an international consultant in evaluation
ri/wv nm\ nmf rmulrni nflhe Amenran FwiluriMnn
jncJ method-
-------
AUDITING
Art
Comprehensive
Objective
Prove
Quality
Present
Internal
Inputs
Separate
Simple
Inductive
Individual
What
Genera lizable
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
4
4
4
4
5
5
5
5
5
5
5
5
5
5
5
5
5
5
6
6
6
6
6
6
6
6
6
6
6
6
6
6
7
7
7
7
7
7
7
7
7
7
7
7
7
7
8
8
8
8
8
8
8
8
8
8
8
8
8
8
9
9
9
9
9
9
9
9
9
9
9
9
9
9
Science
Specific
Value-laden
Improve
Quantity
Future
External
Outcomes
Collaborative
Complex
Deductive
Team
Why
fnique
-------
EVALUATION
Art
Comprehensive
Objective
Prove
Quality
Present
Internal
Inputs
Separate
Simple
Inductive
Individual
What
Generalizable
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4-
4
4
4
4
4
4
4
4
5
5
5
5
5
5
5
5
5
5
5
5
5
5
6
6
6
6
6
6
6
6
6
6
6
6
6
6
7
7
7
7
7
7
7
7
7
7
7
7
7
7
8
8
8
8
8
8
8
8
8
8
8
8
8
8
9
9
9
9
9
9
9
9
9
9
9
9
9
9
Science
Specific
Value-laden
Improve
Quantity
Future
External
Outcomes
Collaborative
Complex
Deductive
Team
Why
L/nique
-------
Logic Mo4ehng
Refer to
Mclaughlin, John A and Gretchen B Jordan, (1999) "Logic Models a tool For telling your
program's performance story" Evaluation and Program Planning 22. Elsevier Science Ltd
pp 65-72
Articles in New Directions for Evaluation 87 Fall 2000, San Francisco CA Jossey-Bass
Publishers
Rogers, Patricia ) , Anthony Petrosmo, Tracy A Huebner, and Timothy A Hacsi
Editor's Notes and "Program Theory Evaluation Practice, Promise, and Problems"
pp1-13
Weiss, Carol Hirschon "Which Links in Which Theories Shall We Evaluate/"
pp 35 - 45
Rogers, Patricia J "Causal Models in Program Theory Evaluation" pp 47-55
1. What is logic modeling/
2 Who uses it/ Why/
3 What are the steps/
-------
Drawings of Logic Models
-------
Is Logic Modeling The 'Answer^
Before you proceed with your task, would
it be helpful to have...
...a visual diagram of the design of the
...a common description of a program to share with persons
internal and external to the program?
...a diagram to facilitate the development of hypotheses about
how program inputs are related to activities, activities are related
to outputs, outputs are related to outcomes?
...a digram to help identify what measures would be helpful in
testing your previous hypotheses?
...a digram to help determine whether the program is designed to
succeed?
if the answer is 'yes' to any of these, then let's model!
-------
The Worl4 of l_£
Modeling
Emmalou Norlan
1C
A LOGIC MODEL
Is a picture'Can4 corresponding text
description) of how a program is designed
to work.
Shows thgt a program has inputs, outputs,
/; outcomes.
Indicates how program components are
supposed to be related to one jnother.
Suggests c^use jn4 effect linkages.
A Tool to fse
when
Just what IS a
logic
Generic Logic Moc|el Cwith
Environmental Program Outcomes)
j i
-------
Who Uses Logic Mojels?
A A logic model serves 35 a common
communication tool among program
staff.
A Logic models help explain program logic
to flinders and other stakeholders.
Evaluators ALSO l/se Logic
Models
A To understand how the program works,
how it links to other programs, and how it
contributes to agency long-term goqls.
A To identify gaps in the logic of a
prog ram's design.
A To determine the critical links in the
program's logic so that appropriate
measures can be identified for use in
evaluations
A Program Logic Mocjel Explains
the ARROW
A How a Program
l/ses Resources to
Improve
Environmental and
Human Health
Congress
Provi4es
Money
Improved
Environment*!
4ixl Humjn
Health
A Program's Logic
Resources are used by the agency to
create a set of activities (the program).
The activities are designed to produce
outputs (products or services) for a set of
customers/clients Those clients react to
the outputs (and gain knowledge and
skills and attitude) such that they can and
will change their behavior in a desirable
way
-------
Logic Continue^..
When client behavior changes, there
environmental consequences (reduced
pollutants, changed ambient condition...)
resulting in improved environmental and
human health.
Activities
A Researching
A Permitting
A Regulating
A Developing
A Monitoring
A Training
A Communicating
A Enforcing
Resources
Can be budget, FTEs, equipment,
facilities, supplies, products from
other- programs, internal an4
external partners, key
information..
Outputs
Scientific Finding
A Permits
A Regulations
A New Technologies
A Pata Bases
A New Methods
A Cutting Edge Information
A Enforcement
-------
Customers/Clients of EPA
Programs
A Congress
A The Public
A States
A Regulated
Community
A Other agencies
Environmental Outcomes
Stressor Recjucecl
Ambient Condition Improves
Environment Better
Behavioral Outcomes
A Affective Reactions (satisfaction,
A
agreement, support...)
A Knowledge
A Skills
A Attitudes (beliefs,
perceptions, values]
A Behavior Change
Environmental
Human Health
IMPROVED!
-------
Developing a Logic Model
A Identify the program boundaries
A Gather written information about the
program
A Put a draft model together
A Review the model with program staffand
other stakeholders.
A I/sing that information, change model to be
the best representation of the program as
designed.
What Makes Up the PROGRAMS
V
\
Resources
/
/
/
Activities
/
)
/
Outputs
/
/
FTES. Budvjet.
F,
lilforrrMllon
>• \
Permit*.
Resfiirch. }
Enforcement /
y
Basic Components of an
tnvironrr)enfal~Prograro Logic"
Mocjel
Behavioral
Outcomes
al
es
/
«
Environment
Outcomes
•
/
X
Environment.)
and Ilimijn
Hejllli
Externalities
I if the resources support the 'right' activities, j
which produce the 'right' outputs, then the !
client has BEHAVIORAL OUTCOMES \
Starts. Sto(>> \
K» \
Pccre.l>o /
Heli.li 101 /
/
-------
if clients move through the behavioral chain to
ultimately change behavior, then
ENVIRONMENTAL OUTCOMES are achieved.
/
Stressor is
Reduced
/
/
/
Ambient
Condition
Improves
/
/
Kjs are
reduced
Improvement
What are Externalities'
People, events, an4 other entities which
couM have an effect upon what happens in
the boxes and/or on relationships between
the boxes
Weather, Political Climate, Competing
Programs, Other Agencies, Agenjas. Lacl.
of Information
Through m^ny programs achieving
BEHAVIORAL AND
ENVIRONMENTAL outcomes over a
long period of time, then the ultimate
goal can be reached: ENVIRONMENT
AND HUMAN HEALTH
Expanded Environmental Logic Model
-------
We can also use this logic in
reverse, when planning a program!
Then what Joe, science
tell us about how the
| environment]! condition:
need to change to reach
that goalf
if the ULTIMATE goal
is Environmental and
Human Health
What kinds of outputs would best
address the desired behaviors?
Information
based upon
souncl science
if we want those environmental conditions to
change, WHO/WHAT arc the pollutant sources/
(WHO - rnsfr>rr)t'is/rlirt)fs - r)cre\ tn rli^t)i_' WHAT
behaviors/)
Stop Polluting
Change
Household
Practices
Enact
Appropriate
Policy
Wh^t progi-gms 3nd Activities can EPA
undertake to produce those outputs/
Information
based on
sound
science
-------
AND, FINALLY, what resources are needed
to plan and conduct these programs
Cwhich, in turn, produce these outputs)/
Resource.
Resource.
Regulations
Information
based on
sound science
-------
Generic Logic Mocjel (with
Environmental Program Outcomes)
Program
Resources
Activities
\
Outputs
Customers
Problem
ShortTenn
Outcome
Intermediate
Outcome
Lull Term
Q il come
Externalities
Behavioral
Outcomes
Environmental
Outcomes
-------
Environmental Logic Mode
Rpgram
Resources
Ativities
Outputs
Customers
Externalities
PluUcm
Behavioral
Outcomes
Environmental
Outcomes
HLOHUI hk'idlh
Knowlcc^e,
Skills, Attitudes,
Behavior
Stntssor/Ambicnt
Concfition
-------
EVALUATION
and PROGRAM PLANNING
PERGAMON
Etalujuon and Progrjm Planning 12 (1999) 65-/:
Logic models: a tool for telling your program's performance story
John A. McLaughlmJ, Gretchen B. Jordan6*
'Independent Consultant 423 Hempsteati Roail Willtaniihurg \'A 2USS. L 5 ^
'Saiulia National Laboratories 950 L'Enfant Plaza SWSuite IItt V/a\hmqion DC .'OO.V. L 5 A
Accepted I August 1993
Abstract
Program managers across private and public sectors are being asked to describe and evaluate their programs in new ways People
want managers to present a logical argument for how and why the program is addressing a specific customer need and how
measurement and evaluation will assess and improve program effectiveness. Managers do not have clear and logically consistent
methods to help them with this task. This paper describes a Logic Model process, a tool used by program evaluators. in enough
detail that managers can use it to develop and tell the performance story for their program. The Logic Model describes the logical
linkages among program resources, activities, outputs, customers reached, and short, intermediate and longer term outcomes Once
this model of expected performance is produced, cntical measurement areas can be identified. © 1999 Elsevier Science Ltd All rights
reserved.
Ke\Hords Program theory. Program modeling. Performance measurement. Monitoring and evaluation. Government Performance and Results Act
1. The problem
"At its simplest, the Government Performance and
Results Act (GPRA) can be reduced to a single question:
What are we getting for the money we are spending?
To make GPRA more directly relevant for the thou-
sands of Federal officials who manage programs and
activities across the government, GPRA expands this
one question into three: What is your program or
organization trying to achieve0 How will its effec-
tiveness be determined? How is it actually doing7 One
measure of GPRA's success will be when any Federal
manager anywhere can respond knowiedgeably to all
three questions."
John A. Koskmen, 1997
Office of Management and Budget
Federal managers were being challenged by Mr Kos-
kmen (1997), Deputy Director of the OMB, to tell their
program's story in a way that communicates not only the
program's outcome goals, but also that these outcomes
are achievable For many public programs there is also
an implicit question. 'Are the results proposed by the
program the correct results?' That is, do the results
address problems appropriate for the program and
deemed by stakeholders to be important to the organ-
izational mission and national needs?
* Corresponding author E-mail gbjordaiii sandu gov
The emphasis on accountability and 'managing for
results' is found in state and local governments as well as
in public service organizations such as the United Way
of America and the American Red Cross It represents a
change in the way managers have to describe their pro-
grams and document program successes. Program man-
agers are not as familiar with describing and measuring
outcomes as they are with documenting inputs and pro-
cesses. Program design is not necessanly explicit, in part
because this allows flexibility should stakeholder pri-
orities change.
There is also an increasing interest among program
managers in continuous improvement and managing for
'quality'. Choosing what to measure and collecting and
analyzing the data necessary for improvement measure-
ment is new to many managers
The problem is that clear and logically consistent
methods have not been readily available to help program
managers make implicit understandings explicit. While
tools such as flow charts, risk analysis, systems analysis.
are used to plan and describe programs, there is a method
developed by program evaluators that more com-
prehensivelv addresses the increasing requirements for
both outcome measurement and improvement measure-
ment
Our purpose here is to describe a tool used by many m
the program evaluation community, the Logic Model
process, to help program managers better meet new
0149-7189 9') S - >ee front mailer f 1999 El»evier Science Ltd All rights reserved
PII S0149-7I89(9S)0004:-|
-------
66
J ^ A/c Launlilui C B Jordan Ci ulummn .nul /V«iv,«ii Plaiminv _V
requirements Documentation of the process by which a
manager or group would develop a Logic Model is not
readily available even within the evaluation cornmunu>.
thus the paper may also help evaluators serve their cus-
tomers better
2. The Program Logic Model
Evaluators have Found the Logic Model process useful
for at least twenty years. A Logic Model presents a plaus-
ible and sensible model of how the program will work
under certain conditions to solve identified problems
(Bickman, 1987). Thus the Logic Model is the basis for a
convincing story of the program's expected performance
The elements of the Logic Model are resources, activities.
outputs, customers reached, short, intermediate and
longer term outcomes, and the relevant external influ-
ences (Wholey, 1983, 1987)
Descriptions and examples of the use of Logic Models
can be found in Wholey (1983). Rush & Ogborne (1991),
Corbeil (1986), Jordan & Mortensen (1997), and Jordan,
Reed, & Mortensen (1997). Variations of the Logic
Model are called by different names, 'Chains of Reason-
ing' (Torvatn, 1998). Theory of Action, (Patton. 1997),
and 'Performance Framework' (Montague, 1997.
McDonald & Teather, 1997). The Logic Model and these
variations are all related to what evaluators call program
theory. According to Chen (1990), program theory
should be both prescriptive and descnptive. That is, a
manager has to both explain the elements of the program
and present the logic of how the program works. Patton
(1997) refers to a program description such as this as an
'espoused theory of action', that is. stakeholder per-
ceptions of how the program will work.
The benefits of using the Logic Model tool include.
• Builds a common understanding of the program and
expectations for resources, customers reached and
results, thus is good for shanng ideas, identifying
assumptions, team building, and communication,
• Helpful for program design or improvement, ident-
ifying projects that are critical to goal attainment,
redundant, or have inconsistent or implausible linkages
among program elements; and,
• Communicates the place of a program in the organ-
ization or problem hierarchy, particularly if there are
shared logic charts at various management levels:
• Points to a balanced set of key performance measure-
ment points and evaluation issues, thus improves data
collection and usefulness, and meets requirement of
GPRA.
A simple Logic Model is illustrated in Fig I.
Resources include human and financial resources as well
as other inputs required to support the program such
as partnerships Information on customer needs is an
e<,scnnal resource to the program Aiinnici include all
those action steps necessan. to produce program outputr
Output* arc the products, goods and services provided tc
the program's direct itmonier\ For example, conducting
research is an activity and the reports generated for other
researchers and technology developers could be thought
of as outputs of the activity
Customers had been dealt with implicitly in Logic
Models until Montague added the concept of Reach to
the performance framework. He speaks of the 3Rs of
performance: resources, people reached, and results
(Montague. 1997. 1994) The relationship between
resources and results cannot happen without people—
the customers served and the partners who work with the
program to enable actions to lead to results Placing
customers, the users of a product or service, explicitly in
the middle of the chain of logic helps program staff and
stakeholders better think through and explain what leads
to what and what population groups the program intends
to serve.
Outcomes are characterized as changes or benefits
resulting from activities and outputs. Programs typically
have multiple, sequential outcomes across the full pro-
gram performance story. First, there are short term out-
~comes. those changes or benefits that are most closely
associated with or 'caused' by the program's outputs
Second, there are intermediate outcomes, those change*
that result from an application of the short ter
outcomes. Long term outcomes or program impacts, foi
low from the benefits accrued though the intermediate
outcomes. For example, results from a laboratory proto-
type for an energy saving technology may be a short-term
outcome; the commercial scale prototype an intermediate
outcome, and a cleaner environment once the technology
is in use one of the desired longer term benefits or
outcomes.
A critical feature of the performance story is the identi-
fication and description of key contextual factors external
to the program and not under its control that could
influence its success either positively or negatively It is
important to examine the external conditions under
which a program is implemented and how those con-
ditions affect outcomes. This explanation helps clarify
the program 'niche' and the assumptions on which per- •
fortnance expectations are set. Doing this provides an
important contnbution to program improvement
(Weiss. 1997) Explaining the relationship of the problem
addressed through the program, the factors that cause
the problem, and external factors, enables the manager
to argue that the program is addressing an important
problem in a sensible way.
3. Building the Logic Model
As we provide detailed guidance on how to develop «.
Losic Model and use it to determine kev measurement
-------
J A MiLaiishlui C B Jordan! Eialualiim urul Prn^rani Plumtiiiy .'.' ,' 19991 6.!-~?
Resource* |
(inputs) p""
Activities !-*•
hfor
Customers — ••
Rejched
Snorl-lerm 1 ^
Outcomes 1 "~
Outcomes
(through
customers)
*
Outcomes
& Problem
Solution
External Influences and Related Program*
Fig I Elements or the Logic Model
and evaluation points, it will become more clear how the
Logic Model process helps program managers answer the
questions Mr Koskmen and others are asking of them
An example of a federal energy research and technology
development program will be used throughout. Program
managers in the U.S Department of Energy Office of
Energy Efficiency and Renewable Energy have been using
the Logic Model process since 1993 to help communicate
the progress and value of their programs to Congress.
partners, customers, and other stakeholders.
The Logic Model is constructed in five stages discussed
below. Stage 1 is collecting the relevant information.
Stage 2 is describing the problem the program will solve
and us context; Stage 3 is defining the elements of the
Logic Model in a table. Stage 4 is constructing the Logic
Model, and Stage 5 is verifying the Model.
3 1. Siage 1 collecting the relevant information
Whether designing a new program or describing an
existing program, it is essential that the manager or a
work group collect information relevant to the program
from multiple sources The information will come in the
form of program documentation, as well as interviews
with key stakeholders both internal and external to the
program While Strategic Plans, Annual Performance
Plans, previous program evaluations, pertinent legis-
lation and regulations and the results of targeted inter-
views should be available to the manager before the Logic
Model is constructed, as with any project, this will "be
an iterative process requiring the ongoing collection of
information. Conducting a literature review to gam
insights into what others have done to solve similar prob-
lems, and key contextual factors to consider in designing
and implementing the program, can present powerful
evidence that the program approach selected is correct.
Building the Logic Model for a program should be a
team effort in most cases. If the manager does it alone.
there is a great risk that parts viewed as essential by some
will be left out or incorrectly represented. In the following
steps to building the Logic Model we refer to the manager
as the key player However, we recommend that persons
knowledgeable of the program's planned performance.
including partners and customers, be involved in a work
group to develop the Model As the building process
begins it will become evident that there are multiple realit-
ies or views of program performance Developing a
shared vision of how the program is supposed to work
will be a product of persistent discovery and negotiation
between and among stakeholders.
In cases where a program is complex, poorly defined, or
communication and consensus is lacking, we recommend
that a small subgroup or perhaps an independent fac-
ilitator be asked to perform the initial analysis and syn-
thesis through document reviews and individual and
focus group interviews. The product of this effort can
then be presented to a larger work group as a catalyst for
the Logic Model process.
J 2. Stage 2: clearly defining the problem and its context
Clearly defining the need for the program is the basis
for all that follows in the development of the Logic
Model. The program should be grounded in an under-
standing of the problem that drives the need for the
program. This understanding includes understanding the
problems customers face and what factors 'cause' the
problems. It is these factors that the program will address
to achieve the longer term goal—working through cus-
tomers to solve the problem. For example.
There are economic and environmental challenges
related to the production, distribution, and end use of
energy. U.S. taxpayers face problems such as depen-
dence on foreign oil, air pollution, and threat of global
wanning from burning of fossil fuels Factors that
might be addressed to increase the efficiency of end use
of energy include the limited knowledge, risk aversion,
budget constraints of consumers, the lack of com-
petitively priced clean and efficient energy tech-
nologies, the externalities associated with public goods,
and restructuring of U.S electricity markets. To solve
the problem of economic and environmental challenges
related to the use of energy, the program chooses to
focus on factors related to developing clean and
efficient energy technologies and changing customer
values and knowledge. In this way, the program will
influence customer use of technologies that will lead to
decreased use of energy, particularly of fossil fuels
One of the greatest challenges faced by work groups
-------
6S
J A McLaugliltn CB Jordan, Eiuluaium ami Prtivrain Plumum; 22 (I999j 65-U
developing Logic Models is describing where their pro-
gram ends and others start. For the process of building
a specific program's Logic Model, the program's per-
formance ends with the problem it is designed to solve
with the resources it has acquired, including the external
forces that could influence its success in solving that prob-
lem Generally, the manager's concern is determining the
reasonable point of accountability for the program At
the point where the actions of customers, partners, or
other programs are as influential on the outcomes as
actions of the program, there is a shared responsibility for
the outcomes and the program's accountability for the
outcomes should be reduced. For example, the adoption
of energy efficient technologies is also influenced by fin-
anciers and manufacturers of those technologies.
3.3 Stage 3 defining the elements of the Logic Model
3.3.1. Starting with a table
Building a Logic Model usually begins with cat-
egorizing the information collected into 'bins', or col-
umns in a table. Using the categories discussed above the
manager goes through the information and tags it as a
resource', activity, output, short term outcome, inter-
mediate outcome, long term outcome or external factor
Since we are building a model of how the program works,
not every program detail has to be identified and catalo-
ged, just those chat are key to enhancing program staff
and stakeholder understanding of how the program
works.
Figure 2 is a table with some of the elements of the
Logic Model for a technology program.
3.3.2. Checking the logic
As the elements of the Logic Model are being gathered,
the manager and a work group should continually check
the accuracy and completeness of the information con-
tained in the table The checking process is best done b
involving representatives of key stakeholder groups u,
determine if they can understand the logical flow of che
program from resources to solving the longer term prob-
lem So the checking process goes beyond determining if
all the key elements identified, to confirming that reading
from left to right, there is an obvious sequence or bridge
from one column to the next.
One way to conduct the check is to start m any column
in the table and ask the question, 'How did we get here?'
For example, if we select a particular short term outcome,
is there an output statement that leads to this outcome0
Or, for the same outcome, we could ask, 'Why are we
aiming for that outcome?' The answer lies m a subsequent
outcome statement in the intermediate or long term out-
come columns. If the work group cannot answer either
the how or why question, then an element needs to be
added or clarified by adding more detail to the elements
in question.
3.4. Stage 4: drawing the Logic Model
The Logic Model captures the logical flow and linkages
that exist in any performance story. Using the program
elements in the table, the Logic Model organizes the
information, enabling the audience to understand and
evaluate the hypothesized linkages. Where the resource*
activities and outcomes are listed within their respective
columns in the story, they are specifically linked m the
Model, so that the audience can see exactly which activi-
ties lead to what intermediate outcomes and which inter-
mediate outcomes lead to what longer term outcomes or
impacts.
Although there are several ways to present the Logic
| Outcomes
Resources
-Budget SXXX
• #/ capacities
of staff
-ttfeataf
panners
•Cost ihareS
•Technology
Roadmap
- H of yean
experience
Activities
• Fund grants
(solidl, review,
etc.)
-Research
properties of
materials
- Provide
technical
aaiotance
- Set policy for
procurement
Outputs
-UrtypeoT
awards
•RAO
progress
reports
- Lab and
nmmeraal
prototypes
•Advice
provided
•S procure-
in nt i fleeted
Customer
Reached
• Federal and
private
researchers
• Industrial
firms impacted
•VUnufacturers
•Existing/future
consumers of
•dated product!
Short
Term
• Rejectees seek
venture capital
• R&D advances
made
• Lab prototype
started: results
documented
-Tech roadmap
revised
-Advice
considered
Intermediate
Term
-So me get
venture capital
- Lab prototype
com p wtco *
• Commercial
designed
-More efficient
processes
adopted
- Technology
purchased
Long
Term
Reduction in
energy use
£Oils.more
competitive,
emiuwni from
energy
are leu, thus
environment
cleaner
' 4
External Influences: Price of oil and other energy supply and distribution factors, economic growth.
perception of risk of globil dimale change, market assumptions, technology assumptions __
Fig :. A table with element* of ihe Logic Model for an energy technology program
-------
J 4 McLaughlm C B Jordan!Eialuuiiii'i ami Pruvruni Planning .V f19991 6i-~:
69
Model (Rush & Ogborne. 1991, Corbeil. 1986) the Lo«ic
Model is usually sec forth as a diagram with columns and
rows, \vith the abbreviated text put in a box and linkages
shown with connecting one-way arrows. We place inputs
or resources to the program m the first column at the left
of the Model and the longer term outcomes and problem
to be solved on the far right column. In the second
column, the major program activities are boxed In the
columns following activities, the intended outputs and
outcomes from each activity are shown, listing the
intended customer for each output or outcome. An exam-
ple of a Logic Model for an energy efficiency research
and development program is depicted in Fig. 3.
The rows are created according to activities or activity
groupings. If there is a rough sequential order to the
activities, as there often is, the rows will reflect that order
reading from top to bottom of the diagram. This is the
case if the accomplishments of the program come in
stages as demonstrated in our example of the if, then
statements. When the outcomes from one activity serve
as a resource for another activity chain, an arrow is drawn
from that outcome to the next activity chain. The last in
the sequence of activity chains could describe the efforts
of external partners, as m the example in Fig. 3. Rather
than a sequence, there could be a multi-faceted approach
with several concurrent strategies that tackle a problem.
For example, a program might do research in some areas
and technology development and deployment in others,
all working toward one goal such as reducing energy use
and emissions.
Although the example shows one-to-one relationships
among program elements, this is not always the case h
may be that one output leads to one or more different
outcomes, all of which are of interest to stakeholders and
are part of describing the value of the program
Activities can be described at many levels of detail
Since model* are simplifications, activities that lead to
the same outcome(s) may be grouped to capture the level
of detail necessary for a particular audience A rule of
thumb is that a Logic Model would have no more than
five activity groupings. Most programs are complex
enough that Logic Models at more than one level of
detail are helpful. A Logic Model more elaborate than
the simple one shown m Figure 1 can be used to portray
rr ore detail for all or any one of its elements. For example,
research activities may include literature reviews, con-
ducting experiments, collecting information from mul-
tiple sources, analyzing data, and writing reports. These
can be grouped and labeled research. However, it may be
necessary to formulate a more detailed and elaborate
description of research sub activities for those staff
responsible and if this area is of specific interest to a
stakeholder group. For example, funding agencies might
want to understand the particular approach to research
that will be employed to answer key research questions.
The final product may be viewed as a network dis-
playing the interconnections between the major elements
of the program's expected performance, from resources
to solving an important problem. External factors are
entered into the Model at the bottom, unless the program
has sufficient information to predict the point at which
they might occur.
Resources
Program I
l.SUff. Us*.
Management |
f
Program I
{.Staff. U*.
Management 1
f
Prugtmi 1
S.SUfT 1 e»
"*-— ~
!
Commercial 1
S. Staff 1
^\
Acnvnn*
PwTomi L^afc-
Research I"*1
Develop 1
Technology |~^
IOoptoy L.
Technology r^
Produce 1
technology 4 L-e*.
Educate Market |
^v
nces: Price of Ml and
and need ft
Outpun
Ideas for I
technology l_
dung, p~
Lap Prototype 1
"•port p"
Politic 1
InCMttVCSi ^^i*
Information |
Manufacture 1
In market ' |
S
uecMchy. economic g
for
Customer!
Reached
for Industry I
researchers 1 ~~
for 1
Users and l-s»
Manufacturers |
4(0, I
Users and 1— 01
Manufacturers |
f 1
technology |
^
tooth in Industry and i
Short-term
Outcomes
Leads to
appucattoMia _
energy
technologies
leads to 1
Commercial |>
iTmriialiiisi 1
riwwtjjrpW |
leads to
•ngflwAstfJeM.
less risk
toads to
Ttchnoiofly _•£
accepted.
purchased
f
n general. peicepOo
nt lechnoloov assu
Intermediate
Outcomes
Potential for
technology
"~ change
documented
Technology
Available for
CommercUmaftor
Early Adapters
to buy
1
Consequences
„ of use. lower
tfMify *>v*b
and emissions
/
n of nsk of global cl
1-
-I
Longer-Term
Outcomes-
P... la lane,
FMUUIfJHI
Solution
I
(snared
responsibility)
economy.
1 *• cleaner
••change 1
Fig 3 Logic chart for a research and technology development and deployment program.
-------
70
. L.III«II/III G B Jui dan E:ulnaliiHi tinti
3 5 Stage : (enj\ u>s ilic Logic Moik-l u irh stakeholders
As the Logic Model process unfolds, the *ork group
responsible for producing the Model should continuously
evaluate the Model u-ith respeci 10 its goal of representing
the program logic—how the program works under what
conditions to achieve its short, intermediate, and lona
term aims The verification process followed with the
table of program logic elements is continued wuh appro-
priate stakeholders engaged in the review process The
work group will use the Logic Model diagram(s) and the
supporting table and text During this time, the work
group also can address what critical information they
need about performance, setting the stage fora measure-
ment plan
In addition to the how-why and if-then questions, we
recommend four evaluation questions be addressed in the
final verification process1
(1) Is the level of detail sufficient to create under-
standings of the elements and their interrelationships?
(2) Is the program logic complete? That is, are all the
key elements accounted for1
(3) Is the program logic theoretically sound? Do all ihe
elements fit together logically0 Are there other plaus-
ible pathways to achieving the program outcomes''
(4) Have all the relevant external contextual factors been
identified and their potential influences described1
A good way to check the Logic Model is to describe
the program logic as hypotheses, a series of if, then state-
ments (United Way of America, 1996). Observations of
key contextual factors provide the conditions under
which the hypotheses will be successful The hypothesis or
proposition the work group is stating is. 'If assumptions
about contexiuai factors remain correct and the program
uses these resources with these activities, then it will pro-
duce these short-term outcomes for identified customers
who will use them, leading to longer term outcomes.'
This series of if-then statements is implicit in Fig. 1. If
resources, then program activities. If program activities.
then outputs for targeted customer groups. If outputs
change behavior, first short and then intermediate out-
comes occur If intermediate outcomes lead to the longer
term outcomes, this will lead to the problem being solved.
For example, given the problem of limited energy
resources, the hypothesis might go something like this
Under the conditions that the price of oil and elec-
tricit> increase as expected, i/ihe program performs
applied research, then it will produce ideas for tech-
nology change //"industry researchers take this infor-
mation and apply it to energy technologies, then the
potential for technology changes will be tested and
identified // this promising new knowledge is used
by technology developers, then prototypes of energy
efficient technologies can be developed IJ manu-
Pt.iC'i'"' Planning 22 SIWJ , rt.{-'.'
fjciurers u:>e the prototypes and perceive value and
lo«. risk, then commercially available energv savmc
technologic* wall result If there is sufficient marke:
education ;ind incentives and if the price is right, then
consumers will purchase the new technologies //ihe
targeted consumers use the newly purchased tech-
nologies, then there should be a net reduction in the
energy use. energy costs and emissions, thus making
the economy more competitive and the environment
cleaner
4. Measuring performance
Measurement activities take their lead from the Logic
Model produced by the work group. There are essentially
two purposes 10 measure program performance, account-
ability or communicating the value of the program to
others, and program improvement. When most managers
are faced with accountability requirements, they focus
on collecting information or evidence of their program's
accomplishments—ihe value added for their customers
and the degree to which targeted problems have been
solved Another way to be accountable is to be a good
manager. Good managers collect the kind of information
that enables them to understand how well their program
is working. In order to acquire such an understanding.
we believe that, in addition to collecting outcome infor-
mation, the program manager has to collect information
that provides a balanced picture of the health of the
program. When managers adopt the program improve-
ment orientation to measurement they will be able to
provide accountability information to stakeholders, as
well as make decisions regarding needed improvements
to improve the quality of ihe program
Measurement strategies should involve ongoing moni-
toring of what happened in the essential features of the
program performance story and evaluation to assess their
presumed causal linkage or relationships, including the
hypothesized influences of external factors Wiess (1997)
citing her earlier work, noted the importance of not only
capturing the program process but also collecting infor-
mation on the hypothesized linkages According to
Wiess. the measurement should 'track the steps of the
proeram'. In the Logic Model, the boxes are the steps
that can often be simply counted or monitored, and the
lines connecting the boxes are the hypothesized linkages
or causal relationships that require in-depth studv to
determine and explain what happened.
h is the measurement of the linkages, the arrows in the
loaic chart, which allows the manager to determine if the
program is working. Monitoring the degree to which
elements arc in place, even the intended and unintended
outcomes, will not explain the measurement or tell the
manager if the program is working. What is essential is
the testing of the program hypotheses. Even if the man-
-------
J -I MiLaiiglitin C B Jordan f. aluuimn jml friivniiii Pl.iiiiiniy .'.' .'IW9i 6i-"~
ager observes that intended outcomes were achieved, the
following question must be asked. 'What featurefs). if
any.'of the program contributed to the achievement of
intended and unintended outcomes?'
Thus adopting the program improvement orientation
to performance measurement requires going beyond
keeping score Earlier we referred to Ration's (1*997)
espoused theory of action The first step m improvement
measurement is determining whether what has been plan-
ned in the Logic Model actually occurred. Patton would
refer to this as determining theones-in-use. Scheirer
(1994) provides an excellent review of process evaluation,
including not only methods for conducting the evaluation
of how the program works, but also criteria to apply in
the evaluation
The Logic Model provides the hypothesis of how the
program is supposed to work to achieve intended results
If it is not implemented according to design, then there
may be problems reaching program goals. Furthermore,
information from the process evaluation serves as
explanatory information when the manager defends
accountability claims and attnbutes the outcomes to the
program.
Yin (1989) discusses the importance of pattern mat-
ching as a tool to study the delivery and impact of a
program. The use of the Logic Model process results in
a pattern that can be used in this way. As such it becomes
a tool to assess program implementation and program
impacts An iterative procedure may be applied that first
determines the theory-m-use. followed by either revisions
in the espoused theory or tightening of the implemen-
tation of the espoused theory. Next, the resulting tested
pattern can be used to address program impacts.
We should note that the verification and checking
activities descnbed earlier with respect to Steps 4 and 5
actually represent the first stages of performance
measurement. That is. this process ensures that the pro-
gram design is logically constructed, that it is complete,
and that it captures what program staff and stakeholders
believe to be an accurate picture of the program
Solving the measurement challenge often requires
stakeholder representatives be involved in the planmns
Stakeholders and the program should agree on the defi-
nition of program success and how it will be measured
And often the program has to rely on stakeholders to
generate measurement data Stakeholders have their own
needs for measurement data as well as constraints in
terms of resources and confidentiality of data
The measurement plan can be based on the logic
chart(s) developed for the program The manager or work
team should use Logic Models with a level of detail that
match the detail needed in the measurement Stake-
holders have different measurement needs. For example.
program staff have to think and measure at a more
detailed level than upper management.
The following are the performance measurement ques-
tions across the performance story which the manager
and work team will use to determine the performance
measurement plan
(I) li (was) each element proposed in the Logic Model
in place, at the level expected for the time period''
Are outputs and outcomes observed at expected per-
formance levels0 Are activities implemented as
designed? Are all resources, including partners, avail-
able and used ut projected levels9
(2) Did the causal relationships proposed m the Logic
Model occur as planned7 Is reasonable progress being
made along the logical path to outcomes0 Were there
unintended benefits or costs0
(3) Are there any plausible rival hypotheses that could
explain the outcome/result0
(4) Did the program reach the expected customers and
are the customers reached satisfied with the program
services and products?
A measurement plan will include a small set of critical
measures, balanced across the performance story, that
are indicators of performance. There may be strategic
measures at a high level of detail, and tactical measures
for implemented of the program. The plan will also
include the important performance measurement ques-
tions that must be addressed and suggest appropriate
timing for outcomes or impact evaluation This approach
to measurement will enable the program manager and
stakeholders to assess how well the program is working
to achieve its short term, intermediate, and long term
aims and to assess those features of the program and
external factors that may be influencing program success
5. Conclusion
This paper has set forth for program managers and
those who support them the Logic Model tool for telling
the program's performance story. Telling the story
involves answering the questions. 'What are you trying to
achieve and why is it important0', 'How will you measure
effectiveness0', and 'How are you actually doing0' The
final product of the Logic Model process will be a Logic
Model diagram(s) that reveals the essence of the program.
text that descnbes the Logic Model diagram, and a
measurement plan Armed with this information the
manaaer will be able to meet accountability requirements
and present a logical argument, or story, for the program.
Armed with this information, the manager will be able
to undertake both outcomes measurement and improve-
ment measurement. Because the story and the measure-
ment plan have been developed with the program
stakeholders, the story should be a shared vision with
clear and shared expectation of success.
The authors will continue to search for ways to facili-
tate the use of the Logic Model process and convince
-------
\ULaughlin C B Jordan, Evaluation ami frnvrmn Planning _V f/W9/ 65-72
nuinjgurs and stakeholders or the benefits of its use.
We welcome feedback from managers, stakeholders, and
facilitator:) who have tried this or similar tools to develop
and communicate a program's performance story.
Acknowledgments
In addition to the authors cited in the references, the
authors thank Joe Wholey, Jane Reismann. and other
reviewers for sharing their understanding of Logic
Models. The authors acknowledge the funding and sup-
port of Darrell Beschen and the program managers of
the U.S. Department of Energy Office of Energy
Efficiency and Renewable Energy, performed under con-
tract DE-AC04-94AL85000 with Sandia National Lab-
oratories. The opinions expressed and the examples used
are those of the authors, not the Department of Energy.
References
Bickman. L. (1987) The functions of program theory In L Bickman
(Ed ). Uwig program ilieon m naluanon New Directions for Pro-
gram Evaluation, no. 33 San Francisco Jossey-Bass.
Chen. H.T (1990). Theor\-dnci:n evaluations Newbury Park. CA Sage
Corbeil. R. (1936) Logic on logic models Eialuanon newsletter
Ottawa: Office of the Comptroller General of Canada. September
Jordan. G B . & Mortensen. J (1997) Measuring the performance of
research and technology programs' a balanced scorecard approach.
Journal of Tethnolog\ Transfer. 22. 2. Summer.
Jordan. C B.. Reed. J H.. & Mortensen. J C. (1997). Measuring and
managing the performance of energy programs: an in-depth case
Presented at the Eighth Annual National Energj Services
Conference. Washington. OC. June
KosXmcn. J A (1997) Office of Management and Budget Tesnmon>
Before the House Committee on Government Reform and Over-
sight Hearing February 12
McDonald. R . & leather. G (1997) Science and technology policy
evaluation practices in the Government of Canada Policy evaluation
in intimation and lechnologr lonards Best practices Proceedings of
the Organization For Economic Co-operation and Development.
Montague.S (1994) The three R's of performance-based management
Focus. December-January
Montague. S (1997) The three Rs of performance Ottawa. Canada
Performance Management Network. Inc. September
Palton. MQ (1997) Utiliiation-focused Evaluation the nen ceniurv
te.tt Thousand Oaks. Sage, pp 221-223
Rush. B . & Ogborne. A. (1991). Program logic models expanding their
role and structure for program planning and evaluation Canadian
Journal of Program Etaluaiion. 6. 2.
Scheirer. M A (1994) Designing and using process evaluation In Who-
ley et al. (Eds). Handbook of practical program evaluation, (pp 40-
66)
Torvatn. H. (1999). Using program theory models in evaluation of
industrial modernization programs: three case studies. Ecaluanon
and Program Planning. 22(\), 73-82.
United Way of America (1996). Measuring program outcomes' a prac-
tical approach. Arlington. VA: United Way of America.
Weiss. C (1997). Theory-based evaluation: past, present, and future.
In D Rog & D Foumer (Eds). Progress and future directions in
eialuanon- perspectires on theory, practice, and methods New Direc-
tions for Program Evaluation, no 76 San Francisco Jossey-Bass.
Wholey. J. S (1983) Ecaluanon and effectire public management Bos-
ton. Little. Brown.
Wholey. J S. (1987). Evaluabilty assessment: developing program
theory In L. Bickman (Ed). Using program theor\ tn eialuanon
New Directions for Program Evaluation, no 33. San Francisco
Jossey-Bass.
Yin, R K (1989) Case study research: design and methods Newbury
Park: Sage pp 109-113.
-------
7 Using Piogram "I hcoiy to Ucphcalc ^iiccitslul Piogiams 71
Tumidly A //(KM
Replicating successful piogiams is cxiicincly dillicull, progiam llicory
i .in help piovide inloniiiiiioii foi he tier icphc-.uion .ind adaptation ol
programs
8 rheory-liased Evaluation Gaming a Shared Understanding 79
Between School Stafl .uul Livalualors
I nicy A Miicbiifi
I licory-h.isccl cvnluaiion c.in enhance staff support Tor and under-
standing ol cvalii.inon, n can also entourage reflective practice
9 Developing and Using a Program Theory Matrix for 91
Piogram Evaluation and Performance Monitoring
Sue C I iiiiiicl/
An approach lo program llicory cvaluanon is dcsciihcd dial encom-
passes peilomiancc inoiiiioring and the broader contexts that affect
piogiams
10 Summing Up Piogram "I heory 103
(fiinniil liitlnntiH
1 lie issues i.useil in the volume aie suininarized. analyzed, and placed
in the context ol oihci developments in piogiam iheoiy and evaluaiion
INDLX 113
EDITORS' NOTES
li has been more than thirty years since evaluaiors lusi undnsinied ilu
advantages in making explicit and testing program tlu-oiy—ih.it is..UK
uiideilymg assumptions about how a program will woik to achieve intended
outcomes And it has been ten years since the last of two issues of Noi
Dilation* Jin Pnifttam Evaluation (the former name ol tin- sciics) w.i-.
devoted lo providing evaluators with some of the tools needed to i.uiy oni
piogram theory evaluations in practice and helped refine the many concep
ttia I and pi actual issues involved Since those volumes were published, tlu
liontiers ol cvaluanon have expanded lo meet new i hallenges-, such as pei
Ioj inanec JiKJSUiemcnl. organizational learning, cnllahoiaiivc- .uul paim i
patory tcscarch, and mcla-analysis
In I'ntgiain I hcoiy in Evaluation Challenge*, mid ()/)/IOIMIMIIICS, wi
examine (he real 01 potential role for program llit-oiy m these nrwei an-.is
Ihu some thorny issues remain foi cvaluatois implemrniiug such evalu.i
lions, pailiuilaily m legard lo causal mfncncc I ollowmg the liiiioduiiinii
Pail One niLludes lour ihapters that address some ol these tliallcngcs
Despite some persistent cjuestions, there arc opportuiiiiies loi piogiam ihr-
ory lo help evaluators in areas such as performance mcasuiemc-ni and mri.i-
analysts Part Two includes four chapters that discuss (his potential I In-
volume's summary chapter belongs lo Leonard Bickman
In Chapter One, the editors assess the current state ol piogiam ilu-oiy
drawing on then own review of the available htcratuie They use ilu- icvicu
as a backdrop to discuss program theory's histoiy, its complexiiy ol deliiu-
tions and variation m lexicon, us diversity in application, and us stiengili>.
and limitations in practice
Some of the challenges in implementing program ihcniy m rvaluaimii
(iiaciiLe aie addiessed in Part One A majoi issue lor c-valuaiois is causal
ink-rente in progiam theory evaluations In Chapter Two, Jane Davidson
aigucs I ha I randomization is but one of several ways (hat scientists .K know!
edge can establish causal inference and that evaluation chenis olu-n acctpi
lessei standards ol prool about causality to get the information ihey need
She offers that a program (heory evaluation could meet the lowei siand.uds
of prool required in settings that cvaluatois often Imil iheiusdvcs in In < mi
trnst, Thomas Cook argues m Chapter Three that piogiam tluoiy is not sul
IK lent to establish causal inference Raihei than "falsely choosing' lujw«-i n
landomization and progiam theory ihe cvaluat»)r tan iiKike_ilie"iip:imi'.'il
clu)ice_.ind combine both "
In addition to the complexities associated with causal minimi-, ihrii
aie challenges in deciding what types of progiam them les 01 models lo lesi
with an evaluaiion 1 heie are often a numbci ol plansilili iluoni's a!>
IIDVV a piogiam woiks and an abundance of links lo lonsidri In
-------
I'KOl.KVM
IN I VAI |l\IU>N
IIII cits' Ni III s
world, I he cvalualor would he .iMc ID siiuly llicm all lUil as Caiol Weiss
wines in Chaplci Font, (he cvalualor has lo make simplifying (.hours She
goes on lo provide guidance lor cvalualors who are pondering which links
in wlin.li ihcoiv ihey should test In Chapter 1'ive, P.uncia Rogers shows
how simple models — usually depleting causalion like a ch.un ol domi-
noes — may not accuialely rellecl how progiams work She provides exam-
ples ol how cvalualors have developed .more realistic causal motlfls and
makes (he point thai eomplexily is not the goal, hut rather useful "maps"
that inloim suhsequeni decisions " '
Pail Two exploit's the potential for program theory to make coninhu-
nons in evaluation's new frontiers. The advent of mcia-analysis has meant
lh.it inoic ev.iluatois not only aie asked to make sense ol multiple evalua-
tions hul also aic picssuicd lo impiove their own sludie.s foi siihsci|iiciii
ievie\\s In Clhaptei Six. Anthony Peliosmo deiiionsiiaies how t ven simple
piogiam llicoiy evaluations could be used in nicla-analysis to accumulate
Knowledge ticncializing liom a progiam ilieoiy cvahiaTltnTTii oiie"scmiig
lo"7iM"Tval\i.iiion in the next setting is Timothy Mat si's i OIK. cm in C'hapiei
Seven Allei leviewmg some hasu pioblcnis in piogiam diffusion, he shows
how the subtle inloimaiion geneialed by piogiam theory evaluation oilers
an alternative to cuncnl models loi replicating innovations
Another challenge laced by cvalualors is ""^iil* |>|i'-"i IIM llu'l'
In Chapter Eight, Tracy Huebncr provides several illustraiive case studies
to show the value-added of prouram theory in eduraiipn.il cvn|ii.inon Hueb-
ncr finds that the approach helped cvalualors coordinate their goals with those
of school sialf and reduced the normal lesisiancc that teachers and adminis-
iraiois olien have to "yet another study," pailicularly one thai iec|uncs them
lo collect the data.
Lvaluatois aie often being asked lo develop systems foi moniiormg per-
lormaiuc In Chapter Nine, Sue Funncll outlines the pioblcms encounieied
in developing such systems and draws on hei experience to show how pro-
giam theoiy can help cvalualors develop momionng tools that make sense
llcf matrix^ can be used lo draw out undei lying theories about why agencies
or programs should succeed, to identify indicators in which measurement
is needed, and lo ensure that important external factois outside the bound-
aries ol ihc oigamzation are also monitored
l-'mally, in Chapter Ten, Leonard Bickman summanzes these contribu-
tions and outlines a luture agenda for progiam theory evaluation As editor
of iwo seminal New Diinfions/iii Piogumi Lvahiatuin volumes on piogram
ihcoiy, liickman has ihe perfect perch Irom which lo critique (he volume
and oiler his observations and predictions
We believe this volume olfeis hope and piagmatism Piogiam ihcoiy
tan help evaluaiois ineei some ol (he new challenges I hey laic Ikil
implc' 'ing piogiam ihcoiy evaluation ceilamly ofleis Us own ih.il-
lenge |uaiul.uies Om hope is that llus Nrw Dim (ions issue noi only
will snmulale thinking about program ihcoiy evaluation bin also—am.
even mote important—will result in an incicasc in real-woild tests ant
applications
Anthony I'eliosino
Pal IK i.i | Itogris
1'iacy A Iliielmei
Timoihy A I lac si
I'dilois
ANMIU.VI P/-/ROS/VOIS icwaich fellow at theCentci /«» /•.vdfiKKioii,
/()/ Lluhln'n PKI^IHIM, Ant cue an Academy o/Arf.s anil Suriu i s, mid n si an It
msec KI/C at (In- llai vaid Graduate School of Education
/MINK IA J RtH.l K's is dinctoi of the Progicim/oi PII/'/IC Srdci / UI/IKIIIHII in
(hi1 ( / A/)/)/iC(/ Si initc, Royal Mclbouinc (nsliliifc o/ li < hnoln^y. AIIS
/\ /(( 'I li.vi K is cooidinaloi fin compiehcnsivi- s< hool n foi in al U'rs/J-J
//MOM/) A f/-u si is ii'M'dK/i fellow at r/ic lldivnid ( h 'is tnilmln't mid
Inn In s liislni y ill l/ic lltiivaid l:\lcn\ion .Sdnml
-------
1 IK /nsfoi it (if < vcln;)iii( n( »J ;>M>£> \.ni-
annns ol I11 r. in pi.\IIKC and inui_h to rccommciul n And elemenis ol
PIT, whetliei (he evalualors use (he terminology 01 not, aie liein^ used
in a wide lange of aieas of contern lo evalualors llased on this ii-view. in
ibis (.hapiei we dismss (he pracdie, promise, and pioblems of I'11
What Is Program Theory Evaluation?
Ik'iause llns volume is intended lo demonstrate the diveisily ol pia< lue,
we li.ivi usul a bio.id delimiion of program llieoiy evalualion We i onsidi i
it lo have iwo essential componenls, one conceptual and one empiiKal
P I'll consists ol an exphcil theory or model of how the piogiam < auscs ilic
intended 01 obseived outcomes and an evalualion that is al least p.uily
guiiled by ibis model llns definilion, I hough dehbei.iiely hioad, dots
exchule some veisions ol evaluation that have the woid iliroi v .111.u hi il lo
limn It does not covei all six types of theory-diiven evalualioii delined by
Chen (IWO) but only (he type he lefcrs lo as ind'ivcuni^ mc
-------
l'Klll.K\M III! i >K\ IN I \ M I • \ I U IN
piogiam hul lh.il ill) nol use I IK- ihcoiy lo guide ihc evaluation Nor docs
M iiuliulc evaluations in which I he piogiam ihroiy is a lisl ol acliMlics, like
a "to do" lisl. tailici than a model showing a sciics of inii-niutliau- out-
comes, ui mechanisms, hy whu h (he pi obtain activities aie uiiiU i stood lo
lead ID I lie dcsiied riuls
The idea ol basing piogiam cvaluaiion on a causal model ol llir pio-
giam is noi a new one Ai least as lai hack as I he Notts, Suihman .suggested
thai piogiam evalnaiion might address ihc aclueveineni of a "chain of ob|cc-
uves" ( I9f>7, p 55) and aiguccl lor ihc henefit ol doing this " I he evalua-
tion siuily lesis some hypothesis lhat activity A will aiiain objective I!
because u is ahle to inlluciur pioccss C which alfecis the occuiicncc of ilns
oh|i-uivc. An mule-islanding ol all three laclois — program, oh|ccnvc and
mici veiling process — is essenii.il 10 I he conduct of evaluative icscauh"
(I9d7.p 177)
Weiss ( 1072) wciii on 10 explain him an cvaluaiion could uleniily scv-
eial possible causal models ol :i leachci home-visiting piogiam and could
dcieiniine which was the hesi as suppoilcd by cvulcnic In the thicc
decades sinie, many dillcicnl lei ins b.i\e been used loi this t\pe ol cvalua-
iion, including iiiihomcs liicnin Jms (licnncii. 1975) and (Ino) y-oj-m fi<»n
(Si hon. 1997) Moie commonly, the leiius /JIO^KIIII »/ic«»iy (Milkman, 1987,
1990), fJirmv-luisuf i-viiliiiilinii (Weiss, 1995, 1997), and /)ic>g»K«II li. has now, in Us sixth edi-
tion, added a chaplei on tins appioach (Uossi, I iceman, and Upscy, I99«J)
Snnil.nl), / viiliiiifioii Models l-'vtmnKioii <>/ i.tliuiitiomil unil Stnml P>OJ>KIIIIS
(Maclaus. Si -earn, and Sciiven. I9H J) lias added a ihaplei on program
llieoiv exalt . in its second edition (Uogeis, loiiluonnng)
I'lMl.KVM till iM!N I \ M UAIIXN I'KA( I K I . I'KI IMIM . >\NI> l'l<( UII I Ms 7
Praclicc: Diverse Clioiccs lo Meet Diverse Needs
Piogiam iheoiy is know by many dilleieni names, ucalcd in many dillu-
cut ways, and used loi any number of puiposes lleie we piovulc a bnel
load map to the vainly ol ways people think about and employ piogiam
iheoiy
Locating Examples, lo ny 10 unileisiand ihe vanety ol ways m whu li
piogiam iheoiy evaluation is now being used, we began in eaily I1WH 10
comb thiough available bibliogiaphical databases, cilalion indexes, and eval-
uation K polls We also leviewed conference pioccedmgs, dissertations, and
ailicles bom a vaiiely of disciplines In addition, we received many helplul
examplis in lesponse lo an nu|Uiry lo the Amencan Evaluaiion Assonaiion's
Iniiinel discussion list. IIVAIJALK Oui cffoils turned up examples dalmg
horn I1)')/ to 2000 horn (he Umied Slates, Canada. Ausiiaha. New /calami.
and the United Kingdom We have not included every example that we
locaied m this volume bin instead have used examples lo uleniily and illus-
liale 11 nical challenges m using program iheoiy or ways ol addiessmg ihem
Oui ievie\v showed amazing diversity in theory and piaiiue auoss IM-O
mam aieas—how piogiam theories are developed and how they aie used lo
guide evaluations
Developing llic Program Theory—Who, When, and Wlial. In
some evaluiinons. ihe piogram theory has been developed laigely by ihe
evalualoi. based on a icview ol research hleialure on snnilai progiams 01
iclcviini i.uisal mechanisms, through discussions \\nli key mloimains,
ihiough a ieview ol piogiam documentation, or through obseivaiion ol ihe
piogiam ilsell (Lipsc) and Pollard, 1989) In oihci evnlu.itions, the pio-
giam iheoiy has been developed primarily by those associated wnh the
progiam. olten ibiough a gioup piocess Many praciilioneis advise using
a combination ol these appioaches (Pawson and "lilley, I9l)5, Pallon, llM)(i,
see also I unnell, C hapler Nine)
Ihe progiam iheoiy can he developed before the piogiam is imple-
mented or altci ihe piogiam is under way At limes, u is used lo change pio-
giam piactice as the evaluation is beginning Most piogiam iheones aie
summarized in a diagtam showing a causal chain Among ihe many van.i-
nons we Mill highlight |usi three for IIOM-, Rogeis discusses oiliei vanaiions
in Chaptct I iv c
At us simplest, a piogram theory shows a single iniermediaie outcome
by which the piogiam ai hieves its ullimaie outcome I 01 exampli. m .1 pio-
giam designed lo leiliui substanie abuse, we miglii lest whethei 01 not the
progiam succeeds m i hanging knowledge about possible dangeis .mil then
wheihei 01 nol this seems impoitant in achieving the desned hehavioi
change As Peiiosmo (Chapter Six) points out, loi some pmgiam aieas,
artiiulaling this mediating variable and incasuimg il would ' • signilu.ini
advaiue on iiineni piaclice
-------
K I'KiU.KAM IlllOin IN I VAIUAIION
Moic complex piogiam thcoiics show a scncs ol mlci mediate out-
comcs, sometimes in multiple strands thai combine to cause I lie ultimate
outcomes So loi a substance abuse picvcntion piogram, we might llieonze
lh;ii .111 cllcclivc piogi.nn will gcnciatc a positive reaction among partici-
pants, change both aldliules .nul knowledge, and develop panic ipants skills
in icsisling peei picssuic Although these more complex progiam llieones
may moie adequately repicsent the complexity ol programs, it is impossi-
ble to design an evaluation that adequately covers all the factors they iden-
tify Weiss (Chaptct Four) proposes some ways to select the particulai causal
links that any one evaluation might study
The ilind type ol program theory is represented by a series ol boxes
labeled m/mls, /mxrsM'i, outputs, and outcomes, with arrows connecting
them It is not specified which pioccsscs lead to which outputs Instead the
difleient components of .1 progiam theory are simply listed in each box
Although this type of progiam theoiy does not show the iclalionships
among dilleienl components, these lelalionships aie sometimes c.xploiccl in
the eiiipnit.il component ol the evaluation
Using I lie Program Theory to Guiilc the Evaluation. Piogiam thc-
01 y has been used in quite dilleient ways lo guide evaluation Lxamples
show diveisny in the purpose and audience ol the evaluation, the type ol
icscarch design, and the type of data collet led Within this diversity, it is
possible to identify two bioad clusters of practice
In some PTEs, the mam purpose of the evaluation is to lest the pio-
gram theory, to identify what it is about the program that causes the out-
comes This sort of PTIE is most commonly used in large, well-resouiced
evaluations locused on suth summative questions as, Does this piogtam
work7 and Should tins pilot be extended7 These theoiy-iesling PTEs wres-
tle with the issue of causal attribution—sometimes using experimental 01
quasi-experimental designs in conjunction with program theory and some-
times using progiam theory as an alternative to these designs Suth evalu-
ations tan be particularly helplul in distinguishing between theory failure
and implementation failure (Lipsey, 1993, Weiss, 1997) By identifying and
measuimg the intei mediate steps ol program implementation and the ini-
tial impacts, we can begin to answer these questions These intermediate
outcomes also provide some interim measure ol program success foi pio-
grams with long-term intended outcomes
An example of (his type of program theoiy evaluation can be found in
the I amily llmpowcimcnt Pro|ctl evaluation, in which Bickman and col-
leagues (1998) concluded an experimental test of the elfetts ol a progiam
that named patents to be strongci advocates foi clnldien in the mental
health system I hey articulated a model of how the program was assumed
lo woik l:nst. patent naming would increase the parent's knowledge, scll-
ellicacy, and advocacy skills Second, parents would then become more
.•ivolml m tin ii c hild's mental health care I inally. this collaboration would
l< nl IM ilii i Inlil's unmoved menial health ouU nines
I'KOi.KAM IIIIOKY I.VAIUAIION I'KAC lie I . I'KDMI .1 . AND I'KHIII I Ms 9
Ikil they did not slop wnh the ailiculalion ol a piogiam iheoiy I lit y
also constituted measuies, collected data, and analy .eel them lo test iluse
nuclei lying assumptions The program was able lo a* Ineve statistically sig-
nificant ellects on parental knowledge and scll-clhcat y, but no useliil mea-
suies loi testing advocacy skills could be found Unloiinnately, the
iniei\enlion had no apparent effect on caregiver involvement in iie.iimeni
or sci vice use and ultimately had no impact on the eventual menial health
status ol the children
(.valuations such as these seem to be ai least implu illy based on Weiss's
delimlion ol program theory, "|lt| refers lo the nietlmiusm* thai mediate
between the delivery (and receipt) of the program and the emeigence ol the
outcomes ol interest" (1998, p 57)
I he other type ol ptogram theory evaluation is often seen in small evalu-
ations done at (he pm|cct level by or on behalf ol pio|t el manageis and stall
In these cases, piogiam theory is more likely lo be used loi lonnalive evalua-
tion, ID guide then daily actions and decisions, than loi summalive evaluation
Such PI l.saie olten not coincided with causal attiibuiion Although some ol
these evaluations pay attention to the influence ol external l.iciois, theie is
taiely systematic lulmgoui ol nval explanations foi the outcomes Many ol
these evaluations have been developed in response lo ihe me leasing demands
lor piogiams and agencies lo report performance mloimaiion and to demon-
strate their use ol evaluation to improve their services In these c ire umstaiu es,
P IT. has often been highly regarded because of the bcnelits il piovides to pro-
gram manageis and stalf in teims of improved planning and management, in
addition lo us use as an evaluation tool
Stewart, Cotton, Ducked, and Mcleady (1990) pi ovule .m example ol
this type of PTE in then evaluation of a project that lecruiied and named
volunteers to provide emotional support for people with AIDS, ihen loveis,
families, and friends I'hc paper did not provide a diagram of the piogiam
theory model nor present any data Instead Stewart and colleagues lepoited
on the piocess ol developing the model, the types ol data dial weie gath-
ered, and how the data were used "Performance inchtaiois developed by
Ankah |ihc project| weie both foi the organisation's own (imposes and to
meet the requnements ol the funding body Both qualnaiive and quantita-
tive indicators were selected Ankah now uses the outcomes lueiaichy
dining orientation of volunteers and report|s| thai the piotess has assisted
with improved laigeiingol volunteers and referial agencies, modification to
the naming piogram, supei vision of clients and volunleeis. and devt lop-
mem ol pioposals loi expansion and enhancement ol the seivue' (1990, p
317)
This type ol piogiam llieoiy evaluation appeals to In- ilosei in ilia)
described by Wholey |li| identifies progiam lesouices, piogiam ac nvities,
and intended program outcomes, and specifies a chain ol causal assump-
tions linking progiam lesources, activities, intermediate oiiicoim s and iiln-
mate piogiam goals" (I9K7, p 78)
-------
10
Inioii'i IN I \ AI OAIIDN
Despite the apparent popularity ol piogram ihcoiy evaluation, we
louncl ih.il I he loi mat evaluation liictaiurc snll has comparatively lew exam-
ples I'oi instance, when we scaichcd the abstracts in six hibhogiaphical
databases lor the lime liamc 1995-1999, we lound program theory exphc-
iily mentioned m evaluations ol cliilihen's pmgiams only twice In aclclilion,
many ol the evaluations thai we louncl used theory in very limited and spc-
cilic ways, loi example, lo help plan an evaluation, hut very few used the-
ory as extensively as the most prominent proponents ol this approach
suggest But PI l:s conducted in small projects or local sites are rarely pub-
lished in lelereed journals or distributed widely, being more likely lo be
picscnicd as conlcicncc p.ipeis by piaclitioners or picsented m pcrfoi niancc
measuiemeni (oiuins And many of them lail to include what some would
considei an essential component ol a program ihcoiy evaluation—
systematic testing ol the causal model
In this volume, we include examples of both types ol I'll. Weiss
(( haptei Tour), ll.usi (C.haptei Seven), and IVirosmo (Chaplei Six) dis-
cuss issues associated with //icoiy-rc.sdn^ /'/i:s llucbner (Chaptei F.iglit)
discusses loin examples ol adion-guidiHg /'//is, and Pimnell (C haptei
Nine) discusses a technique lor assisting with this sort of I'TE
Promises and Problems
Progiam theory has been seen as an answci lo many different problems m
evaluation lleie we briefly discuss several areas wheie program iheoty has
been seen as promising
Understanding Why Programs Do or Do Not Work. Among the
promises made lot PTIi, the most tantalizing is thai it piovides some clues
to answci the question of why programs work or lail to wotk Considei the
usual pi act ice of Hying to mule-island why a ptogram succeeded 01 failed
rollowmg repotting ol results, cvalualois usually woik in a post hoc man-
ner lo suggest icasons lor observed lesulls (Pelrosmo, forthcoming, 2000)
But without data, such post hoc theories are never tested, and given the
poor slate of lephcation in the social sciences, they are likely ncvei to be
In contrast, by creating a model of the miciosieps or linkages m the
causal path from program lo ultimate outcome—and empirically testing it—
PTI. piovides something moie about why the program failed or succeeded
in reaching the distal goals it had hoped to achieve, as in Utckman and col-
leagues' evaluation of the lamily empowerment program (1998) Perhaps
the mieivention was not able lo impiove advocacy skills—lemembei, those
could not be measuied Oi maybe there was a critical mechanism missing
horn the model, which the piogiam was not activating 01 engaging We
learn something mote than the progiam's apparent lack ol impact on clul-
clien's ii> health
I v these issues t annoi be adequately addiessed in the oiigin.il
evaluation.., i> 11 can piovtcle an agenda loi the next piogiam and evaluation
I'KOl.H \M I III HICl I \AI IIA I It IN I'KAt IK I . I'KOMIsl , ANI> I'KUIII I Ms
II
I 01 example, a (.ritual link in the Uickman and colleagues study was not
tested (advocacy skills acquisition), given the paucity of measuiemeni devel-
opment in this aie.i Pointing out this deficiency suggests an agt nd.i to
develop an mstiument to measure this variable in the next similai study
Attributing Outcomes lo the Program. Another piomise sometimes
made loi P11: is beliei evidence for causal attribution—to answci the qucs
lion ol wbeihei the piogiam caused the observed outcomes Progi.im tin oi>
has been used by cvaluaiors to develop better evidence lot aiiiihuimg out
comes 10 a progiam in circumstances where random assignment is not pos
sible (loi example, lloinel, 1990, in an evaluation ol random hicaili listing
ol automobile chiveis) In the absence ol a countcrlactual, suppoii lot i.msal
aitiibuiion can come Irom evidence of achievement ol intermediate out-
comes, inxcsiigaiion ol altcinative explanations foi otilioincs, and patiein
in.HI lung Suppoii lot causal attribution can also come lioin piogiam stake-
holclei assessmems (loi example. Funnel! and Mogiaby. 1995, m then eval-
uation ol the mip.ici of piogiam evaluations in a ro.icl and liallu .iiilhoniy)
01 horn dal.i about .1 i.mge of indicaiors, including data on t xteinal l.u tins
likely lo i n Hue nee the ihconzed causal pathway (foi example. Waul, Maine,
McCaiihy, and Kamaia, 1994, in their evaluation of activities to icclucc
maternal moiiahiy in developing countries). It may be possible lo develop
testable hypotheses on (he basis of (he causal model (Pawson and lilley,
1995). especially il the model includes contingencies or clilleieiitiaiion—
expec'ted chllciences m outcomes depending on dillerences in coniext
Causal atlitlnition is also sometimes addressed by combining traditional
expenmeni.il or quasi-experimental designs with P'l I.
Many P'l l:s do not address attribution at all, simply leporimg implc -
mentation ol activities and achievement of intended outconus I his
apptoac h is parliculaily common where program ihcoiy is used in develop
ongoing monitoring and peifoimancc information systems ( ausal ainihii-
tion in P11 s is disc ussed in moie detail in the c hapleis in tins voluim by
( ook (C haptei Ihiee), Davidson (Chapter Iwo), and Mac si (C liaptei
Seven)
Improving the Program. Many of the claims lor the licneliis ol I'll
relei lo us capacity to improve progiams directly and indiiectly Ann ulai-
ing a piogiam ihcoiy can expose faulty thinking about why the piogiam
should woi k, wine h can be corrected before things aie up and iimnmg ai
full speed (Weiss, ll>95) Hie piocess of developing a piogiam ihcoii (.m
itself be a tewaicling expedience, as stalf develop common uiuK islanding ol
then wot k and identity the most important components Man) an minis ol
PI I- (such as Milne. 1993, and lluebner. Chapter I ighl) lepon that this lias
been the most positive benefit Irom conducting PI I: In this wa\. I'11 is
veiy similai lo Hie cailiei technique of evaluabihly assessment
Ihil PIT is supposed lo then use the program ihcoiy ' -ult tin eval-
uation, and n is lu n that some evaluations falli i When lahoi.itivcly
building a piogiam ihcoiy can be an energizing team .ulivi,,, exposing this
-------
12
I'KIM.KXM I Ml Din IN I VAIH.MION
to haibh cmpmcal tests (..in he less amactive h.iciic.il difficulties abound
as well When PII: is iniplcincnied at a small project, staff may not have the
lime or skills lo collect and analyze data in ways thai cither test the program
theoiy or provide useful information to guide decisions and action If pro-
gram theoiy is used 10 develop accounlahility .systems, there is a real n.sk ol
goal displacement, wherein stall seek to achieve targets and slated ohjcclivcs
at the cost of achieving the ultimate goal 01 sustainahility of the piogram
(Winston, 1991)
Conclusion
In ihischaptei, we have outlined the range of activity that can he consul-
cird piogt.ini iheoiy evaluation .mil have idcnlilicd major issues in Us ihe-
oiy and piaeliie These arc discussed in moie detail hy the other ch.ipieis
in this volume
References
Benncll.C- "Up I he I lu-rari hy " /oiumil <>/ l:\lrn\nm, 1475. IK2).7-12
UK kin Jii, I (cd ) l/Miij;''n'X'/ llir Aiisl»dldsirodili Alcxandiia, Va Unucil Wuy of Amenta, 1996
Home I, K "Random Brralh Testing in New South Wales The Evaluation of a Suuess-
ful Social Lxpcriiuenl " National Evaluation Confeicnie 1990, Piotmlmgs, vol I Aus-
ualasian Lvaluaiion Society, 1990
Lennc. B , and Clclaiul, II "Describing Program l-ogu " Pio^iam L'viiliuinun fiiillcdii
1987, no 2 Puhlic SCIVKC Uoard of New Soiuh Wales, 1987
I i|>si-y, M W " Iheoiy as Method Sin.ill 'I heoncs of I iiMlinenls " In I. Scihiisi .mil A
Si DI I (eils ), I'm/< i slum/nig Giusrs and CifiicMilicmj; Abtwt (firm New Duec HOIKS foi
Piogi.un I v.ilii.iiion. no 57 San Traiu-isco Jossey-ll.iss, 1993
I ipsey, M W , .mil Pollard. J "Driving loward I linny in Program Lvaluaiion More
Models 10 C liiio.se I loin " l.viiliiiiliDii unit I'lvgnnn Pliiiiiiiii)>. 1489, 12. M7-32S
Mad.ius, (• , Siiilllehc.ini, 0 , .mil Suivcn, M I vuliiiilinii Mm/rl\ I viidiiitinii <>/ (ihn uliiniiil
and SIKml Pio^'diiis Norwrll, M.iss Kluwcr, I9H1
Milne, C '(hiuomes Iliei.iulues .mil Piogi.im I ogu .isCoiueplu.il loots I IM ( .ist
Siudii-s ' Papei presented .11 the inieinalional conlcicnie ol the Ausli.il.i.sian I v.ilu.i
lion Sourly lliisli.me, 1993
P.iiion, M (.) Ulilizalnin-l IIIIIMI! / viiliiiiinui (3ulid) I lioiis.ind O.iks, C .ihl S.ig>.
|'I;H(.K\M Inunn IV.MIIAIIIIN PRAI IK i , PKHMIM , ANU Pitt mi i MS I }
Piliosino, A I 'Aiiswiiing I he Wliy Question in I v.ilii.iiion I lit I .ins.il Mmli I
Appio.u h " ( (iiuiilnm /OIIMKI! oj 1'iogiam l-.valuation. 2000, I >( I ), I -2-4
Rogiis P | "Piogram I In ory I. valuation Nol Whciliri Piogi mis Work Km Wh> In
d M.id.ms II Suirilelu .1111, anil T Kelleher (cils ), / vtiliKiiidii Mixfcls / iiiliniliiiii n/
( dm iifiiuuil iiiiif Sni iiil ('ioniums Norwell, Mass Kluwei, loiiluommj;
Rossi, P II . I itiin.in, II . mil I ipsey. M W Pinxniiii I vuliiniiiiii A Sysli'iiiiidi Afifiimnd
I hoiis.mil O.iks, ( ahl Sage. 1999
Si lion I) \ 'Ihroi) ol-<\> lion Lvaluaiion " Papei piesenlrd in ihe llarvaul I \.ilii.iiion
lasL I OKI-. Api 1997
Sle\\arl. K , ( ollon, K . Duckcll, M , and Mcleady, K " I lie New South VValis Piogiam
I ogu Modi I Ihe I \pinenie of the AIDS Bureau. New Soulh Wales Di p.uliiienl ol
lle.ilih ' Pniiiiiliii^s d/ (lit1 Aniiuiil Ciiii/ricnir oj (he AusOnfuiiciii f.viifnuniiii Smniy,
1990,^. II ^-322
Sin Inn, in I \ (viilinilMi Kcsiiixd I'liiifiplri anil 1'iailiii in 1'nhln SMVIII niu/ Sunnf
Ailh'ii l'iiiv;iiiiii\ Ni \\ Noik Itussell Sage I oundalion, I 'Hi 7
W.iul \' M M.inu. I) . Mi( aiihy, | .and K.imaia. A "A Snaltgy loi llu I v.ilii.iiion ol
At IIMIIIS in Kediiit Moil. ilny in Developing C.ounliies " I vdliiiilinn Id vn if. 199-1, IH.
Weiss. C II ( iiiliidduM KiM-iinli Mflli(»i/so/As5fSMM^ Pii^iiuii / //nlivriii \\ I nglcuood
Chlls N| Pienliiellall, 1972
Weiss C II Nolhmn As Prai Dial As Good Theory I xploring I heoiy llasid I valua-
tion loi C ompieliensivi Community Initiatives foi (.Inldii n and I ainilii s ' In | P
Council, A C Kuhisih I B Schorr, and C. M Weiss (eds ), Ni u< Af)fiiiiiiilii sin I ml
iidlinx( OIMIIIIIIII/V liiilidli\is ( Diucpls, Mel/ioils and C'dfilrxis Waslunginn. M ( Aspen
Instiiuli. 1995
Weiss ( || ' I low C. in Ilieory-Uased llvalualion Make Ciicalei llc.ulw.i) ' ' I vdliidlliiii
RI-VKH. 1947.21,501-524
Weiss. C II I \dliidliiin Mi (finds Jm !>liijyin£ 1'nigrdriis and l'iiln u \ (2nd id ) I ngle
wood C lills, N | Pu nine Mall. 1998
Whole).) S "I valuahiliiv Assessment Developing Piogiam lheory"lnl Bukni.iii
(ed ) IKinx I'mxidiii I In in \ in I v£ianijoi Public Sc< tin l:viiliuili«n in tin-
Faculty it) Applied i( iriirr, Rityal Melbourne Institute <»/ /rr/iiindi^v. Ai^tnilm
ANIMO.V) fi IKOSINO is icsri/ic/i fellow at the Cenlei Joi I valuation, linimini s
/oi C./ii/i/icn I'logitim. /\iiiciiC(iii A
-------
M
I III I'in IN I \AIII till >N
need 10 conduct iiadilional causal modeling analyses ol the pailcin ol influ-
ence Inini ihe mlei veiilinn in ilie various media I ing variables and llien Irom
these mediators to a disial outcome
Tew cvaluaiors will argue against the inoic licquenl and sophisticated
use ol substantive theoiy to detail mici veiling processes Piobably ilie sole
exceptions aic those who believe lhai I lie act of mcasunng piocess cieatcs
conditions diffeieni from those that would apply in the actual policy world
|-"ew evaluatois aiguc that n is not possible lo collect measuies of interven-
ing processes So it should be possible to construct and justify a iheory-based
lorm of evaluation that complements experiments and is in no way an alter-
native to them It would prompt experimenters to be more thoughtful about
how they conceptualize, ineasinv. and analyze miciveiling process It would
also lemind them ol the need lo lust probe whether an intervention leads to
change* in each ol the iliroicmally specified intervening processes and then
exploie wheihci these piocesscs could plausibly have caused changes in the
moic distal outcomes of policy inteiesi 1 want lo see theory-based methods
used within an cxpcimicnial li.imevvoik and not a.s an alleinative to it
References
AngiiM.) I) . linlm-ns. ("• W . and Kuliin. H I* "Ideniifiialioii of Causal I lUiis Using
liiMimm-iil.il Variables ' /inn mil »J l/ic Aiiirii«in MulislKiil /Usut union, 10.1)6, 'Jl.
4-H-402
Anson. A . :nul oilicis " NIC Comer School Development 1'iogi.iin A lhcorciu.il Analy-
sis "Joiinmlii/Cibun I.(/in ill in". I*)9I, 26, 5tt-82
Coiner.) I' .SVhiNil I'liwri New York Free Press. I"80
( ook. "I I) , and Campbell. !"» I QwiM-/.v/»niMiriiinlii>ii Pi MX» «". wiiiu-i 2000
Cilyiiioui. C . Vlu-mcs, K , Spines. I'. ami Kelly. K DIMUVCIIIIX («nn«il Slnidiiii Am/i-
iinl Inirllixi-iKf. I'liiloju/iliv o] .Sm-iicf. diul Snilisliidl Moitilmx Orlando, I l.i Au-
ilcmu 1'iess. 1987
Si men. M ••M.minizing I he Tower of Causal lnveslig;ilinii I IK- Modus OptT.nuli
Mriliod ' In Ci V lilass (cd ), t'viiliuKioii iliu/ic» Ktvifiv Aiiiimil 'IhoiisaiidOaks.
( .ilil Sjjir. 1476
(/ (Jli 1C is little i iillsriisil.s
-------
l'Klll,i;\M till OKI IN I \ II MMIHN
lahlc 4.1. 'I licory of a Job-Training Program
Piogiam publicizes a iob-iianinigprogi.ini
Youlh heai about llie piogram
Youili are nneresieil and nioliv.ilcil lo apply
Program eiiiolls eligible youlh
Youlh sign up
I'logiam pioviclts occiipalion.il naming in .in accessible location
'louili .iiiciul legulaily
"I Mining malt lies labor 111.11 kei IK eds
1 raining is earned oul well
Youlh leani skills
11.lining ii'.u lirs HIHH! woik h.ibn.s
Youlh internalize v.ilius ol legul.u emplii) mem and appiopn.ile beh.ivioi on (he
job
Piogiam 11 It is \ oulli 10 suilahli |i>lis
Youib appl\ loi jobs
Youlh beli.ive well in job mlcrvicws
I mployeis ollei jobs
Youlh .uirpl jobs
^oulli show up lor woik legiilail)
Piogiam assists \oiiih in m.ikmg liansiiion lo woik and In Ips \\ nh piolik ins
Youib acicpl anllioiily on llie job
Youlh do (hen woik well
Youlh beh.ne well wnli cowoikers
Youili slay on llie job
Suniii Atl.ipli'il liiini Wriss, IW8, |> 54
Till- is nn jncmpi U) sec how far (lie program succeeds in accomphsh-
iii}; all (lie intervening phases hciwcen enrollment in llie progiam and long-
tcrm job holding If uainccs do well all along (he rouie from participation
m (he training program to slaying on a job, there is at least plausible reason
10 believe ih.u the piogiam was icsponsihlc- (or the 11.uncos' woik success
(See Chapter I wo by Jane Davidson loi limber discussion ol establishing
causality)
13111 lei us lake a step baik Table 4 1 shows the expected steps in the
implementation ol the program It is what might be called the mi/>fciiienfl\ed \\ilh llie piogiam also expect them to help the youlh deal wnli
dillicult hie ciicumstaiKes, such as an abusive parent or involvement in gang
activities Some piogiam people also expect them to inleiiede loi tumbled
youth wuh social woikeis, police, or piobalion officers 01 to help ihc u-ens
sec me sei vices horn health clinics or other service agencies
Several ihcones of action might be operating Some people, maybe the
piogiam adminisliatois, think that the counselors arc role models loi the teens
Hecause ol common ethnic backgiounds and life circumstances, the teens tan
idcnlil) with them, will take then words of advice seriously, and will lollow a
moie positive suu.il path Anothei theory might be lhat the counselois uiuk'i
stand the perils and piessures lhat the teens face and will give advice that is
beiiei suited 10 the ie.il woild ol the inner city ihan would a mitlille-i lass
teaihei 01 counseloi they will know how to advise on laiiuly probli ins
because ol the commonality of their family backgrounds Anothei theoiy migln
be that the counsclois, uiiclcrsiaiidmg the local culture, can use ihreais .uul
penalties ellecdvely, something thai while middle-class counsclois would he
loath to do Yet anolhci theoiy is that the counselors will be well au|ii.iinird
with all the available seivices in the community and theieloie can iclei the
youlh loan appropriate souice ol help All of these assumptions glow Imm ilu
match ol counselors to the ethnic and socioeconomic status ol die leeiiagt is
A clillerent set ol assumptions would refer lo the specific steps .uul
actions that the counst-lois use in their relations wnli the inns, prih.ips
gioxMiig horn the p.ulicular training that (hey received in the comimimiv
college I hey may have icceived training in the use ol rcuauls loi sm.ill
steps lhai a youlh takes in a positive dneclion, such as oflt nug .1 movie
pass liu alli-iuling school live cl.iys in a low Oi ilu-y m.i) li.ivi IH-I n n.iiiuil
lo help with the developmeiil of peer suppoii gioups. win it- .1 gioup of
-------
18 I'KIU.IIXM IlllOin IN I \ \l IIAIION
youngsters help one another maintain good school attendance and piopei
completion ol school woik One might also imagine thai a counselor could
he ellective by lutonng young people in the subjects thai give them the
most double in school and help them oveicome cognitive deficits Theie
aie a plethoia ol iheoietic.il bases on which one might expect the piogi.un
to he successlul in cncouiaging young people lo lemain in school and do
good woik
II (he evalualor isembaikmgon a theory-based evaluation, which theoiy
does she1 hook the study to7 Hoes she follow ihe counselors encomagemcnt
ol school attendance7 His intervention into family disputes7 Mis relenals to
set vice agencies711 is establishment ol support gioups711 is coaching m math7
Oi what7 One study can raiely collect data on all possible activities and their
cascading consequences It would be buidensome lo follow each chain ol pos-
sible events, and the evaluation would become complex and pondeious
C hoiccs have to be made I he evaluate)) has lo decide which ol the seveial (he-
ones lo Hack thiougb the senes ol subsequent steps
Ovciall, theic aie two majoi sources ol theoiy—the social science hi-
eiatuie and the beliefs of program stakeholders "I he advantage ol social sci-
ence iheones is that they aie likely to be based on a body ol evidence that
has been systematically collected and analyzed I he mam disadvantage is
that available social science theoiy may not match the program under
leview, and even when u does, u may be at such a high degiee ol ahsii.u-
iion that n is difficult to opciationalize in the immediate context Ncvei-
theless, when social science piovides theoiy and concepts that giomul and
suppon local (cumulations, it can be ol gieai evaluative value (Chen and
Rossi, 1987) The evalualor should bring her knowledge of the social sci-
ence liteiaiuic lo beai on the evaluation at hand
A way 10 begin the task ol choosing a iheory to lollow is to ask (he pio-
giam designers, administrators, and practitioners how they believe ihc piogiam
will woik They may have clear-cm ideas about the chain ol actions and icac-
lions that they believe will lead to better school achievement of (he youth But
it is not unusual to find thai different people in (lie program hold difleienl
assumptions about the steps hy which inputs will translate into desned out-
comes What can the cvalualoi do7
Pnsi, she can convene a meeting of the stakeholders in (he piogiam,
peihaps including the youth who are the piogram's clients, and ask them to
discuss then assumptions about how the piogtam will reach the desned
icsulis They should discuss the mimstcps ol counselor action and youth
iespouse thai will lead lo success "I hrough such discussion, their onginally
hazy ideas may become deal, and they may icach consensus about what the
piogiam nuly aims to do and how u aims to do it
fiogiam stall will often hud a disc USSIOM of this type revealing and emi-
nently piaclir '"hey will leaf n what then colleagues assume should be
done (and wl y aie doing) Stall may all be peiloiming (he same Imu-
nons but dome iiicm will) different assummions about why lhe\ will be
Wnii n 1 INKS IN WIIK n'liiioKii s SHAI i Wi IV.MHAII' W
successlul Oi they may actually be doing difleienl things In discussion,
they can hnd out wheihei they arc working at uoss-pmposes 01 aie on the
same wavelength If they aic working in different directions, the progiam is
apt lo be liagmenled and ineffective Staff will often find the efloil to ic.u h
consensus.) stimulating and useful exercise It may help the piogiam .mam
cohciciicc and diiection
Including Several Theories
In some instances, some program staffs cannot reach consensus I hey
have m.ukedly dilleieni (he-ones about where they should put then nine
and what kind of actions they should take in ordei to engage piohlem
youth in school In sui h cases, it may be necessary to include seveial dil-
leunt iheones in (he evaluation design The evaluation can lollow the
chains ol assumption ol several theories lo see which of them is best sup-
ported by the data
When a numhci ol dillerenl assumptions are jostling loi piionty. a 1151:
is wise to include multiple theories If only one theory is tracked, and that ihc-
01 y is wiong 01 incomplete, the evalualor may miss impoilanl chains of
action I he final lesult may show that positive outcomes weie at hievecl but
not thiough the senes til steps posited by the iheory The cv.tluaioi will be
unable to explain how success was attained (sec Ring, Slccnhuis, Van Assrma.
and IV Vnes 1996, 1'uska, Nissmen, and Tuoniilehlo. IWi) Oi il the pio-
gram lias disappointing lesulls, and only one theory was irai ked, the cv.ilua-
toi may lace ieade)s who say, "But that's not how we thought good u suits
would come about anyway" When programs rest on fuzzy assumptions, it is
ohen useful loi TBL lo represent a range of theoretical expectations
Bui the more iheones that arc tracked, the more complex and expen-
sive the evaluation It is worthwhile lo try lo winnow down the numbei ol
possible iheones to a manageable number Three or four would seem to be
the maximum that an evalualor could explore in a single study I low can the
evalualor decide which of the several theories is worth including in the eval-
uation7
( Criteria for Select ing Theorifcs-^ ( /
Thc~fiisrcnTCMiorT iQhi-'lv-liils'iiiilip people associated with the piogi.nn.
pnmarily the dcsigneis and developers who planned the piogiam. the
admmisiiaiois who manage it, and the practitioners who i.iny n out on a
daily basis Also impoii.mt may he the beliefs of the sponsois whose money
kinds the piogiam and the t hems who receive the set vices ol the piogiam
Wl)4l di) lh"c" fl^imicigcitmi. are the pathways lo good outcomes7 SVhal
aie the mmisleps (hathavc lo he taken U (lie clients aie to ic -'lelienehis
that the piogiam piomises7 What (he people who aie deep! ed m the
piogiam believe is ciitual because iheir beliavioi largely u- .nnes how
-------
•\
I'KlH.K \M I III Hin IN I \ Al MM KIN
llu- piogiam inns When ilu-y hold chvcigcni assumptions about I he ionic
10 success, ihe seveial ihconcs dial ihey prollcr become candidates loi
inclusion
A second cnierion is pl.uisihiluy Can die program actually do die
I (lungs dim a iheoiy .issuincs, and will die clients he likely lo respond in (he
expected fashion7 'I he cv.iluator needs 10 see whal is really going on Onc_
'^ vv.iv is lo liillow (he money Wheie is die hudgel being:spent7 Where is the
piogr.im K-jlly pulling its chips7 Which resources are they pioviding lor
wlui kinds ol assisi.uue7 If die program makes available to each counseloi
a list ol accessible service agencies, llieir eligibility crilena, and hours ol
operation, (hen n is a leasonablc bc( ibai (hey think the referral mule is
impoitant II nobody gives the counselois any mfoi maiion .ibout available
icsouiccs, (hen this iheoiy is piobably not .in active candidate lot study II
piogiam designers and administrators talk a good deal about ethnic m.ikli
between counselor and client but end up lining pumarily white middle-
class counselois, ethnic match is not an opeialive theory in this piogiam
Similaily, il the counselois do not know enough about plane geometiy 01
nmciccmh-ccniiny Ameiican Insioiy 10 tutoi youth, then assumptions
about success ilnough luionngare noi apt to be the route lo follow (unless
the counsclois hnd olhei people to do the luloimg) I he evalualor needs 10
lake a haul look at the piogram in action, not just in its planning docu-
ments, in oidei to sec which theones aic at least plausible in this location
A third criiciion is lack of knowledge in die piugram field For exam-
ple, many piograms seem to assume thai providing information lo piogiam
panic ipanis will lead lo a change in (lien knowledge, and met eased knowl-
edge will lead to a positive change in behavioi This theory is (he basis loi
a wide lange of piograms, including those dial aim lo i educe the use ol
dings, prevent unwanted pregnancy, improve pain-ills' adheience to med-
ical icgimens, and so forth Program people assume dial il you lell panici-
pants about the evil effects of illegal drugs, (he chllicult long-term
consequences of unwed pregnancies, and the benefits of complying with
physician oideis, (hey will become more conscious of consequences, think
moie caiefully before embarking on dangeious couises ol action, and even-
tually behave m moie socially acceptable ways
I he theory seems commonsensical out social scientists—and many
piogiam people—know thai it is loo simplistic Much research and evalua-
tion has casi doubt on its umvcisal applicability Although some piogiams
dial convey knowledge in an elloit lo change behavioi have had good
lesults, many have been notoriously unsuccessful In an ellon lo add to die
stock ol knowledge m the progtam aiena, an evalualor may find it woiih-
wlnle lo puisne this theory in the context ol the particular program with
which she is woikmg bhc may want lo caielully Hack die conditions ol die
piogiam in oider to galhei moie information about when and where suih
i iheoiy is suppoiied 01 discontinued by die evidence (and whal elements
of i onicxi, inieinal oigamzaiion, and leinloicemeiii make a dillcmuc)
Wnu n I INKS IN WIIK n I'm OHM s SUM i wi |V,MUAII>
•II
So much elloil is expended in providing inloi maiion in an aiiempi
10 change behavior (through public service campaigns, material posted
to Web sues, distribution of printed materials, leciuies and speeches,
couises and discussion gioups, promotional messages disseminated
ilnough multiple media) that careful investigation of (Ins iheoiy is wai-
lanied Riithcrnioic so much uncertainly exists about the efficacy of pio-
vuling inloi maiion of different kinds (o different audiences dial piogiam
developeis need a belter sense of (he prospects I he evaluatoi who pui-
snes this iheoiy in a "I HIE may look lo social science iheoiy loi a sophis-
ticated undeisianding of when and where inlorniation is likely lo have
effects and under whal circumstances She can build tins knowledge into
(he evaluation When the results of (he evaluation are teady, she can ollei
piogiam developeis and stall a greater understanding ol the extent lo
wlnih inloi maiion i icales change wilhin the immediate piogram context
Mail) studies have shown thai information lan lead to change in knowl-
edge and ailiiudes but not often to change in behavioi The i uiiein eval-
uation can examine whether and where the sequence ol steps in the
iheoiy bieaks down and what forces undermine—01 leinloice—die
powei ol mloimaiion
A lin.il cnierion foi choosing which theories to examine m a iheoiy- f [.
based evaluation is die ccntrah(y of (he theory lo the pioKiam Some ilieo- V
lies aie so essential lo the operation of a program that no mallei what else
happens, the piogiams success lunges on the viability ol this paiiiculai the-
01 y I el us take die example of a comprehensive community piogiam I he
piogiam involves the provision ol funds (by government 01 a foundation) •'_
lo a gioup ol comnumily residents, who (hen decide wluc h enhancements I)
(he neighborhood needs in order lo improve die lot of us inhabit.mis I be^A-
lesidems can choose lo use (he funds (o add more seivues (mental health,
education, and so on), clean up the streets and parks, rehabilitate buildings,
hue puvate police, alliacl new business lo die neighhoihood m oidei lo cie-
ate jobs lor local people, begin a ear service lor elderly lesidenis, 01 what-
evei oihei sei vices they decide arc most likely lo impiove die loi.il quality
ol hie
An evaluation can study die services chosen and Imd om ilu conse-
quences ol adding police 01 rehabilitating buildings 01 whatevei olhei new
scmccs have been added Ikil a fundaniental premise ol this commiiiiiiy-
bascd approach is dial local lesiclents arc knowledgeable, lommitted, haul
woikmg, and altiuistic enough lo find out whal is most needed .mil to go
about gelling those sei vices into ihc community rurthei, tin y .in .issnmt d
lo lepiesent (he needs and wants of a wide swath ol the (ommumiy So .in
undeilying iheoiy has to do with die role ol citizen gioups in developing
and directing a compiehensive community initiative I he clleiiivt in ss ol a
gioup of lesidenis in lepresenting the interests ol dieir neigbboiliood .uul
seeming piionty sei vices is key to (he success of (he piogiam I bis .issump
lion becomes a pi line candidate lor die evaluation
-------
-12 I'KIM.KAM Iliroin IN I \AI.UAIK1N
Which Links in a Theory to Study
Many ihcoiu-;., il diawn oul in detail, consist ol a long series ol inteilinked
assumptions about how a piogram will achieve us effects Let us go back to the
job-ii.iimiig progiam in Table 4 1 J.I the evaluation does not have the resouic.cs
01 the lime to study all ol the siqis laid out in die iheory. which ol than should
The evaluation explore* Much ol ihe answer to this quesiion will depcnd~on~nie
7iraciicaTines of ihc'STtuaiion At whal jioml is ihe evaluaioi bi ought to the
scene' Is il alter ihe fusi several siejis have already been taken7 How much
money (Iocs the evaluation have to collect daia? Mow difficult is it to get some
kinds of data7 Tor example, what kind ol data will the cvalualor need in oicler
lo know whelhei die naming is earned oul well71 low will she liiul out whelhei
the iiameesado|ii and internalize the values ol icgulai employment •' II some
kinds ol daia aie dilhcull or expensive lo collect, that will sei practical limns
Second, pirgiam Mall m iy hqyc pailiculai coiiceins about soiiyc seg-
mcnisol ihe impleineniaiion iheoiy They may wain to know, foi example,
whelhei naineis aie giving jiinpci emphasis lo good work hahiis and other
"soli skills" 01 whelhei the youth in fact learn the oicujiaiion.il skills ihat
the nameis seek to convey They may want to know whether stall iclei them
to iclevant jobs and whelhei the youth coinpoil themselves apjiiopiiaicly
m job iniei views, so that it is deal why they do or do not get jobs
It may be even moie impoiianl to examine some links in the piogiam ihe-
ory about the psychosoc i.il piocesses thai underlie the piogram I leie is wheic
much ol the uin.eil.nniy in social programming lies Whal impels developing
counines to seek 10 aiuaci moie girls inio ihe school system7 Whal gels lac-
ully members in urban universities lo leach in mierclisciplmaiy courses in
oidci to iciam sludenls in school7 In our example, what are ihe icasons thai
tiameesjieisisi m the liammgcouiscand leain both job skills and woik icadi-
ness skills7 Is u the capacity of the nainers lo develop supportive communi-
ties among the youlh7 Is il the strength ol external rewinds and |iumslimenis7
An evaluation can concciinaic on understanding these kinds ol mech-
anisms and the extent to which they ojicratc within the piogram milieu The
evaluaioi can collect data on whelhei pcei groups develop dm ing the couisc
ol naming and the messages and supports thai these groups piovidc to their
members Do youlh affiliate in subgroujw7 Do members ol the various
gioups suppoii the aims ol the training piogram? (Or do they deingiaie the
ellort to Icain skills ihat will yield "chump change"?) Do the nainers
actively cncouiagc the foimanon of subgioups and provide leadership7
What messages emulate in the dillerenl subgroups about the \ aluc of work
and the willingness lo accept aiilhonty on the job7 Regarding the theory
about external threats, how important to pamcipanls in the naimng pro-
giam is the icduction in safely net suppoits7
Jjecausc e -lions lo date have lold theirjrcadcrs icjatively little about
ihe wliy ol pi success and lailuie. such iiujinrics may have gie.il ics-
oiiancc Studies mat exploic the jisye hosoc lal piocesses of jiingiam iheory
WlIK II I INkSIN WlllC.ll IlirOHIIxSllAII Wl IVAIUAII* II
will have much to tell piogiam designers, lessons (howevei tentative) ilia)
may be suggesiive lor a whole range of programs
Criteria for Selecting the Links to Study
The cruel 1.1 lor choosing which links to study arc smnlai to ihe ciiiena loi
choosing which theories to study Two are probably most important I he Insi
ci iierion is the link or links that are most critical lo the success ol the |im-
giam It seems wise to mvcsi resources m studying the pamculai assumpiion
on which the progiam most basically resis If the program is predicated on
the assumption that what keeps youth enrolled m the full naming piogiam
is the suppoit ol then peers, then that assumption warrants investigation
I he si-iond c ineiion is ilie degree of unceilainiv about the linkage II
nobody knows whelhei I lie assumpiion is likely lo be suppoilccl cmpnually,
01 il pi 101 studies have pioduccd conflicting lindmgs on the subject, thai
link max be woiihy ol systematic sludy Some linkages aie unsettled in ihe
social si lence and the evaluation literatures Some linkages seem lo be siiji-
poitcd in ihe social science literature (or m common sense), bin evaluations
ol i-ailiei piogiams show that they do not work in piaciicc An cxanijilc
would be the piemise of case management within a iiiulliscmcc progiam
A laige number of multiset vice programs have employed case maiiageis
who analyze the seivices thai a lamily needs, locate and looidinatc a i.ingc
ol SCIMCCS, and help the family members obtain appiopnair sei vices horn
rck'vant agencies I he idea ol a family coordinator, an ndvoi.ne and 1011-
suliani lo the lamily, sounds so utterly sensible that it is unsettling lo Inul
thai evaluations have usually not lound such piograms successful (loi
cxamjile. Iticknun and otheis, 1995, Si Pierre, Layzci, and Goodson. IW)
What aie the assunijiiions thai underlie case managemeni7 What is the i.ise
manager assumed lo do, with what immediate consequences, leading to
what ne.xt steps, with what laler consequences7 Including some ol these
kinds ol links in the evaluation would yield impoiianl mloimaiion
Conclusion
In selecting the theory 01 theories to use as scaffolding foi a I'll! , the eval-
uaioi should consider these ciiiena
• The assumptions ol the people associated with the piogiam What aie
then consnuclions ol the interlinked stejis by which piogiam IHJUHS aie
liansmuied into piogiam outcomes7
• I he plausibility ol the assumptions, given the mannei in whu h tin pio-
giam is allocating us lime and resources
• Uncertainty about the applicability of current assuiiijiiioiis tiiven the
olien m
-------
44 I'KIN.KAM I III DM IN I.VAI IIA IIUN
• 1 he ccnliahty oflhc assumptions u ihc piogiam If llic program is based
directly on a particular ihcory, it would be sensible lo make this theory
llie cemerpiece of ihc TBE
Once ihc cv.ilu.uor decides which theory 01 theories to use for structur-
ing the evaluation, she ought to spell out all the links in the theory iham—
what the progiam will do, how paiticipants will respond, whai the program
docs next, and so on Many evaluations will not realistically be able to follow
all (he links in each chain, and (he evaluaior needs lo choose ihc links on
which to locus Considerations for making thai choice include the practical-
ities of access, resomces, and methodological capability for studying given
links and ihc paniculai knowledge needs of piogram siafl, who wani lo know
which elements of the piogiam they need lo modify or shore up
In making both choices—which theories lo select and wlmh link* 10
study—the evaluauu needs 10 consider ihc underlying iiifi IKIIIIMM.N on
which the piogram icsis, wh.il I have called I lie fmignim (fnoiv m con-
nadistmi lion lo ihc im/>/riiirm I siiuiliins, I . V.m Assuiu. I* ,.IIK| De Vncs, II " I lie linp.iil of .1 ( oni|iiiu i l.tiloud
Nniiiiion liiu rvi-niion " 1'n irndvr Medicine. 1996, 25. 23O-242
(Inn II I ..nul Rossi, I' II "The I hcory-Driven Approach lo Validity " (•viifiuiiinii >in,l
I'liiXHiiii rfiiiiiiiiix. l°"7. ID. 95-10)
I'usk.i. I' NISSIIMII.A .ind I uomililimj "Ihe C'liniiniinily-llasnl Sli.iu-Ky l» l'i< vinl
( oioii.ii) IK .in DIMMM- I oiiilusions lor the Ten Years ol (lie Noilli K.iu-li.i l'io|(iis "
AiiiiiiiilKiinii ofl'iiMu (fnillfi. I'W>. fi. 147-19)
si I'UIH K d l.i)zii | I . (lOiiilsnii. li 15 , and llcriisicin. I S Niiliniuil Inifuui / in/
inilidii .i/(In (
K.Mfu. 1997, 21. 501-524
Weiss, C II I iiiliiiiiiiiii Methodsfoi SluilyingPiogiamsimJ Mmc-. (2il nl ) I ng
Chlfs N| I'rcniui-llall, 1998
(IlKSC (K.IN VV'I ISS is /JI()/CSS(ll (I/ CcJlKdflOII (l( (IlC MillVdlif ( .Mlifllilll
-------
I t UlJlKtflOIIN Kill be /MM'tl (III -
tivf (how the progiain actually works) The issues raised in this i haptei i.m
he applied lo eiihei normal we models or clescnpiive models
Null 'I'his ih.ipltT li.ts hciu'liiccl sMl)si.iiiti.illy from the liclplvil loiumrnts, i|\ifsiiiiii-..
.Hid SU^CSIIDIIS limn iiii-mlicis ul i In- ll.nv.iicl l.v.ikulioii 'I ask I out. p.inu ip.inls .11
ilu- ll)l)H Aiiii'iu.in I v.ilii.iinin ASMU i.iiinn nu-ciui);, the cdiinis anil si'iits uliim ol tins
r, .mil as ,il».iys my gi.nlii.iic r\.tliuii<>n bliiilfius linlial wink mi I Ins ili.i|i' .is
11) .1 It Ilim'ship limn (In S|n iucr I tniiul.Hioii
-------
48 I'KlK.KAM I III 010 IN I \ MIIAIION
What Do the Boxes and Arrows Represent?
Piogiam theory usually involves a diagram of boxes linked hyanows icp-
rcscnting causc-and-cffect rclaiionships It is perhaps templing 10 consider
these causal models lo lie like wn ing diagrams, in which, il we Hick .1 swiuh
at (he lust box in I he diagram, il will cause the lights in the other boxes to
illuminate And indeed, sometimes the descriptions of these models, using
a series ol if-ilicn statements, suggest this imagery (Owen, with Rogeis,
1999, IManiz, Grcenway, and Hendncks, 1997)
l:valualois who aie familiar wiih social science principles will not he
sin pi used that lew piogiam theoiy models aie based on simple i an sal ic-la-
tionships like this, even if diagrams do not explicitly show it llowevei,
some piogram theory models do explicitly attempt lo show the pioccsses
that aie "necessary and sufficient" to produce the desired results — lor exam-
ple, Cooley's causal model (1997) of a progiam designed lo mcicasc gills'
participation in high school in developing countries
Moje commonly, program theory models are based on a recognition
that other faciois may influence the achievement of intermediate and ulti-
mate outcomes For example, ihe United Way's generic causal model dins
not explicitly include other (actors, apart horn a list of consnaints on the
program Howeyej, in the instructions provided wyhjhis model, it is made
clear that the further away from aciual program outputs'one moves, the
weaker the program^. inCLuence becomes, and The likelihood ol outside
forces having an influence increases (Planiz, Grcenway, and HendncUs,
1997) They go on to give an example ol a program pioviding prenatal
counseling for pregnant teens, pointing out that the program can influence
what pregnant teens know about appropriate prenatal practices but cannot
influence what the teens' overall health was when they became pregnant
Nor can the programs affect whether teens were using drugs when they
became pregnant The authors recognize thai each of these issues, genei.il
health and involvement with drugs, can have as much long-lei m influence-
on the later health of babies as the program uself
II is inleresimg lo note that tins analysis focuses only on lixed chaiac-
icrisncs or events thai happen before ihe clieni begins in ihe program— ^soine-
nmes jelened to a> mudeiaiois Outcomes can also be influenced by factors
that occui ai ihe same time as the program and either help or hinder its work
I low can we represem ihese olher factors in our causal models7 Funnell's
piogram theory matrix (Chapter N income I udesolher factors explicitly .mjcxt
associ.aicd with each outcome 1 1 is also possible to sjiowjhcm on the program
tJlSUiy diagianuas I lalpernTf998, 1999) has done, as seen in j-igmc 5 1
We might even make a dramatic move compleicly__away horn our
PI'jiJi-nii-uMimc causal model and show ihe web of client lelationships
thai influence client outcomes, including the influence of family, friends,
schools, shops, economy, neighboihood, media, legal system, woik, econ-
i>. i .nicl pohiii.il system (Mullen. 1995)
C'AIISAI MOI)i:i.S IN I'KlX.KAM I III DHV I VAIII \ I II IN
•19
Figure 5.1. Representing Other Factors in a Logic Model Tor Reducing
Alcohol-Related Motor Vehicle Accident Injuries and Deaths
Si I \ K i
( Illkiillll
lli.ihli
I'liiiiiuiion
|S) I'lOl III lull
Dull isi Dunking
iV Dm nit; in
lifiui.il l'ii|iiil.iliiin
Addnnon
Irealmcnl
Semies
!
I'meigiiKy
. .. , Anilinl.ii
Mediul
Sl 1 Ml 1
Seivn.es
\/
1 limniiile Dunking
ta Driving in
Akoliol-UtjHiidciH
Chcnls
Miniinui Mmliidii)
iV Mi in. ilny
limn MVAs
Kignm.il
llfjhh
I Inu unii
KciliiLC AlLohol-
Kelaled MVA
fjialines
6j Injuries
I'niMiiiial
I It.ill li
lnl.ll
1'irin.iiurr
De.iih .mil
I'ifvi m.ihlc
(ii.iilu.iliil
I in using Itii
Noviir Dim is
Allnn.iiiv
SlKl.llA lllllll.ll
liilri.iim- lui
Dunking f-i
Dining
IWH,
-------
50 I'KOl.KAM IIIIOMV IN I V.\l H4I ION
Mtilliplc Strands in Causal Models
Many piogiam ihcoiy models potiray I he piogiam as a single chain ol uilci-
mcdiaic and ultimate outcomes, wlicic A leads to R and then n> C liui u
,nj£ |w- hdpluljo show nuihinlc strands, where A and C.boih lead in U—
culie'i "liTcoinbinaUoti or iis allcinatives Ideally, we would be able in distin-
guish between complemc-maiy causal paths and alternative lausal paihs in
a di.ini.nn. pcihaps by using line arrows for the complciiicni.ity paths and
block .mows lui die alicinative paihs
II a combination ol two causal pain-, is nc.ccis.iry in achieve llie intended
tesulis. u is mipni lain in make llus explicit m order to avoid maximizing only
one ol mem In many piogi.im!., stall must balance competing imperatives like
this When I winked with iu.iirin.il and child heallh muses lo develop a ians.il
model ol then piogi.im In guide the development til pciloini.iiKf mdii.ilois,
ihey weic paiinitially pleased iliat they could make visilile the balancing they
needed in inamiain heiwceii providing mformaiion lo paienisand sitppoiimg
P.HCIIIS tonhdcncc in llu-ir own anilines Part ol then program mode), which
used an adaptation ol Uciuicus Incraichy (Brnnell and Rockwell, 1999) in
ilesi nhc ilu-ii woik on infant leveling, .showed ilusclcaiU. as seen in 1'iguic 5 2
h was impoitant Ini the stall in nuUe visible to piogt.tm m.iii.igns the
compelmg demands on them and lo make sure ihai perfoimance mcasuies
lelenecl 10 both of these in otdei u> ensure llwi iheie weie not siiuiiui.il
l-igurc 5.2. A Partial Program Model Showing Competing Demand-,
(II I IMA 11
Ik ill I nun ilion mil
j;in\vlll III Kil'h-
III IIAVIOK
( IIANl.l
KNO\VII IX.I
III IIII s
I'DI s
C/U'SAI MODIISIN PRO(.KAM I I II I >K> I X Al I IA I U IN "j |
pu-ssu.es 10 IIMMIIIIZC cithei mlomiaiion giving 01 paiemal suppoii .» (he-
expense ol ihe otliei When programs are managed by managcis w.ilu.ui
cleiailetl knowledge ol jirogram processes or arc managed ilnougb lonnat-
iu.il aiungcmenis, it becomes more important to make cxpliut COIIIIMMIIIU
nnpeiatives sui h as these If performance measuics only iiuliklr one ol the
competing nnpo.ai.ves, then a program may seem to be perlormmg well ,,.
let MIS ol us iiucinuxluic outcomes because one of these 1S being nuxtiniznl
at the expense ol the other
Multiple sii.uuls m a causal model may instead represent ahein.uive
I.IHS.I! luilu, |-OI t-x.,,Mple, Weiss (1998) omlmcs four possible mc.lumsms
hy winch higher UMdici pay may be linked in increased siudem .« \»ew.
Minn II these aie seen as competing explanations for observed outcomes
then .t piogi.im ilK-oiy evaluation might focus on testing which of these besi
e\pl.ims the evident e (as Weiss discusses in ( haptcr Tour)
It is also possible lo see these ajlf rii.il ivi- ( .luJtajjJaih^^Jinm inu fot
i rrlam rnnrtiliaus. Drawing an .m.-ilogy
..- . .. ti-
|w»wdci, whuh will ,,nly f,,c In favorable conditions. I'awson and 'hlley
(1997) have suggested that program causal mechanisms only fnc wiihin
la\niablr tnniexis _Aney^ju.iiiori based on this lype of c.ins.d moJ.-l will iry
jojuidasLm^ljc^tixiiiinsiaticcs under which pa. in ul.ii nifVliams'iiis opcr-"
Jtc In then ic-analysis of n crime preveiHioifjimgiaTn iii public" housing
esiates. I'awson and Hlley (1997) demnnsirjird th,- tmp<,,i.m,«- ,,l ,,,.,lt-i-
SMiKlinj. llu- CO.IIC-M ..I ddfeiem sites, including mtc.au.oiis .miong v.nn.us
ma IUIIIMIIS (such as improved housing and increased lenani involvement
in esiate manage mem) and among oihei coexisting processes
ll'i_LliJ!iUi!u<> rc-Gresetit ihese morc^gmnplex iclai.o.isliiiis m ., ixvo-
dmwnsioii.il di^rajii Pawson and Tillcy (1997) insfraVl repies'em ihVn
causal model in a matrix of context-mcchaiiism-niiicoiiir conligui,uion.s
which describes m text the causal mechanism that produces the outcoiiu-
and the context in which the mechanism is opci alive
The characteristics of program clients— then motivations, .iiiimdes
previous knowledge, and skills— are an important p.iruihhe umtext within
which causal mechanisms work or fail to work An iterative senes ol data
collection and analysis activities can be used in identify impoiiam ways m
which clients vaiy and the implications of these foi program ellei tivrnrss
(McDonald and Uogers. 1999)
Tojully undeisiandjhe cunjcxnvijjiin which c.ntsal nircli.iiiisins «.pt-i-
aic.vve may need to develop program models that do moic than UK hide m.>-
giam ilicnis simply-.is passrvcrecTpienis of ircainienis that change thai lives
II llie IIC.IIII1CIH Involves swallowing a pill, we mighi expnl teiMn. |,hysi(;-
logical ellects, reg.udless ol the active involve mem ol the patient, hut even
m tins example, we know ih;il the patient's expectations about the iiealmeni
tan inlluence its irponed impacts It is even less realistic n nbe piogram
t hems as passive ,c< ip,ents when the program is nuk-.iv. hi mK alHiut
|k-iiii.iiifiii ch.mge ,„. lot example, students' school hrh., ,i .ommmii-
-------
'32 I'KlM.KAM llllCHU IN I V.MIIAIIHN
canon strategics ol llic healing nnp.iiial — changes lhal require progiam
i Ill-ins lo lc.ii ii, apply, and maintain new ways ol oni-i.il ing
Pawson and Tillcy (1997) have argued dial we need "lo shake off those
concrpiu.il hahns which allow us lo speak of a piogram 'producing oul-
comes1 and lo icplace ihem with an imagery which sees the piogram oflcr-
mg chances which may (or may not) be triggered into action via the
suhjccfs capacny to make choices Potential sublets will consider a pio-
gram (01 not), volunteer loi it (01 not), cooperate closely (01 not), slay the
couise (or not), learn lessons (or not), retain the lessons (01 not), apply the
lessons (01 not)" (p 38)
I Ins issue docs nol need to l>c assiu idled with a philosophical com-
mitmeiil lo saving progiam chenis and having their needs and perspectives
at the lorelroni of progiam planning and evaluation nor with a belief that
the prison.il dignity ol clients and stafl requires treating them as program
panneis railiri than as passive objects In lad, the same distinction holds
line loi piogiams such as buiglary pievenlion, which mlcnd It) change the
behavior ol potential burglars Programs tan be understood as changing the
options available to participants and their capacities to choose and enact
these choices Usually piograms seek to increase options and capacities,
some, such as burglary pieveniion, seek to icduce them
Causal Models from Systems Theory
Systems theory suggests other types ol causal models In ihis section, I dis-
cuss liner of these that appear lo he potentially useful for program evalua-
,K)n — viiiiious 01 vicious circles, symptomatic solutions, and feedback delays
Virtuous or Vicious Circles, bysjnnsjhmkingsuggesis that cause and
elU;u.mj.ghj ojiejihc aimiei.led.nol in a linear way bui in a circular way,
es of VJL/UHUS cncfrs (where an initial effect leads to us own
irrnloneiiirni and magnification) 01 VUIOHS rogiaiu.aieJjkely^ to dec ay
over lime or lo become stronger
Symptomatic Solutions. Symptomatic soln/ions ait- soluiions ihai
relieve the symptoms bin lhal actually make 11 harder to solve the pioblc-m
Ii would be like having ihe flu and taking tablets lo recline the sympioms
and then continuing to work excessively, rathci than convalescing, iheicby
making it haidci to actually recover
I Ins pmhlc-m has implications both lor evaluation and loi iiiomioimg
I 01 an evaluation, wheie we are trying lo understand how ellective a pai-
liculai piogiam h.ts been in solving a problem, we should design om eval-
uation so thai it can distinguish between lemporaiy i eduction of sympioms'
and sustainable solving ol the problem l:or monitoring, wheie we aie seek-
ing to simultaneously understand and influence progiam implementation,
we should set up systems lhal do nol encourage people to develop dys-
functional symptomatic solutions
Owen and Limbeit (1995) addressed this issue in then evaluation ol a pio-
gram in which all grade-live students used their individual notebook comput-
ers in all subjects One ol the unintended consequences ol this piogiam was an
incicasc in teacher stress, as teachers struggled lo develop then compuiei skills
and simultaneously acl.ipl their leaching material and piocesses him,illy, icai li-
ers lesponded to this increased stress by "gelling on with the |ob," avoiding
spending time in coordination meetings, or liaising with olhei leac hers, and the
administration sought lo support teachers by leaving them alone and nol mak-
ing addilional calls on their lime If the evaluation had measured icachri si i ess
at this point only, it would have found lhal in the shoil-teim teacliei stiess was
reduced thiough this coping mechanism Hut over time, this symptomatic solu-
tion led to hoaiding ol equipment, rivalry among groups, and pool attendance
at mloimation sessions—consequences thai made it harder to implement the
fundamental solution, which involved belter shanng and coordination ol
resoiuces and increased support and training for leacheis
Feedback Delays. We have probably all cxpcnenced ihe effects nl /tul-
badi-tkluy when Hying to adjust the water temperaluie in a shown If theie is
a delay in response, we lend to overcorreci—first loo hot, then too cold, until
eventually teaching (he desired cquilibnum 1 he Massachusetts liismin< nl
Irchnology "beer game' simulation (Senge, 1990) hasdemonsti.iied ihe ellt t is
ol feedback delay on a simple piogram—a system lor piodiicing ami disinh-
ulinga single biand ol beci Once there is a delay buill mio the sysiein, sn ili.u
the decision m.ikeisdo nol immediately see the nnpac I of the ih.mges ihey aie
making, the ordeis become moie and more excessive and unbalaiu cd
r]ie.tcasyHLliir_usiug rjcrjormancc mcasuies and mduaiois as pan ol
.oimmiiucy»ilu.i.lionjii_ihr public sector is lhal they can be used by iii.inagi-is
lo I.ike i DI iei live action in programs, muclilike some(ine innniliiiing and
-------
I'NIM.KAM
IN I VAItl.\lli>N
admstmg the shower icmpciaiure Unloitu.iately, lew if any ol these sysiems
addies. the pioblc.n ol feedback delay In lact, I have been unable 10 find a
single example iliai does
How Complex Should Program Theory Models Be?
Although ihis chapter has lot used mi more complex model* and i .ius.il ii-la-
uonshms. u is wonh remembering that si inplc models can olten be helplul,
pan.cuWly.iULUioiitamsjn.whjcl. .here have previoUsFylxrii lew explicit
™ic7niual ,uul emm. ical comieci.ons made between piogram activities and
ouicoiiS-s nmluTng ..'plausible model of bow ibe program is meant 10 wo.k
helps managm uleniily the mosi important moccsscs 01 mte.med.aie uui-
comes and lotus then mcasureincnl and attention there C.ivcn lhai many
niogiain evaluations still collect little data about progiain implementation
or miermediaic outcomes, there is often considerable value (as Lipscy. IWJ.
and Pctrosino Chapter Six, point out) in using even a simple iwo-siep pio-
Kraiu model that simply identifies and measures one mediating variable that
is undeisiiHid to be nccessaiy loi ihe achievement ol the ultimate outcomes
And having a common model ol how the piogiam is meant in woik can
help piogram staff work together and IOLUS on those activities that arc most
impoiiant loi piogiam success
In fact as Wcick (IW5) aigucs, ajiioueljnighi provide a useful heuriS;
nc foimirppseUiLaclioti without ncies'sanly being correct lie u-couiiis, ihe
sforfof ihe reconnaissance unii, lost in the snow in ihe Swiss AI|)S loi IhiCK
days in a blizzard, who eventually managed to find ibcir way safely back to
camp with the help of a map-a map. they laic, discovered. i>T tliePv.ien.ffis,
not ol the Alo_s.-n his incident raises the intriguing possibility thai when
•TOU" aiTuIsT any old map will do Once people begin to act, they
Kcneralc tangible outcomes in some context and this helps them dis-
cowt what is occui ring, what needs to be explained, and what
should be done next" (pp 54-55) Wcitk goes on to quote Suuhlle, I l.w-
/"ing an accurate environmental map may be less important than having some
( man that hi mg> order to the world and prompts action" (pp 56-57)
' V Tins analysis may well explain the positive lesponscs that program stall
/ohm have to progiam theory evaluation (see. for example, Huebner, Uiap-
/ ter fight) even when th.s is based on ve.y simple causal models Bui simple
causal models can be dysfunctional in high-stakes evaluations that are linked
tc, n h nn
()tiKniiirs'~^iiliivi sn/ rli\ui /Avwu
in.ipl nun .ui/A I linn)
tin-ill I I HIKHV OMIIII I viilii.iln.il I IIOIIS.IIH! Oaks. Calif S.igr, I WO
tonliv.l I'u M in. u inn in I In Waslimj>lon I. valuators' Confciemc. IW7
I iiinii-ll. S Piu^iain Lnj;ii An Ail.ijilaUlc "lool " Evaluation Nrws uiiiJ ( niiiiiiriii M>»7
(.( I ). •>- 1 7
ll,il|iiin. Kc.ilil)- Kvalualmg Innoviilivc t'l'igi.iins ill I'tihlu liisniu
nuns ' (iiiini'iidiHi Jiiiiimif (fir I'uhlic Seiloi Innovation Jnuuttil. Nov I WH. rovisul N.n
I*)*) |.i\.nl,ilili .u liiip//\v\vw iiinov.iiionic./rev_nilsi-x M I lii-oi\ as Mrlluul Small Iliconcsof Ircatmrnis ' In I SnliuM .mil A
Si on ( i-iU ), I'liili-iwcmifiiig ( tiu\r\ and Ceneiahzmx About I In in New Duniinns loi
riiiKMin I i-aliMiicui. mi 57 San I rrmcisco Jussey-Kass. IW»
I ipso M anil I'nllanl, | "Dnving I uwnnl "I heory in Progi.im I v:iliialinii Mini Mod
rls in C luuisi- I rom ' I Mihutlum tmtl Pn>)>mm Pluiimn)>, I981J. 12. \\7~MH
Mi( Iniiuik ( ' Ailminisiialois. is Applied Theorists " In I. llulun.ui (nl }.Ni\vHim
(KIMS /ni f uiliiiiiiiMi /Uvuiii rt in CIUJJIHIM llicoiy New D.rct lions l.u Piii^iaiii I valn-
.ilion no -i? San 1-r.mi.isi.o )osscy-Bass, 1990
MiDon.ilil. IS . and Knurrs, P "Markcl Segmentation as an An.il.ij>> foi liillt unu.iu.l
1'io^i.nii I In Hi) An 1 xamplc rroin I lie Oairy Industry " I'api-i prrsriilril a. iln .innii.il
ini'i'iiMn of ilu-AiiH-i 11.111 1 t'.ilu.-ition Association, Orlamlo, I la , I «W9
Owen. | M . with Rogers, I' | I'ni^nim h valuation f umii Hint A/i/noiii lies I linns.ind
Oaks. ( ahl Sjftc, IW9
llwin. | M .and l.imhtil, I C "Rules Tor Lvaluaitnn in frainin^ Digniiiz.ilions " f vul
iiiKiuii. IW5 1(2). 217-250
I'awsuii, K , and hlli-y, N KnilisiK I valuation "thousand Oaks. C.ahl S:ij;<-, l*)°7
riantz. M C , lliernwa), M I . and Mcndricks, M "Oulcome Measurcinciil S|HUMIIJ;
Kcsitlis in llu- Nonpiolit Sec lor " In K I- Newcomer (ed ), Using IViJuiiiiiiun Mm
MI 1 1 nil nl lit hni>it>\r I'lil'lu unii Niiiipnifit PiO£H>»is New Direilions for I \.ilu.iiioii. no
75 San I tain ISLII (ossry-lt.iss, 1997
Suigi- I' M lit, (ijili Disc i()linc Tin- Ail tiiuf I'ruilKrit/lhrf r
Weiss.. I II f iiiliiiKinii MilluxJs/oi 3(ii((ying/'i()gi(miMiii<(riiMiio (2n
-------
The Pilot Process Step by Step Outline
1 Step One- Create the Team and Team'
2. Step Two- Begin with the END IN MIND and PLAN YOfR WORK
3. Products
4. Product Contents
5. Tasks
6 For Each Task
7 Step Three: Plan and Conduct Initial Meeting with Agency Top Management
8. Step Four- Gather Documents
9. Step Five: Analyze Documents; Synthesize Information
10. Step Six: Develop Map of all Previous Work
11. Step Seven: Develop Draft Logic Models
12. Step Eight: Share Models with Agency
13. Step Nine: Change Models; Write Narrative
14. Step Ten: Assess Logic Models
15. Step Eleven: Identify Key Evaluation Questions
16. Step Twelve- Create Potential List of Jobs
17. Step Thirteen: Prioritize the List and Recommend Future Work
18. Step Fourteen: Assemble Final Products
19 Develop Pieces and Parts Along the Way
-------
The Pilot Process
Step by Step
Step Two: Begin with the END IN
MINP and PLAN YOUR WORK
• Identify the Products
• List the Tasks
• Create the Time Line
• Assign Responsibilities
• Identify Needed Resources
• Specify a Regular Process for
Communicating (phone, e-mail,
meetings)
Step One: Create the Team
Philosophy of Working in a Team
Mechanics of a Team
l/nique Aspects of the Virtual Te
Communication
Roles and Responsibilities
PRODUCTS
• Written Report
• Supporting Documentati
.-
• Oral Briefing & Materials
• AND, CONTAINED IN THESE
PRODUCTS...
-------
PRODUCT CONTENTS
• Logic Models
• Corresponding Narratives
• Observations
• Key Evaluation Questions
• Prioritization of Potential Jobs
• 'Lessons learned' about the Process
As 3 part of developing the logic
models, begin to 'overlay', on the
models, the following:
• GPRA annual performance measures
• Any metrics the agency uses
to measure any part of the
program (inputs — outcomes)
TASKS
• Plan & conduct initial meeting with
Agency Top Management
• Gather and analyze documents
• Synthesize information; develop tables
• Develop draft 'Logic Models' Ctheory of
program)
• Share draft models with Agency
TASKS CONTINUED...
• Change Logic Models per additional
information from Agency and other
sources
• Write corresponding narrative for models
• Assess the Logic Models
• Identify Evaluation Questions
• Create and prioritize list of Potential Jobs
• Complete Report, Briefing, and 'Lessons
Learned'
-------
For Each Task...
• determine if ft will be an effort that
involves individual team members, the
entire team, the Agency, the other pilo
team, the consultant, others!*
• identify (gather/request) resources
• seta deadline
• assign to a team member for follow-up
• keep notes, documents, references, a work
log
• •
Step Four-. Gather Documents
GPRA documents & other
information
Info on metrics gathered)
Published reports - internal
Published reports - external
Popular reports; articles...
Past evaluations'
Step Three: Plan and Conduct Initial
Meeting with Agency Top
Management
'Past Evaluations, Audits,
Investigations
• When synthesizing the information
collected, these documents will be very
valuable as 'pieces of the puzzle'
which may indicate
that there little need
for the same info againi
• Or can indicate a different
of info is needed
-------
Step Five: Analyze Pocuments;
Synthesize Information
• Read
• Sort
• Begin to develop
tables for each & every
'component'of
the 'program'
Step Seven: Develop Draft Logic
Models
• From tables, create models which indicate
sequencing, relationships, dependency
• fse 'Z' maps to indicate layered and
program to program relationships
• Models include: overall program,
clusters, individual programs •
Step Six: At the same time, begin to
develops Urge 'map'...
• ...of the complete prograr
• Include previous
audits,
investigations,
evaluations internjii^pjf'external to the
program
Step Eight: Share Models
with Agency
• Group meetings
• Individual meetings
• Ask: "Is this representative"; "What are we
missings'" "Where can we find that
information!*" "What other documents or
persons would you suggest we consult/"
• Purpose is to gather more information for
your models—not get agency approval
-------
Step Nine: Change Models; Write
Narrative
• Models are revised or sometimes scrapped
and begun again:(
• Narrative and footnotes are important;
models do not 'stand alone'
• Finished
products are siicTT
Assess the Logic Models..
(continued)
• Logic (In THEORY: Po the resoura
support the activities!' Can the activities
produce the outputs** Do the customers
receive the outputs** etc... Poes everything
make sensed)
• Information Gaps (Where are the gaps in
information/ Why/)
Step Ten: Assess the Logic
Models (You will have been doing
this step all along!)
it! • GPRA CPo the programs appear
to logically lead to the goals/)
Flow of goals, objectives
subobjectives...)
Metrics (Are there metrics/
Are they 'good'/ Can they
aggregated/)
Assess the Logic Models...
(continued)
• Completeness (Is the program logic
complete/)
• Externalities (Are there
unaccounted for externalities/)
• Mission (Poes the program fit
within the mission/ goals/
objectives/ subobjectives/)
-------
Step Eleven: Using Logic Model
Assessment, Identify Evaluation
Questions
• Is there a need for the program(*(calls for- a
NEEDS ASSESSMENT)
• Is the program logically designed!* Care all
parts of the logic model present-*) (calls
for further DESIGN EVALUATION)
• Can the program be implemented as
designed*' (calls for IMPLEMENTATION
EVALUATION)
Using Logic Model Assessment,
Identify Evaluation Questions
(continued)
Is this program worth
the money!* (calls
for a COST-BENEFIT
ANALYSIS)
Other Questions:
Using Logic Model Assessment,
Identify Evaluation Questions
Continued...
• if implemented as designed, can/will the
outcomes be attained!* What about
unintended outcomes!* (calls for
OUTCOME EVALUATION)
• Just how much is this program
contributing to the attainment of desired
outcomes!* (calls for an IMPACT
EVALUATION)
an
Step Twelve: Create a Potential List of
Jobs
• Group questions by type
• Translate evaluation questions into jobs
• Estimate amount of time, FTE's and
resources for each 'job
-------
wa
, ••
Step Thirteen: Prioritize the List and
Recommend Future Work
• Environmental risk and risk reduction
• Level of EPA investment £*
• Importance of knowledge gaps
• Level of stakeholder interest
• Importance for restoring or
preserving E PA's credibility
...DEVELOP 'PIECES ANP PARTS'OF
PRODUCTS ALONG THE WAY
• Logic Models
• Corresponding Narratives
• Observations
• Key Evaluation Questions
• Prioritization of Potential Jobs
• 'Lessons learned' about the Process
Step Fourteen: Assemble Fin^l
Products
• Written Report
• Supporting
Documentation
• Oral Briefing & Materials
• BUT, TRY TO...
From this.,
To this...
------- |