EPA OIG Program Evaluation Pilot Team Training


               EPA QIC PROGRAM EVALUATION PILOT TEAM TRAINING

                                January 16-18, 2001
                                  80817th StNW
                                    Suite 400
                     National Center for Environmental Assessment
                               Washington DC 20006

                                    AGENDA

                        Tuesday, January 16,1 00 PM - 5 00 PM

1 00 - 1 30 PM     Welcome, Introductions, Agenda        Rick

1 30 - 2 30 PM     "What's Up With This'"                Rick
2 30 - 2:45 PM
Break
2'45 - 4-45 PM     The Basics of Program Evaluation

4 45 - 5 00 PM     Wrap-up
                                   Emmalou

                                   Emmalou
                      Wednesday, January 17, 8 30 AM - 5:00 PM

83O -1015AM    Logic Modeling                      Emmalou

1015-10 30 AM    Break

10 30-11 45 AM    The Pilot Process
11 45 - 1 00 PM     Lunch
1 00 - 2 30 PM     IT
2 30 - 2 45 PM
Break
2 45 - 5 OO PM     The Pilot Process
                                   Emmalou w/Connie, Dale, Art,
                                   Rick
                                   Ernie, Stephanie, Yvonne
                                   Emmalou w/Connie, Dale, Art,
                                   Rick
700PM
Dinner

-------
                        Thursday, January 18, 8 30 AM - 1 OO PM




8 30-9 30 AM     The Pilot Process                      Emmalou, Rick, Consultants




9 30 - 12 00 Noon  Team Meetings w/Facilitation            Emmalou, Rick Consultants




12 00 - 1 00 PM    Wrap-up                             Rick, Emmalou




1 OO PM           Adjourn

-------
             The Basics of Program
                            Evaluation
                      Emmalou Norland
                                   >H

                                jmmii
what c|o you plan to gain...
 I professionally from this training!*
 I professionally from participating in the
 pilot/
What Well  Po This Afternoon,

• Share a little about yourselves and
  your expectations
• Define program evaluation more fully
• Identify the links between program
  development and evaluation
• Learn about the 'profession' of program
  evaluation
• Distinguish program evaluation fro
  other similar processes
What <\o you have to contributed

• Team skills
• Program knowledge
• Evaluation knowledge
• Special interests
• Special prior experiences
• Other important
  contributions

-------
What is Program Evaluation?
 I What is your definition?
 I What is a definition from a reference
 book/
 I So... What does it involve/
 land.-.Why do we do it/
 iand-.For Whom is it done/
 land. -What is the target of program
 evaluation/
An4--Why 4o we evaluate?

• To improve the program -
 FORMATIVE EVALUATION is done to'
 help form or reform a program
• To prove the program - 5UMMATIVE
 EVALUATION is done to sum up the
 program's accomplishments
• BOTH are done for decision-making:
  Formative - Change/
 Summative - Keep or Kill/
                                                        So...What POES program evaluation
                                                        involve!*
                                                        • collaborating
                                                        • questioning
                                                        • planning
                                                        • information-gathering
                                                        • information-analysis
                                                        • communicating
                                                        • interpreting*
                                                        • judging*
                                                        • Decision-making*
                                                         An4 Why Else?

                                                         • Postponement
                                                         • Ducking Responsibility
                                                         • Window Dressing
                                                         • Public Relations
                                                         • Requirement

-------
Formative Evaluations
 I Needs Assessments
 I Design Evaluations and
 Evaluability Assessments
 [Implementation Evaluations
 (Performance Audits)
 lSome Outcome Evaluations
An4— For whom is program evaluation
  Stakeholders of the program who hav<
  information gaps about the activities
  characteristics, and outcomes of it Ct
  need information to make decisions)
  For EPA OIG:  Congress, the Agency
  the Public, the Regulated Communi
  other Agencies, States,...
 Summative Evaluations
  I Most Outcome Evaluations
  I Impact Evaluations
  I Cost-benefit and Cost-effectiveness
  Analyses
  iMeta-analysis
 An4---What is the target of program
 evaluation?

• a program (as opposed to policies,
 personnel, or other 'object')
• "an organized set of activities that are
 managed toward a particular set of goal
 for which the program can be held
 separately accountable" Kirchner, 2000
• "a general effort that marshals staff and
 projects toward some (often poorly)
 defined and funded goals" Scriven 19

-------
What are the levels of 'programs' we
can evaluate in EPA<*
 I Objective
 I Sub-objective
 I Project    fc
 I other:     ^P
Evaluation Questions, Phases of a Program,
and Corresponding Evaluation Types
 I Should we have a
  program/
 I Is the program
  designed to work/
 I Is the program
  implemented as
  designed!*
 I What outcomes
  are being
  achieved!*
I Before developed;
 Needs Assessment
I'Before
 conducted; Design
 Evaluation
I During:
 Implementation
 Evaluation
lDuring:Outc
 Evaluation
                                         What actually comprises a program*)
                                         • Inputs - resources used to accomplish
                                          certain activities
                                         •Activities - the activities which produce
                                          products or services for customers
                                         • Outputs -products or services for customers
                                         • Outcomes - customer changes,
                                          environmental changes,
                                          environment/human health changes
                                         • Externalities - contextual influences
 Evaluation Questions, Phases of a Program,
 and Corresponding Evaluation Types

•Are outcomes     • During and after
  caused by the        the program:
  program/           Impact Evaluation
• Were the benefits  • After:
  worth the cost/      Cost-benefit
                     Analysis

-------
 Evaluation Questions, Phases of a Program,
 and Corresponding Evaluation Types
• Could program
  have been         • After:
  conducted more      Cost-effectiveness
  cost-effectively/      Analysis
• What are the big    • After a series of
  picture findings       evaluations:  Meta
  and implications/     Analysis
Stan4ar4s of Practice
 (Joint Committee Program Evaluation
  Standards:  Utility, Feasibility, Propriety,
  Accuracy
 I Yellow Book - GAO Auditing
 I Turquoise Book CPCIE)
 IAEA Guiding Principles
                                                               Evaluation is a Profession
                                                               • persons with specialized
                                                                 knowledge and skills
                                                               • unique body of content
                                                               • preparation programs
                                                               • stablized career opportunities
                                                               • working on certification
                                                               • professiona I associations...
                                                               • ...which influence preparation
                                                               • standards
                                                                Evaluation an4 Similar Processes

                                                               • Evaluation is the broad category of
                                                                 processes which gather and share
                                                                 information with stakeholders to use in
                                                                 decision-making (focus on users and use)
                                                               • Would also include auditing,
                                                                 investigation, monitoring, assessment
                                                               • Research is different from evaluation in
                                                                 that research questions are not nece
                                                                 targeted  by specific information user
                                                                 use

-------
Are Auditing an4 Evaluation Different!*
 I Activities
 (Scope
 I Types of Questions
 I Resources Needed
 [Purpose
 (Stakeholders
 I Utilization
 (Results
No Yes
No Yes
No Yes
No Yes
No Yes
No Yes
No Yes
No Y
Can't Wait L/ntil Tomorrow!
(Wednesday, January 17)
• Starting Time: 8:3O AM
• Location: Here!
• Topics: Basic Steps in Program
  Evaluation; Logic Modeling;
  Pilot Process;  Assistance for
  the Teams
•Announcements:
What We DidThis Afternoon
• Shared a little about ourselves and
  expectations
• Defined program evaluation more fully
• Identified the links between program
  development and evaluation
• Learned about the 'profession' of
  program evaluation
• Distinguished program evaluation fi
  other similar processes
     gm
mf

-------
                             EVALUATION IS.

1 My definition of evaluation is
2  A definition of evaluation from a reference book is
3  Here are some common definitions:

        the systematic collection of information about the activities, characteristics, and
      outcomes of programs to make judgments about the program, improve program
      effectiveness, and/or inform decisions about future programming  CPatton, 1997)

        determining the extent to which a program has achieved its goals.  (What about
      implementation, program processes, unanticipated consequences, long-term
      impacts')

        determining the worth, merit, or value of the something - program, product
      (Vsing what evidence; for what purposes')

        the systematic assessment of the operation and/or the outcomes of a program or
      policy, compared to a set of explicit or implicit standards,  as a means of contributing
      to the improvement of the program or policy  (Weiss, 1998)

4  What is OPE's definition/

-------
Steps in Program
Evaluation
Fits Most Evaluation Need
 1.  Identify Key Stakeholders - Gather
 their Questions
 • Congress
 • The Agency
 • Other Agencies
 • The Regulated Community
 • The States
 • The Public
 • Results of Design Evaluation
 r Steps

 11. Identify Key Stakeholders and Questions
 " 2. Assemble an Evaluation Team
 i 3. Identify Information Needs and Plan Data
  Collection
 "4. Collect Data
 • 5. Analyze Data
 16. Develop Findings
 • 7. Draw Conclusions and Make
  Recommendations
(if there are questions for an
evaluation)...2.  Assemble a Team to
Guide and Conduct the Evaluation
• Evaluation
• Subject Matter
• Agency
• IG Process
• Facilitator

-------
3. Identify Information Needs and
Plan the Data Collection
• What are the information needs/
• Are the data available Cgood, accessible...)/
• Do additional data have to be gathered!*
• I/sing what instruments and processes/
• I/sing what resources/
• Are there alternatives/
5. Analyze
  Inductive process for Qualitative Data
  (Interviews, Focus Groups, Some
  Observational Process)
  Deductive process for Quantitative Data
  (questionnaires, tests, environmental
  measures and monitoring data,
  demographics, program descriptives)
4. Gather Data

• ifavailable.. gather
• if not, design and test instruments, gather
  data
• OR contract out
6. Develop Findings

• if needs assessment...'The program fits in
  these ways"
• if design evaluation... 'The program
  appears like it can be implemented and •
  can successfully reach these outcomes"
• if implementation evaluation  .'The
  program is being implemented in this
  way"

-------
6. Develop Findings (continued)

• if outcome evaluation.. .'There are these
  outcomes, anticipated and unanticipated"
• if impact evaluation...'The program is
  causing these outcomes"
• if combination...
7. Draw Conclusions and Make
Recommendations (continued)
• Recommendations are based on findings
  and conclusions of this evaluation, as well
  as other evaluations and studies: "Based
  on these findings and conclusions, as well
  as other audit results, we recommend---"
7. Draw Conclusions and Make
Recommendations
• Conclusions are based on the findings and
  some pre-identified criteria. 'The
  program is being implemented in this way,
  thus we conclude that..."

-------
                          TYPES OF EVALUATION

When, in the life of a program, 4oes evaluation take place (before, during, after), how are the
results of that evaluation used to make decisions about the program Cmid-course correction,
continuing, expanding, or institutionalizing the program, cutting, ending or abandoning the
program, testing a new program idea, choosing the best of several alternatives, deciding
continued funding.)

1 "Formative evaluations strengthen or improve the object being evaluated-they help form
it by examining the delivery of the program or technology, the quality of its
implementation, and the assessment of the organizational context, personnel, procedures,
inputs, and so on "  Trochim, 2000

needs assessment

design evaluations.

evaluabihty assessment'

implementation evaluation.

2  "Summative evaluations, in contrast, examine the effects or outcomes of some
ob|ect-they summarize it by describing what happens subsequent to delivery of the program
or technology;  assessing whether the object can be said to have caused the outcome,
determining the overall impact of the causal factor beyond only the immediate target
outcomes, and, estimating the relative costs associated  with the object"  Trochim, 200O

outcome evaluations

impact evaluation-

cost-benefit analysis and cost-effectiveness analysis

meta-analysis:

3  Other reasons for evaluation: postponement, ducking responsibility, window dressing,
public relations, fulfilling someone's requirements

-------
                      WHAT ARE WE EVALUATING/

1 What is a program'

      "The term 'program' refers to an organized set of activities that are managed toward a
      particular set of goals for which the program can be held separately accountable "
      Kirch ner, 2000

      "The general effort that marshals staff and  projects toward some (often poorly)
      defined and funded goals." Scriven, 1991

      " .I will call a national program, like Head  Start or Superfund environmental cleanup,
      a program. The local operations of the program are each projects Thus, the Head
      Start that is operated  (locally! is a project  An element of the [local] Head Start
      project, like involving parents through weekly meetings, is a component.  Evaluations
      can be directed at any of these levels. We can evaluate national programs, local
      projects, or sub-project components."  Weiss, 1998

2  What are the various levels of'program' we might evaluate/
3 What comprises a program/

-------
EVALUATION AS A PROFESSION

Refer to articles in' Altschuld, James and Molly Engle, Eds (1994) New Directions for
Program Evaluation. No 62 San Francisco CA Jossey-Bass Publishers

Worthen, Blame "Is Evaluation a Mature Profession That Warrants the Preparation of
Evaluation Professionals/"
Mertens, Donna M "Training Evaluators l/nique Skills and Knowledge"
Kingsbury, Nancy and Terry E Hednck "Evaluator Training in a Government Setting"

What is a Profession/ Is Evaluation There Yet/

Q 1 It has persons with specialized knowledge and skills.
Q 2 There is a developed a body of content unique to its area of specialization.
Q 3. There are preparation programs designed to produce practitioners who are well qualified
in the unique knowledge and skills
Q 4- Stable career opportunities have'emerged for such well-qualified practitioners.
Q 5. The specialization has developed procedures for the certification or licensure ofthose
|udged qualified to practice it.
Q 6. There are associations devoted to furthering the professional development of its
practitioners
Q 7. There are criteria for determining membership in such associations
Q 8. The relevant professional associations influence the preparation programs.
Q 9. The specialization has developed standards to guide those who practice it

Who conducted evaluation before evaluators/

Accountants and auditors, management consultants, planning and systems analysts,
economists, research/product development, test marketing specialists, academics in social and
behavioral science.

Knowledge and Skills Associated with Evaluation:

Research Methodology, Project Management, Strategic Planning, Auditing, Program
Development, Communication Skills, People Skills, Negotiation Skills, Personal Skills
(credible, good judgment...), Cross-cultural Skills, Policy Analysis, Valuing, Economics,
Specific to discipline (education, psychology, health, business, government, environment)

-------
 ihe An     n Evaluation Association, the AEA's pasi. currem, and incoming
 presidents—David Cordray, David Fetterman, and Karen Kirkhan, respec-
 tively, Arnold Love, president or (he Canadian Evaluation Society (CES), the
 regional CES presidents, Kaihy Jones of the CES. Gary Cox of the University
 of Washington, the National Center for Science Teaching and Learning at The
 Ohio Slate University, HealthEasi of Si  Paul, Minnesota, and the University of
 Alabama School of Medicine for assistance with preparation of the directory
 The editors of this volume express their deep gratitude to those who partic-
 ipated in the study Without them, it would not have been possible

                                                    James W Altschuid
                                                     Molly Engle
                                                     Editors
JAMES W ALTSCHULD is associate professor of educational research and evaluation
and evaluation coordinator for the National Center for Science Teaching and Learn-
ing at The Ohio State University His research interests include evaluation models
and methodology, needs assessment, and the development of evaluation training pro-
grams

MOLLY ENGLE is an assistant professor in the Behavioral Medicine Unit. Division of
Preventive Medicine, Department of Medicine tit the University of Alabama at Birm-
ingham School of Medicine She designs, implements, and conducts research and eval-
uations in behavioral medicine and community-based health services
Criteria for judging the matuiUy of any profession are applu  .o
evaluation. Special attention is paid to the question of whether
programs for the preparation of evaluation specialists are
warranted
Is Evaluation a Mature Profession That
Warrants the Preparation of Evaluation
Professionals?
                                                                                     Blame R. Worihen
There is wide agreement that evaluation is an important piofessional special-
ization, but there is less certainly as to whether it has yet attained the status of
a distinct piofession To answer this question, I propose that a fully developed
profession has at least nine characteristics, and I will discuss these charauer-
istics in the context of the need for preparation Worthen and Sandtis (1991)
advanced six of these criteria in their discussion of trends in educational eval-
uation, and portions of this chapter draw on thai earlier work
    First, a fully developed profession needs persons wiih specialized knowl-
edge and skills Second, it has developed a body ol content (knowledge and
skills) unique to us area of specialization Third, the profession has developed
preparation programs designed to produce practitioners who are well quali-
fied in ihe unique knowledge and skills  Fourth, stable career oppoi mumcs
have emerged for such well-qualified practitioners Fifth, the specialization lias
developed procedures foi the certification or licensuie of those judged quali-
fied to practice it  Sixth, the specialization has developed associations devoted
to furthering the professional development of us piaumoneis bevemh. the
specialization has developed criteria for determining membership in such asbo-
ciations Eighth, the relevant piofessional associations in flue nee the piepaiu-
tion  programs Ninth, the specialization has developed sund.iulb 10 guide
those who practice it
    A simple status check on each of the nine ciueriu propobed would he one
way of judging how far evaluation has moved towaid attaining ihe iluiauei-
isucs of a full-fledged  profession  However, the matuiaimn of evaluation

NLWDlIUIUNlHU PllH.IUM t-VAIlMIKJN llu OJ  bummci I
-------
 4    THE PREPARATION OF PROFESSIONAL EVALUATURS

 toward the status of a profession can better be understood by considering the
 forces that have shaped it across the past thirty years Although space will not
 permit me to say much about the historical emergence and evolution of eval-
 uation, I will sketch some portions of the histoncal backdrop when it helps me
 to clarify the current status of evaluation on the nine catena proposed


 Need for Evaluation Specialists

 Although there were a few embryonic efforts to evaluate public programs pnor
 to 1960 (Shadish,  Cook, and Leviton,  1991, Worthen and Sanders, 1987).
 most commentators believe that contemporary evaluation of educational and
social programs first emerged dunng the 1960s  Early in that decade, the U S
Congress passed federal legislation that, in authorizing antipoverty, juvenile
delinquency prevention, and manpower development and training programs,
both required program evaluation and allocated funds for it (Wholey,  1986,
Weiss. 1987)  Yet the emphasis on evaluation built into the Elementary and
Secondary Education Act (ESEA) of 1965 dwarfed previous efforts to mandate
the use of evaluation Broad in scope, the ESEA provided large-scale funding
for education that allowed tens of thousands of federal grants to be awarded
to local schools, state and regional education agencies, and universities Due
largely to the efforts of Robert F Kennedy, the ESEA required the recipients of
grants dealing either with compensatory education for disadvantaged youth or
with innovative educational projects (the great majority  of grants) to file an
evaluation report showing what had resulted from the expenditure of public
 funds
    Overnight, thousands of educators were required to evaluate their own
efforts Few were up to the task Classroom teachers and building principals
were among those pressed into technical  activities for which they had little
training The results were abysmal  And when well-trained educational, psy-
chological, or sociological researchers were called in to help, the results were—
surprisingly—not  much  better  Despite their  technical  prowess,  these
researchers were not prepared for the complex tasks of identifying the influ-
ences that could  be attributed to each of several components of a program or
even of separating the effects of the program from other activities going on in
the school Clearly, new evaluation approaches, methods, and strategies were
 needed
    Meanwhile, areas outside education  were expenencing increased demands
 for evaluation, although it was often called by other names By the late 1960s,
Congress had authorized monies for evaluation of social programs in areas as
diverse as the Job Corps, vocational rehabilitation, child health, and commu-
 nity action Managers of the projects and programs funded under such social
 legislation searched to find the individuals best equipped to fill the newly cre-
ated evaluation roles Faced with an absence of persons trained directly in eval-
 uation, ihf   ~«ployed people trained for roles that  contained some evaluative
                                    IS EVALUAI ION A MATURE PRO! ESSION?    5

 functions—professional accountants and auditors, management consultants,
 planning and systems analysts, economists, research, product development,
 and lest marketing specialists from the private sector, and academics in areas
 relevant to the collection and analysis of evaluative information (Shadish,
 Cook, and Leviton, 1991).
     The evaluations conducted by these persons were little better than those
 conducted by the classroom teachers and educational psychologists who had
 been pressed into service as evaluators on federally funded education projects
 While most of those drafted or recruited into evaluation roles were very skill-
 ful in some of the tasks required of evaluators. few were even aware of the
 broad range of tasks that were essential for a complete and adequate evalua-
 tion  Fewer still possessed the skills that one must have in order to complete
 those tasks. The need for persons with a new constellation of specialized skills
 was evident to any insightful observer
     Today the need for evaluation specialists is generally accepted, although
 many policy makers and program managers who are naive about the knowl-
 edge and skills that evalualors should possess still attribute evaluation exper-
 tise to self-appointed or self-anointed "evaluators" who lack essential evaluation
 skills and knowledge. Despite the frequent lapses when evalualors are selected,
 that there is a need for evaluation specialists seems to be well established

 Development of Unique Content

 When demands for evaluation increased dramatically in the 1960s, the result-
 ing evaluation studies revealed the conceptual and methodological impover-
 ishment of evaluation as it then existed  Theoretical and methodological work
 related directly to evaluation did not exist, and evalualors were left to gather
 what they could from theories in cognate disciplines and lo borrow whai they
 could from the methodologies developed in such fields as experimental design,
 psychometrics, survey research, and ethnography The results were disap-
 pointing and underscored the need for the development of new conceptual-
 izations and methods tailored to fit ihe needs of evalualors more precisely.
 Scholars responded lo this need, and by 1970 important seminal wntmgs had
 provided conceptual foundations and scaffolding for the young field of evalu-
ation (Cronbach. 1963. Scnven. 1967, Slake. 1967, Siufflebeam, 1968) Books
of readings on evaluation were published (Caro, 1971, Worthen and Sanders,
 1973) Articles about evaluation appeared with increasing frequency in pro-
fessional journals. Together, these publications resulted in a proliferation ol
new evaluation models that collectively provided new ways of ihinkmg aboui
evaluation This emerging body of literature showed evaluation lo be a multi-
dimensional technical and political enterprise that required both new concep-
tualizations and new insights into the ways in which melhodologies borrowed
from other fields could be used appropriately
    In recognizing the need for unique theories for evaluanon  '  -lish. Cook.
-------
 6    THE PREPARATION or PROI FSSIONAL EVALUAIORS

 and Leviion (1991. p  31) noied that, "as evaluation matured, its theory look
 on its own special character (hat resulted from the interplay among problems
 uncovered by practitioners, the solutions they tried, and traditions of the aca-
 demic discipline of each evaluator, winnowed by twenty years of experience "
     Publications focusing exclusively on evaluation appeared in the 1970s
 They included such journals and senes as Evaluation. Evaluation and Program
 Planning, Evaluation Practice, Educational Evaluation and Policy Analysis, New
 Directions for Program Evaluation, and the Evaluation Studies Review Annual  The
 number of books published expanded markedly m the second half of (he
 1970s and throughout the 1980s Textbooks, reference books, and even com-
 pendia and encyclopedias of evaluation all appeared Clearly, the necessary
 conceptual underpinnings of a profession are accumulating in a body of eval-
 uation literature that is arguably unique Thus, evaluation seems to qualify as
 a profession on the second criterion There is a body of knowledge that out-
 lines the content of the field and us unique (or adapted) theories, strategies,
and methods


 Programs for the Preparation of Evaluators

 Foreseeing that education had few persons trained in educational inquiry
skills, the U S Congress funded graduate  training programs in educational
 research and evaluation in 1965  These programs included fellowship stipends
 for graduate study in these new specializations  Several universities launched
 full-fledged, federally funded graduate programs aimed at training educational
evaluators When federal funds disappeared, so did many of the graduate pro-
grams that they had supported In 1971. graduate programs for training eval-
 uators existed at more than a hundred American universities (Worlhen  and
 Dyeis, 1971)  Fifteen years later, only forty-four U S universities had such pro-
grams (May, Fleischer,  Schener, and Cox, 1986)  And many programs that had
 had many courses m evaluation scaled back to a single elective course
    The evaluation preparation programs that continued generally offered
training tailored  to fit the  reconceptualized views of evaluation that were
emerging  Notions of how evalualors should be trained gradually expanded
beyond traditional training Courses in research design, statistics, and mea-
surement were often supplemented by a wide variety of applied methods and
techniques courses in such areas as naturalistic observation, interviewing tech-
 niques, content analysis, peiformance assessment, and communication  and
 writing skills  Evaluation internships, assistantships, and praclica became more
central in preparation  programs as evaluation mentors realized that, in evalu-
 ation as elsewhere, the best training is often apprenticeship (raining
    In recent years, the training of evalualors has increasingly been relocated
 to nonacaclemic settings  In-service evaluation training for practitioners is often
 offered in schools, slate agencies, and businesses On occasion, large corpora-
                                    ls EVALUATION A MATURE PROFESSION'    7

 (ions have established corporate (raining centers (such as Xerox Document
 University) These centers, which resemble mmiuniversilies, provide training
 in evaluation along with other techniques Some of these centers award cer-
 tificates attesting that the recipient is qualified in the speuahzation in which
 he or she has been trained And, of course, many persons follow serendipitous
 career paths into evaluation roles where their preparation consists pnmanly of
 on-ihe-job bootstrapping On balance, there are a sufficient number of evalu-
 ation training programs m universities, government agencies, corporations,
 and other settings to produce an  ongoing supply of professional evalualors It
 seems clear that evaluation meets this third criterion for having reached the
 status of a profession.
 Stable Career Opportunities for Evaluators

 One sign that a specialization is a profession is a continuing need for the ser-
 vices of personnel trained in that specialty No field that is only a fad that flour-
 ishes briefly and then fades would qualify as a profession In judging evaluation
 on this dimension, we must consider whether, despite uncertain social and
 economic trends, evaluation provides the stable employment opportunities
 that are typical of mature professions
    At first, evaluation seemed to be just another boom-and-bust specialty
 When the need for evaluators grew quickly between 1965 and 1975,  it
 seemed that evaluation training could provide stable career opportunities for
 anyone who  developed a reasonable degree of expeitise in evaluation  That
 view grew doubtful in the late 1970s, when a dip in the level of federal fund-
 ing for evaluation appeared to signal a declining U S job market for evalua-
 lors In the  early 1980s, Ronald  Reagan cast  a darker  shadow over the
 evaluation scene Federal evaluation mandates were quietly shelved as the so-
 called new federalism reduced federal funding for education and other social
 programs and cut federal control over the ways m which states and local agen-
 cies spent the federal funds that they received  Much categoncal funding that
 had required evaluation was replaced by block grants  to states, which were
 largely exempt from evaluation requirements  Most analysis during the early
 1980s were convinced that stale and local agencies, hard pressed  for oper.i-
 uonal funds, would use categoncal funding to buy supplies, repair equip-
 ment, or add  staff Evaluation was predicted to be one of the majoi casualties
of the Reagan administration  Since  federal mandates had spawned evaluation
in state and local agencies, it seemed reasonable  to expect thai evaluation
would decline or even cease when federal evaluation requirements weie
relaxed  or abolished
    By 1982. these pessimistic prophecies seemed to have  pioved accurate Gov-
emmenial monitoring of categorical funding programs was drastically reduced
Individual evalualors and evaluation agencies lhai  depended on conirjcts wiih
-------
 8    THE PREPARATION or PROFESSIONAL ^VALUATORS

 federal programs found  ihis source of income drying up  For example,
 Shadish. Cook, and Levuon (1991) noie thai ihe number of evaluation stud-
 ies conducted by ihe U S Office of Planning. Budget, and Evaluation dropped
 from 114 in 1980 to 11 in 1984 The declines in other evaluation activities that
 depended on federal funds weie comparable
    Gloom soon spread over the evaluation landscape, and evaluaiors' con-
 ferences in the early and micl 1980s focused on such themes as the decline of
evaluation Evaluation trainers at many universities began to ask whether it
was ethical to tram  neophytes in roles for which demand was thought to be
diminishing For a lime, it seemed that the evaluation bubble had burst Eval-
uation seemed destined for the graveyard of promising endeavors that had
failed to fulfill their  potential
    But the situation began to change For reasons that at first could not be
explained, some evaluation agencies seemed to be bucking the declining
trends  Indeed, they  found that the 1980s had brought a stronger surge of eval-
uation  business than ever before, and soon they were eagerly seeking to add
well-qualified evaluaiors to their staffs Gradually, it became apparent that only
the evaluation agencies that depended primarily on federal funds had been
hard hit, while the agencies that served state and local  agencies, corporations,
professional associations, and the like were finding that evaluation was still a
bustling, thriving enterpnse
    Somehow—perhaps  only  the  most sagacious historical analyst  could
determine all the causes—decision makers in slate and  local government, busi-
 ness, and industry had begun to use evaluation for their own purposes—to
 provide information that they needed to guide policy and program imple-
 mentation Gradually, increasing numbers of agencies began to commission
 evaluation studies  not because they had  been forced to but because they
 believed that the resulting data would be helpful House (1990) noted this
 trend in the emerging tendency for large bureaucracies to develop their own
 evaluation offices, and Worthen and Seeley (1990) described how evaluation
 had  been institutionalized by a variety of enterprises across broad sectors of
 contemporary society
     For those who recognized that this widespread instilutionalizaiion of eval-
 uation would lend stability to the evaluation job market, ihe pessimism thai
 had  prevailed at the beginning of the decade soon passed Openings for eval-
 uniors were appearing in a wide variety of sellings, which included public and
 private school districts, stale and regional education  agencies, social service
 agencies, univcrsmes and colleges, slate systems of higher education, test and
 text  publishers, and the military, business, and industry Evaluation academics
 were at lirst amused and then amazed as ihcy saw their siudents recruited to
 evaluate personnel  training programs run by large, national aicounung fums,
 insurance and brokerage houses, and fast-lood chains
     The shortage of trained evaluatois is obvious today, as the traditional
 employe-  -f evaluaiors aic now forced to compete with suth firms as Aetna,
                                                                                                                       Is CVALUAI ION A MA luiir
Xerox, and Price Waterhouse Every year, the number of evaluation vacancies
outside academic sellings surpasses the number of qualified candidates  And
the surge in federal program evaluation over the past few years has accentu-
ated the need for evaluators  Ginsburg, Mclaughlin, and Takai (1992, p 24)
note lhai "spending on program evaluation by the U S  Department of Educa-
tion exceeds $40 million per year, a tripling of the budget over the last five
years " So a career  in evaluation seems again to be a very good possibility If
the probability of continued employment in a specialization is an imporiam
criterion for considenng u lo be a profession, evaluation may well be consid-
ered a viable profession
Procedures for the Certification or Licensure of Evaluaiors

Since an extensive discussion of this question is beyond the scope of this chap-
ter, I will only touch on U lightly For this chapter, the central question is
whether there are mechanisms for the certification or licensing of evaluators
similar to those thai mark teachers, psychologists, and certified public accoun-
tants as professionals The answer, of course, is no
    Despite pleas that the Amencan Educational Research Association estab-
lish mechanisms to provide certification for qualified evaluators (Cagne, 1975,
Worthen, 1972), neither it nor any other association or agency has stepped for-
ward to assume responsibility for the licensing or ceilificalion of evaluation
practitioners  As a result, there is currently no way of preventing  incompetent
or unscrupulous operators from proclaiming themselves to be evaluaiors
Without some type of credenlialing process, it is difficult for ihosc who need
evaluation services lo determine in advance that those whom they select aie
indeed competent "Lei  the buyer beware" is still the watchwoid for those who
must retain the services of an evaluation specialist In the absence of certifica-
tion or licensure, unprincipled hucksters can do much mischief and in the
process badly tamish ihe image of evaluation
    Perhaps that cannot be helped 1 am much less sanguine today thai we can
set up credenlialing systems than 1 was two decades ago However desirable u
may be to have some way of ensuring that the unqualified cannot masquerade
as evaluaiors. the development of such a mechanism does noi seem feasible
for two reasons  First, rooted as evaluation is in so many disciplines and with
today's evaluators trained in as many diverse specializations and through sui.li
diverse means as they  are, it is hard  to imagine how any broad agieemciu
about the essential elements of evaluation competencies could be luigeil I'm
bluntly, since there is so little agreement about the methods .uul ie(.tinu|iies
that evaluaiors should use, it seems almost certain thai a majoiny of |>iauit--
ing evaluaiors would reject an elfon 10  construct and use a lemplaie ul any son
to judge the qualifications of all evaluators  Second, u seems unlikely that any
professional association or government agency will soon lie equipped to grap-
ple with the thorny and often litigious business of lucusinc    •inlu.jiioii.
-------
  10
1 IIL PREPARATION or PROIT-SSIONAI LVAI UAIOKS
 especially in a field wheie those affected by the effort are more accustomed to
 evaluating than to being evaluated
     Nevertheless, until and unless we establish some feasible mechanism for
 ensuring that those who practice evaluation are competent to do so, evalua-
 tion cannot be considered a fully mature profession


 Development  of Professional Associations for Evaluators

 Several professional associations m North America have emerged to provide
 homes for evaluators  (Similar trends are seen in other coumnes ) One of the
 first North American efforts was not a full-blown association as such but rather
 Division H of the American Educational Research Association, which provided
 a home for school evaluators  However, two professional associations for prac-
 ticing evaluaiors were founded in 1976 The Evaluation Network (EN) con-
 sisted  laigely of educational evaluaiors, while most members of the Evaluation
 Research Society (ERS) served in other professional fields
     In 1985, the EN  and the ERS merged to form the American Evaluation
 Association (AEA),  which, with about 2,200 members, is the largest profes-
 sional association that exists solely to serve the needs of practicing evaluaiors
 The Canadian Evaluation Society (CES) was launched 10 serve the needs of
 Canadian evaluation practitioners who worked in settings ranging from provin-
 cial ministries to private consulting groups Given the scope and stature of
 these associations, it is clear that evaluators have viable professional organiza-
 tions  On this criterion, evaluation fares as well as  any profession


 Criteria for Determining Membership in Evaluation
 Associations

 Most professions have established criteria foi denying membership in profes-
 sional  associations to those who are patently unqualified in the business of ihe
 profession This cannot be said of evaluation The  criteria for membership in
 all the professional evaluation associations just mentioned are lenient, and no
 organization would effectively exclude those who were not qualified as evalu-
 aiors from membership On this criterion, as on the criterion of cenificalion,
 it appears (hat evaluation has not reached full maturity as a profession


 Influence of Evaluation Associations on Preparation
 Programs for Evaluaiors

 In many professions, I he majoi professional associations play a powerful mle
in shaping university  pieparanon progiams through accreditation or similar
mcilunisms Evaluation associations excit no such influence  None of the pro-
fessional associations for evaluators mentioned earlier exercise any direct ton-
                                                                                                                          Is EVALUAIION A MAIURIE I'KOIKSSION'
                                                                                                                                                    II
                                                                                trol or influence over any preservice program ihai purports to train evaluators
                                                                                The evaluation associations do not accredit preservice training programs 01
                                                                                control decisions about required course content, essential internship experi-
                                                                                ences, or faculty qualifications On this criterion, too, evaluation is not fully a
                                                                                profession


                                                                                Development of Standards for Evaluation Practice

                                                                                Most  professions contain technical standards, ethical standards, or both that
                                                                                are intended to ensure that professional practice is of high quality Evaluation
                                                                                was without such standards during us early years  Then in 1981, evaluation
                                                                                took a giant step forward toward qualifying as a profession when several years
                                                                                of work by the Joint Committee on Standards for Educational Evaluation, a
                                                                                coalition of professional associations concerned with evaluation in education
                                                                                and psychology, resulted in the publication of Standards for Evaluations of Edu-
                                                                                cational Programs, Projects, and Materials (Joint Committee on Standards for
                                                                                Educational Evaluation, 1981) These comprehensive standards were intended
                                                                                to guide both those who conducted evaluations and those who made use of
                                                                                evaluation reports
                                                                                    In 1982, the ERS published another set of standards for evaluation prac-
                                                                                tice (Rossi, 1982) Six years later, the joint Committee on Standards for Edu-
                                                                                cational Evaluation (1988) published the Personnel Evaluation Standards
                                                                                Currently, the same organization is neanng the end of the process of revising
                                                                                the standards first published in 1981  If a set of standards to guide professional
                                                                                practice is a hallmark of a profession, then evaluation certainly qualifies, for us
                                                                                standards are much better developed than those now used to guide practice in
                                                                                several more venerable professions
                                                                                Profession, Professional Specialization, or Field of
                                                                                Professional  Practice?

                                                                                Up to this point, we have considered nine touchstones that seem useful in
                                                                                ascertaining whether a field of endeavor has attained the status of a distinct
                                                                                profession. Let us now consider these criteria together What do they tell us
                                                                                about the progress of evaluation toward becoming a profession? Is evaluation
                                                                                separate and distinct from the other professions and disciplines with which it
                                                                                has been intertwined for decades7 In short, is evaluation a profession7
                                                                                   The answer depends on the rigor with which we apply the nine criiena
                                                                               just examined  If an area of specialization must meet all nine criteria before it
                                                                               can be thought of as a profession, then evalujuon is not a piofcssion Tiguic
                                                                                1  1, which summarizes the preceding discussion of evaluation and the char-
                                                                               acteristics that most fully developed professions possess, shows that evaluation
                                                                               falls short on three For evaluation to be consult ied a full-fledged profession.
-------
 12    THH PREPARATION OF PROFESSIONAL EVALUAIORS

          Figure 1.1. Criteria for Judging Whether Evaluation
                       Has Become a Profession
   Does Evaluation Meet the Cntrnnn 0}
V«
N.i
 1  A need for evaluation specialists7
 2  Conicni (knowledge and skills) unique
   lo evaluation7
 3  Preparation progiams foi cvaluaiors7
 4  Stable taieer opportunities in evalua-
   tion7
 5  Cenificaiion or licensure of evaluators7
 6  Appropnaic professional associations
   (or evaluators7
 7  Exclusion of unqualified persons from
   those associations7
8  Influence of evaluators' associations on
   prcscrvice preparation programs for
   e valuators7
9  Siandards lor the practice of evalua-
   tion7
             V

             V
these three areas will need 10 be dealt wuh Nevertheless, some conditions may
be difficult ever to meei For example, we may never resolve ihe challenge of
certifying evaluators Does this mean that evaluation will never qualify as a pro-
fession7 Or can evaluation be considered a profession if 11 meets most of the
criteria7
    Those who have commented on the status of evaluation as a profession are
not of one voice A decade ago, most writers seemed to hold the view that eval-
uation had not yet attained  the status of a distinct profession  For example,
Rossi and Freeman (1993, p 432) concluded that "evaluation is not a 'profes-
sion.' ai least in terms of the formal criteria (hat sociologists generally use to
characterize such groups. Rather, it can best be described as a 'near-group,' a
large aggregate of persons who are not formally organized, whose membership
changes rapidly, and who have little in common in terms of the range of (asks
undertaken, competencies,  work sites, and shared outlooks " Merwm and
Werner (1985) have also concluded that evaluators cannot yet claim full pro-
fessional status
    Several rcceni authors have  reached a somewhat moie liberal conclusion
For example, Panon (1990)  states unequivocally thai evaluation has become
a profession and thai it is a demanding and challenging one at ihai Shadish,
Cook,  and Lcviion (1991, p 25) are slightly more cautious "hvaluation is a
piolcssion in the sense that n shares certain atiiibutcs wuh other professions
and differs from purely academic specialties, such as psychology or sociology
Although they may have academic loois and  members,  professions aie eco-
                                                                             h EVALUATION A MAI LIRE PROFESSION?
                                                                                                   13
nomically and socially structured to be devoted primarily 10 practical applica-
tion of knowledge in a circumscnbed domain wuh socially legitimated fund-
ing    Professionals    tend to develop standaids of practice, codes of ethics,
and other professional trappings  Program evaluation is not fully professional-
ized, like medicine or the law, it has no licensure laws, for example  but u
lends toward professionalization more than mosi disciplines"
    To summarize, some now view evaluation as a profession because it pos-
sesses most of the touchstones that collectively define a profession Others
believe thai evaluation is no) now a full-blown piofession and that it may never
become one because u lacks licensure laws and some other characteristics ol
such professions as law and medicine  Perhaps evaluation will forever be a
near-group  that  lends  toward  professionalization  Perhaps we may best
describe  u as a near-profession—an area of professional practice and special-
ization that has us own literature, as own preparation programs, us own stan-
dards of practice, and us own professional associations  Or perhaps evaluation
is best viewed as a hybrid of profession and discipline thai possesses many
characteristics of both and  lacks some essentials of each (Scriven,  1991,
Worthen and Van Dusen, in press) Or perhaps the label that we give lo the
practice of evaluation is of less consequence than the ways in which we struc-
ture programs aimed at preparing competent evaluation piaciuioncrs
                                            Are Preparation Programs for Evaluation Practitioners
                                            Warranted?

                                            It would matter little whether we considered evaluation to be a profession if u
                                            were not that our conceptions—and even our semantics—influence the ways
                                            in which we prepare personnel for evaluation roles  If we think of evaluation
                                            as a discipline, then we will expect preservice programs for evaluators to be
                                            patterned after those used to tram academics in other disciplines If we think
                                            of it as a profession, the course work and internships in our evaluator prepa-
                                            ration programs will tend to resemble the methods courses and practica used
                                            to prepare practitioners for other professions If we think of evaluation as a
                                            hybrid of discipline and profession, then our evaluation programs will com-
                                            bine elements of programs aimed at training practitioners wuh elemcius ol pro-
                                            grams used to prepare academics
                                                However we think of evaluation, this much  is cleai  Evaluation has
                                            matured rapidly dunng the pasi quarter century, and iheic is every uuli<_aiioii
                                            that u will continue to develop and grow in the decades ahead Wuh lib own
                                            journals, standards, and professional reference groups, evaluation has devel-
                                            oped many of the important characteristics of a piofession  And whcihci 01
                                            not it can be considered a  profession, u has emerged as .m  impou.mi aic:i ol
                                            specialization that demands uniquely prepaied personnel il u is to icadi us full
                                            potential. Il has become institutionalized in many public and private sectors
-------
 14     Till: fREPARATION OH PROITbSIONAL EVAIUAIURS

of oui  society, and, if evaluaiois are prepared to meet the challenge, evaluation
can become one of the most useful and far-reaching areas of human endeavor
Against this backdrop, the present and potential importance of evaluation fully
warrants a careful consideration of the issues and strategics involved m prepar-
ing evaluation specialists


References
Caro. F  G (ed ) Readings in Evaluation Research New York Sage, 1971
Cronbach, L J "Course Improvement Through Evaluation * Teachers College Record, 1963. 64,
   672-683
Gagne, R M  "Qualifications of Professionals in Educational R&D " Educational Researcher, 1975,
Gmsburg. A , Mclaughlin, M . and Takai, R "Remvigoraling Program Evaluation at ihe U S
   Department of Education " Educational Researcher, 1992, 21 (3). 24-27
House. E R  •Trends in Evaluation " Educational Researcher, 1990, 19(3). 24-28
|oini Committee on Standards for Educational Evaluation Standards /or Evaluations of Educational
   Programs,  Projects, and Materials New York McGraw-Hill, 1981
Joint Committee on Standards for Educational Evaluation The Personnel Evaluation Standards
   Newbury  Park, Calif  Sage. 1988
May, R  M , Fleischer, M , Schetrer, C J . and Cox, G  B "Directory of Evaluation Training Pro-
   grams " In B  G Davis (ed ), leaching of Evaluation Across the Disciplines New Directions for
   Program Evaluation, no  29 San Francisco jossey-Bass, 1986
Merwtn.J C , and Werner. P II  "Evaluation A Profession'" Educational Evaluation and Policy
   Analysts. 1985. 7(3). 253-259
Pallon.M  Q " The Challenge of Being a Ptnfesston " Evaluation Practice, 1990. II (I). 45-51
Rossi, P M (ed ) Standards /or Evaluation Practice San Francisco  Jossey-Bass, 1982
Rossi. P U , and Freeman, H E Evaluation A Systematic Approach  (5th etl ) Newbury  Park,
   Calif Sage. 1993
Scnven.M "Ihe Methodology of Evaluation " In R C  Slake (ed ). Cur niulum Evaluation Amer-
   ican Educational  Research Association Monograph Series on Evaluation, no I Chicago  Rand
   McNally.  1967
Scriven. M "Introduction  The Nature of Evaluation " Evaluation Thesaurus (4ih ed ) Newbury
   Park. Calif Sage. 1991
Shadish. W  R, Jr.  Cook, 1  D , and Levnon, L C Foundations o/ Program Evaluation Newbury
   Park, Calif Sage, 1991
Slake, R E "The Countenance of Educational Evaluation " Teachers College Record, 1967, 68,
   523-540
Stufflebeam, D  L Evaluation as Enlightenment /or Decision Making Columbus Ohio Slate Uni-
   versity Evaluation Center, 1968
Weiss. C H  "Evaluating Social Programs What Have We Learned'" Society. 1987, 25(1). 40-^45
Wholey, J  E  "Using Evaluation lo Improve Government Performance " Evaluation Practice. 1986
   7.5-13
Worthen. B  R "Cemficanon for Educational Evaluaiors Problems and Potential " Paper pre-
   sented ai the annual meeting of the American Educational Research Association, Chicago, Apr
   15, 1972
Wculhen. B  R , and Byers, M  L "An Exploratory Study of Selected Varubles Related to the
   Training and Caieers of Educational Research and Research-Related Personnel " Washington.
   D C   Ametican rducaiion.il Research Association. 1971
Wonhen. B  R .  and Sanders. | R Educational Evaluation Theory and Practice Belmont. Calif
   Wadiwoitli, 1973
                                        IS EVALUAI ION A MA I LIRE HKUrCSSION'      I 5

Worlhen, B  R , and Sanders, J  R Educational Evaluation  Alternative Approaches and Pradicul
  Guidelines New York Longman, 1987
Wonhen, B  R , and Sanders, J R  "The Changing Face of Educational Evaluation * Theory into
  Practice. 1991,30(1). 3-12
Wonhen, B  R , and Seeley, C "Problems and Potential in Institutionalizing Evaluation in State
  and Local Agencies " Paper presented al ihe annual meeting of the American Evaluation Asso
  ciaiton, Washington, D C . Oci  19. 1990
Wonhen. B  R . and Van Dusen. L M  "The Nature of C valuation " In H Walben (ed ). Interna-
  tional Encyclopedia oj Education (2nd ed ) Oxford. England Pergamon Press, in press
BLAINE R WORTHEN is piofessor and chan of the Research and Evaluation Method-
ology Program in the Department of Psychology at Utah State University and dira
tor of the Western Institute for Research and Evulualion in Logan, Utah
-------
The skills and knowledge that evalualors need include (dose bor-
rowed from other disciplines as well as (hose unique to the field of
evaluation  Inclusion of multiple perspectives in evaluator (raining
can help to develop the field and improve the practice of evaluation
Training Evaluators: Unique Skills
and Knowledge
Donna M. Mertens
Evalualors work in complex environments, such as emichmem programs for
deaf, gifted adolescents, drug and alcohol abuse programs for (he homeless,
and managemenl programs for high-level radioactive waste The field of eval-
uation itself is evolving as u develops ihrough ihe leflccuvc practice of (he pro-
fessionals involved Evalualors have an ethical responsibility 10 continue their
education and  keep up-to-date on developments in the field (Easimond,
1991) In consequence of this assertion, I have written this chapter for students
of evaluation, by  whom 1 mean noi only those who are enrolled in formal
training programs bui also all practicing evalualors and teachers of evaluation

Perspectives and Assumptions

My answer (o (he question, What are (he unique skills  and knowledge that
should be considered for the preparation  of evalualors7 is based on four
assumptions  First, we live in a multicultural society, and I assume thai evalu-
alors musi bnng a sensitivity to multicultural issues and  perspectives to their
work Beaudry  (1992, p 82) noies lhal program evaluators  must seek in
include  ihe multiple perspectives of eihmciiy. race, gender, social class, and
persons with disabilities "Program evaluation must lake IIOIILC ol the clungcs
I lhank the following people for their comments on my diafi Ir.imcwoik Jennifer Gicrne.
Jody Fuzpainck. Hallie Preskill. Nick Casimoncl. Jack McKillip. Di.iniu Nt-wiiian. and leiry
llednck


Ntw DlttLinm KM PniGUM FVAIIMIKW. no 61 Summer IVM  c |IFMly BJU Publnhr»                 I 7
-------
 18
Thi. i-Rf.PARAI ION Ol~ I'KOn SSIONAI. EVAI UAfORb
 in our society and begin 10 respond to social issues represented by multicul-
 tural education Hale crimes and ethnic strife are reported on the front pages
 of newspapers and in the courts and the schools as well as all around the
 world  In education, much of what we know about negative lacial prejudice,
 biases in testing, culluially biased instructional materials, and teacher effects
 remains part of the hidden cumculum  Multicultuial awareness and education
 have equal  relevance for health care, business, and industry as these sectors of
 society cope with the shifting patterns of a culturally diverse work force "
    Stanfield (1993, pp  6-7) addresses the same point "The  dramatically
 changing world in which we live demands lhai we cease 10 allow well-worn
 dogma to keep us from designing research (read evaluation] projects that will
 provide the data necessary for the formulation of adequate explanations for the
 racial and ethnic dimensions of human life  " Although the author just cued
 speaks from the context of ethnicity and social science research, his comments
 can be more bmadly interpreted as suggesting that evaluators must rethink tra-
 ditional methods in order to be responsive to such alternative perspectives as
 those of minorities, women, the poor, and persons with disabilities
    Evaluation literature is only beginning to address the feminist (Farley and
 Menens. 1993. Menens, 1992, Shapiro, 1987) and minority  perspectives
 (Madison.  1992)  However, students of evaluation  can  borrow  from the
 research-based literature and cieate the applications and implications that are
 necessary The peispeaive of persons with disabilities has not been addressed
 as fully in the research literature (Mertens and McLaughhn, in  press) as the
 perspectives of other groups 1 include them here because a growing literature
 suggests  that  they view themselves as an oppressed cultural group (Wilcox,
 1989)
    Second. I view evaluation as a unique discipline that has borrowed many
skills and much knowledge from social science research  Evaluation is an
emerging profession with an expanding body of skills and knowledge that
 require continual review Skills are things that evaluators need to be able to do
 Knowledge is things that evaluators need to  know  The model that 1 propose
combines skills and knowledge, because evaluators need to be able to apply
what they know in order to conduct evaluations competently
    Thud. 1 assume that a core set of skills and knowledge exists across disci-
 plines  for evaluators The particular emphasis of the skills and knowledge
 requned  in specific contexts depends on the discipline in which the training
occurs, the level of the training, the nature of the training (for example, degree
or nonacademic piogram, single course or program, new training or continu-
 ing education), the aiea of application (foi example, education, economics,
 psychology, criminal justice, public administration, business, health, sociol-
ogy, social work), the naiine of the organization that employs the evaluator,
and the level of the position that he or she holds Having asserted that there is
a coie set of skills and knowledge, I also want to recognize the dispute between
 pioponents of the view that evaluation has content-specific knowledge and
advocates of the gencialist view  Eisner (1991) describes a connoisseur as an
                                                                                                          TRAINING EVALUATORS UNIQUI SKILLS AND KNOWIMH.I
                                                                                                                                                  19
                                                                               individual who is highly perceptive in one domain and able to make fine dis-
                                                                               criminations among complex and subtle qualities I believe that evaluatois
                                                                               need either to be connoisseurs in the area of application (for example, drug
                                                                               abuse, deafness) or to include a subject matter expert (connoisseur) in the
                                                                               planning, conduct, and interpretation of an evaluation
                                                                                   Last, I assume also that evaluators must be capable of being responsive to
                                                                               the needs of the client  They must be capable of recommending the most
                                                                               appropnate approach to an evaluation problem Some problems can best be
                                                                               studied with quantitative data, while others call for qualitative data  While
                                                                               individual evaluators may not be expert in  all quantitative and  qualitative
                                                                               research methods, they do need to be able to recommend the most appropn-
                                                                               ate approach If necessary, they can work in a team with evaluators who have
                                                                               greater expertise in other methods Sechrest (1992) argues that evaluators need
                                                                               increased sophistication in quantitative methods I agree that there is room for
                                                                               experts in either quantitative or qualitative methods, but there is also a need
                                                                               for  those who function comfortably m both domains  Lincoln  and Cuba
                                                                               (1992) argue that a mixture of quantitative and qualitative methods can be
                                                                               appropnale to any paradigm


                                                                               Methodology

                                                                               I used a number  of different techniques to identify the skills and knowledge
                                                                               unique to evaluation 1  reviewed existing literature, such as textbooks on eval-
                                                                               uation (Bnnkerhoff, Brethower, Hluchyj. and Nowakowski, 1983. Popham.
                                                                               1988; Shadish, Cook, and Leviton. 1991,  Rossi and Freeman, 1993, Posavac
                                                                               and Carey. 1992; Worthen and Sanders, 1987), presentations on training at the
                                                                               annual meetings of the American Evaluation Association  (Altschuld, 1992, Dar-
                                                                               nngion. 1989; Covert, 1992, Eastmond. 1992, Mertens,  1992), literature iden-
                                                                               tified through the use of ERIC and other data bases, training-related articles in
                                                                               the journal Evaluation Practice, the U S General Accounting Office (1991) per-
                                                                               formance appraisal system for evaluators, and Da vis's (1986) volume on eval-
                                                                               uation training. I also consulted other evaluators and  reflected on my own
                                                                               expenence as an evaluator trainer for twenty-plus years  I conducted a content
                                                                               analysis of the skills and knowledge that I  found in these various sources and
                                                                               organized them into a conceptual framework  I shared this conceptual frame-
                                                                               work with evaluators and trainers in a variety of disciplines, including educa-
                                                                               tion, psychology, business, administration, government, and interdisciplinary
                                                                               programs.
                                                                               Skills and Knowledge Needed

                                                                               I have divided the skills and knowledge into four categories those unique to
                                                                               evaluation, topics associated with typical training in the methodology of
                                                                               research and inquiry, topics in such related areas as political science or anthro-
                                                                               pology, and discipline-specific topics I chose this organizational framework
-------
20     THE PREPARATION or PRorhSbiONAL EVAUIAIORS

because it lends itself lo the overall design of an evaluation training program
A student who enrolls in an evaluation training program typically also receives
course work in research methodology, including research design, statistics, and
measurement  This organizational framework suggests areas that need to be
included in evaluation courses because they are not often taught elsewhere (the
components of these other topic areas arising in evaluation are unique) It also
suggests other disciplines that can provide a more complete training experi-
ence  Exhibit 2 1 displays the topics associated with research, related areas,
and specific disciplines  Discussion of these topics is beyond the scope of this
chapter
    I focus here  on the unique skills and knowledge associated with evalua-
tion Standard evaluation textbooks typically cover some of these topics, and
I will therefore not elaborate on them  I will discuss certain controversial and
emerging topics in evaluation for three reasons First, the standard textbooks
typically do not discuss them at length Second, students of evaluation should
be aware of the controversies m the field Third, I hope to push trainers to
think of including emerging topics—that is, topics that are still developing and
on which consensus has not yet developed—as valid for inclusion in the eval-
uation curriculum
    The skills and knowledge listed in Exhibit 2 1 can all be taught with spe-
cial insights and examples from evaluation However, certain topics are not
generally covered in a course in research, related areas, or other disciplines
These topics are  discussed in the sections lhat follow
    Introductory Information About Evaluation.  Evaluation textbooks typ-
ically include information about the definition of evaluation, the reasons why
evaluations are conducted, the various lypes of evaluations (for example,
implementation, process, outcome, impact, formative, summalive), trends
affecting evaluation, the roles thai evaluators can play (for  example, external,
internal), and the history of evaluation
     Philosophical Assumptions.  Evaluation classes should teach the philo-
sophical assumptions underlying the positivisl and postpositive paradigmatic
orientations  Although these assumptions should be taught in the research
methodology classes, the teacher of evaluation cannot safely assume lhat they
will be (Lopez and Mertens, 1993) Lather (1992) proposed an organizing
 framework for paradigms that is relevant here  positivists who seek to predict,
 postposinvisis who seek to understand (this group includes those whom the
evaluation literature has labeled interpretive, natuialistic, and conslruclivisl), and
 postposinvisis who seek 10 emancipate (this group includes feminists and race-
 specific mqmrcis)  For the reasons outlined in ihe section on perspectives and
 assumptions, 1 would add pcisons wuh disabilities to ihc emancipatory cate-
 gory Cuba and  Lincoln (1989) have explained ihe assumptions underlying
 positivists and post posit ivists (read consiruilwsis) in detail, and the  trainer of
 evaluation could follow up  on emancipatory paradigms through such sources
 as Lathe-     ^1), Farley and Mertens (1993), Harding (1993), Shapiro (1987).
                     TRAINING EVALUATORS  UNIQUE SKILLS AND KNOWLI IJCE     21

     Exhibit 2.1. Knowledge and Skills Associated with Evaluation

 I  Knowledge and skills associated with research methodology
   A  Philosophical assumptions of alternative paradigms and perspectives, for exam-
      ple, positives and postposinvisis (for example, construcnvisi, feminists, minoii-
      lies, and persons with disabilities) (Lather, 1992)
   B  Methodological implications for alternative assumptions
   C  Planning and conducting research
      1  Literature review strategies
      2  Theoretical frameworks
      3  Hypothesis/questions formulation
      4  Research design (quantitative designs—for example, observational research,
         surveys, experimental, quasi-expenmenial. correlational, causal comparative.
         and single-subject—and internal and external validity qualitative designs—for
         example, case studies and ethnography—and trustworthiness, and mixed
         designs)
      5  Data collection strategies sample selection, quantitative data collection (for
         example, test construction, reliability,  validity, application of tests, norm-anil
         cnienon-referenced tests, selective measurement imniments, assessing mea-
         surement instruments, measurement error and bias interpreting test  results,
         instrument construction), qualitative data collection (for example, observation,
         interviewing, focus groups, document review, unobtrusive measures)
      6  Data analysis and interpretation data preparation, construction of data bases.
         handling missing data, computer usage for data analysis, statistical analysis,
         qualitative data analysis strategies, display of data, presentation of well-sup-
         ported findings, conclusions, and recommendations, communicative results
         and follow-up
II  Knowledge and skills needed for evaluation but borrowed from other areas
   A  Administration/business
      1  Project management making effective use of resources, organizing and con-
         ducting of meetings, developing and administering budgets, managing person-
         nel, delegating work, supervising staff, reviewing work products, supervising
         and evaluating staff, promoting teamwork, observing equal opportunity prin-
         ciples
      2  Strategic planning
      3  Auditing and evaluation
      4  Evaluation and program development
   B  Communication/psychology
      1   Oral communication communicating with staff, external agencies, gener.il
         public, and the press, obtaining needed information skillfully, avoiding mis-
         understanding, projecting a positive image, using media appropnaiely in pie-
         sentation. leading discussions, conducting productive meetings, handling
         hostility and controversy, seeking and  respecting others viewpoints
      2   Written communication writing status reports, one-page faitual suinmants.
         executive summaries, proposals, reports, briefing papers, mrmos. case studies.
         interview notes, testimony, data collection, instrument pciloimaiiLC
         appraisals, speeches, and professional articles, using umipmer software to
-------
22     THE PREPARATION OF PROFESSIONAL EVAI.UATORS

                         Exhibit 2 1.  (continued)
         produce appropriate lexi and graphics, establishing feedback loops 10 avoid
         surprises and allow people to respond to drafts, providing constructive feed-
         back on written products
      3  People skills  getting along with people, logically explaining expectations,
         using sound judgment as to what should be said/written, counseling employ-
         ees in need of remediation, resolving sensitive personnel problems, rewarding
         good performance, providing timely feedback
      4  Negotiation negotiating contract, evaluation questions, separating people
         from ihe pioblem, dealing with issues and values, focusing on many interests
         thai are represented, inventing options for mutual gam (Barnngion, 1989)
      5  Personal qualities credible, good judgment, flexible, sense of humor, continu-
         ally learning,  self-reflexive, cunous about how things work, ability to show
         respect for the efforts of others
   C Philosophy
      I  Ethics
      2  Valuing determining the value of an object, applying criteria to information
         about an object to arrive at a defensible value statement
   D Political science
      I  Policy analysis
      2  Legislation and evaluation  the place of evaluation in current legislation
   E Anthropology cross-cultural skills
   F Economics (T Hednck, personal communication, July 9, 1993)
      1  Cost-benefit and cost-effectiveness analysis, supply/demand theory, discount-
         ing, wage rate analysis
      2  Controlling for economic factors, for example, changes  in the unemployment
         rale
III  Knowledge and skills unique to specific disciplines
   A Education educational objectives, instructional design, instructional product
      evaluation, teacher evaluation, populations with special needs, accreditation,
      alternative assessment strategies
    B Psychology human development, social service programs,  clinical models, goal
      attainment scaling, outcome evaluation of psychotherapeutic interventions, psy-
      chological measurement, work environment (motivation, job satisfaction, produc-
      tivity) (Corday, Boruch. Howard, and Boozin, 1986)
   C Health cpidemiological studies
    D Business  task analysis, job analysis, management, organizational change, market
      research, organizational design and development, information systems, conflict
      resolution (Perloff and Rich 1986)
    E Government policies, procedures, regulations, and legislation that apply to the
      work aica (U S  General Accounting Office, 1991)
    F  Public administration distinctions between the public and private sectors (for
      example, many "bosses" in the public sector, legislative, judicial, executive, pub-
       lit., special micicsl groups), no clear bottom line as with piofil in the private sec-
       tor, accountability 10 the public (J Fiizpatnck. personal communication. July 8.
       1993)
                      TRAINING EVAI.UATORS UNiqut SKILLS AND KNOWLLUGE     23

and Nielsen (1990) for feminists, Madison (1992), Mann and Mann (1991),
and Slanfield and Dennis (1993) for minorities, and Menens and Mclaughlin
(in press) for persons with disabilities  Evaluation courses should include these
diverse perspectives and should be integrated into the process of planning and
implementing an evaluation on the understanding that an inquirer's philo-
sophical assumptions and theoretical  orientation influence every stage of the
design process
     Theories and Models of Evaluation. Numerous methods for the orga-
nization of the  many theories and  models of evaluation have emerged
Shadish, Cook,  and Levuon (1991) explore the knowledge base that has
emerged regarding evaluation theories Theories encompass the choice of eval-
uation method, philosophy of science, public policy, and value orientation
These authors have identified three stages of evaluation theones theories thai
use a rigorous, scientific method and  emphasize the search for "truth", theo-
ries that emphasize the need for detailed knowledge about how organizations
in the public sector work to increase the political and social usefulness of
results; and theones  that integrate alternatives generated  in the first two
stages Cuba and Lincoln (1989) provide a contrasting framework for theo-
nes and models of evaluation that includes four "generations"  measurement
(testing), descnplion  (objectives), judgment, and the responsive, construc-
livisl theory of evaluation  As 1 mentioned in the preceding sect ion, emerging
theones associated with the emancipatory paradigm provide  ferule ground
for an exploration of the meaning  of alternative perspectives and then
methodological implications
    Planning and Conducting an Evaluation. Although the process of plan-
ning and conducting an evaluation vanes with the theoretical framework, the
student of evaluation should be knowledgeable about and able to apply the
following steps

1  Focusing the evaluation  This step includes identifying ihe object of the eval-
   uation, us purpose, its audiences, and the constraints and opportunities
   Identification and  involvement of stakeholders have been tied to  the
   increasing utilization of evaluation results, and they have also been a source
   of controversy in the evaluation field  Harding (1993) and Madison (1992)
   assert that the stakeholders involved should be those with the least power
   and that team-building and collaboration strategies should be devised to
   include clients in a  meaningful way T Hednck (personal communication.
   July 9, 1993) believes that team building is not  appropnate in such sellings
   as federal oversight evaluations J Greene (personal communication. July
   7, 1993) believes that what is distinctive about evaluation is the way in
   which politics intertwines with public program and policy decisions and
   the distinctive, contested audiences  of an evaluation  Students should
   explore who should be involved in an evaluation, whose purposes an eval-
   uation should  serve, and how best they can  be  appropmidy involved
-------
24     Ti IE PRF.PARA riON OF PROPCSSIONAL I-VALUA i ORS

2  Designing ihe evaluation and formulating questions  The choice of a theo-
   retical framework and evaluation model discussed previously guides the
   evaluator here
3  Planning data collection  The evaluator needs to identify the information
   needs, sources of information, instruments (including ways to describe the
   program treatment and implementation), and ways to identify the  theory
   of the program being evaluated (that is. the context and presuppositions of
   the organizations and groups involved)
4  Analyzing and interpreting daia The evaluator needs to identify appropri-
   ate  analytical approaches for the type of data collected  Identifying them
   will provide a mechanism for accurate and meaningful interpretation
5  Planning, reporting, and utilization  The evaluator needs 10 facilitate effec-
   tive communication and integrate utilization strategies throughout the eval-
   uation process
6  Planning management The evaluaior needs  to determine the resources
   required and the time line of activities
7  Planning meta-evaluation The evaluaior needs lo know how  to evaluate
   the quality of the evaluation  plan, process, and product

    Students should be given the opportunity to implement their evaluation
plans through small evaluation projects completed as a class project or as a pan
of an internship Several authors have provided helpful hints concerning the
inclusion of practical experiences in training programs (Eastmond, Saunders,
and Merrell, 1989, Morris, 1989. Preskill. 1992)
    Socialization  into the Profession  Students of evaluation should be
given  the opportunity to become socialized into the profession by means of
involvement with professional organizations, networking with evaluaiors, and
interacting with the evaluation literature
    Special Topics in Evaluation.  The following topics are very important
in evaluation and should be included in the preparation of evaluators

1   Ethics  professional behavior, use of information, confidentiality, sensitivity
    to effect on others, pressure  from client to distort facts, proper  response to
    discovering information that is morally or legally volatile, and so on (Mor-
    ns  and Cohn. 1992)
2   Standards for evaluation of programs and personnel (Joint Committee on
    Standards for Educational Evaluation.  1981. 1988)
3   Politics of evaluation  knowing the players, the policy environment, the
    power of communication, how to get people to come to an agreement, and
    how people are likely to use  information (Barnnglon, 1989), knowing how
    organizations work, how to understand an organization's goals and inter-
    nal and external forces (that is, how to analyze us political context) (H
    Preskill, personal communication, June 23,  1993)
                     TRAINING EvALUATORi UNIQUE SKII i.s ANO KNOWI EDC.I.     25

4  Specific methods and contexts, such as needs assessment (McKillip,  1987),
   evaluabilily assessment, fulunng (the field of future studies) (Patlon.  1990).
   and international evaluations
5  Evaluaior as trainer training evaluation clients and users

The section in Exhibit 2 1 on skills and knowledge borrowed from related
areas includes olher topics, such as policy analysis, communication skills, and
cost analysis, (hat are essential lo the preparation of evaluaiors

Summary
The training of evaluaiors should reflect ihe evolving, dynamic nature of the
field of evaluation. Many core topics identified here are reflected in evaluation
lextbooks. Evaluation also borrows skills and knowledge from olher disci-
plines, but a training program for evaluaiors should examine them specifically
through an evaluation lens. The inclusion of emerging topics in an evaluation
training program can sensitize students of evaluation 10 these issues and make
them better able to serve the people whom their evaluations affect The train-
ing of  evaluators should  prepare  them to reflect on and engage in dialogue
about the besl ways of responding lo society's diverse demands The field of
evaluation needs to think  in terms of multiple, not singular,  pcispcciives when
il trains evaluaiors
References

Altschuld. J W "Structuring Programs to Prepare Professional Evaluaiors " Paper presented al
  the annual meeting or the American Evaluation Association, Seattle. Wash . 1992
Bamngion, G  V "Evaluator Skills Nobody Taught Me, or What's a Nice Girl Like You Doing in
  a Place Uke This7" Paper presented at the annual meeting or the American Evaluation Associ-
  ation, San Francisco, 1989
BeaudryJ S 'Synthesizing Research in Multicultural Teacher Education Findings and Issues
  Tor Evaluation of Cultural Diversity " In A  Madison (ed ), Minority Issues in Program Evalua-
  tion New Directions for Program Evaluation, no 53 San Francisco Jossey-bass, 1992
Bnnkerhoff R  O , Brethower, D M , Hluchyj, T , and Nowakowski. J R Program t'vuluulion
  Boston Kluwer Academic Press. 1983
Cordray, D , Boruch, R , Howard. K , and Bootzm. R "Teaching of Evaluation in Psychology
  Northwestern University " In B  G  Davis (ed ). The  frothing oj Evaluation Across the Disciplines
  New Directions for Program Evaluation, no 29  San Francisco Jossey-Bass. 1986
Covert, R W "Successful Competencies in Preparing Professional Evaluaiors " Paper presented
  at the annual meeting of the American Evaluation Association, Seattle, Wash ,  1992
Davis. B G "Overview of the Teaching of Evaluation Across the Disciplines "In B C Davis (ed).
  The Teaching oj Evaluation Across ihe Disciplines New Directions for Program Evaluation, no
  29 San Francisco Jossey-Bass.  1986
Eastmond. J N , Jr  "Addressing Ethical Issues When Teaching Evaluation " Paper presented at
  the annual meeting of the American Evaluation Association. Chicago, 1991
Eastmond, J N Jr "Structuring a  Program lo Prepare Professional Evaluating " Paper presented
  at the annual meeting of the American Evaluation Association. Seattle. Wash .  1992
-------
26
I III PRtPARATION OH PROI>SSIONAI. EVALUATOR-S
Easunond. J N , Jr . Saunders, W , and Merrell. D "leaching Evaluation Through Paid Con-
  iraLlual Arrangements " Evaluation Practice. 1989, 10 (2). 58-62
Eisner. L W The Lnlightened Eye  New York  Macmillan. 1991
Farley, J . and Meriens. D M "The Feminist Voice in Evaluation Methodology " Paper presented
  at the annual meeting of the American Evaluation Association, Dallas, 1993
Cuba, I: , .ind Lincoln. Y S Fouilh-Gfneralion Evaluation Ncwbury Park. Calif  Sage, 1989
I larding. S "Rethinking Standpoint Epistemology 'What Is Strong Objectivity?1" In L Alcoffand
  E Potter (cds ), /Vminist Episternology  New York Koulledge, 1993
Joint Committee on Standards for Educational Evaluation Standards Jui Evaluationsof Educational
  Programs. Projects, and Materials New York McGraw-Hill,  1981
joint Committee on Standards for Educational Evaluation  The Peisonnel  Evaluation Standards
  Newbury Park. Calif  Sage. 1988
Lather. P Celling Smart Feminist Research and Pedagogy with/in the Postmodern New York Rout-
  ledge. 1991
Lather. P  "Critical Frames in educational Research Feminist and Poststructural Perspectives "
  Theory into Practice. 1992.31 (2). 1-12
Lincoln. Y  S ,  and Cuba, E G  "In Response to Lee Sechresi's 1991 AEA  Presidential Address
  •Roots  Back to Our First Generations.' Feb  1991. 1-7 " Evaluation Practice. 1992,  13 (3).
   165-170
Lopez. S D . and Menens. D M "Current Practices  Integrating ihe Feminist Perspective in Edu-
  cational Research Classes " Presentation at the annual meeting of the American Educational
  Research Association. Atlanta, Ga ,  1993
McKilhp.J Nerd Analysis Newbury Park. Calif  Sage. 1987
Madison. A  M (ed ) Minority Issues in Program Evaluation  New  Directions in Program Evalua-
  tion, no 53  San Francisco  Jossey-Bass, 1992
Mann, G . aYid Mann. B  V Reseanh with Hispanic Populations  Newbury Park, Calif  Sage, 1991
Meriens, II M "Structuring a Program to Prepare Professional Evalualors  Whal Aren l We Talk-
  ing About (Ihai We Should Be)'" Paper presented al the annual meeting of the American Eval-
   uation Association. Seattle. Wash .  1992
Menens. D  M , and McLaughlm. J  Reseanh  Methods in Spenal Education  Newbury Park. Calif
   Sage,in press
 Morns. M  "Field Experiences in Evaluation Courses " In D M Meriens (ed ), Creative Ideas for
   leaching Evaluation Norwell, Mass Kluwer, 1989
 Morns. M , and Cohn, R "Program Evalualors and Ethical Challenges A National Survey" Paper
   presented at the annual meeting of the Amencan Evaluation Association, Seattle. Wash  . 1992
 Nielsen. J M (ed ) Feminist Research  Methods  Boulder. Colo Westview Press, 1990
 Ration. M Q "The Challenge of Being a Profession " Evaluation Piaclice. 1990. 11(1), 45-51
 Perloff, R . and Rich, R F " The Teaching of Evaluation in Schools  of Management" In B G Davis
   (ed ), The Teaching of Evaluation Across the Disciplines New Directions for Program Evaluation.
   no 29  San Francisco Jossey-Bass. 1986
 Popham.WJ  Educational Evaluation  Englewood Cliffs, N J  Prentice Hall. 1988
 Posavac. E J . and Carey. R J  Program Evaluation Method and Case Studies (4lh ed ) Englewood
   Cliffs, NJ  Prentice Hall. 1992
 Preskill. II "Students, Client, and Teacher Observations from a  Praclicum in Evaluation " Eval-
   uation Pnulice. 1992. 13 (I). 39-46
 Rossi. P .  and  Freeman, II E evaluation A Systematic Approach  (5lh ed )  Newbury Park, Calif
   Sage. 1993
 Seihresi.l  "Roots Back 10 our Firsi Generations " Fvuluulion Piaitue. 1992  1.1(1). 1-8
 Shadish. W R.Jr. Cook. 1  1). and Leviton, L C Foundations of Program Evaluation  fheones
   ofPraititf Newbury Park, Calif  Sage, 1991
 Shapno.j  I* "Cull.ibor.uivr Evaluation  lowaidd rr.insform.mon of I valuation lor Feminist Pro-
   grams  and Projects " Paper presented al the annual meeting of the American Educational
    Rcse:iich AiMKialion. Washington, 1} C ,  1987
                        TRAINING EVALUATORS UNIQUE SKILLS AND KNOWI m.i      2 7

Stanfield.J H  "Methodological Reflections " In J H  Sianheld II and R  M  Dennis (eds ). Kace
  and Ethnicity in Research Methods Newbury Park, Calif  Sage. 1993
Sianfield. J  H ,  II. and Dennis. R M (eds ) Race and Ethnicity in Reseanh Methods Newbury
  Park. Calif  Sage. 1993
U S  General Accounting Office Performance Appraisal System o) Band I. II. and III Employees
  Washington. DC  US General Accounting Office. 1991
Wilcox. S "STUCK in School  Meaning and Culture  in a Deaf Education Classroom " In S
  Wilcox (ed ), Amencan Deaf Culture Burtonsville. Md  Linstock Press. 1989
Worlhen, B  R  , and Sanders. J R  Educational Evaluation New York Longman. 1987
                                                                                             DONNA M MERTENS is professoi m the Depaitment of Educational I'ounilatumi. and
                                                                                             Research at Gallaudet University in Washington, D C
-------
In 1988, the US. General Accounting Office started an ongoing,
comprehensive evaluation (raining program for its staff. This
chapter sketches the program and describes the major substantive
areas of its curriculum
Evaluator Training in a Government
Setting
Nancy Kingsbury, Jerry E. Hednck
The U S General Accounting Office (GAO) is a nonpamsan agency in the leg-
islative branch of the federal governmem Its statutory mission, established by
the Budget and Accounting Act of 1921, is (among other things) to "investi-
gate, at the seat of government or elsewhere, all matters relating to the receipt,
disbursement, and application of public funds" (Budget and Accounting Act,
1921)  Over the years, this responsibility has evolved from detailed audits of
individual agency purchases into economy and efficiency reviews of govern-
ment programs and more recently to a wide-ranging array of program evalua-
tions and policy analyses  With  very few exceptions, any program or activity
funded with federal tax dollars can be  the subject of a GAO review And as
Congress grapples with the difficult decisions of the 1990s, the issues that the
GAO is asked to evaluate mirror  the breadth and complexity of the day's head-
lines A recent sample of study requests includes these questions What strate-
gies have been most effective m reaching hard-la-serve recipients of welfare
programs7 What factors drive  health care costs' How feasible arc various
approaches to the development of geothermal energy7 Are federal plans for the
elimination of tuberculosis from the United States achievable7 What are the
causes and effects of the European currency crisis7 What is the best strategy
for response by the federal government to natural disasters? Can nonnuclear
designs for aircraft carriers and  submarines meet the Navy's need for future
missions7 What interventions are necessary to end the underrepiescnution o(
women and minorities in federal agencies7
    In September 1993, the GAO had about 4.900 staff Three-quarters were
engaged in evaluation and auditing work Evaluaiors are organized into (hiriy-
six issue areas that correspond  roughly to government progiar      riil.li.lwit                61
-------
62     Tin.
                       PROI 1-bSIONAI. EVALUA10KS
such as healtli policy, employment and (raining, cnvironmenial issues, lax policy
and administration, administration of justice, management of defense and
space programs, international trade, and federal management issues
    Because a substantial part of the GAO's evaluation work is earned out on
site where government programs operate, evaluation staff are located in Wash-
ington. D C , in fourteen regional offices around the U S continent, and in two
overseas offices (one in Hawaii, the other in Germany)  Like many federal
agencies, the GAO expects to reduce us size over the next few years, although
(here is little likelihood that the work thai it will be asked lo do will decrease
Accordingly, ihe GAO is investing significantly in technology improvements
(computer networks, videoconferencing) that can improve its productivity and
in improving us work processes ihrough total quality management (TQM)


Evaluation and  Evalualors at the GAO

The GAO defines  evaluation broadly The term is used to describe a range of
activities Depending on the context, it can be synonymous with audit, review,
or policy analysis Methodologies and techniques from a variety of disciplines
are often brought  to bear on an assignment, and, because of the agency's his-
tory in the accounting tradition, the results of the work must meet traditional
government auditing and accounting siandards as well as the standards of
other professional disciplines
    Most staff responsible foi carrying oui reviews of governmem programs
are called cvalualois, and they arc expected to demonstrate strong genetic skills
in project planning, data gathering  and analysis, written and oral communi-
cation, and interpersonal communications and management areas However,
(hey also need specialized expertise
    The GAO recruits staff from an an ay of the professional disciplines found
in colleges and universities throughout the country In the 1930s and 1960s,
the GAO iccruited almost exclusively from the accounting profession In recent
years, it  has greatly diversified its hiring practices, hiring master's degree and
doctoral level graduates in economics, ihe social sciences, public policy, pub-
lic administration, and business administration It continues to recruit accoun-
tants  but  generally at  ihe  bachelor's level   Increasingly,  GAO staff are
maintaining their  professional identities (for example, as an economist) after
entcnng the agency Nevertheless, the requirements of the work necessitate thai
members of each discipline become familiar with the terminology and meth-
ods of othci disciplines
    Although disciplinary diveisity  is clearly an impoitant pan ol the GAO's
institutional capability,  u is equally  important that there be a common set of
coic values and woikmg procedures thai ovcilny the vjnety of disciplinary pci-
spet lives It is essential to have a common understanding of the GAO's mis-
sion, of what  is  meant by  the expression "quality woik,"  of the GAO's
expectations for the w.iy in which  woik will be carried out, of the way in
                           EVALUATOR TRAINING IN A GOVERNMENT SETIING    63

  which ihe executive and legislauve branches operate and mteisect, and of the
  way in which the GAO will ultimately meet the mformalion needs of Congress
  and provide value to the taxpayer Meeting all these needs while developing spe-
  cific technical and computer skills poses a large challenge for the GAO's tram-
  mg programs  Training serves as a vehicle for establishing values, leaching
  agency procedures, and understanding the broad coniexi of evaluaiion work

  Training at the GAO

  In 1988, the GAO established us own Training Institute Training and educa-
  tion responsibilities were consolidated and separated from career counseling.
  personnel, and orgamzauonal developmeni suppon  The intent was to high-
  light the importance that the agency places on training and professional devel-
 opment  Investing  in staff development was deemed critical to meeting ihe
 mformalion needs of Congress.
     In comparison with other federal agencies, ihe GAO makes a relatively
 large investment in training opportunities for us siaff The physical plam
 includes iwo major training ceniers in Washmgion. D C , lhai have a lotal of
 seventeen classrooms  Each regional office also has space and equipmeni for
 training activities Completion of a nationwide videoconferencing capabiluy
 this year will permu training lo be provided simuhaneously in headquarters
 and regional offices
     GAO evaluators averaged seventy-four hours of continuing education in
 1992. Two-thirds of that irammg was delivered by the Training Institute  The
 institute has a rosier of 210 aciive courses and offers more lhan a ihousand
 classes a year (Training Institute .  ,1991) Evalualors have also been given
 significant resources, boih centrally and wuhm organizational units, lo  partic-
 ipate in professional development activities outside ihe GAO  The agency's on-
 gm in accounung conmbuies directly lo ihis emphasis on irammg by imposing
 a continuing education requirement on  all  evaluaiors To conunue lo be
 deemed qualified to do ihe GAO's audit and evaluation work, every evaluator
 musi oblam a minimum eighly hours of irammg every iwo years


 Teaching Evaluation in a Work Setting

 The environment of the GAO makes demands and  imposes constramis on ihe
 design of training programs that are quiie different from the  forces thai opcr-
 aie  in academic sellings Our siudems are adults langmg from recent gradu-
 ates of graduate-level programs to experienced evaluators who have  bioad
 evaluation and management experiences and mature auditors ncarmg rcine-
 ment Almosi all  ihese students work full-time and expect naming  to be
directly relevam to what they will do on the job the very next week
    Work  schedules and geographic  dispersion require ihai training l>e deliv-
ered in intensive segments  A typical Training Institute course consists of iwo
-------
64     Tllfi I'RFPARATION OF PROFESSIONAL EVAIAIATORS

lo four successive eight-hour days of training  This pattern permits a short
period of full-time training (about all that is manageable given the press of
ongoing work), and it gives regional participants reasonable (ravel time  This
concentrated, full-time training schedule and the nature and expectations of
the Training Institute's students—evaluator staff—heavily influence training
methods Most courses are a mixture of lecture, case studies, and opportu-
nities for practical application of skills thiough role playing or demonstra-
tion, and most courses make extensive use of materials taken directly  from
GAO work  When possible, we give training participants oppoitunnies to
use material from their current assignments in class exercises (for example,
by using real data from an ongoing assignment when they practice writing
testimony)
    In  part as  a means of focusing training on skills and activities directly
related to the work, instructors are heavily drawn from line staff and managers
Many of our senior executives regularly act as course instructors This pattern
of training delivery requires development of course frameworks and instruc-
tional materials that are easy for multiple instructors to use We also train our
instructors to leach effectively We are more likely to use external than inter-
nal  instructors for courses in such things as statistics, writing, computer soft-
ware, and generic management topics


Major Areas  of Emphasis

Overall, the GAO's formal training progiam for cvaluators  has six areas of
emphasis agency mission  and policies, assignment  planning and execution,
communication skills and strategies, computers and information technology,
workplace relations and management, and issue area expertise  With the
exception of issue area training, (he  courses in each area have been designated
as required, core, or elective and determined to be appropnate for staff, senior
staff, management, and/or executive levels
    All evaluators must lake the required courses, which contain information
that the agency believes is necessary for all peisons regardless of prior educa-
tion or work experience Core courses contain material with which all evalu-
alois should  be familiar, but the agency recognizes that individuals may excuse
themselves from specific courses if they have previously mastered the mater-
ial  Elective courses can be selected to fill specific needs, depending on the type
of work m which the individual is currently engaged Courses at the staff and
senior staff levels are concentraied in technical areas Courses at the manage-
ment and executive  levels emphasize management The training provided for
the  upper levels often relies on external opportunities for continuing educa-
tion, such as piofcssional conferences Cvaluators move thiough a structured
set of couiscs as they progress in their careers
    The GAO's curriculum structure was developed in collaboration with an
advisory c-~  -nmee of managers diawn from the agency's divisions and offices
                           EVALUATOR TRAINING IN A GOVFKNMI-NI Sen INC,    65

 Specific courses have been developed over the past four years, and only in the
 past year can the GAO be said to have fully implemented us evaluatoi cur-
 riculum The six sections that follow describe each of the major substantive
 areas.
     Mission, Policies, and  Individual Responsibilities. All new staff mem-
 bers are required to attend an initial orientation course that describes the GAO's
 history, its mission, its role in supporting congressional decision making, ethics
 guidelines, and the policies and procedures for the conduct of evaluation assign-
 ments  Subsequent required courses that evaluators take in the next few years
 elaborate on standards for work, internal control issues concerning the quality
 of acceptable evidence, and processes for ensuring accuracy in  the agency's
 reports
     As staff move up the career ladder after each promotion, they are invited
 to attend so-called promotion programs that lay out the organization's expec-
 tations for their new roles The discussions in these programs are structured
 both around people issues—interpersonal communication, supervision, per-
 formance feedback—and around planning and reporting issues—for example,
 what it means lo have responsibility for directing an evaluation or managing
 the work of multiple teams of evaluaiors Ai mid levels, these programs can
 include special topics, such as information on a manager's equal employment
 opportunity responsibilities
     Assignment Planning and Execution. Each audit  or evaluation at the
 GAO is referred to as an assignment, and the skills necessary  lo design and
 manage an assignment are a crucial part of the GAO's internal training pro-
 gram for newly hired evaluaiors The goal here is twofold 10 create an aware-
 ness of other professional disciplines and to build specific skills
     The overall intent of ihis pan  of the curriculum is lo fosier an awareness
 of the wide range of work that the agency docs and of the need to apply appro-
 pnate methodologies when the work is done All entering staff are required to
 attend a workshop on the selection of an approach and methodology The
 workshop provides guidance on how to lake an area of congressional concern
 and  develop focused questions lhat can be answered within the  constraints
 imposed by resources and lime Workshop participants then  analyze these
 questions lo determine how they can mosi appropriately be answered  Stalf
 then lake core methods courses on compliance auditing, economy and effi-
 ciency  reviews, program evaluation, and policy analysis Follow-on courses on
 such lopics as procurement  and contract processes, financial management.
 budgeting processes, fraud awareness, and special issues in economics aie
 available. The goals are for all individuals—staff and manageis—in become
comfortable with a variety of types of work and lo be able to work rfleuivcly
 in a multidisciplmary environment
    To meet ihe skill-buildmggoal. the institute offers couiscs on such topiis
as sampling, questionnaire design and structured interviewing, jpphed siatii-
lics (basic classes and elective classes on advanced topics, sir    log-linear
-------
66     THE (-REPARATION or PROFESSIONAL EVALUATORS

modeling and lime senes analysis), and qualitative methods At entry, staff are
provided with self-paced training materials on organizing their documentation
(work papers) for the evidence and analysis thai support an audit or evalua-
tion
    Communication.  Although a GAO evaluation team may conduct an
excellent study, ihe value of the study will be weakened significantly if it is not
communicated effectively in published reports and oral bnefmgs For this rea-
son, the GAO's curriculum gives evaluators a series of courses reflecting the
latest research on written and oral communication and on cognitive psychol-
ogy  For example, instructors may discuss readability principles and factors
that increase the  retention of information read
    Writing courses at the entry level clanfy the GAO's basic communications
policy and differentiate between academic writing and workplace writing  The
focus is on producing an institutional—not an individual—product In class,
evalualors work on skills that they need in order to  wnle GAO documents—
for example, analyzing the writing situation, wnting collaboratively, recogniz-
ing the difference between wnier-based and reader-based documents, assessing
the readability of their own documents, and using review comments to
improve documents
    Class exercises for senior staff show how writing and thinking are inextri-
cably linked and how the structure of a written report can affect us interpre-
tation  Training participants also practice constructing a succinct message out
of masses of data  Using data from a case study, evalualors develop  report
issues, prepare for a message conference (a meeting m which all evaluation
team members, advisers, and managers discuss the evaluation results and agree
on their interpretation), and conduct a simulated message conference Mes-
sage conferences are stressed because they improve the quality and timeliness
of the documents produced and reduce unnecessary rework  A course for man-
agers called Managing Writing reviews the writing principles embodied in the
curnculum, suggests strategies for managing the writing process, and presents
ideas about the  role of oral  and written  communication in public  policy
processes
    The wnting curnculum also includes specialized courses thai help staff to
write specific kinds of documents, such as an executive summary or written
testimony for oral presentation These two kinds of writing are emphasized
because both are highly visible statements of the GAO's work Both kinds of
wntten summaries receive close scrutiny, and many readers may never read the
full evaluation report During the testimony course, evaluators develop testi-
mony by following guidelines for effective congressional presentations They
practice testimony-wnimg skills and  receive constructive feedback The  course
also includes discussions with executives who excel in the delivery  of written
testimony
    Oral communication is equally important,  because much of the agency's
work is conveyed through bnefmgs and testimony Training on oral presenta-
tion skills begins during the first three months after a new evaluator starts
                           EVALUATOR TRAINING IN A GOVI.RNMENT SETTING     67

 work, and it seeks to improve his or her interviewing and briefing skills The
 follow-on course is dedicated to honing presentation skills, it makes use of
 videotaping and feedback  Electives are available ai a more advanced level 10
 improve presentation skills and learn how to conduct meetings effectively At
 ihe most advanced levels, managers and executives can lake hands-on courses
 involving practice in communicating effectively with ihe media and delivering
 oral testimony to Congress Figure 6 1 shows how the conient of ihe commu-
 nications courses vanes by position
     Computer  Use.  As a  large organization, ihe GAO uses several software
 packages lhat must be supported with technical assistance and training Several
 years of expenence have taughi us thai users prefer thai course maienal deliv-
 ered in the classroom be very bnef.  Maienal is often delivered in modules—for
 example, WordPerfect sort features. WordPerfeci lexi columns, Loius 1-2-3 daia
 lables.  Users can enroll in  ihe course most suiied lo iheir immediate needs
 Forty courses are available  (Training Institute   ,  1991)  They relate to word
 processing applications, spreadsheets, and database management systems Addi-
 tional training is available on microcomputers, data analysis packages, such as
 SAS and SPSS; computer communications; and support for local area networks
 As ihe agency is moving to design and implement software applications for
 organizing, sharing, and accessing working papers and databases, the institute
 is designing training to support their use. Steps are also being taken lo revise
 existing training in ways thai recognize how these apphcauons and the use of
 local area networks can change the ways in which work gets done.
  Figure 6.1. Communications Courses in the GAO's Evaluator Curricu-
                                                       Eiacunva
    Producing
    OganlMd
    WMnoml
             Wnuttop
              Evwy ftiM
                                EiaCUM
                                Sufimwiy
                                Wortuliop
                                 Tummy
rsa?
                                                    EiaoilM
                                                    Di.tvo.hg
                                                    lammony
Note  Introductory Evaluaior 1 wining, the iwo-week orientation, includes module;, on wining and
oral bntfmg skills
-------
68
Tnr PREPARAIION or PROCESSIONAL EVAI.UAIORS
    Workplace Relations and Management. Training in the area of work-
place relations and management contains material that traditionally has been
classified as both soft and hard skills As one might expect, the institute offers
classes on time management, relations with Congress, and management of
one's issue area, that is, a body of work in an area like transportation or energy
These kinds of courses focus on planning and coordination processes  Super-
vision and performance management seminars are also available
    Courses on interpersonal relations in the workplace, mediation, diversity
in the workplace, and advanced communications and negotiations are newer
additions to the GAO's curriculum. As the GAO has become  increasingly
involved with total quality management (TQM), the emphasis on interpersonal
communication, teamwork, and management skills has increased  Recent hires
have an opportunity to enroll in such courses as Workplace Relations and
Communications Mid-level managers take Managing Quality Improvement,
and top-level executives learn about the role and responsibilities of quality
councils  Five-day courses prepare leaders of problem-solving teams to use
appropriate tools and techniques, leach fellow team members, and be aware
of how to foster positive group dynamics Additional training is expected to be
added in this area as the GAO advances in us implementation of TQM
    This area also contains courses to build skills and heighten awareness and
knowledge about key supervisory and management responsibilities All supervi-
sory staff recently participated in workshops on ways of preventing sexual harass-
ment in the work environment, and a similar course  is now  available  to
nonsupcrvisory staff All supervisors and managers have already received train-
ing on their equal employment opportunity (EEO) responsibilities To maintain
this awareness, the GAO auiomaucally enrolls newly promoted staff in the EEO
workshop And in recognition of the increasing diversity of its work force and of
the need to have a work environment that makes all staff feel welcome and val-
ued, the institute has started to provide workshops on the valuing of diversity
    Issue Area Training.  As noted earlier, the GAO has thirty-six issue areas
covenng work in areas as wide-ranging as national security policy  and national
resource management Generally, the Training Institute has neither the exper-
tise nor the resources needed to develop issue area subject mailer training, so
in most cases issue area groups pursue their own strategies to develop and
maintain staff proficiency These strategies can include inviting subject matter
experts to give informal talks, using consultants on specific projects, and hold-
ing planning conferences with invited participants from government agencies
and academic and other relevant groups, including businesses,  professional
associations, and think tanks
    However, major training initiatives have been supported internally in two
key issue aieas In the financial management aiea, the institute worked  closely
with the GAO's accounting and financial  management expeits to develop a
financial auditing ami<_ulum And in the  information management technol-
ogy aiea, •'     sniute has offered courses on such topics as computer security,
                                                                                                         EVALUATOR TRAINING IN A GOVCRNMEN r Sc RING    69

                                                                               telecommunications, and systems development Several masier's-level courses
                                                                               leading to a certificate in information systems from the George Washington
                                                                               University have been offered at the GAO's training center on a regular basis at
                                                                               the end of the workday


                                                                               Self-Paced Training

                                                                               Besides classroom courses, the institute provides a variety of self-paced courses
                                                                               in interactive multimedia, audio, video, and print formats Many of these
                                                                               course offerings are related to computer software packages, but others cover
                                                                               such topics as management and supervision, human resource management,
                                                                               writing, and administrative support activities  In most cases, courses can be
                                                                               mailed to the work site and used on the individual's own compuier When spe-
                                                                               cial equipment is needed or licensing restrictions make widespread distribu-
                                                                               tion impractical, individuals can sign up to take such courses in ihe msuiuie's
                                                                               learning center  Acceptance of this type of course delivery has grown Self-
                                                                               paced hours increased by iwo-thirds in the past year More than 1,300 evalu-
                                                                               ators enrolled in self-paced courses in fiscal year 1993. and there have been
                                                                               about 800 course completions to date

                                                                               Lessons Learned

                                                                               Developing and delivering ihe GAO's evaluator curriculum has been a contin-
                                                                               uous learning process for everyone involved  Several valuable lessons have
                                                                               been learned that may be useful to others who have a responsibility for nam-
                                                                               ing functions in similar contexts
                                                                                  Firstfjob relevance isrrul^Drnnro. developers and msiructois need to
                                                                               be able to assess training needs accurately, design effective learning experi-
                                                                               ences, and demonstrate to training participants that the training material is
                                                                               directly relevant to their work Training is not an end in uself. u mus,i be
                                                                               designed to support the effectiveness of the individual and the organization
                                                                               One technique for enhancing the relevance of training is to design courses that
                                                                               use real case studies or have the students bring ongoing work to class and
                                                                               apply the material to it This means that instructors have to be adaptable and
                                                                               quite proficient in their area of expertise.	
                                                                                  Second,  th^tnvolvement of line managers and stottjncrrw* the credibil-
                                                                              ity and quality of training  The GAO's curriculum was developed under the
                                                                              guidance of a management advisory committee, and all decisions regarding
                                                                              mandated, core, and elective courses were  made by this committee Housing
                                                                              decision-making authority in the "line" makes ownership of the curriculum
                                                                              greater than if the training department were to make all ihe decisions  Execu-
                                                                              tives, managers, and senior staff also often serve as mstruciors or preseniers on
                                                                              panels or contribute course material, thereby endorsing the value of training
                                                                              and building commitment
-------
70     THE PREPARATION OF PROI-ESSIONAL EVALUAIORS
    Third, training needs lo dclivefcpnsistcnt messages at all levels^The
GAO's curriculum structure was intended to be an integrated one, with simi-
lar concepts and skills in courses at staff, senior staff, and management levels
We are still completing this process, and we hope by next year to have paral-
lel courses running at all levels, with the upper-level courses incorporating a
managerial perspective  This consistency will ensure that managers and staff
are familiar with the same terminology, methodologies, and guidance  We also
plan to increase the amounl of training that we deliver lo intact work groups
and reduce the number of open enrollment classes We believe that training
the members of work groups together makes it more likely thai the concepts
laughl in class will  be reinforced and used on the job Work unit-based train-
ing is also expected lo have a direci effeci on improving teamwork and inira-
unu communication—important goals in an organization focused on quality
We plan to lesi ihese assumptions by conducting follow-up evaluations for
selected courses                                           	
    Finally, in the spinl of quality^management, we strive l^fassess^he effec-
uveness of our training elTorts^oTuinuousTji and we revisit our delivery strate-
gies The challenges lhai ihe GAO laces Change consianily,  the needs of our
work force change, and lechnological advances creaie new training demands


References
Budget and Accounting Act of 1921 (PL 67-13. 42 Stal  20, codified asamended al 31 U SC
  712)
Training Institute. U S General Accounting Office Training and Education 1992-1993 Catalog
  Washington. D C Training Institute. U S General Accounting Office. 1991
 NANCY KINGSBURY is dneclor for federal human resource management issues in the
 General Government Division of the V S General Accounting Office

 TLRRY E  HLDRICK is direcloi of the Training Institute al the U S General Account-
 ing OJftic
  The procedures used to collect information for the Directory of
  Evaluation Training Programs are described, and the results of the
  survefcflre discussed. Tables list the programs identified in the
  United States, Canada, and Australia.                 '
 The 1994 Directory of Evaluation

 Training Progral

 James W. Allschuld, Molly &wle, CarA Cullen,
 Inyoung Kim, Barbara Rae Ma$ce  /



 The American Evalualion Association ^A) has periodically published a direc-
 lory of evaluation training program? m fhe United Slates and Canada (May
 Fleischer. Schreier. and Cox, 1986. ConnerMZIay. and Hill, 1980. Gephan and
 Potter. 1976) May. Fleischer. Schreier. andVox (1986) listed fony-six pro-
 grams in six different lypes of sellings PrograrXs locaied in education and psy-
 chology predominated  In late J992, ihe boarclof ihe American Evalualion
 Associaiion commissioned a sliyly lo provide a cUrrem lisung and description
 of evaluauon iraming progranjs This chapler describes ihe siudy meihodol-
 ogy. reports some of ihe stud^s findings, and labulaifcs ihe informauon aboul
 evaluation training prograr


 Methodology

    Sample. Since 198|($. numerous changes have occurred in ihe field of
 evaluation thai could ajjfeci ihe nature of training programs \mong them are
 changes m ihe meihods lhai evalualors use and  improved understandings of
 the relationship belween evaluation and policy development As\a result, new
 programs have emerged, others have changed or altered theirtonieni and
 structure,  and still/others have ceased to exist
    The initial and perhaps most challenging task for the study team was to
 develop a comprehensive sampling frame of potential candidaics We used the
 following  process to develop the  sampling frame  First, we  placed an
announcement in ihe call for papers  for ihe AEA's 1992  annual conference
The announcemem asked AEA members to nominate candidates for the study
Ntw Di.bLiKiNS ioi PKH.MH EVAIIMIKW no 61 Summri IWM O |«K) ibu r.ibl»hr,,                 7 |
-------
Meetings and Events                                             http /Avww evaJ org/Conferences meetings html


           American Evaluation Association  Annual
                                 Meetings
     AEA holds its annual conference during the first week of November each year.
     Topics of current interest are discussed in sessions proposed by members, as well
     as in sessions presented by invited speakers. In addition, a computer-assisted Job
     Bank is provided at the annual conference. Professional awards for outstanding
     contributions to the field of evaluation are presented in the areas of: service to the
     profession, evaluation theory, evaluation practice, and service to AEA. Preceding
     the conference, over 25 training sessions are offered. Past training session topics
     include: increasing evaluation use, cost-benefit analysis, reporting and debriefing,
     applying professional standards, statistical analysis software, focus groups,
     qualitative evaluation, and secondary analysis of available data.

                 Future AEA  Annual Conferences

     Evaluation 2001
     Dates: November 7-10 in St Louis, MO
     Hotel: Millennium
     Call  for Proposals in  Mail: By January 7, 2001
     Proposals Due: March 16, 2001
     Notifications of Proposal Status: By July 1, 2001
     Registration Materials Available: By July 1, 2001

     Evaluation 2002
     Dates: November 6-9 in Washington, DC
     Hotel: Hyatt-Crystal City

     Evaluation 2003
     Dates: November 5-8 in Reno, NV
     Hotel: Nugget

             Other  Evaluation-Related Meetings or
                              Conferences

      •  List of  Events

              Archives from Past AEA Conferences
,of2                                                                   1/4/01 1040AM
-------
Links
http.//www eval org. ListsLinks-EvaluationLmks/links htir.
                    Links of Interest to Evaluators
     Here are some sites that you may find useful. If you know of other sites that might
     be of interest to our members and others involved in evaluation, please send your
     suggestions to AEA manager Susan Kistler at aea@kistcon.com.

     AEA Topical Interest Groups

          TIG on Alcohol. Drug Abuse, and Mental Health
          TIG on Assessment in Higher Education
          TIG on Business & Industry
          TIG on Collaborative. Participatory & Empowerment Evaluation
          TIG on Extension Evaluation Education
          TIG for Graduate Student Association
          TIG on International and Cross Cultural Evaluation
          TIG on Minority Interests in  Evaluation
          TIG on Program Theory and Theory- Driven Evaluation
          TIG on Research Technology and Development Evaluation
          TIG on Teaching of Evaluation

     AEA Local Affiliates

          Arizona Evaluation  Network  fazENet)
          Eastern Evaluation  Research Society
          Ohio Evaluator's Group
          Oregon Program Evaluators  Network
          Southeast Evaluation Association
          Washington Evaluators
          Western Pennsylvania Evaluator's Network fWPEN)

     Other National Evaluation Associations
1 of 3
                                                                         1/4/01 1037 A^
-------
Links
http '/www eval orS'ListsLinks-EvaluationLinks/hnks him
           A'. 5:.'?.'3S!-:- -. Evaluation Soae;:.'
           Canaci:?n  Evaluation Society
           European  Evaluation Socier-/
           French Evaluation Society
           German Evaluation Society (German language)
           Ghana Evaluators Association
           Italian Evaluation Society
           Malavsian Evaluation Society
           Monitoring & Evaluation in Latin America and the Carribean
           Nigerian Network of Monitoring & Evaluation
           Sri Lankan Evaluation
           Swiss Evaluation Society
           UK Evaluation Society
           Walloon Evaluation Society
      Other Associations of Potential Interest to Evaluators

        •  American Society for Public Administration

      General  Evaluation Sites
           The American Educational Research Association
           Applied Survey Research
           Centre for Program Evaluation. The University of Melbourne
           CRE5ST Home Page (Center for Research on Evaluation. Standards, and
           Student Testing)
           Educatiional Research Methods (follow link then click on textbook)
           ERIC Clearinghouse on Assessment and Evaluation
           Evaluation Associates Ltd.
           The Evaluation Clearinghouse
           The Evaluators' Institute
           Harvard Family Research Proiect
           The Joint Committee  on Standards for Educational Evaluation
           Lesley  College Program  Evaluation and Research Group
           Literature on Programs
           Monitoring and Evaluation News. A news service focusing on developments in monitoring and
           evaluation methods relevant to development projects with social development objectives
           (UK).
           On-line Evaluation Resource Library. A resource of project evaluation tools (plans and
           instruments) and reports used by the National Science Foundation's Directorate for Education
           and Human Resources; topic areas focus on curriculum development, teacher education, and
           faculty development, including minority group representation (US). URL: http //oeri.sn com/
           Performance Assessment Links in Science An online resource of performance assessments
           for students studying science in grades K-12 provides information on standards, tasks  and
           rubrics for evaluative purposes (US)
           Program for Public Sector Evaluation. Roval  Melbourne Institute of Technology This
           interdisciplinary group focuses on public sector (e.g., program) evaluation; information is
           provided on recent articles, coursework, and projects (AU)
2 of 3
                           1/4/01 10:37 AN
-------
Links                                                           http.//wwwevalorg'ListsLinks,Evaluai,,. -niks/'lmks.hnr



                              >--s on th^ Met A site with many resources, information is organized for
            general audiences, students, professionals, researchers, with a "room" for chat and feedback
            (CA)
         •  Gene Shackman's List of Free Evaluation pesourceq on the Web Resources on methods in
            evaluation and social research (US)
         •  Student Evaluation Case Competition Open during the CES annual meeting to students of all
            levels and disciplines, this is an opportunity for small teams to compete in the analysis of an
            evaluation case file available in English and French. The cite includes archives with past
            competition scenarios and winning entries. (CA)
         •  Bill Trochim s Center for Social Research Methods. Links for applied social research and
            evaluation;  look for the Knowledge Base (online textbook), statistical test selector, and the
            simulation book (US)
         •  UK Evaluation Society Home page for a professional organization that promotes the use of
            evaluation as a contribution to public knowledge (UK)
         •  UMASS Foundation Relations Responsible for managing the University of Massachusetts
            relationships with private foundations, this office provides a wealth of information about
            grants, philanthropy, and foundations (US)
         •  UNICEF Research and Evaluation A timely update of policy analyses, evaluations, and
            research;  links, statistical data, and newsletter archives can also be accessed (US)
         •  University of Wisconsin Program Development  and Evaluation Full-text publications in  PDF
            format available for download; targeted to evaluators assessing extension programs, but
            resources have general evaluation appeal, as well (US)
         •  Vanderbilt Center for Mental Health Policy Research focuses on child, adolescent, and family
            mental health; follow the links to current projects, including the homeless families study (US)
         •  Virtual Library  Evaluation
         •  Western Michigan Evaluation Site, the Evaluation Center/Evaluation Support Services.
            Information on evaluation checklists and instruments, terminology; resources also include a
            directory of evaluators and related links (US)
         •  The World Bank Institute The evaluation unit analyzes the learning activities for World Bank
            Institute clients and staff; be sure to check out the newsletter, too (US)
         •  The World Bank. Operations Evaluation Department. This independent division evaluates the
            lending operations of the World Bank; online publications in different languages are made
            available as a  result (US)

       Federal Sites and Databases

            CIESIN's US Demography
            CYFERNet f Children Youth and Family Education and  Research Network^ Gopher
            List of WWW Servers (USA  - Federal Government')
            National Institutes of Health Home Page
            Substance Abuse and  Mental Health Services Administration
            US Census Bureau Home Page

       Statistics

         •  American  Statistical Association
         •  One-Stop  Federal Statistics Site
         •  Statistical country profiles and global maps of indicators of interest to UNICEF
         •  Statistics  Canada
3 of 3                                                                                         1/4/01 10-37 AN
-------
            STANDARDS OF PRACTICE IN EVALUATION
Refer to

Articles in Fitzpatrick, Jody L and Michael Moms, eds  (1999) New Directions for Program
Evaluation  No 82 San Francisco CA Jossey-Bass Publishers.

      Fitzpatrick, Jody L  "Ethics in Disciplines and Professions Related to Evaluation"
      Datta, Lois-ellin "The Ethics of Evaluation Neutrality and Advocacy"

"The Program Evaluation Standards" -Joint Committee on Evaluation Standards

"Cuidmg Principles for Evaluators" - American Evaluation Association

Your copy of "Government Auditing Standards" (the Yellow Book) - GAO

Your copy of "Quality Standards for Inspections (the Turquoise Book) - PCIE

1  Think about the differences between; ethics, standards, rules, principles, philosophy

Ethics


Standards


Rules'


Principles-


Philosophy
-------
      1:1)1 it)KS' Nuirs
     L FllZl'MMCK is dssoi late /);o/cssor D/ />uJ>lic ddmimstidfion d/ (lie University
o/ Colorado 5'it* mmniuins an rtciivf p«( ore in evaluation anil is interested in (/if
ethical nuances o/evaluaior-dient relations Slie serves tin the Board of the American
fl valuation Association and is working on a booh of IMC studies Joi the association

MlCHAFL MoKKfS is pro/essor o/ psyi luiltgy and director o/graduate jield training in
((immunity psychology at the University ajNew Haven He a/Us the column "lilln-
fcn ^jivcn /D (lie
slud^ o/e(lniul codes in evaluudun-rcluleJ (nnsul(mj> f)/o/cssiuns She
examines the ethical cades within evaluation and ielated disciplines
uriJ />i<|/essiuris and discusses implications foi content, dissemination,
and compliance
                                                                                           Ethics in Disciplines and Professions
                                                                                           Related to Evaluation
                                                                                           Jody L  Filzpatnch
                                                                                           Donald Campbell (1969) bemoaned lite iracluional isulaiion, 01
                                                                                           of diffcieni disciplines Using (he analogy of a lish. he picsenicd a lish sea It-
                                                                                           mode I of omniscience in depicting the social sciences and iclaied discipline!)
                                                                                           Each discipline believes u knows the uiuh ((he whole fish) when, in lau. we tend
                                                                                           10 know only our own discipline (single scale)  This chapter is designed to help
                                                                                           us avoid such ethnocenliisin in ethical malleis and instead, as Campbell aigued
                                                                                           we should, learn fiom related fields
                                                                                           F.lhicul Codes in Program Evaluation

                                                                                           To pi o vide a loundation fbi (Ins learning, I fust Imclly leview the liibimy ul
                                                                                           ethical codes in evaluation In the eaily 1980s two documents wcie
                                                                                           to guide evaluatois in then ethical consideiaiions  the Standaid** /or (•
                                                                                           of Educational /'rogrums. I'lojecl^, and Malenal\ ()
-------
h     l>ll Kl.INC I'llllC Al I".MAI I I Ml.IS IN LVAIIJAIUIN

maintained (hi- same four in.ijoi groups of standards—iinliiy, Icasihiliiy. pio-
pneiy, ami auuiacy—Inn wiilim these majoi categones some of die onginal
ilnily stiindaids weie combined .UK! oiheis wcie it-vised I  uiilici, these newei
siandaids. Ihf /'M^KIIII /Ivdliidlioii Sliini/diiK, wcie intended lo addicss evalua-
luins beyond die educational aiena, though much of llic focus leinains on cdu-
(..iiion .Mid naming Siinil.uly, in 1995, (lie Amencan IZvalualion ASSOLIJIIOII
(AI:A) |)til)lisheil us new GiiK/mg Pniicijrii-s/oi  F.valuiUon  These guiding pnnci-
ples olfcied a set of values—foi example, honesiy. miegiiiy. anil Responsibility
loi public well.ue—.is guides loi evaluation practice  (In 1986. the  Lvaluaiion
kescaich SoLieiy had merged wnh I;valuation Neiwoik lo lomi die Amencan
livaluaiion Association, a piofessional associauon lo lepiesem the enure pio-
lession in ihe United Slates ALA those not 10  adopt (he old ERS standaids. but
i.11 hei 10 develop us own guiding prmuples ) Today, evaluaiois in the United
btales have these two documents lo advise their ethical practice  Oilier coun-
tnes (Canada), oigamzations (Government Accounting Office), and groups of
tommies (Ausii jl.isian Evaluation Society) have developed then own standards
(See Won hen, Sandeis, and Fiizpainck  11997| foi a discussion of these )
    What can we learn fiom these two documents and Irom the history ol eth-
ical codes for evaluation7 Compaimg the  documents published in the eaily
1980s with those published in the mid-1990s  teveals a major change a move lo
include a gieaiei  locus on noii-inelhodological issues  "I Ins change is most obvi-
ous in comparing the l:kS standaids and the ACA guiding principles Table  1  I
lists the niajoi headings loi both "Ihe  I. KS siandaids generally minoi the stages
ol an evaluation In coniiast, the AI:A guiding pi maples are moie  concerned
wiih qualities 01 pnnciples thai peime.ue the evaluation pioccss As I discuss
liuihei on. the natuie ol the guiding pnnciples is moic congruent wuh  the eth-
ical codes ol other piolessional associations  I hat is,  the articulation of values.
as opposed 10 stages of tasks, is a moie lommon stialegy in olhei ethical codes
Peihaps moie mipoiiani lo the lusioiy ol evaluation, this change Illinois die
move in the education .met naming of evaluaiois fiom .1 veiy strong focus on
methodological issues (which ceitamly  lemains die suit' <|i«i non of evaluation)
to a gieaiei examination ol the many political faciois anil peisonal  judgments
entailed in conducting evaluations  I his change has been positively noted in sev-
eial of the commcnianes on the AHA  Guiding Pnnciples with special refeicnce
    lahle I  I   I:US Standards veisus ATA Principles. Major Headings
f KS
    AM I'MIII I/id's
I Ulllllll.il1011 .111(1 l
Man Inn- .ind
D.II.I iiillnlion .iiul |)ii|).ii.iiuiii
D.II.I .in.ilybib .uid iiiicipii'i.iiiDii
("(iniiiiiiiiii.iii'      ilisi Insult-
Systematic nu|iniy
    -Ll lor people
ki-spnii!>il>ililif<> loi genual
  .ind pulilk will.ui-
                                                     I  IIIKSIN DlSCII'IINI.S AND I'KOI I.SSIONS KlIAIII) IO I.VAII IA I ION     7

                                          lo the pnnciple conceinmg lesponsibilities foi (he gencial and puhlu wellaie (!•)
                                          (Covert, 1995. House, 1995)
                                              Though the headings in "lahle I  I ic-llecl changes in the lone and empha-
                                          sis in evaluation ethics Irom (he  1980s to the 1990s, the dilleience should not
                                          be overstated  As one might expect, the oveilap between the enure body ol 1-kS
                                          Siandaids and die AHA Guiding Principles is gieat, most ol the topics ami con-
                                          tent covered in ihe  first are lellected in (he  second  Ihe conveise is also line.
                                          even in cnntioveisial aieas  The HIS Siandaids thus recommended ideniifying
                                          various gioups ol stakeholders and then  "mloimillion needs and expei laiions"
                                          (f.KS Standaids Committee, 1982, p  12)  I hey even aigued thai "evaluaiois
                                          should also help identify aieas of public inieiest m  the piogram" (I:KS .Stan-
                                          dards Coniminee, 1982. p  12) But the lone  is diffeiem  The AI:A Guiding
                                          Principles emphasize the diveise groups we serve. 01 miglii seive. and our
                                          obligation lo be inclusive in ensuring those gioups aie icpiesemecl  I in.illy. die
                                          language used in major headings is important  One goal of piolessional codes
                                          is to inspire ethical behavior among us members  Lolly language can help m
                                          thai regard  As such, the major categories lor the AI:A Guiding Pniuiples. as
                                          with the Joint  Committee Standards, aie moie  inspnalional than (he su-p-by-
                                          siep emphasis ol the eailiei LiKS Standaids
The Social Sciences and  Evaluation Codes

 Ihe ethical codes discussed above have been stiongly influenced by ethics con-
cerning social science teseaich in specific disciplines 'I Ins influence can be
seen in (he initial impetus foi the codes and m'llien piocessol devclopim-m
 The original Siandaids (1981) weie a spinoll fiom the levision ol the hiuiuiimJ*
for Educational and Psychological TCMS and Manual* by the Aineiuan liiliuational
Keseaich Associauon (AI:KA), the Amencan I'syihological Assoiianou (AI'A).
and the National Council on Measuiement m liducaiion  I he twelve sponsoi -
mg organizations for ihe 1994 version continue 10 lepii'snil these areas, hut
llie validation panel for the newer Joint  Comnmice btandaids also nu Killed
broader aiuliences lo n-piesenl adull naming m many aieas  Ni-vtnlifles:>. die
locus remained on education and naming I he development  ol the ALA tiiiul-
mg Principles was initialed by leviewmg the ethical codes in psychology (AI'A).
film anon (ALKA). and aiillnopology (Amencan Anlliiopology ASSOH.IIKUI.
AAA) I he committee also icviewed oihei codes dealing wuh iesi-aii.li, nii.liii.l-
mg the federal regulaiions on Pioftdnm i>[ llumtin .SuJ'/cus and  tin lltltnuni
RffWH on biomedical and behavioial teseaich (bhadish. Newman. Si lu-nci. and
Wye. 1995)
     "I hese di)cumenis. which locus pnmanly on iCM-auh. .m- icii.nnly |>i in-
neiii to the ethical pnnciples of piogiam evaluaims  I hey piovidt impoiiaiu
gLiidelmesconceiningihe design ol leseaich and the c-thual UIIK fiiisciu.ulid
when one collects data horn people  I lowevei. the almost t     ivt  liuuson
the social sciences fails 10 iiilonn usol the ethical coiilluis.      us l.ncil by
piolt-ssioiis. suih as evaluation, thai woik ilneiily with i	s As  I h.ivt-
-------
. HiiiiCAi QIAIILNOI.SIN HVAI MA i ION
8
argued elsewhere, because the graduate naming of mosi evaluatois is in ihe
social sciences, we tend 10 use these disciplines as exemplars and neglect pio-
(essions that are similar in our own (Filzpatiick, 1994)  As the AEA Guiding
Principles introduce ethical issues concerning relationships with clients and
the public, balancing of stakeholder needs, and the values involved in these
mieiaciions, the ethical codes of other professions that struggle with conflicts
among clients, other stakeholders, the public welfare, and the values of their
discipline become iinpoiiant learning tools  In fact, the Joint Committee has
found some procedures from the accounting profession to be useful in devel-
oping their standards for evaluation (Sanders, 1999)
     Ihe AEA Guiding Pi maples were initiated to stimulate dialogue  among
program evaluators on how we deal with ethical dilemmas  Yet, that dialogue
has not piogressed as  much as many might have desired  Some might argue
that the absence of dialogue is clue to the generality of the principles  (Rossi.
1995) House wines that the "endorsement  of general principles sometimes
seems platitudinous or irrelevant" (1995, p  27)  However, he goes on to
encourage the dialogue, observing that, "Ethical concerns become interesting
only in conflicted cases, and it is  often the balance of principles that is crucial
rather than the punciples themselves" (I louse. 1995, p 27) Examining the
codes, cases, and pioceduies of professions confronting similar conflicts can
be fiuitful m fuither stimulating  this dialogue


Consulting versus Scholarly Professions
Bayles (1981), m witting about piolessional ethics as a broad subject, makes
an important distinction between consulting firu/fssioruils and scholaily piofes-
sinnfils Admitting the terms represent a continuum and the  middle can  become
murky, Bayles wines that consulnng ptofessionals differ from scholarly pro-
fessionals in two important ways they establish peisonal. working relation-
ships with their clients, and their method of reimbursement  is typically
fee-for-service In contiast, scholaily piofessionals generally deal with clients
at a distance (students m a class, leadeib of a journal) and aie salaried Con-
sulting professionals woik us "eniiepieneuis" and. as such, "depend on attiact-
ing individual clients" (Bayles, 1981.  p  9) Consulting piofessionals  include
lawyers, physicians, aichitects. consulting engineers, accountants, and psy-
chologists Scholarly piolessionals include teacheis, professors, and scientific
researchers As program evaluators who use  research methods, we may fall in
that murky middle, but I would argue  that we are more akin  to consulting
engmeeis 01 accountants in our  relationships with clients than we are to oui
social science biethien
     The diffeient economic and peisonal lelalionships with clients, Bayles
argues, "are ciucial in defining the kinds of ethical  problems each confronts"
(1981, p 9) The peisonal lelationships that consulting professionals develop
 with ilien clients and the expectations engendeied by clients' direct lining and
            llIMICb IN DIM MM INI'S ANI> PKorLSblONS KUAIID IO IIVAIUAIION     9

Because the relationship ol the consulting professional is closer 10 ihe cliein
than to other stakeholder, the ptofessional must guaid against bias toward, or
ovendentification with, the clients' views or needs further, because the pio-
fessionals' ongoing livelihood depends on utiiacimg and retaining clients, it
can be against the piofessionals' self-interest (at least in the shoil-teim) 10 pui-
sue ethical norms that conflict with the clients' self-perceived needs  I he schol-
arly professional is  not so bulfeted  by the pressures of individual  clients'
expectations or the exigencies of maintaining a practice This distinction can
be uselul for evaluators m consideimg ethical codes  Ceitamly, foi method-
ological issues, our ethical codes should build on those from the  scholaily pro-
fessions  But as evaluation ethics moves toward a focus on the values entailed
in dealing with diverse stakeholders and balancing the public interest, we can
also look to the consulting professions for guidance

The Content of Various Professional Codes

Codes of professional groups vary considerably in their comprehensiveness.
explicimess. and means of enforcement Table 1 2 piesents the principles of sev-
eral professional associations, both scholaily and consulnng ('I hese principles
are referred  to as standards, canons, and principles by the difleiem gioups, but
they all represent the first level of values articulated m the code ) One  first
notices the commonality across punciples m spue of (he variation in the fields
represented  Several (accounting, engineering, public administration) begin with
a principle concerning the public service or public welfaie Psychology ends
                                                                                 Table 1.2.  A Sample of Characteristics of Selected Professional Codes
                                                                                I'ro/ession
                                        I'HIM i/ilYj *
                                                                                Accounting (AICI'A)


                                                                                Inlinijl auditors (IAA)
                                                                                Professional engineers
                                                                                  (NSIM2)
                                                                                Psychology (APA)
                                                                                Public admmislrJiioii
                                                                                  (ASPA)
                      Responsibilities as piolesstnnals. seiving tlie public inltiibi.
                      integrity, objectivity and mdt pendencc. exeuise due »aie.
                      apply piniLiples lo scope and nariiie of scivues
                      Honesty. ob|iilivtly. diltguiiL. luyjliy, lonlliil* ol mil KM. lies
                      or gills, confidentiality, due (..lie lo obt.tm biilln.ii in l.u iu.il i vi
                      denrr lo support tbe expulsion ol .in opinion
                      Safely, bcjllli. and wellaie ol tbe public, i output mi. H|>|LI|IVI
                      and liullifnl. failbful agent of employer 01 ilieiil. avoid dm p
                      live acts, conduct onestll bonorably and responsibly 10 lionm
                      I be profession
                      Coni|>eieiice, integrity, professional and bucniilic ii'>puu->ibiliiy.
                      ies|>ecl (or people's nglils and dignity, joniiin loi otbeis w«l
                      faie. social responsibility
                      Serve public interest, lespeil constitution and law. niirgiity.
                      pioinole ethical organisations, slnve loi piolessional exielluue
                                                                                * Ilir plllicipk'b dlf llblnl in llirir Hitler (il pirMlilalinii in tin units. In i.HIV ilni milt i iiuy irllnl I lie
                                                                                I'Mii'iin < ill llif ill*.' mint*1 ni
-------
10    |"MI:KI.IN<. 1-iinrAi CHAMKNC.ISIN CVAIIIAIION

wuh a pi maple concerning social lesponsibililics Tlic piommcnce of attention
lo I he public good in ni.niy ol these codes ni.iy lellecl the consulting piolcssions
desire to emphasize explicitly and prominently the importance of audiences
oilier (luin the duecl client Although the piofcssional engineer's code includes
a pimciple cunceining seiving as a faithful agent to a client or employer, their
code. too. emphasizes, lust and foi e most, the obligation to the safely, health, and
welfare of ihc public  The complete discussion of their canons, uiles. and pio-
fessional obligations stresses this pnoiiiy The Amencan Institute of Cemlied
Public  Accountants (AICI'A) expressly states, "In resolving those conflicts
(between dilfeieni audiences or stakeholdeisl, members should act  with integiity,
guided by the precept that when membeis fulfill their responsibility to the pub-
lic, clients' and employers' interests are  best served" (Albrecht, 1992, p  175)
They define the public interest as "the collective well-being of the community of
people and institutions the profession serves" (Albrecht, 1992. p  175)
    Of couise, the lole of accountants generally differs from that of program eval-
uators  But loi many public accountants, their work in assessing a piogram and
defining public interest might be quite similar to those of a program evaluator
(See Wisler 11996| for a discussion of the similarities and differences in the roles
of auditors and evaluators)
     In contiast to these codes,  which emphasize (he professional's obligation
to the gpneial public, the ALA Guiding Pi maples siiess the tJivmKy of partic-
ipants m the evaluation process and the need lo recognize these differences,
considei the interests of all gtoups, and piovide lesults in such a way that they
are accessible to all  'I he emphasis is on the heterogeneity of stakeholders, not
the homogeneity But the  pimciples close with an exhortation to "encompass
the public interest and good," which, they acknowledge, "aie rarely the same
as the  interests of any particular gioup " This lattet  admonition more closely
Illinois the codes of othei associations  The committee, however, noted strug-
gling with this issue, and as they acknowledge, furthei discussion and inter-
polation aie needed to apply (Ins pnnciplc effectively (Shadish and olheis,
 1995) 'Ihese olliei codes might piovide some effective guidance
     As a fiame foi analysis, Bayles (19HM) has identified six staiulaids ol a
good or iiusiwoithy piofessional  honesty, candor, competence, diligence, loy-
ally, and discretion  Most of these standaids can be seen in the piofessional
codes  listed m 'lable  I 2 and m the ALA Guiding Pimciples shown  in Table
 I I  Honesty is addiessed in vaiymg ways It heads the list foi internal audi-
tois Accountants and engmeeis si i ess "objectivity", accountanls add "inde-
pendence" and evaluaiois add "mitgiity" Itaylcs sees "candoi" as going beyond
honesty to include lull disclosuie  1 he code ol piolessional engineers addresses
candoi, loi example, by aiticulaiinga pmlessional's obligations lo acknowledge
eiiuis to clients and to advise clients whrn a piojixi will not be successlul
     Similaily. competence is addiessed in each of the piofessional codes, but
 undei cK    ->t  woids Only the code of piolessional engineers, like  the AHA
 Ciindin,      iples, directly uses ihe  word "competence " Accounting and
 iiiii-iiul iii.iiMisrmnh.isw "i|iii- i lie " wlll( ll lilt oilioi.ilrs In ill) dihtu'llic ,illd
            I.IIIKSIN DlSUI'l INI S AND I'KOI I SSIONS KlIAIII) l() I.VAI IIAIIUN    I I

competence  Psychology and public admimsnrmon annulate pnuciplesioii-
ceining piolessional lesponsibilinesand piolessional excellence
     Loyally  is concerned  laigely wuh the conllici  belween obligations in
clients and ie.spousibililies lo olheis, including the piofession  I he conllu.ii
faced by piolcssionals in this aiea are paitly addiessed by the pnnc iples con
ceinmg public and piolessional responsibilities Many piolessional codes deal
extensively with loyally  Seveial of these codes addiess conflicts ol mien, si and
independence- of judgment  1 liese issues .lie subsumed uncki loyally liei ausi
the client has an expectation thai the piolessional they hue has u-ve.ilcd any
potential conflicts of interest that would (nuclei then completing ihc woik
fairly and will, in fact, be able to provide an independent judgment on tin
issue of concern Guiding Principle I: 4 stales the need to "maintain a balance-
between client needs and othei needs " 'I he evaluaioi is urged lo "meet legm
male client needs whenever u is feasible and appiopiiaic to do so," but tin-
principle notes that when client mieiests conllici  with othei pimciples. "ev.il
uatois should explicitly identify and discuss the conflicts wuh the client and
relevant stakeholders" (American Lvaluation Association, 1995. p  25)
     Discretion is addressed less duecily by most piolessional code:. Ihe AI:A
Guiding Principles, as with the code ol ethics loi psychology, add i esses the issue
of confidentiality under  "lespeci for people " Bui. the  implication is thai these
people aie pailicipams in the evaluation, not ihe agency 01 client What obliga-
tion does the  evaluator have to the client in ic-gaid lo confidentiality and discic
(ion7 Guiding Pi maple  II 3 advocates bioad dissemination of I Hidings Undei
what cncumsiances is such dissemination unethical7 An accounting ethics casi
asks readcis whether an accounting piofessor should use- inmeiuls (mm an oiu
side pioject in the classioom (Albieclil,  1992) II so. should the identity ol tin
firm be disclosed? Although laws on public iccoicls may cast an evaluation u |*>it
on a public program in a dilferem  light, what ethical obligation^ loi  disc tenon
does an evaluator have7 What loyalty does the evaluaioi have lo the diem7 I hes<
aie issues that should be discussed, luiihei building on the smiilanties and the
distinctions between oui  piolession and those in lelatc-d lielcls
IZiiforccineiil of litlucal Codes

HiMoncally. most consulting piolessioiib have been se'll-iegulaimg As pmlts
sionsollen come undei sonic cnticism loi then lailuic- to legulair. |>iolrx,ioiul
associations have established mechanisms loi eiiloueiiinii ol die «»!> •, ih.u
the evaluation piolession cniienily lacks Ihe Amencan Psyi liologu.il Assou
anon, many ol whose membeis aie piacltcing psychologists, the Ann in an It.n
Association  (ABA), ihe Amencan Medical Associauon (AMA). the Ann-man
Institute ol Ceinfied I'ublic Accountants (AIC PA), and the National Assucia
lion of Social Woikeis (NASW) all have enloufmeni bodies   Hirst ionium
lees answei  c|uestions. hear complaints, and issue disc i    -iy decision:, m
sanctions as appiopnate I hen hearings and det ismns h       .Id i ase law Im
ihe miei inflation ol the elhual codes
-------
12    IIMIRCINI. l.lllll'AI ClIAniNU-SIN I VAIUAIION

    lii contiast. enforcement mechanisms are typically absent lioni ilie pro-
fessional associations ol scholaily piolessions Tor example, (he Amencan Edu-
cational Researcli Assou.ilion (Ai:RA). ilie American Anthropology Assouulion
(AAA). and even llic Aniencan Sotieiy  foi Public Administration (ASPA)
develop and disseminate then ethical codes but have not developed official
mechanisms  foi enloicement  Plant (1998)  discusses the reasons for the
absence of external enfoiceinent mechanisms for the ASPA code, drawing on
extensive writings  in public administration concerning ways to create ethical
behavior Some aigue against even the codification of pmfessional ethics,  main-
taining that practmoneis should be (heir own moral reasoncis (Rohr. 1978)
ASPA. however, argues that codes are necessary to socialize and educate the
practitioner about common standards, but thai enforcement al the individual
level  is more appropriate  than cenlial enforcement  Organizations such as
ASPA appear to believe thai the development of "inner controls" will be more
successful at engendering ethical behavior among members than the "external
management of conduct" (Plant, 1998. p  165)
    The AHA Guiding Pnnciples may not include enforcement mechanisms
because of a belief in the success of inner controls, however, the more likely
reason lor their absence may be the need  to reach consensus on the meaning
and application of the principles, (he continuing tensions among the diverse
paradigms used in evaluation, and (he relative newness of the evaluation pro-
fession  All of these factois ueate difficulties in developing and implementing
enforcement mechanisms As evaluation inatuies as a field and greater con-
sensus is achieved on the appropriate methods and actions of evaluators.  devel-
opment of enfoicemi-ni meihanisms may be  icconsidered  They seem more
appropriate to the self-regulating role ol the consulting professions
     If AEA continues to use internal mechanisms to motivate ethical behav-
ior among members, however, levisions ol the code may consider the  use ol
language to bettei  achieve that goal  I he style  of ASPAs Code of l-thics is con-
sistent with the puipose of instilling internal contiols The code is slioit, it
could lit on one legal-size  page  It amuilales five bioad principles wiih foui (o
eight bnef points explicating each  1 he woids and language used in this code-
are designed to inspire and are less legalistic than the codes of the piofessional
agencies that  include enfoicement  The ALA Guiding Pi maples make  use of
this formal (principles widi brief points), but the (one of the language, as noted
by the authois, is moic  legalistic than the ASPA codes
      In the absence of foimal committees delegated with cnloiccment poweis,
oilier means ol educating  memhcis and enloiung codes of ethics do exisi and
 nuisi be used (o encomage uncle-islanding and compliance  I he  National Soci-
ety ol Piolession.il Lngmecis, which does not have  an official enlorcement
 body, uses us Boa id of lilhical Review to inteipiet ethical dilemmas submitied
 by engineers, public officials, and membeis of (he public  They publish these
cases on-line  and  in punt with an index, sponsoi  an  annual ethics contest in
 which membeis lespond  (o a case, and disseminate a senes of  videotapes for
 .in.I.-in-  ii.,l  iM, In You. descnbmg the code and
examining five case studies Congressional c niicism ol accountants assoc uiied
with the savings and loan fiasco in the 1980s stimulated these actions, the pio-
fession was aroused to attend to ethics and us public image (Mmtz. 1992)


The Future for Our Ethical Codes

Compared to the disciplines and professions leviewcd heie. progiam evalua-
tion is quite new The accounting piolession in the United Slates cclchiaied us
centennial a few years ago (Mintz. 1992) Physicians, engmeeis. and lawyeis
have been defining then  piofessions and tinkeimg with then ethical codes foi
even longer It is therefoie not surprising that  the ethical codes loi pmgiam
evaluators are less well loimcd, their slate lellects the  stale ol oui held  How
ever, we can leain from the codes levicwed heie A  most immediaie issue.
which docs not icquiie consensus but instead action, is to continue to expand
the dialogue that the AI:A Guiding Pi ma pies weie iniended to aeale We need
more discussion of cases thiough oui publications and  iimleicnces to aigue
and interpict the meaning ol various pnnciples  I he publication ol a new senes
in 1 lie Amencan Journal of Evaluation on  e(hica) challenges is a first step in thai
duection (Morns, 1998)  EvalTalk and focus gioups at die annual conlcicnce
can be used to futlhei ailiculaie (he meaning ol vauous pnnciples I am cui-
rently working on a casebook for the Amencan  Evaluation Association, loi
which I will diaw upon styles used in othci piolessions  Ciiiiein discussions
of certification or licensure are peitmeni to oui ethical codes Q-iiilicaiion and
licensure. as with ace led nation, piovide a mechanism foi ensiinng thai pm-
fessionals are mfoimecl of and concerned with oui eihical codes  I inally. con
sideiation might be given to linking inemheislnp in die  Aim-man I valuation
Association with the AI:A Guiding Pnnciples  The Joint Committee Siandauls
for Progiam Evaluation, AT As Guiding Pnnciples loi I  valuaiois. and codes
from other disciplines have provided us wuh food foi  thought  Now we must
continue 10 discuss and amculaie  what it means to be an ethical c valuaioi
References

Alhieelll. W b /.lined/ KMII •. in (lie- Tide (in- <>/ /Viiiiuifmi; ( nu iiin.ili  '.milli Wi -,11 in
Ainencan Lvaln.iiinii Absoi I.IIIOM  Ciiiidin^ l'iiiiii|>lfb loi I v.iliuiois  In W  |<  Slmlisli.
  D 1. Newiii.in, M A Sclic-im. .mil C Wye (i-d* ). dunlin); f'iiiui|>li >, /m / nilii.ilnis Ni w
  Dirc'Llions Ini l'ruj;r.iin r.vjItinlKin, no  66  San 1 1. un ism  |n->s< y I U-.-. I'W>
Bayles, M I) /'ni/t-sMoiiul kihit* hcliimni. C_.ilil  Wdilbw.mli  I'JHI
Campbell, D "rilinnteniiibin nl Disciplines .uul ilie fisli Sulc Miul< I nl i IINUIM u iu<  In
  M Shcuf and (1 Shcnf (ciU ), (nfi-i i/iu ifliiuiiy Kilii(inii\lii(Mii iJu \m,il \i,n, ,\  ( Im-i^n
  Al.lini  |«h«>
-------
14     I'MIRl-INC. I-IIIICAI ClIAIIINC.ISIN I'VAIUAIION

Covul  K W  "A Iwi'iiiy-Yeai ViliMan's Kcllct lions on ihe (.iiiiilmgl'nniipks for l.valuaiois"
  lnW  R Sliadish. D I  Newman, M A Sclii'iiei, anil C. Wye (eds ). Glinting Pnnufiln JIN
  (vuhuiiuis New Diieiiions loi Piogiam (.valuation, no 66 San francisco Jossey-Bass. 1995
I valuation Ki-icaii.il  Sixieiy (I RS) Standaids Comniiiice  "Evaluation Rcsean.h Souety
  Siandauls  loi 1'iogi.iin kvaluauon " In I' Rossi (eil ), 5liimii tvuliitilion PHIIIIIC
  New Pireiiion* fin Program Evaluation, no  15  San  riancisco jossey-Bass. 1982
I iiz|KUiu.k. J  L "Aliemaiive Models lor ilie SiniLluringol  1'iofessiunal 1'ieparalion I'rogianis"
  InJ W Aliscliuld and M (Ingle (oils ). Hit Pii/wiuliiiniiJPiiiJi-s\iiwKil Cvuliiufius ls«as. (Vi-
  >|>fifm->. niiJ l>n>xi(iiiu  New PIICUIUMS for Piogiam Evaluation, no 62 San 1'iani.ibLO
  Josscy-llass. 199-4
House. L R  "Principled Evaluation  A Crinque of ihe AEA Guiding Principles " In W R
  Sliadisli. D L  Newman. M A  Scheirer. and C Wye (eds ), Guulmx Piiiuif>/i-s/c>i Cvu/im-
  (ui> New Diieuions loi Piogram I'valuaiion. no 66 San FranLisco Jossey-Bass. 1995
Joint Committee on  Standards lor Educational Evaluation  Standards Jui Cvuluiilimu of
  Muuifioruil Curiums. I'Ki/uli, mill Miiii-iiuff  New York M(.Graw-lldl. 1981
Joint Coniiiiiitee on Sundaids lor tducaiional Lvaluanon Tlie Pingum Cvu/uulinn S(lrs jai I'vuluahn* New Uirections for 1'iogiaiii
   LvaliMlion, no 66 b.m I laniiMo |ossey-l).iss. 1995
 Sanders. J IVisonal (.oininuiiKation. January. 1999
 Shadish. W R . Newman. O I. . Siheiiei. M A . and Wye. C (1995) "Developing the
   Guiding PiiiKiples" In W  R bhadibli. D I  Newman. M A Stlieiier. and C Wye (eds ).
   CuiJiiigPiiiuifilir^dr LvdliuilDii  New Direuions foi Program Evaluation, no  66  ban
   liancisco Jossey-Uass, 1995
 btulllcl)caiii. D  L "A Next btep DIM ussion 10 Conmdei Unifying the I.RS and Joint Com-
   mittee Siandaids " In I' II  Rossi («l ). Sluiufcinl*/m /.v/ /»nl»li( m/ininiMiulKMi iu< IILC in fvuliiiHioH duel i.s inlcii'slrd i" llu1
               rs.'i rviiliiiilni diriiJ nliiliniis Slir scivrsdii llir llndiilnf
Although empiiual /tit'im/i on evaluation fllncs i!> not /)lt-;i/i/ul,
strvctul impoitant findings have emeiged  7/iesf uu liulc on
Iticfe o/foriicnsiii wilhrn iht field concerning whal  tonsli/ules on
ethical issue, the/ie^uent ouuncnit of elhmil /)/o/)/cnis ilunng the
latei stages of evaluation piojects, and the peiceivcd ethual
significance oflhe tendency foi evaluaiors to be nwic itspondve to
some stakeholder than others  The authoi (/licuiu's the neea to
incoipoiate reseat di (jueslions (in ethics into nn^ytin^ eval
piojCLli, and to imciS systematically evaluating pen f/)d
Guiding Pnnci/)lfi
                                                       tion
Research  on Evaluation E
What  Have  We  Learned
Why\Is  It  Important?
                                                  cs:
Michael Morris
              \
               \
NtMily twenty yc^tis ago Slu'inle)d and l.oid (IS>KI) nuicd ili.n "i-inpiin.il Mud-
les of the ethical Concerns of /valuaiion leseaicheis aie lew" (p  )HO)  Wh.ii
was true then is only sligliily/fess nue tod.iy Indeed, at a leu-iu session devoted
to "What Should We\Be  IVseai clung m l:valua(ion Films'" (Morns. I"A>7) .11
the American livjIiiaiW/Associaiion's (A I: As) annual iiieeting. I lie panelists
outnumbered the audience1 Whatever else ethical issues may he, they do noi
appear to have auiac^ed  HIC attention of a large segment of the  leseaich com-
munity in evaluulir
     'I his is not lp say, of course, that theie has heen virtual silence  on (he
subject beyond/he Guiding humpies jot l'.vahiatoi\ and  the joini ( omiiiii
lee's Program Uvttluation  Siandaids  Analyses of ethical concerns, fiequeiuly
based on tlie/ju'isonal experiences ol the auihois. ate iclaiively easy to hiul
(see CnghsU; 1997. Gensheimer.Vers, and Roosa. 1991. Schwandi. 1997.
Slake and/Mabry. 1998) Far fewerNjiowever. are cases in which the auihois
have  garnered primary data m a systematic fashion for the explicit puipose
of shedding light on ethical issues iiAvaluatton  In a held that pi ides use 1 1
on bemg commuted to decision making informed by such data, ilns stall- ol
affairs is cause loi concein  Accordmgly.Nhischapiei will locus on |>i 1111.11 y
     studies and then  value loi enhanctn)l om undeislanding of evaluation
ettcs
-------
76     EMLKGINC, rniu'Ai. CMAII LNGI.S IN EVAIIIAIION

Kawls, J A Theory ajjustut Cainbiulge. Mass  Belkiiap 1'icss of Harvard Umveisiiy Press,
  1971
Kyan. K L  Using reiiiiin:.! Sii.negies loi Aildiessing Issues ot Social Justice Do I hey
  Help7" Pa|>ei presenicil at the annual meeting of the Amentan Evaluation Assncianon.
  November 1994
Sanders. J R . Newman. D I., Owen.J , and Woilhen. B "Making the Piogiani evaluation
  Slandards Meaningful in (iiaduale IMucauon " Panel presented at the Annual Meeting ol
  ihe Ameiu.au Lvalualion Association. Atlanta. November, 1996
Sk.hwandl.1 A  'Bctiei living I liniiigli Evaluation Images of Progirss Shaping Evaluation
  Practice "{.'viiluimiiii fun (in-. 1992. (1(2). 135-H-t
,l)IANNA L  NLWMAN is associate pioffssoi and dntiloi ojlhe Lvaluation Consoi-
 duni (it (lit iiwif UniveiMty o/Nrw Yoili at Albany She is coauthoi oj Applied ElhiLs
 in Pnigiam Evaluation, and "Guiding Pnnuplesfoi  Hvaluatois" and has piesented
 and published moneious papeis and unities on cdncs in mifiKiliou
        on (i Jrvrli>f>iiu'iii«l inot/cl (i/lidiiiin^ cvalualon* that
           issues sft'/n (o tiiuiisc the fxissum ^cm uiici/ by di\t IISSIOMS
oj whe
incieased Tlie federal governmeiil IMS incieaseil ilk- pciulncs .ly.tinsi ilic pei|x:-
liaiors and also lias laken ulhei nie,isuies  An ev.ilu.uion ic.un g.uln.-ib 10 JbiCbS
I he effectiveness of these measuies One inetnbei ol the le.nn. like many oilier:,
in this nation, is strongly "pio-lifc," believing that under (he Niueinlxrig and othei
i tilings, any action is justified in pievenlmg what this pet son legaidb us the
slaughter of innocents 'I his team member sees the government as pioleiimg
"murderers," believing the genetal welfaic and public good demand doMiif ol the
(.limes Another team meml>ei, like many otheis in lliii, naiion. ib Miongly "pio-
Lhoice," believing that law and ethics give  a woman coniiol ova hei own body
and regarding violence against  the clinics and medical pcisnnnel asuiminal. not
heioic  To  this team member, ensuring the geneial  welfaie anil  public  good
requires the pioieciion ofclmics olTeiing ahoinons lioih memlx:is call thcmselvt-.s
evaluaiois  Should eithei of these ev.iluatois panicipaii- in 01  lead ihe evalnaiion'
     For this chaptei. I have been asked to examine aigumenis loi and ag.nn.si ev.il
nation neuliahty and advocacy To exploic these issues I havr looked pimi.mly al
articles by [xist ptesidentsol the Ameiuan Lv.tlu.u ion Association and mini piunn
nent figures in out field 'I heie is an abundance ol pnoi wouls on this lopu . sonu
of which will be summaiized in the next sections Soiling ilnough tlitin, wh.u
struck me was not then dissimilanty but—with some exceptions—iln.it agKeineni
after one had woiked thiough die definitions given ol advoc.u y and neimaliiy
Nonetheless, some of the discourse on the ethics of advocacy in cvalnaiion si-ems
to take place as though the moial high giound had loom loi only om. luimu
     Why the passion, given  the common giouiuP (3nc U.IMMI may he die
|X)ieniial for common gionnd in theoiy to gel "balkamzal" in piauut  A sicond
-------
7H     llMIRC.INC MlIK Al ClIAl I I'NU.S IN HVAI.UAMON

leason may lie whether I lit' evaluaior pnmanly has in innul a nnuonal study or
one i lose to client service delivery The cxplanatoiy power of ihis distinction,
gianted. does not always hold  Without deprecating the many ethical and moral
dilemmas confronting evaluatois, peihaps we could advance a hit funhei by
examining ( I ) specific evaluations earned out at ronifwicjiMr levels in light of (2)
the pi maples and theones put foith undei diffetent banners


Guiding Piincinles for Lvaluulors
Our suiting point is the "Guiding Principles for Evalualois" adopted hy the
American  1: valuation Association (A Li A) as "a set of principles that should
guide the ptofessional practice of evaluations, and thai should inform evalua-
tion clients and the geneial public about (he principles they can expect to be
upheld by prolessional evaluatois" (Amencan Evaluation Association, 1995,
p  21) There  aie five broad principles  systematic inquiry, competence,
miegiiiy/honesiy, icspect for people, and lesponsibilines for general and pub-
lic welfare  These have been presented eailier in this volume
     All ol  the ALA Guiding Pi maples aie iclevant, to some degiee, to the ethics
of advocacy One cannot,  howevei, sum the pi maples foi geneial guidance
regaidmg advocacy and know exactly what actions to take  Fust, as intended,
the pi maples aie not siandaids  They do not indicate, for example, matteis of
practice, such as what would constitute incompetent performance or what types
of education, abilities, skills, and expenence would be  mappiopriaie for differ-
ent types of evaluation (asks for dilfeicnl evaluations  Second, it is possible for
|>t i sons uiking dilleient positions on the ethics ot advocacy 01 neutiality in eval-
uation to cue one 01 anoihei principle as consistent with their views
     In the next sections, these appaiently dissimilar  positions are piescnted
togethei with the AIZA Guiding Pnnciples that seem 10 suppoit them, anil then
the positions aie teexammed to identify what may be common giound that
ledefmes the Pnnciples  fust, loin definitions (Websteis. 1994)

     Advocate  One who ilcli-iuls, vindicates. ui espouses a cause hy argument,
       upholder. defendei, one who pleads lur or in behalf ol another
     Adveisaiy A person 01 gioup who opposes another , opponenl, foe, any enemy
       who lights deteiminedly. lelcnilessly, loniinuoiisly
     r.inis.in An ailheieni or  supporter ol a |>erson, pany, or cause, hiased, partial
     Nonpanisan  Objective. not suppoiluigany of the established or tegular parlies


To IZvalualc Uequiics Credibility:  No, livalualors
Should Not Uc Advocates

 I heie is no laik ol woids and deeds lonccinmg what evaluation and the eval-
ualoi's lole a'      u Some uuild be lead as indicating that  the cvaluaiois wle
is .ihoui ne       iiip.iiiis.in evaluations leg.ndless of how pauisan 01 non-
11 II IIS 111 I 111'
                   IHJi'll
                     I III Ll I IICS Ol  II VALDAI ION Nl.UIKAl II Y ANU AllV()( A( >

     For example, Chehmsky (1997) observes

     In he listened in hy various stakeholders in even an ordinary political debate
     requires a great deal of ellon by evaluaiois not only lo be competent and oh|ce-
     live bul 10 appear so     I here are   a great many things we can do   not just
     technically, in the steps we lake lo make and explain our evaluative decisions,
     bul also intellectually, in the effort we put forth 10 look at all sides and stake-
     holders in an evaluation    A second implication lor evalualors of a political
     environment is the need lor courage    Speaking out in situations that may
     include numerous political adversanes, all with different viewpoints and axes to
     grind, and also insisting on the right lo independence in speaking out, lakes a
     strong stomach     li takes courage 10 refuse sponsors the answers they want
     to hear, lo resist becoming a "member of the team," to fight inapproptiate intru-
     sion into the evaluation process    bul when courage is lost, everything is lost
     Ipp 57-60. see also Cook, I997|

     This is Scnven( 1997)

     Distancing can be thought of as a stale on which a minilx r of points aie nf par-
     ticular inteiesl    At one end ol the scale is complete distancing, as when a pio-
     gratn (person, policy, or whatever) is evaluated on the basis ol extant data alone
     At the other end is ownership or authorship of the progiain, usually conceded to
     be a poor basis for objective evaluation of it    Although u is belter in principle
     to use extant data, u is often the case thai one needs moie. and the n&ks atten-
     dant on personal involvement |bias| must be undeilaken    So-called p.iiiiu
     palory design, pan of the empowerment movement, is about as sloppy as one can
     get. short of participatory authoring of the final repoit (unless thai report is
     mainly done lor educational or therapeutic put poses)    li is sometimes sug-
     gested that the push for distance is tisell an attempt 10 lie superior, external, an
     attempt to play God the Judge On the contrary, n is pan of the simple and sen-
     sible human effort to gel things light, to uncover and report the truth— Deciding
     when and to what extent lo withhold those findings horn those who  paid loi
     them is the "doing what's good for you, not what you asked me to do" step ovi i
     the holder between expemse and censoiship/p.tiennng |pp  -IH-I.
                                                                                                                                                                      79
To Evaluate Is lo Advocate. Yes, F.valualors Should He-
Advocates

Other wotk could be lead as saying we aie about dealing painsan ev.tliuiions in
an inetnevably pauisan woild We should be advocates, weighing in on the side
of llie underdog, the oppressed, the maigmalizcd in the light foi sot i.il justice
     Lincoln  (1990) wines, "|To the posilivistsl, only il icseauli u-sulis wen-
free of human values, and. theieloie, liee fiom bi.ts, pie|udi-    ' individual
slakes could  social action be taken thai was neunal with it     n> poltiK.il
n.illls.mslno      I lie i OIISIIIK IIVIM ll.ii,ulii'ili Hi is ,i> us < i nli.il Im u-.|     the
-------
80
LMI Kl.lNt. 1.1 Illl Al C IIAIII NC.IS IN I VAIIIAIUIN
picseniaiion ol multiple. holistic, competing .uul often conlliclual ic.ilnies of
multiple stakeholders JIK! icscaich participants    the wntlcn report should
demonstrate the passion, (he commitment, and tlie involvement of the inquirer
with his or her eopaiiiapams in the mquiiy" (pp  70-71)  She fuitliei corn-
menis, "We sliould abandon the lole ol dispassionate ohseivct in favor ol the
lole of passionate paiuupam" (p  86)
    Greene (1995). expanding on this thought, uiges in hei classic, widely
cited aniclc. "Evaluation inlim-mly involves advocacy. so ilie inipoiiiint qucs-
tiun becomes advocacy ku whom The most defensible answer 10 this question
is that evaluation should advocate loi  the interests of the paitiLipants" (p  1 ) In
a iclaied statement, rctieiman (1997) offers a nuanced argument, considering
both advocacy and data credibility, that evaluation is best  seen  as a foun of
empoweiment  He observes, "Empowerment evaluation has an unambiguous
value orientation — it is designed to help people help themselves and improve
iheir programs, using a foim of self-evaluation and leflection     Advocacy.
in this context, becomes a natuial by-pioduct of the self-evaluation process —
if the data mem it" (pp 1H2-384)
    AndMeriens(l995)

    I Ins prmuple (III D ">) coiiitinnig divcisny and inclusion has implications not
    only ai ihc level of idemilying anil it-spelling I he viewpoints ol marginalized
    groups, liul also for I lie li-ihiiK.il adequacy of whai evaluaiors do    Evalua-
    tors need it) leflnl on huw loaddiess validity and reliahilily honestly in acul-
    liual context,  so as not lo  violate the human rights of  I he culturally
    uppie&sed    1 1 he ein.iiu ipauny iramewnikl    is more appropriate to slop
    oppression and bring about social justice  Three characteristics |ol ilus fiame-
    woik arc) (I) recognition ol silt-need voucs, eiisuiing dial groups liadinonally
    marginalized in soueiy are e(|iially "lie-aid" tluimg the evalualion piotessaiul
    foinianon of evalualion findings and n.-ionunendalions. (2) analysis of power
    iiicc|iiiucs in lenns of suual iil.Miunships involved in I he planning, iniplcmeii-
    laiion. and reporting of evahiaitous. (3) linking evaluation resold in polniial
          |pp 41-92)
     III the context of evaluation as advocacy, stakeholder involvement seems
to mean the- evaliuior should take up the cause ol the marginalized "I he e val-
uator should nuke 01 snppoii piOLcdiir.il, technical, and methodological deci-
sions l.ivoi ing the side ol the poisons directly leccivmg sei vices


Sonic Relevant Principles and "Ilicir  Implications fur
Ami- and I'ro-Ailvocacy  Stances

I he ALA Cjiiiding humpies do not tule out ciihei lite ami-advocacy 01  pio-
advncai y stances, anil vanous ones can lie cited to support eithci position
     Against Advocacy bevcial ol the Al'A Guiding rnnciplcs can he cited to
emphasize the iiKompaiihihly ol evaluation with an atlvnc.u v posnum as nuli-
                                                                                                               liir Knurs or MVAIUAIION NI.UIKAIIIY AND AIIVOI'AI r   HI
                                                                                     cated in the quotes given  These aie lound piiinanly under I'linciplc C
                                                                                     Integrity/Honesty In us subpaits. (his principle emphasizes that evaluatots
                                                                                     should  assute  the  honesty and  integrity  ol  the crime evaluation process
                                                                                     through practices such as being explicit about then own (and others') inleiesis
                                                                                     concerning the conduct and outcome of evaluations, disclosing any roles 01
                                                                                     relationships that might pose a significant conllict ol interest
                                                                                          As these words aie generally understood, they are inconsistent with an
                                                                                     advocacy position Accoidmg to Websieis  (I99'l).  fionrsf means "lloruualile
                                                                                     in principles, intentions, acuons, fan, genuine 01 unadulterated, iiiiihlul or
                                                                                     creditable,  unadorned, just, incomipnhle. liustworthy, mnhlul. stiaiglit for-
                                                                                     ward, candid "  In common understanding,  as an evaluaioi one cannot he lair
                                                                                     to all stakeholders and at the same time lake a position of advocacy (or adver-
                                                                                     sary) for or against one stakeholder gioup or the other The principles tell us
                                                                                     to be scrupulous about identifying biases, values, pieconceptions favoimg one
                                                                                     outcome or another that may be held so strongly the evaluaini could find it
                                                                                     difficult to be fair, incorruptible, just, trustworthy  "I hese ilneais to faiiness
                                                                                     specifically and explicitly included political stances  'I hat is. the  principles
                                                                                     assume that evaluatois have biases, prejudices, values, opinions I hey require
                                                                                     us, however, to be ever mindful of how our values may alien oui londuci ol
                                                                                     the evaluation — and to disqualify ourselves from a paiticulai study il we can-
                                                                                     not be balanced, fair, just, incorruptible
                                                                                          Different organizations use slightly different  leims loi (he same idea  I he
                                                                                     U S General Accounting Oflice (1997) speaks of "impanmeius" in one's abil-
                                                                                     ity to be fair and just These impairments can come not only horn financial and
                                                                                     career interests, but  also from values, attitudes, and political views  Mowevei
                                                                                     phrased, and with appreciation for the nuances of phiasmg. the evaluatoi can-
                                                                                     not, this principle makes clear, take sides This is quite dilleient hum iepoii-
                                                                                     ing findings that may favoi the inleiesis of one pany or anoihei  Kathei. it
                                                                                     means conducting the evaluation so that the findings aie not slanted begin-
                                                                                     ning, middle, and end by the e valuators own passions  liy this punt iple, the
                                                                                     evaluatoi must forego balancing peiceived inequities with a thumb giving
                                                                                     greater  weight to  the scales of the oppressed
                                                                                          Considering this reading of Principle C. neithei the pro-life nor the pio-
                                                                                     choice evaluator should be on the evaluation UMMI   llieii pnlitual p. .sir urns
                                                                                     seem so deeply held as to be considered an impaiiinenl to a Ian, just, nusi-
                                                                                     worthy evaluation
                                                                                          For Advocacy  Another pimciple, howcvci. uuild l>c read as |viiiiiuing and
                                                                                     perhaps encour aging advocacy in evaluation  I'liniiplc I! considers u^pDiisihiliiy
                                                                                     for the general and public welfare It explicitly stales "cvaluaiois have obligations
                                                                                     that encompass the public interest and general good     ck-ar tineas to the pub-
                                                                                     lic good should never be ignored in any evaluation llecause the public mieicsi
                                                                                     and good are raiely the same as the inteiests  of any paitu ular gioup    ivalua-
                                                                                     tots will usually have to go beyond an analysis of puiticutai sukcliolilt-i micu-sis
                                                                                     when considering the welfare of society as a whole" (Anifiicau rv.iluaiioii A.->s«)-
                                                                                     ii.ition  199-5 pp  2V26)
-------
82    I.MiK(.iN(. limirAi CIIAIIINI.LS IN I VAIUAIION

     A common language leading ul ilns pnnciple lequnes cvaluatois ID he
cvei conscious of the public good and gcneial welfare  liui  the guidelines do
noi indicate wh.ii view ol ilie gcnei.il welfaic and public good is considcicd
Wbai is slated in law' l)y unrenily elected olficials' By majority opinion? Ity
ilie views of whatever gmup seems mosi disenfiancbised by whatever indica-
tois' lly ibe evaluatoisown peiccplion ol social justice' As Rossi (1995). dis-
cussing ibis pimciple, points out."   what is the public good is (lie bone of
contention .iniong political  parties, political ideologies, and even world reli-
gions" (p 57)  It seems as tliougb evaluatois can select any definition ol ilie
public good tbey choose
     Wbai are the implications ol ibis position for the hypothetical abonion
clinics' evaluation7 Considenng this leading of Principle E, depending on your
poitu of view, eithei ihe pm-ltfe or die pro-choice evaluator should seive on
ilie learn bui noi boih  Moreover,  any evaluators who have not thought
through what ihe common good and general welfaie mean on this issue (thai
is. on abonion) should leach a position as pan of their responsibility
     li seems noteworthy that the basis for Pniiciple E is not  a belief (hai eval-
uators are irtemediably unable to l)e objective, but rather that  we serve a htghei
social good beyond serving  those in  charge and ihe proximal and miermedi-
aie stakeholders, such as stalf and paiticipanis of a parttculai ptogram To do
only the  bidding of those paying fui  die evaluation is seen as making evalua-
tion hide moie than maikei icseaich Although icsponsible 10 our clienis.
wheihei  internal or external, we are  equally icsponsible. in  light of ihis prin-
ciple, foi consideimg die geneial good and public welfaie
     Hxactly what evaluatois have lo  do beyond "considering" is left unsiaied
Presumably it includes inlusmg all aspects of the evaluation with the repie-
sciuaiion of the ultimate stakeholder—the public good as undeistood by die
evaluaioi—in die same way one would a more pioximal siakeholdei

Coiniiiun Ground
"Ibis bnef analysis illusti.iies what many oiliei evaluaiois have alieady noted
(see. lor example, Rossi, 1995)  The pi maples appaiently can be cited in sup-
lion of neutialiiy or advocacy in evaluation  It is theiefoie not to  the ALA
(uncling  hmciples as stated thai evaluaiois  might look for standaids ol con-
duct in specific cases 01 foi a icconciliaiion. if (Ins is possible, between appai-
cntly ineconcilable views
     I he ambiguity ol the AI~:A Guiding I'unciplcs is consistent  with the inten-
tional dilleience between the geneial guidance of pi maples and the opeia-
nonal guidance ol siandaids Rossi (1995) commented  The membership ol
AI:A is divided on a nurnbei ol cntical and substaniive lechnical issues  A
stiongly woided set of standards might easily sundei the weak bonds that bind
us together :••- ' nullily the compromises that make AI:A possible" (p 5o)
     Ihe |).      es developed between 1992 and  I99H weie intended as pail
ol loiiiinum^.iMiissionson ethical issues I hey seivrd us well ilint  in ollei-
                    Im KIIIICSOI I'vAiiJAiioN NLDIKAMIY AND AIWIX AC Y   H)

ing an ecumenical liamewoik lor lobust discouise on ethics  Seven yeais laiei.
however, the pi maples may need lefieslung in oidei to leflect new appioaches.
such as emtiffnl italism (Henry, Julnes. and Mark,  1998), and to guide piac-
lice Indeed, some common ground may be piesent in the values shaied by vai-
ions perspectives on the ethics of evaluation advocacy and neutiality  I bai is.
by examining possible common denominator in lecent commentaiies on these
issues, we may get back to a sense of how 10 balance  appaiently competing
principles
     Iwo sinking common denominatois aie the value  placed on launess and
faithfulness to all stakeholders and on lespecimg deeply  ihe dignity of all stake-
holders and their right to be heard A series of recent anicles by leaders in oui
field,  such as Lincoln, House, and Greene, gives a window on contempoiaiy
definitions of advocacy in evaluation
    This is Lincoln (Ryan, Greene, Lincoln. Matbison.  and Meitens. 1998)

    We opeiaic Iroin profound social commiiinenis whn.h honoi all siakeholdeis
    groups' views and peispccnves,  wheiher or not we liap|>en 10 agree wuli  those
    views    We speak ol "jilvocacy" as if u meant we go inio an evaluaimn deiei
    mined 10 take sides, and that would mean typically. "agamsi" ihr  piog	i man-
    agers, administrators, flinders, or olliei uilical individuals When I lalk aliinii
    advocacy, I don't mean taking sides in llial moie spv< iln  sense Wliai I  IIILJII
    ralher icfeis 10 becoming an advocate for pluiahsm, lor many voii.es in Ictd mm
    the evaluation     Wliai I am ad vocal ing lor is less a paniciilar individual 01
    group ilian a position whii.li insists  dial  all siakeliulders be idenulied and
    solicited for their const muions of I lie siiengilis and weaknesses ol ihf |>uigiain
    Ipp 102. IOH|

A similar idea was expressed vividly in hei discussion (1990) ol the need lo
"express multiple, socially constructed, and often conflicting lealities I he lai-
lei we termed fairness, and judgments weie made on the aihievemcnt of this.
criterion in  much the same way that  labor negoiiaiois and  mediaiots delei-
mme  fan ness in bargaining sessions" (p 72)
    This is House and I lowe (1998)

    We lliink llit liamewoik |ol a Uielnnsky study) must he something like tins
    Include conllicung values and siakeholilci  gioups in I lit- snuly  Make smr all
    major views aie sufficiently included and n-pirstnied Ui ing > outlining vuus
    logeihei  so llieie can lie lie-liberation and dialogue ahiuil (hem .iniong tin leli
    vanl pailies Make sine I lie if is sulliiienl mom lor dialogue lo usolvi lonlliii
    ing claims, but also 10 help ilie policy makeisand media u solve ilust- iLnms liy
    soiling ihiough ihe good and bad information    Is ilns advoi acy on die |i.ui
    of the evaluaiors? We would say no. even though linn woik is heavily  value
    laden and incoipoiales judgment U is noi advoeaiy. sin h .is i.iki-    •» :.idi 1.1
    the- otlieil al ilie beginning ol (lie study and iliaiii|iioning 01       sidi  01
    anolhei    We suggest lliree iiilena loi evaliialions in In |iiopiil> kilinm!
-------
       I MIK(.IN(. I: I Mil Al QlrtlllNC.IS IN I VAIIIAIION

     FIISI (lie study sliiiulil IK- iiiLlusive so .is lo iipiest-iu .ill u-levam views, interests.
     values anil stakeholder    beioiul, I he it should In- sulliuent dialogue wiih the
     relevant groups so lh.il I he views aie pmpeily anil authentically represented
     llnrd. there should be biilhcienl deliberation ID anive ai proper findings |pp
     'I Ins is Gicene (Kyan. ct .il ,
     Ilicepl in unusual iiicumslanii-s. I ilo mil believe llial cvalualois should advo-
     tale for (he piogiam being evaluated Sui.li advocacy LOiiiproiinscs (he peiceived
     nedihilny and ihus ihe persuasiveness of (he evaluative claims    whai evaln-
     atuis should    advocate for is ihcir own value commitment    In panicipa-
     loiy  evalualion.  tins  value commitment is  to  democratic  pluralism,  lo
     broadening the polity conversation to nuludc all legitimate perspeinvrs and
     voices, and lo lull and fan stakeholder paiiiupaiiun in policy and program den
     sion making    the paiticipatory evaluaior needs in get in close to the pin-
     gram      Dm  this closeness should  not be  misionsirued as piogiant
     partisanship  I hat is. pailiupalnry evalualors do advocate, not for a particular
     progiam. but lather lor an open, inclusive, engaged conversation among stake-
     holder about the  mem and wonh ol dun progiam   of fanly and fully rep-
     lesennng all legiliiuaie inleiesis and concern!) in an evaluation  |pp  109. Ill)


Refraining llic Discussion

     Neulial A person or gioup not taking pan in a controversy, unahgned with
       one side or another in a contioversy, ol no partuular kind or characteristic,
       iiulrhnile
     linpaitial Fair.jiist

     Wiih ihese clcfiiiiiions (Webster's. 1994). a shared theme among die eval-
ujiors cited liete is iinparuality — (lie sense of fan ness and justice  Neuiialily.
which might initially seem similai, is too passive, connoting a son of with-
drawal fiom the storms and complexities ol the wot Id  Passivity does not seem
to me either characteristic of. 01 common giouncl for. our field I his review.
I hen, suggests

•  Dive tse evaluaiois ayce thai ihe evaluatoi  should not he an advocate (01
   ptcsumahly, an advcisaiy) of a specific piogtam in the sense ol taking sides,
   ol a  pieconceived  position of suppon (01 destiuction)
•  *l here is agieement that the ev.iluatoi should he an advocate foi making sine
   that as  many lelevam cauls as  possible gel laid on the table, face up, with
   the values (wonh, mem) showing
•  'I heie is uycemcnt that the evaluaiot must  be awaie of how less poweilul
   voices or unpopulai  views, positions, infotmaiion can get silenced and make
   spa i.il rlloits lo etisuie ih.tl these voters (data, view minim. iimnl m-i heaid
                    III!  htmisol  I-VAIUAIION Nl.DIRAIIIY AND Al)V()( A( V    H"i

     'Iheiefore. it may be he I pi n I to lehame the iliscusstons in leims ol inipai-
lialily or fairness  No evaluation appioach ol which I know would couiiicnam e
(I) delibeiately ignoring program ihcones leading to dilleienl expectations
about what should be studied 01  measuied, or what lesults to look lot, (2)
deliberately selecting measiues 01 quesiions to favoi one side ovei anothei, (5)
dehbetulely misquoting what an interviewee said, (4) delibeialely dealing data
out of the whole cloth to prove a point, (5) Jflilieialcly going fiom the  teams
of raw data to com Instons hy a sneaky path siipponing one side ovei anoiliri,
(fi) delibcmtcly  tailing to listen lo the views ol .ill panics with conlliiiing pi i
speclives, (7) dflibtialely suppressing mfoimaiion that did not suppon ihe
evalualors own values 01 ihe lesults ihe cvalualoi wished lo obtain, (H) iMtli-
mifffy using woids that cumulatively skew the repot I lo one side 01 (he  oihct,
or (9) delibeiately presenting complex,  nuancecl Imdings m a simplisiic way to
favor one position over anothci (Dana. 1997) Pet haps these points aie a Mail
lowaid  expiessing standards in this atea with which many evaluaiois  could
agiee
     This is not to say that we may not inadvenenily m practice—through
methodological limitations, tgnoiance ol how out own views and language
cieate subtle biases, or failuie to use strategies lot achieving Ian and latihful
evaluations—do all of the above 01 mote Kathei, n is to say thai .is  I tead
iccent elloits to atliculaie what  we mean, I hnd that we seek balance and want
fairness, like a  mighty rivet, lo pour down
QED? No, Dilemmas Will Remain in Application to
Practice
Principles aie not standards on how to be Ian and just m piacticc  What pnn-
c i pies mean m practice is likely to requite continued leexaminaiion and tern
leipietalion  as experience giows, evaluation iheones develop, and new
technologies aiisc  l:or example, to some evaluaiois. such as (iiccnc and
Meitens, closeness and inclusion aie essential  1 he evaluatot models, by how
the evaluation itself is clone, ideals ol empowerment, demaigtnalizaiion ol the
disenfranchised and oppressed, and in so doing leaps many evaluation hem -
fits such as greater authenticity, bellei balance,  gicaicr latiness, "natuial" i val-
uation utilization Since truth lies in the eye ol the beholdei. one logically gi is
as many beholdeis as possible
     Toothers, such as Sciiven (1997), opinions and srll-iiiu-usi lu- m the tyt
of the involved stakeholders, albeit expeiietued by them as until  (  lost tuss is
to be avoided, risking as H docs co-option  and bias Also to be avoided is l>< mg
impartial on an issue dunng wot king lioins but an advcisaiy 01  advoc.nr  on
the same issue when the metei isn't  iiinmng Inclusion ol ielc vani, but unpop
ular or silenced views, lo such  evaluaiois is as uncial lo evalualion as n is in
those encouraging closeness "I he techniques loi achieving stu h nu IIIMOII aie
not seen as requiring silting atomic! a table, as n  wete, wiih tin  evalnaioi as
moderator when decisions an-made about di-sii'n  mfasiins analvsi s and
-------
 H6    I'MI KfilNC. I. MIIC.AI ClIAl I LNC.Ib IN IJVAI IIAIION

 reports. R:n her, (he techniques UK kale using cxlani cl,U;i and lelymg on pci-
 lonnance cl.ua rathci ilian siall interviews, and wheic suc.li iniciviews aie
 essential, being sine ihey involve piesiruciunng based on oilier daia and ate
 conducted by well-named, wcll-supeivised inicivieweis Oihei appioachcs
 include goal-free meihods, heavy  inteiviewing with consumers and other
 stakeholders, and in all of ilus. applying quality control procedures such as
 audioiape backups  Paieniheiically. a fine example ol an evaluation using such
 approaches in a icsponsive evaluation  frame wot k is now available (Slake.
 Davis, and Guynii,  1997)
     Chehmsky (1997) is pragmatic about  methods for achieving inclusive-
 ness  Though considerably less daunted than Scriven about being captured,
 subdued. 01 co-opted by silling down with stakeholders, she would be
 highly on hei guaid against efloits to coeice evaluators or otherwise place
 them in an advocacy or adversary position  (For example, being set up as a
 Congressional pit-bull chomping on a possibly effective but out-of-favoi pio-
 gram such as WIG would be as threatening to the GAO's independence and
 credibility as being cast as a slull for a possibly ineffective but  popular pro-
 gram such as chemical wai fare )
     Are Greene and Meitens talking about diffeient types of progiams ihan
 Senven and Chelnnsky. and thus ihe appaiem disagieements aie a case ol "It
 depends"7 Lvaluaiois vary m ihe ease of public access to evaluations they have
 completed. 01 in how closely ancboied then discussions ol advocacy and neu-
 trality are in specilic woik li seems likely that positions lecommendmg close-
 ness and mclusiveness are moic feasible with fauly small-scale evaluations.
 peihaps on local 01 si.de levels Tot example, one could bung out all stake-
 holders' voices lanly in a small piogiam. such as a local Hospice Center, an
 individual school, or even a county-wide lecyclmg program
     In coniiast. although u is easy to envision mclusiveness in a national eval-
 uation, it is moie dillicuh to imagine one-on-one closeness As an example, the
 fust issue of the new Head Stan journal  aimed at promoting reseaichei-
evaluaior and piacimonei dialogue  Ionises on stakeholder collaboration and
 paiiicipation, but includes examples only fiom small studies, not the many
 national evaluations Howevei. some ledeial agencies now are wining Requests
 for Pioposals (RIH's) consistent with empowerment and  participatory views
(such as the lluieau of Indian Affairs), so in the future, we may be able to see
 more empincally ihe tianspoilability of the inclusive, close, paruupaioiy
appioaih
     Peisonally, I would like 10 icad. in lull, an evaluation someone has com-
 pleted (seveial. if possible) as a way ol seeing what diffeience the pimciples
 make in piactite and wlieie, il any place, "it depends " We might be somewhat
lanhei along il such evaluations weie easily available as companion pieces to
die moic ihcoieiieal ailnles House (1995) wisely wrote, "It is difficult to wine
mielligently about ethics and values  One icason is that ethical problems aie
mamlesied      -n paiticulai loncieic cases and endoisement of general pim-
uplrs some    _s sirms platitudinous 01 inelevam lillncal concerns become
                     III! I IIIIC SOI liVAIUAIION NllllltAIIIY AND AllVix   New Dnrciioiib loi hiigiam I v.iliuiioii ,„,
  66 San rianciscn Josscy-Hass. I«J95
 Hell. S "Cullinga Non-l'.ims.in Lvjluaiiun in .1 r.inib.ui Woild  Ihe Uih.m IIIMIIUU Niw
  I'edtiahsm l:valu.ilion " I'.ipcr piesemed .11 (he Aim-man I v.iliuiinii ,\y>oi I.UI..M ( on
  ference, San Diego, Niivemlier IW7
 Ilicknun, 1.  •Implications ol die Ion Biagg l.v.ilii.mon  ' / I',I(II,KI,>II 1'i.nn.,  IWo  //

( lieliinsky. l: "Ilie Pnliik.il I nviioniiieni  of I v.ilu.iiion .mil Wli.n h MI.IM:,!,,! id, |i,i-,|
  opineni ol ihe Field " In I. c.helimbky and W Sh.idi»li (eds I / v.iln.iiinii /,»  id, JIM (. „
  luiy A llanJIwIt  lluuisand Oaks. C .ill)  baj-e. I'W7
Cook. I D "I esiiins I.ejinetl in I vjlu.uion C)ver ihe I'.IM 2") \\ .lib ' In I  < In limsky ,m,l
  W  K  Shadish (eds ), llvttlualum jtn llif 2hl Ccnluiy A  fdiiuflniiik Ihiui-nid ().iks ( .ilil
  Sage.  1997
n.m.i. I  TufliiiK Nnn-p	S.IH l.v.ilu.nions in a I'.inis.in Woild ' I'.ij     M MI«| .11 ilu
  Aiiiiiu.inlv.il	HI AssiiiMiHiiiCoiiliiiiiii 'u-|ii. L.I M..tll,il..i l'i-p/
-------
HH    HMIRC.INC. nimCAi. CIIAIIINUSIN LVAIUAIION

I ciieiman. I) "I'nipowernicnl Lvaluaiion .mil Acneiliiaiion ID Ilighei Education " In II
  Chelnnsky and W K Shadish (oils ) Evaluation JIH ihc 2lsl Century A llinullnHili "I liou-
  sand Oaks, Calif Sage, 1997
(ill-cue. J  C Tvjlu.ilion as Advocacy " l.vuliiiiliiin 1'iuilnc, 1995,/#. 25-36
lleiuy. li  I  . Julncs. G . and M.uk. M  M (eds) Ktahsl Lvaluiilum An /"mfijjmj; lliatiy in
  iii/i/>oil (i/ I'iMlKf NewDireciioiiblorbvaliidiion.no  78 San Francisco jossey-Uass.
   1998
I IOUM;. II  K . anil I lowe, K K "Hie Issue ol Advocacy in Evaluations " Antfiiuin Jiiuinul iif
  Lvulutiiiiiii 1998. /'J. 233-236
House, L  R Tinuiplcd I valuation  A  CnlU|iie oflhe ACA Guiding I'muiples" In W K
  Shadish. D I. Newman. M  A Sclii-ner. and C Wye (eils ), New UiiulKim/cii L'vuliiulinii.
  66 San Franc isio |ossey-Bass, 1995
Lincoln. Y  'Hie Making ol a Consiruciivist  A Remembrance of transformations I'asi " In
  E Cuba. I he PaiaJiyn Unity I housaiul Oaks. Calif Sage. 1990
Meilens. I) "Identifying and Resigning Dtlfeiences Among Participants in Evaluation Stud-
  ies " In W R Shadish. D Newman. M A  Scheirer. and  C Wye (eds ). Guiding Pnmiplers
  foi r.vuliuilois New Dneciions lor Piogiam Evaluation, no 66 San Francisco Jossey-
  Bass. 1995
Moms, M (Chair)  "Wliai's an ["valuator 10 Do' Confronting Ethical Dilemmas in Practice "
  Session piesented at the American ('valuation Association Confetence. San  Diego,
  November 1997
Nelkm. V S (Chair) T.ihiial Dilemmas in (.valuation " Session presented at tbe Ameiican
  Evaluation Association Conlereiue. Vancouver, I) C , November 1995
Rossi, P II "Doing Good and Gelling It Right " In W K Shadtsh. D Newman, M A
  Scheirer, and C  Wye (eds ). Guiding I'umiplrsjoi Cvulnulnis New Directions for Prngiam
  Lvaluaiion.no 66 San l:iancisco  Jossey-Bass, 1995
Kyan, K . Gieene.) , Lincoln. Y , Mallnson, S . and Meilens, D M "Advantages and Disad-
  vantages of Using Inclusive I.valuation Appioaches in Evaluation Piaclice " Amenutn Jinn-
  nut o/ Emiiuuiiim. 1998, 19. 101-122
bcnven, M "Tiulh and Objectivity in Evaluation " In I:  Chelunsky and W  Shadish (eds ),
   Evdluulion/or the 2Js( Ctiituiy A llanJIntuli Ihousand Oaks. Calif Sage. 1997
Slake. R E . Davis, R . and Guynn. S fc'vuluulion ofReadei FotuscJ VVniiiix/oi tl\e Veterans
   Hfiif/ili Ailniiiiisiiuiion Champaign. Illinois CIRCE at the Univeistly of Illinois. 1997
United States General Accounlin|> Office Ain/Hiiig blumfuidi Washington. D C Author.
   1997
Wtl»lei'i Encyilo|ifi/ii  Unabiulyd Dulumttiy n/ the F.n^h Lunffiagf New York  Gramercy
  Hooks.  1994
   s-i 11 IN DAI IA, /m-iiJfiil oj Dallu Aitdl^iis. is a />tiil pitsulent of the Evaluation
Rexanh Sttiicty ami ifci/nciil oj (lit- Alvd ami Gunnai MyiJal Awaidfoi livahm-
Mun in Goveinmcnl Set vu t ami oj ihe Ralwl Ingpl Awaidfot Exliaoidinary Sewtce
to (lie AmriiiuH Evaluation /\iSiiti(ilio;i Mir Juis been dneclor oj lestanli and eval-
uation /o/ Pmjecl Head Stait and ihc Childicn* Buieau. diiectHi fat («i<.hmg. leam-
in£. and dis^iwncn( at the National Inhume of llduialton. and Dnectoi }m Pioyam
  'Ilieie ha* been little diuussion in evaluation litciatuie in (lie United
  Stale* of ethical liiues in condui ting evaluation in mfcnuilioiKil
  stKin^i Although many of the Mine ethical issues «mse wheievci the
  evaluation is iondmted, two sets of ethical issiirs that ate i>aiticulaily
  important in developing countne* umuin how Maheholden should be,
  mvulvea\md to what extent the evuluator should leywct Imul custoi?^
  and values

           \
                   \
 Ethical Issiks  in Conducting
 Evaluation  ^International Bettings

 Michael Bambeigci
 This chapter reviews some ol llrtc eihicalXsucs iclctiiiliccl in evaliiiiitoiis in ilic
 Unued Slaies and considers ihe Sjmilafiucs and diffeienees in application of
 these issues in developing couninesvlnilso identities a nutnbei  ol ethical issues
 ansing in international evaluationsytat aie less common in U S domestic eval-
 uations These concern the role/if uJlei national agencies in  linaiiung. pro-
 moting, and conducting evaluation  ift developing commies and how the
 political, economic, and cultural charactVistiLS of developing couiitties affect
 evaluation practice We reftVTrequenily uAhe Joint Committees I'K^KIIII Eval-
 uation Standutds and ihe Ainencan EvaluaiVin Association's (AHA) "Guiding
 Principles for Evaluauon" as illustiationsXof how  U S  evaluaiois  have-
 approached issues reUtmg to professional eval^ialion standaids and lo show
 how these approaches have been  viewed fiom rhe pcispecitve of clevclo|)ing
 countries
 Ethical Issues in International Evaluation

 Ihisihamer is concerned pnmaiily with evaluaitoiib iri developing l
 that are/Conducted by. or sponsoied by. multilateial crvvi-lopmeiu agtiuu-s
 (WoiUf Bank. UNICEF. InterAinencan Development UankX fot i-xampli). lnl.n-
 etaUlevelopmeni agencies (USAID, CIDA). iiiieinaiiona\iiongovriniiiciu.il
 o/gamzalions (NGOs) (OXPAM. CARL. Win Id Vision). an\J Noith Amciican
,<6r European-based iese.iK.li institutes (umveisities) rvaluait
-------
Program Evaluation Standards                                hrtp '.'www eval org. c .equationDocuments/proeeval html



        The Program Evaluation  Standards

                      Summary of the  Standards

                                 Utility Standards

     The utility standards are intended to ensure that an evaluation will serve the
     information needs of intended users.

     Ul Stakeholder Identification—Persons involved in or affected by the evaluation
     should be identified, so that their needs can be addressed.

     U2 Evaluator Credibility—The persons conducting the evaluation should be both
     trustworthy and competent to perform the evaluation, so that the evaluation
     findings achieve maximum credibility and acceptance.

     U3 Information Scope and Selection—Information collected should be broadly
     selected to address pertinent questions about the program and be responsive to
     the needs and interests of clients and other specified stakeholders.

     U4 Values Identification—The perspectives, procedures, and rationale used to
     interpret  the findings should be carefully described, so that the bases for value
     judgments are clear.

     U5 Report Clarity—Evaluation reports should clearly describe the program being
     evaluated, including its context, and the purposes, procedures, and findings of the
     evaluation, so that essential information is provided and easily understood.

     U6 Report Timeliness and Dissemination—Significant interim findings and
     evaluation reports should be disseminated to intended users, so that they can be
     used in a timely fashion.

     U7 Evaluation Impact—Evaluations should be planned, conducted, and reported in
     ways that encourage follow-through by stakeholders, so that the likelihood that
     the evaluation will be used is increased.

     Feasibility Standards

     The feasibility standards are intended to ensure that an evaluation will be realistic,
     prudent,  diplomatic, and frugal.

     Fl Practical Procedures—The evaluation procedures should be practical, to keep
     disruption to a minimum while  needed information is obtained.

     F2 Political Viability—The evaluation should be planned and conducted with
     anticipation of the different positions of various interest groups, so that their


I of 4                                                                       1/4/01 10 35 AiV
-------
Program Evaluation Standards                                  http /Avww evaJ.ora- Evaluation Documents, progeval html


     cooperation may be obtained, and so that possible attempts by any of these
     groups to curtail evaluation operations or to bias or misapply the results can be
     averted or counteracted.

     F3 Cost Effectiveness—The evaluation should be efficient and produce information
     of sufficient value, so that the resources expended can be justified.

     Propriety Standards

     The propriety standards are intended to ensure that an evaluation will be
     conducted legally, ethically, and with due regard for the welfare of those involved
     in the evaluation, as well as those affected by its results.

     PI Service Orientation-Evaluations should be designed to assist organizations to
     address and effectively serve the needs of the full  range  of targeted participants.

     P2 Formal Agreements—Obligations of the formal parties to an evaluation (what is
     to be done, how,  by whom, when) should be agreed to in writing, so that these
     parties are obligated to  adhere to all  conditions of the agreement or formally to
     renegotiate it.

     P3 Rights of Human Subjects—Evaluations should be designed and conducted to
     respect and protect the  rights and welfare of human subjects.

     P4 Human Interactions—Evaluators should respect human dignity and worth in
     their interactions with other persons  associated  with an evaluation, so that
     participants are not threatened or harmed.

     P5 Complete and Fair Assessment—The evaluation should be complete and fair in
     its examination and recording of strengths and weaknesses of the program being
     evaluated, so that strengths can be built upon and problem areas addressed.

     P6 Disclosure of Findings—The formal parties to an evaluation should ensure that
     the full set of evaluation findings along with pertinent limitations are made
     accessible to the  persons affected by the evaluation, and any others with
     expressed legal rights to receive the  results.

     P7 Conflict of Interest-Conflict of interest should be dealt with openly and
     honestly, so that it does not compromise the evaluation  processes and results.

     P8 Fiscal  Responsibility-The evaluator's allocation and expenditure of resources
     should reflect sound accountability procedures and otherwise be prudent and
     ethically responsible, so that expenditu res are accounted for and appropriate.

     Accuracy Standards

     The accuracy standards are intended to ensure that an evaluation will reveal and


2 Of 4                                                                         1/4/01 1035AM
-------
Program Evaluation Standards                                 hrtp '/www eval org'EvaluationDocuments.progeval html


      convey technically adequate information about the features that determine worth
      or merit of the program being evaluated.

      Al  Program Documentation—The program being evaluated should be described
      and documented clearly and accurately, so that the program is clearly identified.

      A2  Context Analysis—The context in which the program exists should be examined
      in enough detail, so that its likely influences on the program can be identified.

      A3  Described Purposes and Procedures—The purposes and procedures of the
      evaluation should be monitored and described in enough detail, so that they can
      be  identified and assessed.

      A4  Defensible Information Sources—The sources of information used in a program
      evaluation should be described in enough detail, so that the adequacy of the
      information can be assessed.

      A5  Valid Information—The information gathering procedures should be chosen or
      developed and then implemented so that they will assure that the interpretation
      arrived at is valid for the intended use.

      A6  Reliable Information—The information gathering procedures should be chosen
      or developed and then implemented so that they will assure that the information
      obtained is sufficiently reliable for the intended use.

      A7  Systematic Information—The information collected, processed, and reported in
      an  evaluation  should be systematically reviewed and any errors found should be
      corrected.

      A8  Analysis of Quantitative Information—Quantitative information in an evaluation
      should be appropriately and systematically analyzed so that evaluation questions
      are effectively answered.

      A9  Analysis of Qualitative Information—Qualitative information  in an evaluation
      should be appropriately and systematically analyzed so that evaluation questions
      are effectively answered.

      A10 Justified Conclusions—The conclusions reached in an evaluation should be
      explicitly justified, so that stakeholders can assess them.

      All Impartial Reporting—Reporting procedures should guard against distortion
      caused by personal feelings and biases of any party to the evaluation, so that
      evaluation reports fairly  reflect the evaluation findings.

      A12 Metaevaluation-The evaluation itself should be formatively and summatively
      evaluated agajnst these  and other pertinent standards, so that its conduct is
      appropriately guided and, on completion, stakeholders can closely examine its


3 of 4                                                                         1/4/01 1035AM
-------
Program Evaluation Standards
httpv/www.eval org/EvaluationDocuments,progeval htm
      strengths and weaknesses.
          Prepared by:
          Mary E. Ramlow
          The Evaluation Center
          401B Ellsworth Hall
          Western Michigan University
          Kalamazoo, MI 49008-5178
          Phone: 616-387-5895
          Fax: 616-387-5923
          Email: Marv.Ramlow(a)wmich.edu
4 of 4
                          1/4/01 10.35 A.\
-------
Guiding Principles for Evaluators                               htrp .'/www eval org/EvaluationDocuments-aeaprin6 hrmi


                        Guiding Principles for Evaluators

                          A Report from the AEA Task Force on
                             Guiding Principles for Evaluators

                               Members of the Task Force

                       Dianna Newman, University of Albany/SUNY
                            Mary Ann Scheirer, Private Practice
                    William  Shadish, Memphis  State University (Chair),
                            w.shadish@mail.Dsvc.memDhis.edu
                   Chris Wye, National Academy of Public Administration

      I. Introduction

          A. Background: In  1986, the Evaluation Network (ENet) and the
          Evaluation Research Society (ERS) merged to create the American
          Evaluation Association.  ERS had previously adopted a set of standards
          for program evaluation  (published in New Directions for Program
          Evaluation in 1982); and both organizations had lent support to work of
          other organizations about evaluation guidelines. However, none of these
          standards or guidelines were officially adopted by AEA, nor were any
          other ethics, standards, or guiding principles put into place. Over the
          ensuing years, the  need for such guiding  principles has been discussed
          by both the AEA Board and the  AEA membership. Under the presidency
          of David Cordray in 1992, the AEA Board appointed a temporary
          committee chaired  by Peter Rossi to examine whether AEA should
          address this matter in more detail. That committee issued a report to the
          AEA Board on November 4, 1992, recommending that AEA should pursue
          this matter further. The Board followed that recommendation, and on
          that date created a Task Force to develop a draft of guiding principles for
          evaluators. The AEA Board specifically instructed the Task Force to
          develop general guiding principles rather than specific standards of
          practice. This report summarizes the Task Force's response to the
          charge.

          B. Process: Task Force members reviewed relevant documents from
          other professional societies, and then  independently  prepared and
          circulated drafts of  material for use in this report. Initial and subsequent
          drafts (compiled by the Task Force chair) were discussed during
          conference calls, with revisions  occurring after each call.  Progress
          reports were presented at every AEA board meeting  during 1993. In
          addition, a draft of  the guidelines was mailed to all AEA members in
          September 1993 requesting feedback; and three symposia at the 1993
          AEA annual conference were used to discuss and obtain further
          feedback. The Task Force considered all this feedback in a December
          1993 conference call, and prepared a  final draft in January 1994. This


1 of 9                                                                         1/4/01 10-35 A.V
-------
Guiding Principles for Evaluators                                httpV/^ww eval org/EvaluationDocuments.aeapnn6 html


          draft was presented and approved for membership vote at the January
          1994 AEA board meeting.

          C. Resulting Principles: Given the diversity of interests and employment
          settings represented on the Task Force, it is noteworthy that Task Force
          members reached substantial agreement about the following five
          principles. The order of these principles does not imply priority among
          them; priority will vary by situation and evaluator  role.

              1. Systematic Inquiry:  Evaluators conduct systematic,
              data-based inquiries about whatever is being  evaluated.

              2. Competence: Evaluators provide competent performance to
              stakeholders.

              3. Inteahtv/Honestv: Evaluators ensure the honesty and
              integrity of the entire evaluation process.

              4. Respect for People:  Evaluators respect the security, dignity
              and self-worth of the respondents, program participants,
              clients, and other stakeholders with whom they interact.

              5. Responsibilities for General and Public Welfare: Evaluators
              articulate and take into account the diversity  of interests and
              values that may be related to the general and public welfare.

              These five principles are elaborated in Section III of this
              document.

          D. Recommendation for Continued Work: The Task Force also
          recommends that the AEA Board establish and support a mechanism for
          the continued development  and dissemination of these Guiding
          Principles.

     II. Preface: Assumptions Concerning Development of Principles

          A. Evaluation is a profession composed of persons with varying interests,
          potentially encompassing but not limited to the evaluation of programs,
          products, personnel, policy, performance, proposals, technology,
          research, theory, and even  of evaluation itself. These principles are
          broadly  intended to cover all kinds of evaluation.

          B. Based on differences in training, experience, and work settings, the
          profession of evaluation encompasses diverse perceptions about the
          primary purpose of evaluation. These include but are not limited to the
          following: bettering products, personnel, programs, organizations,
          governments, consumers and the public interest; contributing to


2 of 9                                                                         1/4/01 10-35 AN
-------
Guiding Principles for Evaluators                                hnp //www eval org/Evaluation Documents. aeaprm6 html


          informed decision making and more enlightened change; precipitating
          needed change; empowering,all stakeholders by collecting data from
          them and engaging them in the evaluation process; and experiencing the
          excitement of new insights. Despite that diversity, the common ground is
          that evaluators aspire to construct and provide the best possible
          information that might bear on the value of whatever is being evaluated.
          The principles are intended to foster that primary aim.

          C. The intention of the Task Force was to articulate a set of principles
          that should guide the professional practice of evaluators, and that should
          inform evaluation clients and the general public about the principles they
          can expect to be upheld by professional evaluators. Of course, no
          statement of principles can anticipate all situations that arise in the
          practice of evaluation.  However, principles are not just guidelines for
          reaction when something goes wrong or when a dilemma is found.
          Rather, principles should proactively guide the behaviors of professionals
          in everyday practice.

          D. The purpose of documenting guiding principles is to foster continuing
          development of the profession of evaluation, and the socialization of its
          members. The principles are meant to stimulate discussion and to
          provide a  language for dialogue about the proper practice and application
          of evaluation among members of the  profession, sponsors of evaluation,
          and others interested in evaluation.

          E. The five principles proposed in this document are not independent,
          but overlap in many ways. Conversely, sometimes these principles will
          conflict, so that evaluators will have to choose among them. At such
          times evaluators must  use their own values and knowledge of the setting
          to determine the appropriate response. Whenever a course of action is
          unclear, evaluators should solicit the advice of fellow evaluators about
          how to resolve the problem before deciding how to proceed.

          F. These principles are intended to replace any previous work on
          standards, principles, or ethics adopted by ERS or ENet, the two
          predecessor organizations to AEA. These principles are the official
          position of AEA on these matters.

          G. Each principle is illustrated by a number of statements to amplify the
          meaning of the overarching principle, and to provide guidance for its
          application. These statements are illustrations. They are not meant to
          include all possible applications of that principle, nor to be viewed as
          rules that provide the basis for sanctioning violators.

          H. These principles are not intended to be or to replace standards
          supported by evaluators or by the other disciplines in which  evaluators
          participate. Specifically, AEA supports the effort to develop standards for


3 of 9                                                                          1/4/01 I035AV
-------
Guiding Principles for Evaluators                                http-.'/www eval on>'EvaluationDocuments/aeapnn6.html


          educational evaluation by the Joint Committee on Standards for
          Educational Evaluation, of which AEA is a cosponsor.

          I. These principles were developed in the context of Western cultures,
          particularly the United States, and so may reflect the experiences of that
          context. The relevance of these principles may vary across other
          cultures, and across subcultures within the United States.

          J. These principles are part of an evolving process of self-examination by
          the profession, and should be revisited on a regular basis. Mechanisms
          might include officially-sponsored reviews of principles at annual
          meetings, and other forums for harvesting experience with the principles
          and their application. On a regular basis, but at least every five years
          from the date they intially take effect, these principles ought to be
          examined for possible review and revision. In order to maintain
          association-wide awareness and relevance, all AEA  members are
          encouraged to participate in this process.

      III. The Principles

          A. Systematic Inquiry: Evaluators conduct systematic,  data-based
          inquiries about whatever is being evaluated.

               1.  Evaluators should adhere to the highest appropriate
               technical standards  in conducting  their work, whether that
               work is quantitative or qualitative in nature, so as to increase
               the accuracy and credibility of the evaluative information they
               produce.

               2.  Evaluators should explore with  the client the shortcomings
               and strengths both of the various  evaluation questions it might
               be productive to ask, and the various approaches that might
               be used for answering those questions.

               3.  When presenting their work,  evaluators should communicate
               their methods and approaches accurately and  in sufficient
               detail to allow others to understand, interpret and critique their
               work. They should make clear the limitations of an evaluation
               and its results. Evaluators should discuss in a contextually
               appropriate way those values, assumptions, theories,  methods,
               results,  and analyses that significantly affect the  interpretation
               of  the evaluative findings. These statements apply to all
               aspects of the evaluation, from  its initial conceptualization to
               the eventual use of findings.

          B. Competence: Evaluators provide competent performance to
          stakeholders.


4 of 9                                                                           1/4/01 10:35 AM
-------
Guiding Principles for Evaluators                                 http '/www eval org'EvaluaiionDocumems
-------
Guiding Principles for Evaluators                                http "www eval org/EvaluanonDocuments.aeaprm6 hrnii


               interests).

               4. Evaluators should disclose any roles or relationships they
               have concerning whatever is being evaluated that might pose a
               significant conflict of interest with their role as an evaluator.
               Any such conflict should be mentioned in reports of the
               evaluation results.

               5. Evaluators should not misrepresent their procedures, data
               or findings. Within reasonable limits, they should attempt to
               prevent or correct any substantial misuses of their work by
               others.

               6. If evaluators determine that certain procedures or activities
               seem  likely to produce misleading evaluative  information or
               conclusions, they have the responsibility to communicate their
               concerns, and the reasons for them, to the client (the one who
               funds or requests the evaluation). If discussions with the client
               do not resolve these concerns, so that a misleading evaluation
               is then implemented, the evaluator may legitimately decline to
               conduct the evaluation if that is feasible and appropriate. If
               not, the evaluator should consult colleagues or relevant
               stakeholders about other proper ways to proceed (options
               might include, but are not limited to, discussions at a higher
               level,  a dissenting cover letter or appendix, or refusal to sign
               the final document).

               7. Barring compelling reason to the  contrary, evaluators should
               disclose all  sources of financial support for an evaluation, and
               the source of the request for the evaluation.

          D. Respect for People: Evaluators respect the security, dignity and
          self-worth of the respondents, program participants, clients, and other
          stakeholders with whom  they interact.

               1. Where applicable, evaluators must abide by current
               professional ethics and standards regarding risks, harms, and
               burdens that might  be engendered to those participating in the
               evaluation; regarding informed consent for participation in
               evaluation; and regarding informing participants about the
               scope and limits of confidentiality. Examples of such standards
               include federal regulations about protection of human subjects,
               or the ethical principles of such associations as the American
               Anthropological Association, the American  Educational
               Research Association, or the American Psychological
               Association. Although this principle is not intended to extend
               the applicability of such ethics and standards  beyond their


6 of 9                                                                           1/4/01 10-35 AN
-------
Guiding Principles for Evaluators                                 httpVAvvvw eval org/EvaluationDocuments,aeaprm6 html


               current scope, evaluators should abide by them where it is
               feasible and desirable to do so.

               2. Because justified negative or critical conclusions from an
               evaluation must be explicitly stated, evaluations sometimes
               produce results that harm client or stakeholder interests.
               Under this circumstance, evaluators should seek to maximize
               the benefits and reduce any unnecessary harms that might
               occur, provided this will not compromise the integrity of the
               evaluation findings. Evaluators should carefully judge when the
               benefits from doing the evaluation or in performing certain
               evaluation procedures should be foregone because of the risks
               or harms. Where possible, these issues should be anticipated
               during the negotiation of the evaluation.

               3. Knowing that evaluations often will negatively affect the
               interests of some stakeholders, evaluators should conduct the
               evaluation and communicate its results in a way that clearly
               respects the stakeholders' dignity and self-worth.

               4. Where feasible,  evaluators should attempt to foster the
               social equity of the evaluation, so that  those who give to the
               evaluation can receive some benefits in return. For example,
               evaluators should seek to ensure that those who bear the
               burdens of contributing  data and incurring any risks are doing
               so willingly, and that they have full knowledge of, and
               maximum feasible  opportunity to obtain any benefits that may
               be produced from the evaluation. When it would not endanger
               the integrity of the evaluation, respondents or program
               participants should be informed if and how they can receive
               services to which they are otherwise entitled without
               participating in the evaluation.

               5. Evaluators have the  responsibility to identify and respect
               differences among  participants, such as differences in their
               culture, religion, gender, disability, age, sexual orientation and
               ethnicity, and to be mindful of potential implications of these
               differences when planning, conducting, analyzing, and
               reporting their evaluations.

          E. Responsibilities for General and Public Welfare: Evaluators articulate
          and take into account the diversity of interests and  values that  may be
          related to the general and public welfare.

               1. When planning and reporting evaluations, evaluators should
               consider including  important perspectives and  interests of the
               full range of stakeholders in the object being evaluated.


7 of 9                                                                          1/4/01 10 35 AN
-------
Guiding Principles for Evaluators                                http /www eval.org/EvaluationDocuments/aeaprin6.html


               Evaluators should carefully consider the justification when
               omitting important value perspectives or the views of
               important groups.

               2. Evaluators should consider not only the immediate
               operations and outcomes of whatever is being evaluated, but
               also the broad assumptions, implications and potential side
               effects of it.

               3. Freedom of information is essential in a democracy. Hence,
               barring compelling reason to the contrary, evaluators should
               allow all relevant stakeholders to have access to evaluative
               information,  and should  actively disseminate that information
               to stakeholders  if resources allow. If different evaluation
               results are communicated in forms that are tailored to the
               interests of different stakeholders, those communications
               should ensure that each  stakeholder group is aware of the
               existence of  the other communications. Communications that
               are tailored to a given stakeholder should always include all
               important results that may bear on interests of that
               stakeholder. In all cases, evaluators should strive to present
               results as clearly and simply as accuracy allows so  that clients
               and other stakeholders can easily understand the evaluation
               process and  results.

               4. Evaluators should maintain a balance between client needs
               and other needs. Evaluators necessarily have a special
               relationship with the client who funds or requests the
               evaluation. By virtue of that relationship,  evaluators must
               strive to meet legitimate client needs whenever it is feasible
               and appropriate to do  so. However, that relationship can also
               place evaluators in difficult dilemmas when client interests
               conflict with  other interests, or when client interests conflict
               with the obligation of evaluators for systematic inquiry,
               competence, integrity, and respect for people. In these cases,
               evaluators should explicitly identify and discuss the conflicts
               with the client and relevant stakeholders, resolve them when
               possible, determine whether continued work on the evaluation
               is advisable if the conflicts cannot be resolved, and make clear
               any significant limitations on the evaluation that might result if
               the conflict is not resolved.

               5. Evaluators have obligations that encompass the  public
               interest and  good. These obligations are especially  important
               when evaluators are supported by publicly-generated funds;
               but clear threats to the public good should never be ignored  in
               any evaluation.  Because the public interest and good are  rarely


8 of 9                                                                           1/4/01 10-35 AiV
-------
Guiding Principles for Evaluators                                   http //www eval.org/EvaluationDocuments/aeaprm6 html


                the same as the interests of any particular group (including
                those  of the client or funding agency), evaluators will usually
                have to go beyond an analysis of particular stakeholder
                interests when considering the welfare of society as a whole.
9 Of 9                                                                                 1/4/01 1035 AN
-------
   WHAT COMES TO MIND WHEN YOU HEAR THE WORD...
    process
ma [or- Activities
purpose
                                                     worc|s to
                                                     Describe
Evaluation
Auditing
Investigation
Research
Monitoring
Assessment
-------
                     EVALUATION AND AUDITING
                                                                            o 71
Refer to articles in- Carl Wisler, ecj. C1996) New Directions for Program Evaluation
San Francisco CA Jossey-Bass Publishers

       Editor's Notes, Carl Wisler
       Divorski, Stan "Differences in the Approaches of Auditors and Evaluators to the
       Examination of Government Policies and Programs"
       Chelimsky, Eleanor "Auditing and Evaluation Whither the Relationship/"

"Performance auditing is an objective and systematic examination of evidence for the
purpose of providing an independent assessment of the performance of a government
organization, program, activity, or function in order to provide information to improve
public accountability and facilitate decision-making by parties with responsibility to oversee
or initiate corrective action "  Comptroller General of the L/S, 1994-
               Evaluation
                                                          Auditing
New C1960's)
Offshoot of Social Sciences concerned with
theory and explanation
Less precise, more intellectually stimulating
Multiple audiences
Criteria selection flexible
Cooperative, interactive relationships with
evaluees
Examines a question of "why" (what will
produce desired/undesired effects')
Other
                                           Older profession (stemming from
                                           accounting, bookkeeping)
                                           Verification of authoritative documents
                                           Single client
                                           Fixed criteria (comparing 'what is' to 'what
                                           should be')
                                           Examines a question of "whaf'Cdoes what
                                           was done conform to standards/)
                                           End result is an opinion- Is observed
                                           performance consistent with accepted
                                           norms/

                                           Other.
-------
EDITOR'S NOTES
 During the last several decades, the disciplines of evaluation and auditing
 have each gone through substantial change  The purpose of this volume is not
 to explore all the reasons and consequences associated with such change, but
 to focus on the extent to which audit and evaluation have converged on simi-
 lar procedures and organizational structures and on the extent to which they
 have remained different  An effort has been made to understand and describe
 the issues from both evaluation and auditing perspectives
    Both evaluation and auditing claim to help decision makers by providing
 them with systematic and credible information  that can be useful in the cre-
 ation, management, oversight, change, and occasionally abolishment of pro-
 grams  Yet despite considerable overlap in objectives, subject matter, and
 clients, auditing and evaluation have until recently functioned largely in isola-
 tion from one another. The literature of each discipline scarcely recognizes the
 existence of the other Academic preparation of auditors and evaluators could
 hardly be more different  Organizationally, the two activities have traditionally
 been separate The practitioners have difficulty communicating with one
 another not only because of differences in vocabulary but also because of some
 important differences in mind-set Will auditing and evaluation persist as dis-
 tinctive services to decision makers or is there a possibility of a merge or blend
 of the two activities?
    Differences between auditing and evaluation are rooted in the older pro-
 fessions from which they emerged Auditing evolved from financial account-
 ing and so makes much use of concepts like verification, internal controls, and
 good management  practice. Evaluation emerged from the sciences, especially
 social science, and so has tended to carry with it the trappings of measurement,
 probability sampling, and experimentation. The authors represented  in this
 volume make frequent reference to the conceptual underpinnings of the two
 disciplines when they point out how auditing and evaluation are different
    One way to get a quick sense of some differences between auditing and
 evaluation is to consider the kinds of questions  for which the two fields have
 tned to provide answers  Three categories of questions are especially useful for
 comparing auditing and evaluation descriptive, normative, and cause-and-
effect (Wisler. 1984a, 1984b)  These categories are explicitly or implicitly
 referred to any number of times in the following chapters
    Program evaluation  has always given much attention to cause-and-effect
questions, especially ones about the overall impact of a program  The answer
 to an impact question is usually formulated as the difference between an out-
come observed after a program has been in operation with the outcome that
would have been observed in the absence of the program Evaluation clients
also seek the answers to descriptive questions—ones that do not compare two
-------
 2     IJHIOR'S NOM.S

 conditions bin simply desuibc a stale ol ilic world Common examples include
 questions about tbc societal needs, the selling in which a program operates, or
 the way a progiam was implemented
     Auditing has traditionally focused on normative questions for which the
 answer compares "what is" with "what should be " Seldom do auditors seek
 answers to descriptive questions nor do they often consider cause-and-effect
 questions, at least m the sense that evaluators understand that term
     One of the most interesting differences in disciplinary perspective is the
 coniiasi between program audit, which usually starts with a normative ques-
 tion, and impact evaluation,  which focuses on cause and  effect  A  brief
 overview of these two forms of inquiry is illuminating because practitioners
 from each discipline claim that their respective approaches  address the issue
 of program effectiveness However, the similarity seems to end  there Method-
 ologically, they are different because in fact they address two distinct questions
 about program effectiveness Although the two methodological approaches can
 be distinguished easily, it is probable that most clients, and at least some prac-
 titioners as well, do not lully appreciate the important differences
     As defined by the Comptroller General of the United States (1994, p 14),
 peijoimance aud\l is "an objective and systematic examination of evidence
 of the performance of a government organization, program, activity, or func-
 tion  in order to provide information to improve public accountability and facil-
 itate decision-making " A piogiam audit is a subcatcgory of a performance audit
 for which one objective (of three) is  to determine "the extent to which the
 desired results or benefits established by the legislature or other authorizing
 body aie being achieved " This application of program audit provides an inter-
 esting contrast with impact evaluation (The scope of auditing is broad, of
 course, and the piactitioners may employ objectives and methodologies dif-
 ferent from those reviewed here The notion of value-for-money audits, used
 in many countries, is similar to the concept of performance auditing )
     Schandl (1978, p  4) says "auditing is a human evaluation process to
establish the adherence to ccitam norms, resulting in an opinion (or judg-
 ment) " I lerbeit (1979) describes developments at the U S General Account-
 ing Office in the 1960s that led to a prescription for management and program
audits, those prescriptions are comparable to the general concept set forth by
Schandl regarding how to undertake management and  program audits  The
central idea in a program audit is to compare "what is" with "what should be"
(Compunllci General of the United Slates.  1979 |1974|) This notion seems
to How lioin earlier forms of auditing in which generally accepted accounting
puuiics 01, mote bioadly, generally accepted management  practices played
key mlcs  The methodology was to establish a standard generally accepted
management piaciicc, lor example, and to compete that standaicl with actual
piogiam piauiLC Any discovery of a scuous discrepancy would be regaided
as a deliLii— •• and lead to a negative audit icport
     In n       o broader issues of program effectiveness, auditors look along
the noun..    question  To conduct a proginm audit, it  was therefore nei.cs-
                                                   ni>noR's NOILS
sary to identify one or more specific program objectives to piny the role of
"what should be " Such objectives might come from legislation,  regulations,
declarations of intent by program managers, and so on  Actual program per-
formance, determined empirically, provided the "what is" component And
program effect was defined as the difference between the program objective
and actual performance  (In auditing, an effect is sometimes defined differently
to mean the consequences of a discrepancy For example, if a  program to
improve water quality showed a shortfall in the achievement of water pumy
standards, the effects might be greater incidence of disease, lost time at work
due to illness, and so on )
    The audit approach to program effectiveness is in the spirit  of the strat-
egy advocated in the problem-solving literature as exemplified by  Kepner and
Tregoe (1976) It is also very much in the mode of objectives-oriented evalu-
ation approaches (see Chapter 5 of Worthen and Sanders, 1987) sometimes
used with educational programs — and especially of the version called dis-
crepancy evaluation (Provus, 1971) It does not correspond, however, to the
conventional notion of impact evaluation
    Program impact evaluation stems from the experimental design used in a
number of sciences wherein compansons are made between outcomes associ-
ated with randomly assigned  treatment and control groups In evaluation, the
methodology is generally understood not to require random assignment but 10
extend to other approaches that permit compansons between what happened
in the presence of the program and what would have happened in (he absence
of the program Such quasi-experimental designs are  prominent in evaluation
as ways to answer impact questions (Cook and Campbell, 1979) (Auditors and
others new to the evaluation literature have to contend with the variety of terms
that may be used interchangeably with impact evaluation, most notably impact
assessment, impact analysis, and program effectiveness evaluation )
    The two questions posed by auditors and evaluators about program effec-
tiveness can, and generally will, lead to quite different conclusions about the
performance of a program Both conclusions may be correct — they are just
answers to different questions  Unfortunately, because of the language used,
the unwary may perceive the questions to be the same
    The foregoing companson of program audit and impact evaluation illus-
trates the  purpose  of this volume, not to give a full-featured account of the
heartlands of evaluation and auditing, but more 10 focus on ihe territory where
they come together. The authors of the five chapters of this volume have all
been in positions to survey the terrain thai is roamed boih by bands of evalu-
ators and of auditors, and they offer their views on the jointly occupied terri-
tory along with occasional references to the heartlands  Earlier vcisions of the
chapters were presented at a session of the International Evaluation Confer-
ence held  in Vancouver, Bntish  Columbia, November 1-5, 1995
    Stan Divorski, from his vantage poini with the Oflicc of (he     'or Gen-
eral in Canada, sets forth five key dimensions that he finds d      nsh the
mmrl rpic nf
i unr< fmiTi
                                                                                                                    Pivmr Rinnl/c
                                                  Im, iwisnri live 111 I hi'
-------
4     EDITOR'S NOTES

Minnesota Office of Legislative Auditor to consider the extent to which the
auditing and evaluation cultures have been blended, or at least have that
potential if blending is desired Christopher Pollitt and Hilkka Summa bring
their experiences with audit institutions m the United Kingdom and Finland,
respectively, to bear in comparing the ways m which auditors and evaluators
approach similar tasks Frans L  Leeuw from the Netherlands Court of Audit
focuses on the contributions of evaluators and auditors to the improvement of
public sector performance and, in so doing, draws attention to a slice of the
international literature companng the two fields  Finally, Eleanor Chelimsky,
formerly of the U S General Accounting Office, highlights the conclusions of
the other authors, based on their conference papers, and offers her own views
on the pros and cons of integrating auditing and evaluation or keeping them
as separate services to decision makers.
    Collectively, the authors point out many similarities and differences between
auditing and evaluation  Indeed, there is such vanety in the observations that it
is difficult to categorize them and gauge their importance If, as seems to be the
case, evaluation and auditing are moving closer together, what are the differences
that seem most likely to hold them apart? Readers may wish to consider the fol-
lowing three themes and conjectures as they read this volume
     The inclination of auditors toward normative questions and of evaluators toward
descriptive and impact questions The difference is rooted in the history of the
disciplines and in the educational preparation  of the practitioners Although
examples of crossover between the disciplines can be cited, a broad-scale mix
of the approaches seems likely to lake a long time, if it ever occurs
     Independence versus collaboration with the subjects of audit and evaluation
Auditors have attached great importance to their independence from both
client and audilee, while evaluators have tended to work more closely with
their clients and to move toward yet greater collaboration with evaluees  Even
if the tenets of fourth-generation evaluation are not adopted by most  evalua-
tors, mainstream evaluators seem unlikely to return to the extremes of scien-
tific detachment that once  prevailed  The two disciplines therefore seem
destined to be some distance apart on the scale of independence
     Di/ferences m the degree to which auditing and evaluation have become rou-
tinized government operations  The role of auditing in government activities has
been generally acknowledged with legislative mandates, clearly identified
clients, and organizational permanence  The shorter history of public program
evaluation reveals less widespread acceptance, more variability and multiplic-
ity of clients, and considerable fluctuation m fortune over the short term The
repercussions of these differences on achievements of the two disciplines (in
terms of program improvements, new legislation, and so on) may be  hard to
sort out, but in the case of evaluation, one wonders how long the shakedown
cruise will last and how, or if, a greater stability will be achieved  A continued
convergence of evaluation and auditing might be a course toward a  higher-
quality, more steadfast service to  decision makers
                                                     CDIIOR'S Nori-s     5

References

Comptroller General of the United Stales "Report Manual" As adapted in I.  I lerben. Audit-
  ing the Performance of Management Belmont. Calif Lifelime Learning. 1979 (Originally
  published 1974)                                                   6    '
Comptroller General of the United Stales Government Auditing Standards. 1994 Revision
  Washington, DC  US General Accounting Office. 1994
Cook, T D , and Campbell. D T Quasi-Expenmentation Skokie, III  Rand McNally. 1979
Herbert. L Auditing the Performance ojManagement Belmom. Calif  Lifetime Learning. 1979
Kepner.C H , and Tregoe, B B The Rational Manager (2nd ed )  Pnnceion NJ  Kcpner-
  Tregoe, 1976
Provus. M M Discrepancy Evaluation  Berkeley. Calif  McCulchan. 1971
Schandl.C W  Theory of Auditing  Houston. Tex  Scholars. 1978
Wisler, C E "Topics in Evaluation " GAO Review. 1984a.  19-1
Wisler, C E "Topics in Evaluation " CAO Review. 1984b.  19-3
Wonhen, B R , and Sanders, J R  Educational Evaluation New York Longman, 1987
CARL WISLER is an evaluation consultant with Wisln
Maryland
                                                         in Mitiliellvillf.
-------
Although audits and evaluations may have similar characteristics,
the perspectives of auditors and evaluators can be quite different
These differing perspectives are reflected in then respective treatments
of program impacts
Differences in the Approaches
of Auditors and Evaluators
to the Examination of Government
Policies and Programs
Stan Divorsfei
On any given dimension, the extent to which audit and evaluation differ is largely
a matter of degree Any characteristic thai can apply to audits will also apply to
some evaluations, somewhere, sometime Any distinguishing characteristic of
evaluations probably can be found to apply to some audit Overall, however,
audits and evaluations can be very different
    This chapter attempts to describe the perspective, the mind-set, that audi-
tors bring to the examination of programs, so as to illustrate how different the
perspectives of auditors and evaluators can be Although this perspective has
us origins in the requirements of financial auditing, the focus here is on value-
for-money auditing, which includes the examination of program activities, as
well as of management systems and procedures for controlling these activities
    Five key dimensions distinguish this mind-set from that of evaluators In
general, auditors.

• Make a judgment as to how adequate or inadequate are the mailers examined
• Make this judgment against a preestabhshed set of criteria
• Focus on management systems and procedures for coniiollmg program
  activities rather than on the program activities themselves
The views expressed in this chapter are ihose of the author and should not br • onsiriicd a>>
representing I hose of the Auditor General of Canada
-------
8     EVAI.UAI ION AND AUDI PINO

•  May avoid commenting on substantive government policies
•  View information on program results (when considered at all) as evidence
   of other mailers, rather than as an end in itself

Judging Against  Pre-Established Criteria
The mandate for an auditors work requires a judgment about the adequacy of
the matters examined  Such judgment is based on a comparison of the results
of the examination against a sei of criteria, or expectations, that the auditor
may draw from a variety of sources, including previous audits and professional
standards or guidelines
     For example, in reporting an audu of The Control and Clean-up of Freshwa-
ter Pollution, ihe Auditor General of Canada slated that it "expected to find thai
the various federal components of action plans were coordinated and that the
means were m place to handle interdepartmental conflicts over policy, planning
and funding as action plans are implemented" (1993b, p 370) In addition to
general cniena of this nature, an auditor may establish more specific subcntena
     The way in which a judgment  is reached and expressed depends upon the
assignment the auditor receives The auditor may reach a judgment after exam-
ining directly the matters at hand or after examining the reliability of manage-
ment's assertions At limes the (wo approaches may be combined The choice
of approach may be imposed by ihe auditor, the client, or, in the case of a leg-
islated mandate, by law
     An interesting example of ihe audit of penodic financial statements is that
recommended by ihe Canadian Comprehensive Auditing Foundation (CCAF)
for reporting on elfectiveness (1987) The CCAF recommends that management
make rcpresentaiions (that is, that they provide information about effectiveness
in their oigamzaiions), and that auditors provide opinions on the fairness of
ihose representations Rather than define effectiveness, the CCAF sets out twelve
attributes of effectiveness against which managers are expecied to report

  1   Management direction (including clarity of objectives)
  2  Continued relevance of a program
  3  Appropriateness of program design
  4  Achievement of intended results
  5  Satisfaction of clients or stakeholders
  6  Secondary impacts
  7  Costs and produuiviiy
  8  Responsiveness to changed circumstances
  9  Financial results
 10  The extent to which the organization  provides an appropnate work envi-
     ronment foi us employees
 11  Safeguarding of assets
 12  Monitoring and reporl ing of pci lormancc
             DIFFERENCES IN THE APPROACHES OF AUDITORS AND EVALUATORS     9

The conditions on an audit engagement may also specify how ihe auditor is to
report the judgment In exception reporting, the auditor is required to report
any deficiencies, that is, situations that did not meet these criteria Canada's
Auditor General Act, Section 7(2), gives such a mandate

    Each report of ihe Auditor General under subsection (1) shall call aiiemiun 10
    anything thai he |sic| considers to be of significance and of a nature that should
    be brought to the attention of the House of Commons, including any cases in
    which he has observed thai
    (a)  accounts have not been faithfully and properly maintained or public money
        has not been fully accounted for or paid, where so required by law, inio (he
        consolidated revenue fund,
    (b)  essential records have not been maintained or the rules and proceduics
        applied have been insufficient to safeguard and control public property, to
        secure an effective check on the assessment, collection and proper alloca-
        tion of the revenue and to ensure that expenditures have been made only
        as authorized,
    (c)  money has been expended other than for purposes for which  it was appro-
        priated by Parliament,
    (d)  money has been expended without due regard to economy or efficiency,
    (e)  satisfactory procedures have not been established to measure and report ihe
        effectiveness of programs, where such  procedures could appropnaiely and
        reasonably be implemented, or
    (D  money has been expended without due regard to ihe environmental effects
        of those expenditures m the context of sustainable development

For example, in the audit of The Control and Clean-up of Freshwater Pollution
described earlier, ihe auditor reported instances of poor  coordination and  unre-
solved conflicts. "This difference in departmental objectives and program fund-
ing led to coordination problems  Although the two departments agreed to
provide for a management structure to coordinate their respective programs.
llhe slructurel proved to be ineffective11 (1993b, p 374)
    Alternatively, an auditor may be required to reporl on  the level of comlort
or assurance (hat a third party can have regarding the management of ihe pro-
gram  The CCAF model for effectiveness auditing is one example  of uu assm-
ance engagement In this instance, the auditor provides assurance icgarding
management representations as to the matters at hand  Another example  of an
assurance mandate is provided m Part  X of Canada's Financial AdminiMiuiion
Act (1991), bearing on the responsibilities of Crown Coiporanons  Suhseiiion
138(1) of the act requires corporations to engage an auditor to undertake a
special examination, in which the auditor is required to "deicimine it the sys-
tems and practices referred to in paragraph I3l(l)(b) were, m  the period
under examination, maintained in a manner that ptovidcd reasonable assur-
ance that they met the requirements of paragraphs 131 (2)(a)  and  (O "
-------
 10
HVAIIIAIIUN AND AUDI I INC.
    Subsection 131(2) clarifies ihm " I'hc hooks, records, systems and practices
referred lo in subsection (1) shall be kept and  maintained in such manner as
will provide  reasonable assurance that    (c) the financial, human and physi-
cal resouiccs of the corporation and each subsidiary are managed economically
and efficiently and the operations of the corporation and each subsidiary are ear-
ned out effectively" In this instance, the inability to find an exception from (he
positive expectation would be considered a judgment in confirmation of it
    In assuiance engagements,  a positive judgment is reached and reported
with gieat care because (I) available methods may be incorrectly applied or
not fool pi oof, and (2) theic may be an undetected case that constitutes an
important variation fiom the expectations established


Focus on Management

The core business of an audit is the examination of management controls over
expendituie, including whether  or not management has in place the systems
and procedures necessary to  ensure that expenditures are  made with due
regard to economy and efficiency and in compliance with existing regulations
or policies Evaluation laicly intrudes into these areas The distinction between
audit and evaluation has become blurred, however, through increasing atten-
tion by audit 10 management control over results, including program effects,
as an indication of good financial management and control
    This change has been lemforced by trends in government to decrease the
emphasis on formal controls and increase the delegation of authority, accom-
panied by a gieatcr need for accountability for results on the part of managers
    The mandate for audits of Crown Corporations under Canada's Financial
Administration Acl (1991) reflects this  focus on managemenl, as it requires
attention to management systems and practices Of particular significance to a
comparison of auditors and evalualors, the act specifically requires the auditor
to consider managemenl control over the effectiveness of program  operations
    The approach recommended by the Canadian Comprehensive Auditing
Foundation  (CCAP) for auditing effectiveness also reflects the auditor's focus
on improving managemenl The CCAF notes that "The decision to emphasize
management icpiescntations reflects the reporting obligations of managers, the
needs of governing bodies, and their mutual desire for better management"
(1991. p 9)
    The focus on management issues rathei than outcomes per se is reflected
in the CCAF's twelve attributes of effectiveness, which include such matters as
managers' responsibility for costs and productivity, responsiveness lo changed
circumstances, piovision of an appropriate working environment, and so on


Restrictions on the Scope  of Audit

Under tlu     F approach, the  auditor is potentially limited in the scope of
rhc iiivrsiiiMiioi) hv ihr information ihai ihr m.m.it'ri < linn^r--, in n-nori
            DIFFERENCES IN HIE APPROACHES OF AUDITORS AND EVAI UAIURS    11

    There may be other, more formal restrictions on the scope of audit work
This is especially likely to apply 10 ihe examination of major government poli-
cies For example, with regard lo the special examination of Crown Corpora-
tions,  Section 145 of Canada's Financial Administration Act specifies. "Nothing
in this Pan or the regulations shall be construed as authorizing the examiner
of a Crown corporation to express any opinion on the merits of mailers of pol-
icy, including the merits of   . (c) any business or policy decision of the cor-
poration or of the Governmenl of Canada "
    Other examples are provided by the Swedish National Audii Office (1995,
p 8) and the Australian National Audit OITrce (1995, p  1 3). whose mandates
specifically exclude commenl on governmeni policy
                                                                            When Auditors Examine Results
                                                                            To this point, 1 have argued that auditors are more likely than cvaluators to
                                                                            focus on management systems and procedures and to face restnciions on their
                                                                            freedom to comment on policy I have also pointed oul lhai the  mandate of
                                                                            auditors frequently includes effectiveness  li is in the area of examining effec-
                                                                            tiveness that  the  distinction between  audu and evaluation is  least clear.
                                                                            although reducing to iwo basic issues The first is the focus of effectiveness
                                                                            work for each The second is the reasons why auditors and cvaluators tackle
                                                                            issues of program effectiveness.
                                                                                The model for results-based audit depicted m Fable 1  1 illustrates the pos-
                                                                            sible foci for effectiveness work.
                                                                                The model locales results measuremeni  of government  programs along
                                                                            two dimensions the level of results measured and the level of program ana-
                                                                            lyzed  With regard to the program level,  a distinction is made between ihe
                                                                            results of management systems and procedures and ihe results  of program
                                                                            activities. Systems and procedures include such mailers as planning, manage-
                                                                            menl information  systems, and procedures for detecting, recording, and col-
                                                                            lecting overpayments 10 program clientele  Program evaluation itself is viewed
                                                                            as a management control over program effectiveness  The levels of results iden-
                                                                            tified include economy and efficiency, the achievement of intermediate pro-
                                                                            gram objectives, and ihe achievemeni of overall program objectives
                                                                                Thus a results-based audit could potentially examine the efficiency of pro-
                                                                            gram activities, the exteni lo which ihese activities further intermediate program
                                                                            objectives, or ihe exteni lo which (hey further the achievement of overall pro-
                                                                            gram objectives Audits may also examine ihe effect of management c ontrols on
                                                                            efficiency or the achievemeni of intermediate or ultimate program objectives
                                                                                A few examples will help illustrate the model  In 1993, the Office ol the
                                                                            Auditor General examined management controls over pension benefit  pay-
                                                                            ments. The audit concluded thai ihe sysiems  and proceduies in place for the
                                                                            recording, control,  and collection of overpayments fell far short  ol r     irsniiril
-------
 9
I
w*
 O
 
 i
u.
•5
•XJ

i
                      SB
Potential for end
foreign economic
development proj
anticited beca
            -el'
                .
            C u D.
            H .a
assured in
1993a Pa
ipat
not
ects
proj
            i
            e
                                 5
Weakn
and
incre
whe
199
                           a «
                          -a >.
                           P «
                           la
                           u "" -•  ft
                             6»- a  B
                             £ 8«£
                           •g
                           rt
I
1
-3
c
-S
"o
!
K
            DIFFERENCES IN THE APPROACHES OF AUDIIORS AND E VALUATORS    1 3

0 5 percent of total program payments, increasing ihe adminisirativc cost of
program delivery by more than 50 percent The observation does not bear on
the potential contribution that management controls may make to program
effectiveness, but rather on their effects on the costs of program delivery
    By comparison, an audit of Search and Rescue examined the relative effi-
ciency of program activities, specifically the types of rescue vessels employed
The audit concluded that the largest class of vessels were the most costly and had
not been critical to saving any lives during the penod audited (1992b, p 225)
    Management systems and controls may also have  an  impact on  the
achievement of program objectives  In 1992, the Auditor  General of Canada
reported on a program of payments to government employees scheduled to be
laid off. The payments were expected to permit employees who so desired to
quit immediately if there  was no work to be performed and to save costs ol
employee benefits, retraining, and finding a job (I992a,  p  189) The audit
observed (hat the number of payments, in all years except one, had consis-
tently exceeded the reduction in person years (p 194)  Moreover, the auditors
concluded that the situation was one indicator of problems in the administra-
tion of the policy, pointing to such matters as inadequate planning, no appro-
priate management framework,  and the failure of senior  management to
provide leadership direction and support (p  196)


Results as Evidence of Management Deficiencies
As noted earlier, the core business of audit is the examination of management
controls over expenditure management It is therefore the bottom  row of Table
1  1 where audits will commonly be found Economy and efficiency of program
operations, the left-hand column of Table 1 I, is also a common matter for
attention by audit  However, control over results as  an  indication of good
financial management and control has also become an  audit concern, leading
to increased attention to the contribution of program operations to intermedi-
ate and overall program objectives The  result is illustrated by  the cases cued
in Table 1 1 No area of effectiveness is exempt from attention by audit
     It is in attention to the contribution of program activities to intermediate
and ultimate  program objectives where the functions of audit and evaluation
become most difficult to distinguish In principle, u is in  the examination ol
activities in relationship to intermediate program objectives where the two may
be most likely to overlap, but the central focus is substantially diflcicnt   l;m
an auditor, problems with the effectiveness of piogiam activities  is ol interest
as an indication of the importance of deficiencies in program management   I he
auditor will do sufficient work to assess whether piogiam clleciiveness  is ai
risk, before turning attention to the management factors  that  should enable
managers to gain control over program results This may involve relying on
evaluations conducted by program management, synthesizing  the findings of
evaluations of similar programs in other jurisdictions, and, as a last lesoii, con-
ducting measurement  and analyses of program impaus
-------
1 4    F.VAI.UA1 ION AND AUDI I INC.

    Tor an auditor, the aimhuiion of impacts to program activities is less
important than  in the classic social scientific model of evaluation In the social
science-based model, the ngoious pursuit of the causal link between the pro-
giam  and us outcomes is the main focus  For an auditor, it  is sufficient to
determine that results may be inadequate, whatever the cause If results appear
to be  positive, there is no deficiency to report and no need to elaborate the
causal chain
    If there aie pioblcms wnh results, the search is on for deficiencies in  prac-
tices that may have impeded management from detecting and solving the prob-
lem or for related areas of program weakness  However, this search does not
necessarily involve exploration of the causal chain, for the auditors job is not to
solve management's problems but to identify that there is a problem to be solved
and the areas that may be involved  In fact, it may be sufficient for an auditor to
point  out thai management has done little to measure program effectiveness
    This lack of attention to a rigorous exploration of the causal  chain may
puzzle evaluators The answer to the puzzle is that for an auditor, information
on results provides evidence of other matters and is not  an end in itself


References
Auiliior General of Canada Taymems 10 Employees Under ihe Work Force Adjustment Pol-
   icy " In Report i)/(he Auditor General of Canada Ouawa Minister of Public Works and Gov-
  einincni Services Canada. 1992.1
Auditor General of Canada "Scaich and Rescue " In Report of the Auditoi General of Canada
  Ottawa Minister of Public Works and Government  Services Canada. I992b
Auditor General of Canada "CIDA—Bilateral Economic and Social Development Programs "
   In Report o/ihe Auditor General of Canada  Ottawa  Minister of Public Works and Gov-
  ernment Services Canada, 1993a
Auditor General of Canada "Department of the knvironment—1 he Control and Clean-Up
   of Freshwater Pollution " In Rrpoit of the Auditor General of Canada Ottawa Minister of
   Public Works and Government Services Canada, I993b
Auditor General of Canada "Department of Fisheries and Oceans—Northern Cod Adjust-
   ment and Recovery Program " In Report of the Auditor Central of Canada Ottawa  Minis-
   ter of Public Works and Government  Services Canada. 1993c
Auditor General of Canada  "Department of National Health and Welfare—Programs for
   Seniors " In Report of the Auditor General of Canada Ottawa  Minister of Public Works and
   Government Services Canada. 1993d
Austialian National Audi! Office Per/or rnunte Auditing Canberra Australian National Audit
   Office. 1995
Canada  Financial Admmisualum Act  RS 1985. c F-ll.S I  Ottawa Government of
   Canada. 1991
C.iiiadi.in Compiehensivr Auditing foundation C//cc nvencss Re;x»fm£ ami Auditing in (he /lib-
   lie Sfilni .StiriimMry Ki'/iori  Ou.iwj Cuiailiaii Comprehensive Auditing round.uion, 1987
Swedish Naiion.il Audit Olficc  l'eifi>inian
-------
60    EVAIUAIION AND AUDIT ING

Chchmsky. I: "Comparing ami Contrasting Auditing and [valuation Sonic Notes on Their
  Relationship " Evaluation Revuw. I9S5. 9 (4). 483-503
Clielimsky, I: "Expanding GAO's Capabihlies in Program Evaluation " The GAOJournal.
  Winter/Spring 1990,8. p  51
Chen, M Theory-Driven Evaluations 1 housand Oaks. Calif Sage, 1990
Davis. D I" "Do You Want a Peifoimance Audit or a Program Evaluation'" Public Adminis-
  tration Review, 1990, 50. 35-41
Day, P . and Klein. D Accountabilities Five Public Services New York Tavistotk.  1987
Frey. B , and Serna, A "Erne polmsch-dkonomische Delraclilung des Rechnungsliofs "
  nnanzarchiv. 1990.18. 244-270
Cuba. I:, and Lincoln, Y Fuuilh Generation Evaluation Thousand Oaks, Calif  Sage. 1989
Leeuw, f  L  "Performance Auditing and Policy Evaluation Discussing Similarities and Dis-
  similarities " Canadian Journal of Program Evaluation. 1992, 7. 53-68
Lceuw, F  L  "Performance Auditing, New Public Management, and Performance Improve-
  ment Questions and Challenges " Accounting. Auditing and Accountability, forthcoming
Leeuw. F L , Rist. R  C , and Sonmchsen, R C (eds ) Can Governments Learn? New
  Brunswick. NJ  Transaction, 1994
LeGrand, J , and Banleti, W (cds )  Quasi-Marfeels and Social Policy Old Tappan, N J
  Macmillan. 1993
Mason, R , and Milroff, I Challenging Strategic Planning Assumptions New York Wiley. 1981
Moukheibir. C , and Barzelay. M "Performance Auditing Concept and Controversies "
  Paper presented at the Public Management Group/Organization for Economic Coordi-
  nation and Development Audit Symposium. Pans, June 6-7, 1995
Meyer, K . and O'Shaugnessy, K "Organizational Design and the Performance Paradox " In
  R Swedberg (ed ). Explorations in Economic Sociology  New York  Russell Sage Founda-
  tion. 1993
Osborne, D . and Gaebler, T Reinventing Government How the Entrepreneurial Spin! Is Trans-
  /ormmg the Public Sector Reading, Mass Addison-Wesley, 1992
Pawson, R . and Tilley. N "Whither (European) Evaluation Methodology " The International
  Journal o/Knowledge Transjer and Utilization. 1995. 8 (3). 20-34
Public Management Group/Organizaiion for Economic Coordination and Development
  "Background paper to the OECD Conference on Auditing " Public Management  Group/
  Organization for Economic Coordination and Development Audit Symposium, Pans,
  June 6-7, 1995
Rist, R C "Management Accountability The Signals Sent by Auditing and Evaluation "Jour-
  nal of Public Policy. 1989, 9 (3). 355-369
Smith. P "On the Unintended Consequences of Publishing Perfoimance Data in the Pub-
  lic Sector " International Journal o/ Public Administration. 1995, 18. 377-310
UdryJ (ed ) The Media and Family Planning Chapel Mill  University of North Carolina
  Press. 1974
Walker, W E "The Impact of General Account ing Office Program Evaluations on Govern-
  ment " Evaluation and Program Planning, 1985, pp 359-366
Wilson J  Bureaucracy New York Free Press, 1989
While there is wide consensus that evaluation ana* auditing are moving
closer together, there is disagreement on the width ojlhe remaining
gap  Further integration has both advantages and disadvantages
Auditing and Evaluation:
Whither the  Relationship?
Eleanor Chelimsky
In a paper wnuen more lhan ten years ago, I examined some of the similari-
ties and differences 1 perceived in the  ways thai auditors and cvaluaiors,
respectively, assess program performance, I linked these (o the hisioncs, mind-
sets, training, functions, and methodological approaches of the two professions
and spoke to the important and promising relationships I was beginning to
glimpse between performance (then called program results) audits and program
evaluations (Chelimsky, 1985)  Since that time, and based now not only on
Elmer Staats's pioneering introduction of program evaluation into the U S
General Accounting Office (GAO) in 1980, but also on  a good many other
experiences of collaboration between auditors and evaluaiors worldwide, u
appears that major two-way influences are, at very least, changing ihe nature
of both professions, even if they have not as yet produced an actual "blending
of the two cultures," in Roger Brooks's phrase (1995)
     Indeed, there is quite  wide consensus today (and this is reflected by the
articles in this volume) that audit and evaluation have moved, and arc con-
tinuing to move, toward increasing closeness with regard to understanding
and to methodological approach  Disagreement exists, however, about the
degree of closeness that has actually been achieved, along with the reasons foi
the change
 I-RANS L LI:UIW is director of the Division of Policy Evaluation at the Netherlands
 Court of Audit and pw/essor. Department of Sociology, University of Utrecht, the
Evidence from Five Observers
At one end of the spectrum, Leeuw (1995. pp  15-16) saw the two professions
as still quite different and spoke to the need "to improve the training of audi-
tors and open their minds to the social and behavioral mechanisms opciaung
-------
62    UVAI.UAI ION AND AUDI i ING

m (he public sector and in decision-making " Out he also indicated that cur-
lent efforts to combine "the strong points of evaluation (like the methodology
applied, the  attention paid to theory, and theory-driven evaluations) with the
strong points of auditing (the orientation toward management, the focus on
'follow-the-money', and the attention paid to documentary evidence) may well
lead to a new interdiscipline in the 21st century"
    At the other end of the spectrum, Pollitt and Summa (1995, pp  24-26)
noted thai "the  methods and approaches of auditors and evaluators are com-
ing closer to each other," and found only a marginal difference in the tool kits
potentially available to performance auditors and evaluators They believe this
increasing closeness  has occurred "as performance auditing becomes more
common," and  they infer that many differences in methodological approach
may be more apparent than real, given that evaluators—who are typically
deprived of the  statutory authority of auditors—have a vested interest in lay-
ing claim to  "superior methodology and expertise "
    Divorski, who spoke uniquely about the differences between auditing and
evaluation approaches, did not, given this topic, discuss the degree to which
auditors and evaluators are moving toward each other despite these differences
Like Leeuw—that is,  at the opposite pole from Pollill and Summa—he viewed
evaluation and auditing as enterprises that are still very different, especially in
terms of focus and mind-set With respect to focus. Divorski invoked audil-
ing's need to judge whether management performance is adequate and audi-
tors' consequent targeting of management systems and controls, as opposed to
evaluation's targeting of management activities in the search to determine pro-
gram results. Differences m  mind-set referred to the same  issue  that  is.
Divorski saw auditors as interested m results only insofar as they reflected
management performance, whereas evaluators are interested in results for their
own sake  These differences also give rise to other differences, for example, the
assumption  by  auditors that causes for problems found can be explained via
"auditors'judgment," whereas evaluators would see the cause-and-effecl ques-
tion as a matter for empirical inquiry As Divorski put it "If there are problems
with results, the search is on for deficiencies in  practices  that may have
impeded management  from delecting and solving the problem " Bui evalua-
tors would say that deficiencies in management practices may not be the cause
of all program problems, and that improving managers' awareness may not do
much to solve many  of them
     Biooks (1995), somewheie m the middle of the spectrum, did not argue
that the methodological approaches of auditors and evaluators are now virtu-
ally the same, but  ratliei, from his experience at the state level in  the United
States, that auditors and evalualors today are "drawing upon the approach of
iradiuonal auditing as well as ihe approach of social science-based evaluation "
Like Lecuw, Brooks recognized that "the 'two cultures' of auditing and evalu-
ation sti1'   'St." and thai "when comparing approaches across  states, one
observe        tant differences " But even though he saw these differences as
real, he ..     j  with  Polliil and Summa that they are declining However,  he

                                         WHITHER THE RI:LAiIONSIIIP'   63

believes thai the decline is "mostly because of a liberalization of traditional
auditing," rather than simply the increasing presence or the performance audit
Also, he views the "emergence of a blended approach to auditing and evalua-
tion" as a fan accompli, at least in some places
     It is, of course, difficult to generalize persuasively across the widely dif-
ferent institutional arrangements within which evaluators and auditors work
together, and without empirical data, some of the points cited here may be
impressionistic Should we really assume thai because similar tool kits are
potentially available to evaluators and auditors alike, this means that the tools
are equally used? Given the existence of the "two cultures," even if (he method-
ological tool kits were identical, there would still likely be major differences
made by auditors and evaluators in the application, use, and pervasiveness of
the particular methods selected  (For example, I would expect to see auditors
use many more surveys and case studies than, say, quasi-experimental designs,
and I would also expect to see the issues of reliability—in survey questions—
and of generalizability—in case study findings—handled very differently by
evaluators and auditors.)
     Again, if there is more "togetherness" observed between auditors and eval-
uators, is this due to a liberalization in traditional auditing standards and pro-
cedures7 to the increasing prevalence of performance auditing7 to the changing
nature of policy makers' questions that force evaluators and auditors 10 bor-
row  from each other? to the increased prestige of evaluation among auditors
today? to the belated recognition by evaluators that auditors are right to be
interested in costs7 We don't really know
     Thus, although there may be little consensus as yet about how or why it
has happened that evaluators and auditors, despite some real differences, are
coming closer together, the important point is that many observers believe
they are  And the question 1 would like to raise here is whether any of the
mingling and blending being discussed in this volume would have happened
without the physical juxtaposition of auditors and evaluators within an orga-
nization  It may well be this kind of proximity that has allowed productive
comparisons of work methods and  of ways to examine policy issues, to
resolve technical problems, to establish credibility, such as those we see in the
chapters in this volume.
     It is much less frequent today than it was fifteen years ago, say. to hear
"the  auditors judgment" being used as the sole basis for a finding, or to see
evaluators unable to explain what savings (or expenditures) are likely to result
from service changes they propose  But if physical closeness is important, then
it is also important to examine what we know about how to manage auditors
and evaluators together in an organization That is, given the diffcicnces and
the similarities of the two fields,  and given the benefits for public policy likely
to accrue from their increased collaboration, what is the oig.imzaiional device
that will allow us to reap the greatest rewards7 Should we integia*  '~» func-
tions that is, house, and supervise auditors and evaluators togcthc      hould
we keep them separate7
-------
64    LVALUAl ION AND AUDI TING


Separation or Integration-
Some Advantages and Disadvantages

Ai first glance, keeping the auditing and evaluation funuions separate has a
number of obvious disadvantages Increased costs are involved in maintaining
a separate group or department, and important  potential diffusion-of-mfor-
malion benefits to the organization may be foregone unless an enlightened
management takes steps to break down walls (or pieveni them from rising in
the first place) In addition, differences in findings among auditing and evalu-
ation units working on the same  issues but using different methods  can
become a problem for the organization The normal competition between
groups for organizational hegemony can move from what is stimulating and
healthy to something profoundly unhealthy unless thai competition is quite
carefully managed
     On the other hand, separation does allow evaluators the independence
they need for credibility (that very same independence for which auditors have
fought so hard over their long history) Separation  also gives evaluators the
freedom to develop a critical mass of skills, to establish the legitimacy of their
work with policy makers who may not be familiar with evaluation, and to
respond to policy questions with strong studies that can demonstrate the worth
of evaluation not only to policy makers but  to auditors as well Finally, one of
the most important advantages of separation is us feasibility  start-up does not
require behavioial  change in an organization, only funds, operational know-
how, and leadership
     What about integration7 Most experience suggests that this can be quite
difficult to achieve, ai least immediately A first problem arises because audi-
tors and evaluators have been  trained so differently, and tend to have dissimi-
lar mind-sets with regard to a study Auditors are taught, for example, that they
are wasting the taxpayers' money if an investigation does not uncover a major
pioblem "If nothing is wrong," they say, "then why should we  be  doing an
audit7" But evaluators are trained instead to ask whether some policy or pro-
gram has made a difference, any difference, good or bad  That is. to evalualors,
positive findings are as important as negative ones in improving public policy
and need no special justification To find out what works is at least as useful,
evalualors think, as to find out what didn't work
     These mind-sets have some ramifications for the work process  Auditors
ofien do their besi 10 determine whether theie is (or is noi) a significant prob-
lem based, say, on a one-month informal investigation, and may then abandon
the project if no significant problem has turned up If evalualors expecl that a
study may genei.ite weak or strong findings in any direction, they will often
spend thiee months designing  it (moie if the policy question asked is complex
or contiovcrsial), and they have difficulty  in answering auditors' questions
about what their findings arc likely to be befoie they  have collected their data
One result of this process is that evaluations arc often afflicted by findings that
aie anything but conclusive, and tins means that undei an unrelated organi-
                                        WmiiiCRiiir. Ri.LArioNsmp?    63

zational arrangement, auditors may greel (he evalualors' findings wuh a bored
"So what?" and a large yawn, whereas evalualors will always be skeptical about
those exciting one-month findings thai are established before data on both
sides of the question could possibly have been collected
    Emphasis (or the lack of it) on measurement is another training issue thai
causes tensions between auditors and evalualors The measurement questions
lhai preoccupy evalualors (like ihe reliability of items in a questionnaire, or
threats to ihe internal validity of siudy findings, or the real comparability of
before-after or cross-secuonal siudy daia) lend to take a bit of time to resolve,
are typically low on the auditors' pnonly list, and are usually nol well under-
stood by them Because of this, abseni training on boih sides, and especially
training of managers, auditors may reproach evalualors for their slowness and
evalualors may reproach auditors about the validity of their findings  Unhap-
pily, these perceptions lend lo linger within an organization they engender us-
and-them mentalities, they cause morale problems, and they militate against
the continuing recruitment and  retention of methodologically sirong evalua-
lors whose presence is critical to the success of cross-fertilization efforts
    The point here is that when these tensions of mind-sei and measurement
bubble up, many of ihe advantages lhai integration was counted on lo supply
may not materialize For example, in a tense atmosphere, knowledge diffusion
and organizational learning may be even worse off than under separatum Dif-
ferences in findings based  on methodological approach are still likely to surface
under integration, although al a lower level and thus more manageably—that
is, easier lo suppress lhan confront—from an organizational  perspective  And
even if evalualors may feel more protected institutionally than they do as pan
of a separate unit, the trade-off between thai protection and technical quality
may nol seem worth u 10  many evaluators
Organization for Production or for Cross-Fertilization?
Perhaps a useful way to think of separation versus integration is as a function
of ihe institutional goals to be pursued  If the mam aim is lo increase the capa-
bilities of the organization to answer complex policy questions and to begin
doing thai as soon as possible, then separaie evaluation and audit units such
as ihose ai GAO and ai the Minnesota Program Evaluation Division have much
lo recommend them But if the aim is cross-fertilization in an organization,
then separation, especially if it is poorly managed, may bring two important
long-run costs  communication wuh ihe larger institutional entity may not be
good enough, and evaluation staff may begin to feel excluded from important
policy decisions
     A good example of this problem comes from the experience of the Office
of Management and Budget (once known as the Bureau of the Budget, or
BOB) In ils early days, BOB decided lo bring in people with strong technical
skills lo complemeni the work of their budget analysis, and they separated
these technical staff from the budget people by cicaimg technical centers in
-------
66    EVALUAI ION AND AUDI I INC.

which (he new skills would be deployed  In the words of a former BOB Assis-
tant Director, William D  Carey, what happened was this separation "built a
kind of concentrated quality in  the technical centers and it successfully accu-
mulated a critical mass of top-flight specialists " But it also led to an organiza-
tion in which technical staff became alienated  In Careys words

     I he budget  people sal al I he table during the Director's reviews, but the techni-
     cal people had only backbench chairs and very limned possibility 10 participate
     in the discussions When they did speak, their commenis were considered intru-
     sive Promotions and supergrades went to line, not staff, personnel BOB Direc-
     tors had little tune or interest in the technical work, and technical staff had little
     or no access 10 them On their side, technical people tended 10 look down on
     budget examiners as journeymen of very average  capabilities |U S  General
     Accounting Office. 1990. p 23|

     To remedy this developing nft within the organization, BOB decided to dis-
solve the technical centers and scatter their personnel across  the budget divi-
sions In this way it was hoped that organizational cohesion and communication
could be improved and that the work of the budget divisions would be enriched
by the closer proximity of the technical staffs expertise Carey believes this move
to have been a mistake and its results unfortunate, for three reasons

     First, the same problems reappeared, but at the lower, divisional level The scat-
     tered technical staff continued  10 feel they were second-class citizens and now
     the situation was worse in ihat they had no organizational voice Their sense was
     thai they had 10 keep proving their worth (as technical people in a budget divi-
     sion) and that they were no belter off in terms of having direct inputs into orga-
     nizational decisions and products Second, the professional quality of the
     technical staff weakened over  lime because the technical centers which had
     attracted some of the brightest  people in their respective fields were no longer
     there And finally, the dispersed technical personnel did not appear to have any
     visible effect on the work of the budge) divisions  |U S General Accounting
     Office.  1990. p 24|

Ai the GAO and in  Minnesota, where audit and  evaluation units have been
separate, at least some of these evils have been avoided, and certainly, with
regaid lo visible effect, the influence of the evaluation work  has been recog-
nized and highly regarded
     In short, if integration has  not been easy—and it has not—we may need
10 learn moic about why u has been so difficult before  dismissing it as an
option   Is u, for example, because evaluation and auditing have  different
sources of credibility and legitimacy7 Is u because mind-sets and cultures
become cntieiKhed under the  stresses of organizational competition7 Is u
bcL.ii-      simply haven't yci developed both the management techniques and
the m     .is needed lo do the  job7
                                           WmmcR i HE RLLAIIONSIIIP'    67

     My own view is that — given effective managers who know and value both
auditing and evaluation, and given also some harmonization of training for
auditors and evaluators — it should be possible to integrate the evaluation func-
tion successfully in audit organizations But until that training has taken place
(and especially for audit offices beginning now to incorporate evaluation into
their work programs), keeping the  functions  separate, building multiple
bndges between them, and watching their interactions and managing them
carefully may be not only the most prudent but also the best organizational
course of action
     A lot may be ndmg on this selection of the right organizational model To
be viable, both evaluation and audit functions need independence, skilled per-
sonnel. credibility, sponsors who understand the benefits to be drawn from
both audits and evaluations, and the capability to respond appropriately to the
policy questions of today's political environment Such capability requires the
use of both auditing and evaluation methods When we can bring these two
together and target them properly to policy makers' information needs, and
when findings from both types of studies can make their way unimpeded into
the policy  process, then both evaluations and audits will have achieved their
real  public purpose, to help make government  services more effective, more
meaningful, more responsive, more accountable, and — last but not least — bet-
ter managed
References
Brooks. R "Blending Two Cultures State Legislative Auditing and Evaluation " Paper pre-
  sented at (he International Evaluation Conference in Vancouver. D C , Nov 1995
Chelimsky, E "Comparing and Contrasting Auditing and Evaluation  Some Notes on Their
  Relationship." Evaluation Review. 1985. 9 (4). 483-503
Leeuw, F. L "Auditing and Evaluation  Bndging a Gap, Worlds to Meet7" Paper presented
  at the International Evaluation Conference in Vancouver. B C , Nov  1995
Pollill, C , and Summa, H "Performance Auditing Travellers' Tales " Paper presented at the
  International Evaluation Conference in Vancouver. B C . Nov 1995
U S General Accounting Office Diversifying and Expanding Technical Skills at GAO  GAO/
  PEMD-90-18S. Vol 2 Washington. D C  US General Accounting Office. Apr  1990
ELEANOR CHELIMSKY is an international consultant in evaluation
ri/wv nm\ nmf rmulrni nflhe Amenran FwiluriMnn
jncJ method-
-------
AUDITING
Art
Comprehensive
Objective
Prove
Quality
Present
Internal
Inputs
Separate
Simple
Inductive
Individual
What
Genera lizable
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
4
4
4
4
5
5
5
5
5
5
5
5
5
5
5
5
5
5
6
6
6
6
6
6
6
6
6
6
6
6
6
6
7
7
7
7
7
7
7
7
7
7
7
7
7
7
8
8
8
8
8
8
8
8
8
8
8
8
8
8
9
9
9
9
9
9
9
9
9
9
9
9
9
9
Science
Specific
Value-laden
Improve
Quantity
Future
External
Outcomes
Collaborative
Complex
Deductive
Team
Why
fnique
-------
EVALUATION
Art
Comprehensive
Objective
Prove
Quality
Present
Internal
Inputs
Separate
Simple
Inductive
Individual
What
Generalizable
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4-
4
4
4
4
4
4
4
4
5
5
5
5
5
5
5
5
5
5
5
5
5
5
6
6
6
6
6
6
6
6
6
6
6
6
6
6
7
7
7
7
7
7
7
7
7
7
7
7
7
7
8
8
8
8
8
8
8
8
8
8
8
8
8
8
9
9
9
9
9
9
9
9
9
9
9
9
9
9
Science
Specific
Value-laden
Improve
Quantity
Future
External
Outcomes
Collaborative
Complex
Deductive
Team
Why
L/nique
-------
                                Logic Mo4ehng

Refer to

Mclaughlin, John A and Gretchen B Jordan, (1999) "Logic Models a tool For telling your
program's performance story"  Evaluation and Program Planning 22. Elsevier Science Ltd
pp 65-72

Articles in New Directions for Evaluation 87 Fall 2000, San Francisco CA  Jossey-Bass
Publishers

       Rogers, Patricia ) , Anthony Petrosmo, Tracy A Huebner, and Timothy A Hacsi
       Editor's Notes and "Program Theory Evaluation  Practice, Promise, and Problems"
       pp1-13
       Weiss,  Carol Hirschon "Which Links in Which Theories Shall We Evaluate/"
       pp 35 - 45
       Rogers, Patricia J  "Causal Models in Program Theory Evaluation" pp 47-55

1.  What is logic modeling/
2  Who uses it/ Why/
3  What are the steps/
-------
Drawings of Logic Models
-------
Is Logic  Modeling The 'Answer^
 Before you proceed with your task, would
 it be helpful to have...
 ...a visual diagram of the design of the
 ...a common description of a program to share with persons
 internal and external to the program?
 ...a diagram to facilitate the development of hypotheses about
 how program inputs are related to activities, activities are related
 to outputs, outputs are related to outcomes?
 ...a digram to help identify what measures would be helpful in
 testing your previous hypotheses?
 ...a digram to help determine whether the program is designed to
 succeed?
 if the answer is 'yes' to any of these, then let's model!
-------
  The Worl4 of l_£
  Modeling
  Emmalou Norlan
1C
A LOGIC MODEL
  Is a picture'Can4 corresponding text
  description) of how a program is designed
  to work.
  Shows thgt a program has inputs, outputs,
  
-------
 Who Uses Logic Mojels?	

A A logic model serves 35 a common
  communication tool among program
  staff.
A Logic models help explain program logic
  to flinders and other stakeholders.
                              Evaluators ALSO l/se Logic
                              Models	
                             A To understand how the program works,
                               how it links to other programs, and how it
                               contributes to agency long-term goqls.
                             A To identify gaps in the logic of a
                               prog ram's design.
                             A To determine the critical links in the
                               program's logic so that appropriate
                               measures can be identified for use in
                               evaluations
 A Program Logic Mocjel Explains
 the ARROW	
A How a Program
  l/ses Resources to
  Improve
  Environmental and
  Human Health
                 Congress
                  Provi4es
                  Money
 Improved
Environment*!
 4ixl Humjn
  Health
                             A Program's Logic
Resources are used by the agency to
create a set of activities (the program).
The activities are designed to produce
outputs (products or services) for a set of
customers/clients  Those clients react to
the outputs (and gain knowledge and
skills and attitude) such that they can and
will change their behavior in a desirable
way
-------
Logic Continue^..
  When client behavior changes, there
  environmental consequences (reduced
  pollutants, changed ambient condition...)
  resulting in improved environmental and
  human health.
Activities
A Researching
A Permitting
A Regulating
A Developing
A Monitoring
A Training
A Communicating
A Enforcing
   Resources
   Can be budget, FTEs, equipment,
   facilities, supplies, products from
   other- programs, internal an4
   external partners, key
   information..
 Outputs
  Scientific Finding
A Permits
A Regulations
A New Technologies
A Pata Bases
A New Methods
A Cutting Edge Information
A Enforcement
-------
Customers/Clients of EPA
Programs	
A Congress
A The Public
A States
A Regulated
  Community
A Other agencies
 Environmental Outcomes
  Stressor Recjucecl
  Ambient Condition Improves
  Environment Better
 Behavioral Outcomes
A Affective Reactions (satisfaction,
                     A
  agreement, support...)
A Knowledge
A Skills
A Attitudes (beliefs,
  perceptions, values]
A Behavior Change
  Environmental
  Human Health
                                                 IMPROVED!
-------
 Developing a Logic Model

A Identify the program boundaries
A Gather written information about the
  program
A Put a draft model together
A Review the model with program staffand
  other stakeholders.
A I/sing that information, change model to be
  the best representation of the program as
  designed.
What Makes Up the PROGRAMS
V
\
Resources
/

/



/

Activities
/

)



/

Outputs
/

/
FTES. Budvjet.
 F,
lilforrrMllon
>• \
 Permit*.
 Resfiirch.   }
Enforcement /
                                y
                             Basic Components of an
                             tnvironrr)enfal~Prograro Logic"
                             Mocjel
                                    Behavioral
                                    Outcomes
al
es

/

«

Environment
Outcomes

•
/
X
Environment.)
and Ilimijn
Hejllli

                                       Externalities
                          I if the resources support the 'right' activities,  j
                          which produce the 'right' outputs, then the   !
                          client has BEHAVIORAL OUTCOMES        \
                                                          Starts. Sto(>>  \
                                                              K»   \
                                                           Pccre.l>o   /
                                                           Heli.li 101  /
                                                                                                /
-------
if clients move through the behavioral chain to
ultimately change behavior, then	
ENVIRONMENTAL OUTCOMES are achieved.
/
Stressor is
Reduced
/
/


/
Ambient
Condition
Improves
/
/
       Kjs are
     reduced
Improvement
What are Externalities'
   People, events, an4 other entities which
  couM have an effect upon what happens in
  the boxes and/or on relationships between
             the boxes
   Weather, Political Climate, Competing
   Programs, Other Agencies, Agenjas. Lacl.
           of Information
                                                               Through m^ny programs achieving
                                                               BEHAVIORAL AND
                                                               ENVIRONMENTAL outcomes over a
                                                               long period of time, then the ultimate
                                                               goal can be reached:  ENVIRONMENT
                                                               AND HUMAN  HEALTH
                                      Expanded Environmental Logic Model
-------
   We can also use this logic in
   reverse, when planning a program!
  Then what Joe, science
  tell us about how the
| environment]! condition:
  need to change to reach
      that goalf
if the ULTIMATE goal
is Environmental and
  Human Health

   What kinds of outputs would best
   address the desired behaviors?
      Information
      based upon
     souncl science
                                      if we want those environmental conditions to
                                      change, WHO/WHAT arc the pollutant sources/
                                      (WHO - rnsfr>rr)t'is/rlirt)fs - r)cre\ tn rli^t)i_' WHAT
                                                                      behaviors/)
                                                                                       Stop Polluting
 Change
Household
 Practices
                                                                                          Enact
                                                                                        Appropriate
                                                                                          Policy
                                        Wh^t progi-gms 3nd Activities can EPA
                                        undertake to produce those outputs/
                                                          Information
                                                           based on
                                                            sound
                                                           science
-------
AND, FINALLY, what resources are needed
to plan and conduct these programs
Cwhich, in turn, produce these outputs)/
   Resource.
   Resource.
                              Regulations
 Information
  based on
sound science
-------
Generic Logic Mocjel (with
Environmental Program Outcomes)
       Program
Resources
Activities
\
              Outputs
                      Customers
  Problem
           ShortTenn
            Outcome
Intermediate
 Outcome
Lull Term
Q il come
            Externalities
                           Behavioral
                           Outcomes
                 Environmental
                  Outcomes
-------
        Environmental  Logic Mode
Rpgram
Resources

Ativities

Outputs
                   Customers
       Externalities
           PluUcm
Behavioral
Outcomes
Environmental
 Outcomes
                                                HLOHUI hk'idlh
                              Knowlcc^e,
                             Skills, Attitudes,
                              Behavior
           Stntssor/Ambicnt
             Concfition
-------
                                                                                        EVALUATION
                                                                                        and PROGRAM PLANNING
PERGAMON
Etalujuon and Progrjm Planning 12 (1999) 65-/:
   Logic models:  a  tool for telling your program's  performance  story

                           John A. McLaughlmJ, Gretchen  B. Jordan6*
                         'Independent Consultant 423 Hempsteati Roail Willtaniihurg \'A 2USS. L 5 ^
                   'Saiulia National Laboratories 950 L'Enfant Plaza SWSuite IItt V/a\hmqion DC .'OO.V. L 5 A
                                            Accepted I August 1993
Abstract

  Program managers across private and public sectors are being asked to describe and evaluate their programs in new ways People
want managers to present a logical argument for how and  why the program is addressing a specific customer need and how
measurement and evaluation will assess and improve program effectiveness. Managers do not have clear and logically consistent
methods to help them with this task. This paper describes a Logic Model process, a tool used by program evaluators. in enough
detail that managers can use it to develop and  tell the performance story for their program. The Logic Model describes the logical
linkages among program resources, activities, outputs, customers reached, and short, intermediate and longer term outcomes Once
this model of expected performance is produced, cntical measurement areas can be identified. © 1999 Elsevier Science Ltd All rights
reserved.

Ke\Hords Program theory. Program modeling. Performance measurement. Monitoring and evaluation. Government Performance and Results Act
1.  The problem

  "At its simplest, the Government Performance and
  Results Act (GPRA) can be reduced to a single question:
  What are we getting for the money we are spending?
  To make GPRA more directly relevant for the thou-
  sands of Federal officials who manage programs and
  activities across  the government, GPRA expands this
  one question into three: What is your  program or
  organization trying  to achieve0 How will its  effec-
  tiveness be  determined? How is it actually doing7 One
  measure of GPRA's success will be when any Federal
  manager anywhere can respond knowiedgeably to all
  three questions."
  John A. Koskmen, 1997
  Office of Management and Budget

  Federal managers were being challenged by Mr Kos-
kmen (1997),  Deputy Director of the OMB, to tell their
program's story in a way that communicates not only the
program's outcome goals, but  also that these outcomes
are achievable For many public programs there is also
an implicit question. 'Are the results proposed by the
program the  correct  results?'  That is, do the  results
address problems  appropriate for  the program  and
deemed by stakeholders to be important to the organ-
izational mission and national needs?

  * Corresponding author E-mail gbjordaiii sandu gov
                      The emphasis on accountability and 'managing  for
                    results' is found in state and local governments as well as
                    in public service organizations such as the United Way
                    of America and the American Red Cross It represents a
                    change in the way managers have to describe their pro-
                    grams and document program successes. Program man-
                    agers are not as familiar with describing and measuring
                    outcomes as they are with documenting inputs and pro-
                    cesses. Program design is not necessanly explicit, in part
                    because this  allows flexibility should  stakeholder pri-
                    orities change.
                      There is also an increasing interest  among program
                    managers in continuous improvement and managing for
                    'quality'. Choosing what to measure and collecting and
                    analyzing the data necessary for improvement measure-
                    ment is new to many managers
                      The problem  is that clear and  logically consistent
                    methods have not been readily available to help program
                    managers make implicit understandings explicit. While
                    tools such as flow charts, risk analysis, systems analysis.
                    are used to plan and describe programs, there is a method
                    developed  by program  evaluators that   more  com-
                    prehensivelv addresses the increasing requirements  for
                    both outcome measurement and improvement measure-
                    ment
                      Our purpose here is to describe a tool used by many m
                    the  program evaluation community,  the Logic Model
                    process,  to  help program managers  better meet new
0149-7189 9') S - >ee front mailer f 1999 El»evier Science Ltd All rights reserved
PII S0149-7I89(9S)0004:-|
-------
66
J ^ A/c Launlilui C B Jordan Ci ulummn .nul /V«iv,«ii Plaiminv _V
requirements  Documentation of the process by which a
manager or group would develop a Logic Model is not
readily available even within the evaluation cornmunu>.
thus the paper may also help evaluators serve their cus-
tomers better
2.  The Program Logic Model

  Evaluators have Found the Logic Model process useful
for at least twenty years. A Logic Model presents a plaus-
ible and sensible model of how the program will work
under certain  conditions to solve identified problems
(Bickman, 1987). Thus the Logic Model is the basis for a
convincing story of the program's expected performance
The elements of the Logic Model are resources, activities.
outputs,  customers  reached,  short,  intermediate and
longer term  outcomes,  and the relevant external influ-
ences (Wholey, 1983, 1987)
  Descriptions and examples of the use of Logic Models
can be found in Wholey (1983). Rush & Ogborne (1991),
Corbeil (1986), Jordan & Mortensen (1997), and Jordan,
Reed, & Mortensen (1997).  Variations of the  Logic
Model are called by different names, 'Chains of Reason-
ing' (Torvatn,  1998). Theory of Action, (Patton.  1997),
and   'Performance   Framework'  (Montague,  1997.
McDonald & Teather, 1997). The Logic Model and these
variations are all related to what evaluators call program
theory.  According  to  Chen  (1990),  program  theory
should be both prescriptive and descnptive. That is, a
manager has to both explain the elements of the program
and present  the logic of how the program works. Patton
(1997) refers to a program description  such as this as an
'espoused theory of action',  that is.  stakeholder per-
ceptions of how the program will work.
  The benefits of using the Logic Model tool include.

• Builds a common  understanding of the program and
  expectations  for resources, customers reached and
  results, thus is good  for  shanng ideas,  identifying
  assumptions, team building, and communication,
• Helpful  for  program design or  improvement, ident-
  ifying projects that are critical to  goal  attainment,
  redundant, or have inconsistent or implausible linkages
  among program elements; and,
• Communicates the place of a program in the  organ-
  ization or  problem hierarchy, particularly if there are
  shared logic charts at various management levels:
• Points to a balanced set of key performance measure-
  ment points and evaluation issues, thus improves data
  collection  and usefulness, and  meets  requirement  of
  GPRA.

   A simple  Logic Model is illustrated in  Fig  I.
Resources include human and financial resources as well
as other inputs  required to support the program such
as partnerships  Information on customer needs is an
                               e<,scnnal  resource to the program  Aiinnici include all
                               those action steps necessan. to produce program outputr
                               Output* arc the products, goods and services provided tc
                               the program's direct itmonier\ For example, conducting
                               research is an activity and the reports generated for other
                               researchers and technology developers could be thought
                               of as outputs of the activity
                                  Customers  had  been  dealt  with implicitly  in  Logic
                               Models until Montague added the concept of Reach to
                               the performance framework.  He  speaks of the 3Rs of
                               performance:  resources,  people  reached, and  results
                               (Montague. 1997.  1994)  The   relationship  between
                               resources and  results cannot  happen without  people—
                               the customers served and the partners who work with the
                               program  to enable actions to lead to  results Placing
                               customers,  the users of a product or service, explicitly in
                               the middle of the chain of logic helps program staff and
                               stakeholders better think through and explain what leads
                               to what and what population groups the program intends
                               to serve.
                                  Outcomes are characterized as changes or benefits
                               resulting from activities and outputs. Programs typically
                               have multiple, sequential outcomes across the full  pro-
                               gram performance story. First, there  are short term out-
                               ~comes. those changes or benefits that are most closely
                               associated with  or 'caused' by the program's outputs
                               Second,  there  are intermediate outcomes, those change*
                               that result  from  an application of the  short  ter
                               outcomes. Long  term outcomes or program impacts, foi
                               low from the  benefits accrued though the intermediate
                               outcomes. For example, results from a laboratory proto-
                               type for an energy saving technology may be a short-term
                               outcome; the commercial scale prototype an intermediate
                               outcome, and a cleaner environment once the technology
                               is in use one of  the desired longer term benefits or
                               outcomes.
                                  A critical feature of the performance story is the identi-
                               fication and description of key contextual factors external
                               to the program and  not under its control that could
                               influence its success either positively or negatively  It  is
                               important  to  examine  the  external conditions  under
                               which  a  program  is implemented and  how those  con-
                               ditions affect  outcomes. This explanation helps clarify
                               the program 'niche' and the assumptions on which per- •
                               fortnance expectations are set. Doing this provides an
                               important   contnbution  to  program   improvement
                               (Weiss. 1997)  Explaining the  relationship of the problem
                               addressed  through the program, the factors that cause
                                the problem, and  external factors, enables the manager
                                to argue that the program is addressing an  important
                                problem in a sensible way.

                                3. Building the Logic Model

                                  As we provide detailed guidance on how to develop «.
                                Losic Model  and use it to determine kev measurement
-------
                          J A MiLaiishlui C B Jordan! Eialualiim urul Prn^rani Plumtiiiy .'.' ,' 19991 6.!-~?

Resource* |
(inputs) p""


Activities !-*•


hfor
Customers — ••
Rejched


Snorl-lerm 1 ^
Outcomes 1 "~


Outcomes
(through
customers)

*


Outcomes
& Problem
Solution
                                       External Influences and Related Program*
                                         Fig I  Elements or the Logic Model
 and evaluation points, it will become more clear how the
 Logic Model process helps program managers answer the
 questions Mr Koskmen and others are asking of them
 An example of a federal energy research and technology
 development program will be used throughout. Program
 managers in the U.S  Department of Energy Office of
 Energy Efficiency and Renewable Energy have been using
 the Logic Model process since 1993 to help communicate
 the progress and value of their programs to Congress.
 partners, customers, and other stakeholders.
   The Logic Model is constructed in five stages discussed
 below.  Stage  1  is collecting the relevant information.
 Stage 2 is describing the problem the program will solve
 and us context; Stage 3 is defining the elements of the
 Logic Model in a table. Stage 4 is constructing the Logic
 Model, and Stage 5 is verifying the Model.

 3 1.  Siage 1 collecting the relevant information

   Whether  designing a new program or describing an
 existing program, it is essential  that the manager or a
 work group collect information relevant to the program
 from multiple sources The information will come in the
 form of program documentation, as well as interviews
 with key stakeholders both internal and external to  the
 program  While  Strategic Plans, Annual Performance
 Plans,  previous  program evaluations, pertinent legis-
 lation and regulations and the results of targeted inter-
 views should be available to the manager before the Logic
 Model is constructed, as with any project,  this will "be
 an iterative  process requiring the ongoing collection of
 information. Conducting a  literature review to gam
 insights into what others have done to solve similar prob-
 lems, and key contextual factors to consider in designing
 and implementing the program,  can  present powerful
evidence that the program approach selected is correct.
  Building the Logic Model for a program should be a
 team effort in most cases.  If the manager does it alone.
 there is a great risk that parts viewed as essential by some
will be left out or incorrectly represented. In the following
steps to building the Logic Model we refer to the manager
as  the key player  However, we recommend that persons
 knowledgeable of the program's planned performance.
including partners and customers, be involved in a work
group to develop the Model  As  the building process
 begins it will become evident that there are multiple realit-
 ies or views of program performance  Developing a
 shared vision of how the program is supposed to work
 will be a product of persistent discovery and negotiation
 between and among stakeholders.
   In cases where a program is complex, poorly defined, or
 communication and consensus is lacking, we recommend
 that a small subgroup or perhaps an independent  fac-
 ilitator be asked to perform the initial analysis and syn-
 thesis through  document reviews and individual  and
 focus group interviews. The product of this effort  can
 then be presented to a larger work group as a catalyst for
 the Logic Model process.

J 2. Stage 2: clearly defining the problem and its context

  Clearly defining the need for the program is the basis
for all that follows  in  the development of the Logic
Model. The program should be grounded in an under-
standing of the problem that drives the need for  the
program. This understanding includes understanding the
problems customers face and what factors 'cause'  the
problems. It is these factors that the program will address
to achieve the longer term goal—working through cus-
tomers to solve the problem. For example.

  There are economic  and  environmental challenges
  related to the production, distribution, and end use of
  energy. U.S. taxpayers face problems such as depen-
  dence on foreign oil, air pollution, and threat of global
  wanning from burning  of  fossil fuels  Factors that
  might be addressed to increase the efficiency of end  use
  of energy include the limited knowledge, risk aversion,
  budget constraints of consumers, the lack of com-
  petitively priced  clean  and  efficient energy  tech-
  nologies, the externalities associated with public goods,
  and restructuring of U.S electricity markets. To solve
  the problem of economic and environmental challenges
  related to the use  of energy, the program chooses to
  focus  on factors  related  to developing clean and
  efficient  energy technologies and changing customer
  values and knowledge. In this way, the program will
  influence customer use of technologies that will lead to
  decreased use of energy, particularly of fossil fuels

  One of the greatest challenges faced by work groups
-------
 6S
                        J A McLaugliltn CB Jordan, Eiuluaium ami Prtivrain Plumum; 22 (I999j 65-U
 developing Logic Models is describing where their pro-
 gram ends and others start. For the process of building
 a  specific program's Logic Model,  the program's per-
 formance ends with the problem it is designed to solve
 with the resources it has acquired, including the external
 forces that could influence its success in solving that prob-
 lem Generally, the manager's concern is determining the
 reasonable point of accountability for the  program At
 the point where the actions of customers, partners, or
 other programs are as  influential on  the  outcomes as
 actions of the program, there is a shared responsibility for
 the outcomes and the program's accountability for the
 outcomes should be reduced. For example,  the adoption
 of energy efficient technologies is also influenced by fin-
 anciers and manufacturers of those technologies.

 3.3  Stage 3 defining the elements of the Logic Model

 3.3.1. Starting with a table
  Building a  Logic  Model  usually begins  with cat-
 egorizing  the information collected  into 'bins',  or col-
 umns in a table. Using the categories discussed above the
 manager goes through the information and tags it as a
 resource',  activity, output,  short term outcome, inter-
 mediate outcome, long term outcome or external factor
 Since we are building a model of how the program works,
 not every program detail has to be identified and catalo-
 ged, just those chat are key to enhancing program staff
 and  stakeholder  understanding of  how the program
 works.
  Figure 2 is a table with some of the elements of the
 Logic Model for a technology program.

3.3.2. Checking the logic
  As the elements of the Logic Model are being gathered,
the manager and a work group should continually check
 the accuracy and completeness of the information con-
 tained in the table The checking process is best done b
 involving representatives of key stakeholder groups u,
 determine if they can understand the logical flow of che
 program from resources to solving the longer term prob-
 lem  So the checking process goes beyond determining if
 all the key elements identified, to confirming that reading
 from left to  right, there is an obvious sequence or bridge
 from one column to the next.
  One way to conduct the check is to start m any column
 in the table and ask the question, 'How did we get here?'
 For example, if we select a particular short term outcome,
 is there an output statement that leads to this outcome0
 Or, for the same outcome, we could ask, 'Why are we
 aiming for that outcome?' The answer lies m a subsequent
 outcome statement in the intermediate or long term out-
 come columns. If the work group cannot answer either
 the how or why question, then an element needs to be
 added or clarified by adding more detail to the elements
 in question.

 3.4.  Stage 4: drawing the Logic Model

  The Logic Model captures the logical flow and linkages
 that  exist in any performance story.  Using the program
elements in  the  table, the Logic Model organizes the
information,  enabling the  audience  to understand and
evaluate the hypothesized linkages. Where the resource*
activities and outcomes are listed within their respective
columns in the story, they are specifically linked  m the
Model, so that the audience can see exactly which activi-
ties lead to what intermediate outcomes and which inter-
mediate outcomes lead to what longer term outcomes or
impacts.
  Although  there are several ways to present the Logic
| Outcomes
Resources

-Budget SXXX
• #/ capacities
of staff
-ttfeataf
panners
•Cost ihareS
•Technology
Roadmap
- H of yean
experience
Activities

• Fund grants
(solidl, review,
etc.)
-Research
properties of
materials
- Provide
technical
aaiotance
- Set policy for
procurement
Outputs

-UrtypeoT
awards
•RAO
progress
reports
- Lab and
nmmeraal
prototypes
•Advice
provided
•S procure-
in nt i fleeted
Customer
Reached
• Federal and
private
researchers
• Industrial
firms impacted
•VUnufacturers
•Existing/future
consumers of
•dated product!
Short
Term
• Rejectees seek
venture capital
• R&D advances
made
• Lab prototype
started: results
documented
-Tech roadmap
revised
-Advice
considered
Intermediate
Term
-So me get
venture capital
- Lab prototype
com p wtco *
• Commercial
designed
-More efficient
processes
adopted
- Technology
purchased

Long
Term
Reduction in
energy use
£Oils.more
competitive,
emiuwni from
energy
are leu, thus
environment
cleaner
' 	 4
               External Influences: Price of oil and other energy supply and distribution factors, economic growth.
               perception of risk of globil dimale change, market assumptions, technology assumptions	__
                       Fig :. A table with element* of ihe Logic Model for an energy technology program
-------
                         J 4 McLaughlm C B Jordan!Eialuuiiii'i ami Pruvruni Planning .V f19991 6i-~:
                                                                                                          69
 Model (Rush & Ogborne. 1991, Corbeil. 1986) the Lo«ic
 Model is usually sec forth as a diagram with columns and
 rows, \vith the abbreviated text put in a box and linkages
 shown with connecting one-way arrows. We place inputs
 or resources to the program m the first column at the left
 of the Model and the longer term outcomes and problem
 to be solved  on  the far right column.  In the second
 column, the major program activities are  boxed In the
 columns following activities, the intended outputs  and
 outcomes  from each  activity are shown, listing  the
 intended customer for each output or outcome. An exam-
 ple of a Logic Model  for an energy efficiency research
 and development program is depicted in Fig. 3.
   The rows are created according to activities or activity
 groupings. If  there is  a  rough sequential order to the
 activities, as there often is, the rows will reflect that order
 reading from top to bottom of the diagram. This is the
 case if the accomplishments of the program come in
 stages as demonstrated in our example of the if, then
 statements. When  the outcomes from one  activity serve
 as a resource for another activity chain, an arrow is drawn
 from that outcome to the next activity chain. The last in
 the sequence of activity chains could describe the efforts
 of external partners, as m the example in Fig. 3. Rather
 than a sequence, there could be a multi-faceted approach
with several concurrent strategies that tackle a problem.
 For example, a program might do research in some areas
and technology development and deployment in others,
all working toward one goal such as reducing energy  use
and emissions.
  Although the example shows one-to-one relationships
 among program elements, this is not always the case  h
 may be that one output leads to one or more different
 outcomes, all of which are of interest to stakeholders and
 are part of describing the value of the program
   Activities can be described at many levels of detail
 Since model* are simplifications, activities that lead to
 the same outcome(s) may be grouped to capture the level
 of detail necessary for a particular audience  A rule of
 thumb is that a Logic Model would have no more than
 five  activity  groupings.  Most programs are complex
 enough that  Logic  Models at more than one level of
 detail are helpful. A Logic  Model more elaborate than
 the simple one shown m Figure 1 can be used to portray
 rr ore detail for all or any one of its elements. For example,
 research activities may include literature reviews,  con-
 ducting experiments, collecting information from mul-
 tiple sources, analyzing data, and writing reports. These
 can be grouped and labeled research. However, it may be
 necessary  to formulate a more detailed and  elaborate
 description of research sub activities for  those  staff
 responsible and if this area is of specific interest  to a
 stakeholder group. For example, funding agencies might
 want to understand the particular approach to research
 that will be employed to answer key research questions.
  The final product may be viewed as a network dis-
 playing the interconnections between the major elements
 of the program's expected performance, from  resources
 to solving an important problem. External  factors  are
entered into the Model at the bottom, unless the program
has sufficient information to predict the point at which
they might occur.
Resources
Program I
l.SUff. Us*.
Management |
f 	
Program I
{.Staff. U*.
Management 1
f 	
Prugtmi 1
S.SUfT 1 e»
"*-— ~

!
Commercial 1
S. Staff 1
^\

Acnvnn*
PwTomi L^afc-
Research I"*1

Develop 1
Technology |~^

IOoptoy L.
Technology r^


Produce 1
technology 4 L-e*.
Educate Market |
^v
nces: Price of Ml and
and need ft
Outpun
Ideas for I
technology l_
dung, p~

Lap Prototype 1
"•port p"

Politic 1
InCMttVCSi ^^i*
Information |


Manufacture 1
In market ' |
S
uecMchy. economic g
for
Customer!
Reached
for Industry I
researchers 1 ~~

for 1
Users and l-s»
Manufacturers |

4(0, I
Users and 1— 01
Manufacturers |


f 1
technology |
^
tooth in Industry and i
Short-term
Outcomes
Leads to
appucattoMia _
energy
technologies

leads to 1
Commercial |>
iTmriialiiisi 1
riwwtjjrpW |

leads to
•ngflwAstfJeM.
less risk


toads to
Ttchnoiofly _•£
accepted.
purchased
f
n general. peicepOo
nt lechnoloov assu
Intermediate
Outcomes
Potential for
technology
"~ change
documented

Technology
Available for
CommercUmaftor

Early Adapters
to buy
1

Consequences
„ of use. lower
tfMify *>v*b
and emissions
/
n of nsk of global cl

1-







-I



Longer-Term
Outcomes-
P... la lane,
FMUUIfJHI
Solution
I





(snared
responsibility)

economy.
1 *• cleaner


••change 1
                     Fig 3 Logic chart for a research and technology development and deployment program.
-------
 70
                              . L.III«II/III G B Jui dan E:ulnaliiHi tinti
 3 5  Stage :  (enj\ u>s ilic Logic Moik-l u irh stakeholders

  As the Logic Model  process unfolds, the *ork group
 responsible for producing the Model should continuously
 evaluate the Model u-ith respeci 10 its goal of representing
 the program logic—how the program works under what
 conditions to  achieve its short, intermediate, and lona
 term  aims The verification process  followed with the
 table of program logic elements is continued wuh appro-
 priate stakeholders engaged in the review process The
 work group will use the Logic Model diagram(s) and the
 supporting table and text   During this time, the work
 group also can address what critical information they
 need about performance, setting the stage fora measure-
 ment  plan
  In addition to the how-why and if-then questions, we
 recommend four evaluation questions be addressed in the
 final verification process1

 (1)  Is  the  level of detail  sufficient  to create  under-
    standings of the elements and their interrelationships?
 (2)  Is the program logic complete? That is, are all the
    key elements accounted for1
 (3)  Is the program logic theoretically  sound? Do all ihe
    elements fit together logically0 Are there other plaus-
    ible pathways to achieving the program outcomes''
 (4)  Have all the relevant external contextual factors been
    identified and their potential influences described1

  A good way to check the Logic Model is to describe
 the program logic as hypotheses, a series of if, then state-
 ments (United Way of America, 1996). Observations of
 key contextual factors provide the  conditions under
 which the hypotheses will be successful  The hypothesis or
 proposition the work group  is stating is. 'If assumptions
 about contexiuai factors remain correct and the program
 uses these resources with these activities, then it will pro-
 duce these short-term outcomes for identified customers
 who will use them, leading to longer term outcomes.'
  This series of if-then statements is implicit in Fig. 1. If
 resources, then program activities.  If program  activities.
 then outputs for targeted customer groups. If outputs
change behavior, first short  and then  intermediate out-
comes occur If intermediate outcomes lead  to the longer
 term outcomes, this will lead to the problem  being solved.
  For  example, given  the problem  of limited energy
resources, the hypothesis might go something like this

  Under the conditions that the price of  oil  and elec-
  tricit>  increase as expected, i/ihe program  performs
  applied research, then it will produce ideas for tech-
  nology change //"industry researchers take this infor-
  mation and  apply it to energy technologies, then the
  potential for technology  changes will be tested and
  identified // this promising new knowledge  is used
  by technology developers, then prototypes of energy
  efficient  technologies can be developed  IJ  manu-
   Pt.iC'i'"' Planning 22 SIWJ , rt.{-'.'

   fjciurers u:>e the prototypes and perceive value and
   lo«.  risk, then commercially available energv  savmc
   technologic* wall result  If there is sufficient marke:
   education ;ind incentives and if the price is right, then
   consumers  will purchase the new technologies //ihe
   targeted  consumers use the newly  purchased tech-
   nologies, then there should be a net reduction in the
   energy use. energy costs and emissions, thus making
   the economy more competitive and the environment
   cleaner
 4. Measuring performance

   Measurement activities take their lead from the Logic
 Model produced by the work group. There are essentially
 two purposes 10 measure program performance, account-
 ability or communicating the value of the program to
 others, and program improvement. When most managers
 are faced with  accountability requirements, they focus
 on collecting information or evidence of their program's
 accomplishments—ihe value  added for their customers
 and the  degree to which targeted problems  have been
 solved Another way to be accountable is to  be a good
 manager. Good managers collect the kind of information
 that enables them to understand how well their program
 is working. In order to acquire such an understanding.
 we believe that, in addition to collecting outcome infor-
 mation, the program manager has to collect information
 that provides a balanced picture  of the health of the
 program. When managers adopt the program improve-
 ment orientation to measurement they will be able to
 provide accountability information to stakeholders, as
 well as make decisions regarding needed improvements
 to improve the quality of ihe program
   Measurement strategies should involve ongoing moni-
 toring of what happened in the essential features of the
 program performance story and evaluation to assess their
 presumed causal linkage or relationships, including the
 hypothesized influences of external factors Wiess (1997)
 citing her earlier work, noted the importance of not only
 capturing the program  process but also collecting infor-
 mation  on  the hypothesized linkages According  to
 Wiess. the measurement should 'track the  steps of the
 proeram'. In the  Logic Model, the boxes are the steps
 that can often be simply counted or monitored, and the
 lines connecting the boxes are the hypothesized linkages
 or causal relationships  that  require in-depth studv  to
determine and explain what happened.
   h is the measurement of the linkages, the arrows in the
 loaic chart, which allows the manager to determine if the
 program is working. Monitoring  the degree  to  which
elements arc  in place, even the  intended and unintended
outcomes, will not explain the measurement or tell the
manager if the program is working. What is essential is
 the testing of the program hypotheses. Even if the man-
-------
                        J -I MiLaiiglitin C B Jordan f. aluuimn jml friivniiii Pl.iiiiiniy .'.' .'IW9i 6i-"~
ager observes that intended outcomes were achieved, the
following question must be asked. 'What featurefs).  if
any.'of the program contributed to the achievement of
intended and unintended outcomes?'
  Thus adopting the program improvement orientation
to  performance measurement requires  going beyond
keeping score  Earlier we  referred  to  Ration's (1*997)
espoused theory of action The first step m improvement
measurement is determining whether what has been plan-
ned in the Logic Model actually occurred. Patton would
refer  to  this as determining theones-in-use. Scheirer
(1994) provides an excellent review of process evaluation,
including not only methods for conducting the evaluation
of how the program works, but also criteria to apply in
the evaluation
  The Logic Model provides the hypothesis of how the
program is supposed to work to achieve intended results
If it is not implemented  according to design, then there
may be problems reaching program goals. Furthermore,
information  from  the  process  evaluation  serves  as
explanatory information when  the  manager defends
accountability claims and attnbutes the outcomes to the
program.
  Yin (1989) discusses the  importance of pattern mat-
ching as a tool to study the delivery and impact of a
program. The use of the Logic Model process results in
a pattern that can be used in this way. As such it becomes
a tool to assess program implementation  and program
impacts An iterative procedure may be applied that first
determines the theory-m-use. followed by either revisions
in the espoused theory or tightening of the implemen-
tation of the espoused theory. Next, the resulting tested
pattern can be used to address program impacts.
  We  should note that the verification and checking
activities descnbed earlier with respect to Steps 4 and 5
actually  represent  the  first stages of  performance
measurement. That is. this process ensures that the pro-
gram design is logically constructed, that it is complete,
and that it captures what program staff and stakeholders
believe to be an accurate picture of the program
  Solving the  measurement challenge often requires
stakeholder representatives be involved in the  planmns
Stakeholders and the program should agree on the defi-
nition of program success and how it will be measured
And often the program  has to rely  on  stakeholders to
generate measurement data  Stakeholders have their own
needs for measurement  data as  well as constraints in
terms of resources and confidentiality of data
  The measurement  plan can be  based  on  the logic
chart(s) developed for the program The manager or work
team  should use Logic Models with a level of detail that
match the detail  needed in the measurement  Stake-
holders have different measurement needs. For example.
program staff have to think and measure at a more
detailed level  than upper management.
  The following are the performance measurement ques-
tions across the performance story which the manager
and  work team will use to determine  the performance
measurement plan

(I) li (was) each element proposed in  the Logic Model
    in place, at the level expected for the time period''
   Are outputs and outcomes observed at expected per-
   formance levels0  Are  activities   implemented as
   designed? Are all resources, including partners, avail-
   able and used ut projected levels9
(2) Did the causal relationships proposed m  the Logic
    Model occur as planned7 Is reasonable progress being
    made along the logical path to outcomes0 Were there
    unintended benefits or costs0
(3) Are there any plausible rival hypotheses that could
   explain the outcome/result0
(4) Did the program reach the expected customers and
   are the customers reached satisfied with the program
   services and products?

  A  measurement plan will include a small set of critical
measures,  balanced across the performance story,  that
are indicators of performance. There may be strategic
measures at a high level of detail, and tactical measures
for implemented of the program.  The plan  will  also
include the important  performance measurement ques-
tions that must be addressed and suggest  appropriate
timing for outcomes or impact evaluation This approach
to measurement will enable the program manager and
stakeholders to assess how well the program is working
to achieve its short term, intermediate, and long term
aims and to assess those features of the program  and
external factors that may be influencing program success
5.  Conclusion

  This paper has set forth for program managers and
those who support them the Logic Model tool for telling
the program's  performance story.  Telling the  story
involves answering the questions. 'What are you trying to
achieve and why is it important0', 'How will you measure
effectiveness0', and 'How are you actually doing0' The
final product of the Logic Model process will be a Logic
Model diagram(s) that reveals the essence of the program.
text  that descnbes  the  Logic  Model  diagram,  and a
measurement plan  Armed  with this  information  the
manaaer will be able to meet accountability requirements
and present a logical argument, or story, for the program.
Armed with this information,  the manager will be able
to undertake both outcomes measurement and improve-
ment measurement.  Because the story and the measure-
ment  plan have been  developed  with  the  program
stakeholders, the story should  be a shared vision  with
clear and shared expectation of success.
  The authors will continue to search for ways to facili-
tate  the use of the Logic Model process and  convince
-------
                                \ULaughlin C B Jordan, Evaluation ami frnvrmn Planning _V f/W9/ 65-72
 nuinjgurs and  stakeholders or  the  benefits  of its  use.
 We welcome feedback from managers, stakeholders, and
 facilitator:) who have tried this or similar tools to develop
 and communicate a program's performance story.


 Acknowledgments

   In addition to the authors cited in  the references, the
 authors thank Joe Wholey, Jane Reismann.  and other
 reviewers  for  sharing their  understanding  of Logic
 Models. The authors acknowledge the funding and sup-
 port of Darrell Beschen and the program managers of
 the   U.S.  Department of  Energy  Office of  Energy
 Efficiency and Renewable Energy, performed under con-
 tract DE-AC04-94AL85000 with Sandia National Lab-
 oratories. The opinions expressed and the examples used
 are those of the authors, not the Department of Energy.
 References

 Bickman. L. (1987)  The functions of program theory  In L Bickman
   (Ed ). Uwig program ilieon m naluanon  New Directions for Pro-
   gram Evaluation, no. 33 San Francisco Jossey-Bass.
Chen. H.T (1990). Theor\-dnci:n evaluations Newbury Park. CA Sage
Corbeil. R. (1936)  Logic on logic models Eialuanon newsletter
   Ottawa: Office of the Comptroller General of Canada. September
Jordan. G B . & Mortensen. J (1997) Measuring the performance of
   research and technology programs' a balanced scorecard approach.
   Journal of Tethnolog\ Transfer. 22. 2. Summer.
Jordan. C B..  Reed. J H.. & Mortensen. J C. (1997). Measuring and
   managing the performance of energy programs: an in-depth case
          Presented at the Eighth Annual National Energj Services
    Conference. Washington. OC. June
 KosXmcn. J A (1997) Office of Management and Budget Tesnmon>
    Before the House Committee on Government  Reform and Over-
    sight Hearing  February 12
 McDonald. R . & leather. G (1997) Science and technology policy
    evaluation practices in the Government of Canada Policy evaluation
    in intimation and lechnologr lonards Best practices  Proceedings of
    the Organization For Economic Co-operation and Development.
 Montague.S (1994) The three R's of performance-based management
    Focus. December-January
 Montague. S (1997) The three Rs of performance Ottawa. Canada
    Performance Management Network. Inc. September
 Palton. MQ (1997) Utiliiation-focused Evaluation the nen ceniurv
    te.tt Thousand Oaks. Sage, pp 221-223
 Rush. B . & Ogborne. A. (1991). Program logic models expanding their
    role and structure for program planning and evaluation Canadian
    Journal of Program Etaluaiion. 6. 2.
 Scheirer. M A (1994) Designing and using process evaluation  In Who-
    ley et al. (Eds). Handbook of practical program evaluation, (pp  40-
    66)
 Torvatn. H. (1999). Using program theory models in evaluation of
    industrial modernization programs: three case studies. Ecaluanon
    and Program Planning. 22(\), 73-82.
 United Way of America (1996). Measuring program outcomes' a prac-
    tical approach.  Arlington. VA: United Way of America.
 Weiss. C (1997). Theory-based evaluation: past, present, and future.
    In D Rog & D  Foumer (Eds). Progress and future directions in
   eialuanon- perspectires on theory, practice, and methods New Direc-
   tions for Program Evaluation, no 76  San Francisco Jossey-Bass.
 Wholey. J. S (1983)  Ecaluanon and effectire public management Bos-
   ton. Little. Brown.
Wholey. J  S. (1987).  Evaluabilty  assessment: developing program
   theory In L. Bickman (Ed). Using program theor\ tn  eialuanon
   New Directions for Program Evaluation, no 33. San  Francisco
   Jossey-Bass.
Yin, R K (1989)  Case study research: design and methods Newbury
   Park: Sage pp  109-113.
-------
7  Using Piogram "I hcoiy to Ucphcalc ^iiccitslul Piogiams            71
Tumidly A  //(KM
Replicating successful piogiams is cxiicincly dillicull, progiam llicory
i .in help piovide inloniiiiiioii foi he tier icphc-.uion .ind adaptation ol
programs

8  rheory-liased Evaluation  Gaming a Shared Understanding         79
Between School Stafl .uul Livalualors
I nicy A Miicbiifi
I licory-h.isccl cvnluaiion c.in enhance staff support Tor and under-
standing ol cvalii.inon, n can also entourage reflective practice

9  Developing and Using a Program Theory Matrix for                91
Piogram Evaluation and Performance Monitoring
Sue C  I iiiiiicl/
An approach lo program llicory cvaluanon is dcsciihcd dial encom-
passes peilomiancc inoiiiioring and the broader contexts that affect
piogiams

10  Summing Up Piogram "I  heory                                  103
(fiinniil liitlnntiH
1 lie issues i.useil in the volume  aie suininarized. analyzed, and placed
in the context ol oihci developments in piogiam iheoiy and evaluaiion

INDLX                                                            113
 EDITORS' NOTES
 li has been more than thirty years since evaluaiors lusi undnsinied ilu
 advantages in  making explicit and testing program tlu-oiy—ih.it is..UK
 uiideilymg assumptions about how a program will woik to achieve intended
 outcomes And it has been ten years since the last of two issues of Noi
 Dilation* Jin Pnifttam  Evaluation  (the former name ol  tin- sciics)  w.i-.
 devoted lo providing evaluators with some of the tools needed to i.uiy oni
 piogram theory evaluations in practice and helped refine the many concep
 ttia I and pi actual issues involved  Since those volumes were published, tlu
 liontiers ol cvaluanon have expanded lo meet new i hallenges-, such as pei
 Ioj inanec JiKJSUiemcnl. organizational learning, cnllahoiaiivc- .uul paim i
 patory tcscarch, and mcla-analysis
     In I'ntgiain  I hcoiy in Evaluation Challenge*,  mid ()/)/IOIMIMIIICS, wi
 examine (he real 01 potential role for program llit-oiy m these  nrwei an-.is
 Ihu some thorny issues remain foi  cvaluatois implemrniiug such evalu.i
 lions, pailiuilaily m legard lo causal mfncncc I ollowmg the liiiioduiiinii
 Pail One niLludes lour ihapters that address  some ol these tliallcngcs
 Despite some persistent  cjuestions, there arc opportuiiiiies loi piogiam ihr-
 ory lo help evaluators in areas such as performance mcasuiemc-ni and mri.i-
 analysts Part Two includes four chapters that discuss (his potential  I In-
 volume's summary chapter belongs lo Leonard Bickman
     In Chapter One, the editors assess the current state ol piogiam ilu-oiy
 drawing on then own review of the available htcratuie They use ilu- icvicu
 as a backdrop to discuss program  theory's histoiy, its complexiiy ol deliiu-
 tions and variation m lexicon, us diversity in application, and  us stiengili>.
 and limitations in practice
    Some of the challenges in implementing program ihcniy m rvaluaimii
 (iiaciiLe aie addiessed in Part One  A majoi issue lor c-valuaiois is causal
 ink-rente in progiam theory evaluations In Chapter Two, Jane Davidson
 aigucs I ha I randomization is but one of several ways (hat scientists .K know!
 edge can establish causal inference and that evaluation chenis  olu-n acctpi
 lessei standards ol prool about causality to get the information ihey need
 She offers that a program (heory evaluation could meet the lowei siand.uds
 of prool required in settings that  cvaluatois often  Imil iheiusdvcs in In < mi
 trnst, Thomas Cook argues m Chapter Three that piogiam tluoiy is not sul
 IK lent to establish causal inference Raihei than "falsely choosing' lujw«-i n
 landomization and progiam theory ihe cvaluat»)r tan  iiKike_ilie"iip:imi'.'il
 clu)ice_.ind combine both                                  "
    In addition to the complexities associated with causal minimi-, ihrii
aie challenges in deciding what types of progiam them les 01 models lo lesi
with an evaluaiion  1 heie are often  a numbci ol plansilili  iluoni's a!>
 IIDVV a piogiam  woiks and an abundance of links lo lonsidri  In
-------
       I'KOl.KVM
                     IN I VAI |l\IU>N
                                                                                                                                         IIII cits' Ni III s
 world, I he cvalualor would he .iMc ID siiuly llicm all  lUil as Caiol Weiss
 wines in Chaplci Font, (he cvalualor has lo make simplifying (.hours  She
 goes on lo provide guidance lor cvalualors who are pondering which links
 in wlin.li ihcoiv ihey should test  In Chapter 1'ive, P.uncia Rogers shows
 how simple models — usually depleting causalion like a ch.un ol domi-
 noes — may not accuialely rellecl how progiams work She provides exam-
 ples ol how cvalualors have developed .more realistic causal motlfls  and
 makes (he point thai eomplexily is not the goal, hut  rather useful  "maps"
 that inloim suhsequeni  decisions               "                    '
     Pail Two exploit's the potential for program theory to make coninhu-
 nons in evaluation's new frontiers. The advent of mcia-analysis has meant
 lh.it inoic ev.iluatois not only aie asked to make sense ol multiple  evalua-
 tions hul also aic picssuicd lo impiove their own sludie.s foi siihsci|iiciii
 ievie\\s In Clhaptei Six. Anthony Peliosmo deiiionsiiaies how t ven simple
 piogiam llicoiy evaluations could  be used in nicla-analysis to accumulate
 Knowledge  ticncializing liom a progiam  ilieoiy cvahiaTltnTTii oiie"scmiig
 lo"7iM"Tval\i.iiion in the next setting is Timothy Mat si's i OIK. cm in C'hapiei
 Seven  Allei leviewmg some hasu pioblcnis in piogiam diffusion, he shows
 how the subtle inloimaiion geneialed by piogiam theory evaluation oilers
 an alternative to cuncnl models loi replicating innovations
     Another challenge laced by cvalualors is  ""^iil* |>|i'-"i IIM llu'l'
    In Chapter Eight, Tracy Huebncr provides several illustraiive case studies
to show the value-added of prouram theory in eduraiipn.il cvn|ii.inon Hueb-
ncr finds that the approach helped cvalualors coordinate their goals with those
of school sialf and reduced the normal lesisiancc that teachers and adminis-
iraiois olien have to "yet another study," pailicularly one thai iec|uncs them
lo collect the data.
    Lvaluatois aie often being asked lo develop systems foi moniiormg per-
lormaiuc  In Chapter Nine, Sue  Funncll outlines the pioblcms encounieied
in developing such systems and  draws on  hei experience to show how pro-
giam theoiy can help cvalualors develop momionng tools that make sense
llcf matrix^ can be used lo draw out undei lying theories about why agencies
or programs should succeed, to identify indicators in which measurement
is needed, and lo ensure that important external factois outside the bound-
aries ol ihc oigamzation are also monitored
    l-'mally, in Chapter Ten, Leonard Bickman summanzes these contribu-
tions and outlines a luture agenda for progiam theory evaluation As editor
of iwo seminal New Diinfions/iii Piogumi Lvahiatuin volumes on piogram
ihcoiy, liickman has ihe perfect perch Irom which lo critique (he volume
and oiler his observations and predictions
    We believe this volume olfeis hope and piagmatism Piogiam ihcoiy
tan help  evaluaiois ineei  some ol (he  new challenges I hey  laic  Ikil
implc'    'ing piogiam ihcoiy evaluation  ceilamly ofleis Us own ih.il-
lenge      |uaiul.uies  Om  hope is that llus Nrw Dim (ions issue noi only
 will snmulale thinking about program ihcoiy evaluation bin also—am.
 even mote important—will result in an incicasc in real-woild tests ant
 applications

                                                   Anthony I'eliosino
                                                   Pal IK i.i | Itogris
                                                   1'iacy A Iliielmei
                                                   Timoihy A  I lac si
                                                   I'dilois
ANMIU.VI P/-/ROS/VOIS icwaich fellow at theCentci /«» /•.vdfiKKioii,
/()/ Lluhln'n PKI^IHIM, Ant cue an Academy o/Arf.s anil Suriu i s, mid n si an It
msec KI/C at (In- llai vaid Graduate School of Education

/MINK IA J RtH.l K's is dinctoi of the Progicim/oi PII/'/IC Srdci / UI/IKIIIHII in
(hi1 ( / A/)/)/iC(/ Si initc, Royal Mclbouinc (nsliliifc o/ li < hnoln^y. AIIS
      /\  /(( 'I li.vi K is cooidinaloi fin compiehcnsivi- s< hool n foi in al U'rs/J-J
//MOM/) A  f/-u si is ii'M'dK/i fellow at r/ic lldivnid ( h    'is tnilmln't mid
Inn In s liislni y ill l/ic lltiivaid l:\lcn\ion .Sdnml
-------
         1 IK /nsfoi it (if M>£> \.ni-
annns ol I11 r. in pi.\IIKC and inui_h  to rccommciul n  And elemenis ol
PIT, whetliei (he evalualors  use (he  terminology 01 not, aie liein^ used
in a wide lange of aieas of contern lo evalualors llased on  this ii-view. in
ibis (.hapiei we dismss (he pracdie, promise, and pioblems of I'11


What Is Program Theory Evaluation?
Ik'iause llns volume is intended lo demonstrate the diveisily ol pia< lue,
we li.ivi  usul a bio.id delimiion of program llieoiy evalualion We i onsidi i
it lo have iwo essential componenls, one conceptual and one empiiKal
P I'll consists ol an exphcil theory or model of how the piogiam < auscs ilic
intended 01  obseived outcomes and an evalualion that is  al least  p.uily
guiiled by  ibis model  llns definilion, I hough dehbei.iiely hioad, dots
exchule some veisions ol evaluation that have the woid iliroi v .111.u hi il lo
limn It does not covei all six  types of theory-diiven evalualioii delined by
Chen (IWO) but only (he type he lefcrs lo as ind'ivcuni^ mc
-------
      l'Klll.K\M III! i >K\ IN I \ M I • \ I U IN
piogiam hul lh.il ill) nol use I IK- ihcoiy lo guide ihc evaluation Nor docs
M iiuliulc evaluations in which I he piogiam ihroiy is a lisl ol acliMlics, like
a "to do" lisl. tailici than a model showing a sciics of inii-niutliau- out-
comes, ui mechanisms, hy whu h (he pi obtain activities aie uiiiU i stood lo
lead ID I lie dcsiied riuls
    The idea ol basing piogiam cvaluaiion on a causal model ol llir pio-
giam is noi a new one  Ai least as lai hack as I he Notts, Suihman .suggested
thai piogiam evalnaiion might address ihc aclueveineni of a "chain of ob|cc-
uves" ( I9f>7, p  55) and aiguccl lor ihc henefit ol  doing this " I he evalua-
tion siuily lesis some hypothesis lhat  activity A will aiiain objective I!
because u is ahle to inlluciur pioccss C which alfecis the occuiicncc of ilns
oh|i-uivc. An mule-islanding ol all three laclois — program, oh|ccnvc and
mici veiling process — is essenii.il  10 I he conduct of evaluative icscauh"
(I9d7.p 177)
    Weiss ( 1072) wciii on 10 explain him an cvaluaiion could uleniily scv-
eial possible causal models ol :i leachci  home-visiting piogiam and could
dcieiniine which  was the hesi  as suppoilcd by cvulcnic In the  thicc
decades sinie, many dillcicnl lei ins b.i\e been used loi this t\pe ol cvalua-
iion, including iiiihomcs liicnin Jms (licnncii. 1975) and (Ino) y-oj-m fi<»n
(Si hon. 1997)  Moie commonly, the leiius /JIO^KIIII »/ic«»iy (Milkman,  1987,
1990), fJirmv-luisuf i-viiliiiilinii (Weiss,  1995, 1997), and /)ic>g»K«II li.  has now, in Us sixth edi-
tion, added a chaplei on tins appioach (Uossi, I iceman, and Upscy, I99«J)
Snnil.nl), / viiliiiifioii Models l-'vtmnKioii <>/ i.tliuiitiomil unil Stnml P>OJ>KIIIIS
(Maclaus. Si     -earn, and Sciiven. I9H J) lias added a ihaplei on program
llieoiv exalt     . in its second edition (Uogeis, loiiluonnng)
          I'lMl.KVM till iM!N I \ M UAIIXN I'KA( I K I . I'KI IMIM . >\NI> l'l<( UII I Ms     7

Praclicc: Diverse Clioiccs lo Meet Diverse Needs

Piogiam iheoiy is know by many dilleieni names, ucalcd in many dillu-
cut ways, and used loi any number of puiposes  lleie we piovulc a bnel
load map to the vainly ol ways people think about and employ piogiam
iheoiy
    Locating Examples,   lo ny 10 unileisiand ihe vanety ol ways m whu li
piogiam iheoiy evaluation is now being used,  we began in eaily I1WH 10
comb thiough available  bibliogiaphical databases, cilalion indexes, and eval-
uation K polls We also leviewed conference pioccedmgs, dissertations, and
ailicles bom a vaiiely of disciplines In addition, we received many helplul
examplis in lesponse lo an nu|Uiry lo the Amencan Evaluaiion Assonaiion's
Iniiinel discussion list. IIVAIJALK  Oui cffoils  turned up examples dalmg
horn I1)')/ to 2000 horn (he Umied Slates, Canada. Ausiiaha. New /calami.
and the United Kingdom  We have not included every example that  we
locaied m this volume  bin instead have used examples lo uleniily and illus-
liale 11 nical challenges m using program iheoiy or ways ol addiessmg ihem
    Oui  ievie\v showed amazing diversity in theory and piaiiue auoss IM-O
mam aieas—how piogiam theories are developed and how they aie used lo
guide evaluations
    Developing llic Program Theory—Who, When, and Wlial.   In
some  evaluiinons. ihe piogram theory has been developed laigely by ihe
evalualoi. based on a icview ol research hleialure on snnilai progiams  01
iclcviini i.uisal mechanisms, through discussions \\nli key  mloimains,
ihiough a ieview ol piogiam documentation, or through obseivaiion ol ihe
piogiam ilsell (Lipsc)  and Pollard, 1989) In oihci evnlu.itions, the pio-
giam  iheoiy has been developed primarily by those associated wnh the
progiam. olten ibiough a gioup piocess Many praciilioneis advise using
a combination ol these appioaches (Pawson and "lilley, I9l)5, Pallon, llM)(i,
see also I unnell, C  hapler Nine)
    Ihe progiam iheoiy can he developed before the piogiam is imple-
mented or altci ihe piogiam is under way At  limes, u is used lo change pio-
giam  piactice as the evaluation is beginning Most piogiam iheones aie
summarized in a diagtam showing a causal chain  Among ihe  many van.i-
nons  we Mill  highlight  |usi three for IIOM-, Rogeis discusses oiliei  vanaiions
in Chaptct I iv c
    At us simplest, a piogram theory shows a single iniermediaie outcome
by which the piogiam ai hieves its ullimaie outcome I 01 exampli. m .1 pio-
giam designed lo leiliui substanie abuse, we miglii lest whethei 01 not the
progiam succeeds m i hanging knowledge about possible dangeis .mil then
wheihei 01 nol this seems  impoitant in achieving the  desned hehavioi
change  As Peiiosmo (Chapter Six) points out, loi some pmgiam aieas,
artiiulaling this mediating variable and incasuimg il would '   • signilu.ini
advaiue on iiineni piaclice
-------
K     I'KiU.KAM IlllOin IN I VAIUAIION

    Moic complex  piogiam thcoiics show a scncs ol mlci mediate out-
comcs, sometimes in multiple strands thai combine to cause I lie ultimate
outcomes So loi a substance abuse picvcntion piogram, we might llieonze
lh;ii .111 cllcclivc piogi.nn will gcnciatc a positive reaction among partici-
pants, change both aldliules .nul knowledge, and develop panic ipants skills
in icsisling peei picssuic  Although these more complex progiam llieones
may moie adequately repicsent the complexity ol programs, it  is impossi-
ble to design an evaluation that adequately covers all the factors they iden-
tify Weiss (Chaptct Four) proposes some ways to select the particulai causal
links that any one evaluation might study
    The ilind type ol program theory is represented by a series ol boxes
labeled  m/mls, /mxrsM'i, outputs,  and outcomes, with arrows connecting
them  It is not specified which pioccsscs lead to which outputs  Instead the
difleient components of .1 progiam theory are simply listed in each box
Although this type  of progiam theoiy does not show the  iclalionships
among dilleienl components, these lelalionships aie sometimes c.xploiccl in
the eiiipnit.il component ol the evaluation
    Using I lie Program Theory to Guiilc the Evaluation.   Piogiam thc-
01 y has  been used in quite dilleient ways lo guide  evaluation Lxamples
show  diveisny in the purpose and audience ol the evaluation,  the type ol
icscarch design, and the type of data collet led Within this diversity, it is
possible to identify two bioad clusters of practice
    In some PTEs, the mam purpose of the evaluation is to lest the pio-
gram  theory, to identify what it is  about the program that causes the out-
comes This sort of PTIE is most commonly used in large, well-resouiced
evaluations locused on suth summative questions as, Does  this piogtam
work7 and Should tins pilot be extended7 These theoiy-iesling PTEs wres-
tle with  the issue of causal attribution—sometimes using experimental 01
quasi-experimental designs in conjunction with program theory and some-
times using progiam theory as an alternative to these designs Suth evalu-
ations tan be particularly helplul in distinguishing between theory failure
and implementation failure (Lipsey, 1993, Weiss,  1997) By identifying and
measuimg the intei mediate steps ol program implementation and the ini-
tial impacts, we can begin to answer these questions These intermediate
outcomes also provide some interim measure ol  program success foi pio-
grams with long-term  intended outcomes
    An  example of (his type of program theoiy evaluation can be found in
the I amily llmpowcimcnt Pro|ctl  evaluation, in which Bickman and col-
leagues  (1998) concluded  an experimental test of the elfetts ol a progiam
that named patents to be strongci advocates foi clnldien in the mental
health system  I hey articulated a model of how the program  was assumed
lo woik  l:nst. patent naming would increase the parent's knowledge, scll-
ellicacy, and advocacy skills Second, parents would then become more
.•ivolml m tin ii c hild's mental health care  I inally. this collaboration would
l< nl IM ilii i Inlil's unmoved menial health ouU nines
          I'KOi.KAM IIIIOKY I.VAIUAIION I'KAC lie I . I'KDMI .1 . AND I'KHIII I Ms     9

    Ikil they did not slop wnh the ailiculalion ol a piogiam iheoiy  I lit y
also constituted measuies, collected data, and analy .eel them lo test  iluse
nuclei lying assumptions The program was able lo a* Ineve statistically sig-
nificant ellects on parental knowledge and scll-clhcat y, but no useliil mea-
suies  loi  testing advocacy  skills could be  found  Unloiinnately, the
iniei\enlion had no apparent effect on caregiver involvement in iie.iimeni
or sci vice use and ultimately had no impact on  the eventual menial health
status ol the children
    (.valuations such as these seem to be ai least implu illy based on Weiss's
delimlion ol program theory, "|lt| refers  lo the nietlmiusm* thai mediate
between the delivery (and receipt) of the program and the emeigence ol the
outcomes ol interest" (1998, p 57)
    I he other type ol ptogram theory evaluation is often seen in small evalu-
ations done at (he pm|cct level by or on behalf ol pio|t el manageis and stall
In these cases, piogiam theory is more likely lo be used loi lonnalive evalua-
tion, ID guide then daily actions and decisions, than loi summalive evaluation
Such PI l.saie olten not coincided with causal attiibuiion  Although some ol
these evaluations pay attention to the influence  ol external l.iciois, theie is
taiely systematic lulmgoui ol nval explanations foi the outcomes Many ol
these evaluations have been developed in  response lo ihe me leasing demands
lor piogiams and agencies lo report performance mloimaiion and to demon-
strate their use ol evaluation to improve their services  In these c ire umstaiu es,
P IT. has often been highly regarded because of the bcnelits il piovides to pro-
gram manageis and stalf in teims of improved planning and management, in
addition lo us use as an evaluation tool
    Stewart, Cotton, Ducked, and Mcleady (1990) pi ovule .m example ol
this type  of PTE in then evaluation of a project that lecruiied and named
volunteers to provide emotional support for people with AIDS, ihen loveis,
families, and friends I'hc paper did not provide a diagram of the piogiam
theory model nor present any data Instead Stewart and colleagues lepoited
on the piocess ol developing the model, the types ol data dial weie gath-
ered, and how the data were used "Performance inchtaiois developed by
Ankah |ihc project| weie both foi the organisation's own (imposes and to
meet the  requnements ol the funding body Both qualnaiive and quantita-
tive indicators were selected     Ankah now uses the outcomes lueiaichy
dining orientation of volunteers and report|s| thai the piotess has assisted
with improved laigeiingol volunteers and referial agencies, modification to
the naming piogram, supei vision of clients and volunleeis. and devt lop-
mem ol pioposals loi expansion and enhancement ol the seivue' (1990, p
317)
    This type ol piogiam llieoiy evaluation appeals to In- ilosei in ilia)
described by Wholey  |li| identifies progiam lesouices, piogiam ac nvities,
and intended program outcomes, and specifies a chain ol causal assump-
tions linking progiam lesources, activities, intermediate oiiicoim s and iiln-
mate piogiam goals" (I9K7, p  78)
-------
 10
Inioii'i IN I \ AI OAIIDN
    Despite the apparent  popularity ol piogram ihcoiy evaluation, we
 louncl ih.il I he loi mat evaluation liictaiurc snll has comparatively lew exam-
 ples I'oi instance, when we scaichcd the abstracts in six hibhogiaphical
 databases lor the lime liamc 1995-1999, we lound program theory exphc-
 iily mentioned m evaluations ol cliilihen's pmgiams only twice In aclclilion,
 many ol the evaluations thai we louncl used theory in very limited and spc-
 cilic ways, loi example, lo help plan an evaluation, hut very few used the-
 ory as  extensively as the most  prominent  proponents  ol this approach
 suggest  But  PI l:s conducted in small projects or local sites are rarely pub-
 lished in lelereed journals or  distributed widely, being  more likely lo be
 picscnicd as conlcicncc p.ipeis by piaclitioners or picsented m pcrfoi niancc
 measuiemeni (oiuins And  many of them lail to include what some would
considei  an essential  component ol  a  program ihcoiy evaluation—
systematic testing ol the causal model
    In  this volume, we include examples  of both types ol  I'll.  Weiss
(( haptei Tour), ll.usi (C.haptei Seven), and IVirosmo (Chaplei Six) dis-
cuss issues associated with //icoiy-rc.sdn^ /'/i:s  llucbner (Chaptei  F.iglit)
discusses loin  examples ol adion-guidiHg  /'//is, and Pimnell (C haptei
Nine) discusses a technique lor assisting with this sort of I'TE
Promises and Problems
Progiam theory has been seen as an answci lo many different problems m
evaluation lleie we briefly discuss several areas wheie program iheoty has
been seen as promising
    Understanding Why Programs Do or Do Not Work.   Among the
promises made lot PTIi, the most tantalizing is thai it piovides some clues
to answci the question of why programs work or lail to wotk Considei the
usual pi act ice of Hying to mule-island why a ptogram succeeded 01  failed
rollowmg repotting ol results, cvalualois usually woik in a post hoc man-
ner lo suggest icasons lor observed lesulls (Pelrosmo, forthcoming, 2000)
But without data, such post hoc theories are never tested, and given the
poor slate of lephcation in the social sciences, they are likely ncvei to be
    In contrast, by creating a model of the miciosieps or linkages m the
causal path from program lo ultimate outcome—and empirically testing it—
PTI. piovides something moie about why the program failed or succeeded
in reaching the distal goals it had hoped to achieve, as in Utckman and col-
leagues' evaluation of the lamily empowerment program (1998) Perhaps
the mieivention was not able lo impiove advocacy skills—lemembei, those
could not be measuied Oi maybe there was a critical mechanism missing
horn the model, which the  piogiam was not activating 01 engaging We
learn something mote than the progiam's apparent lack  ol impact on clul-
clien's ii>     health
     I v       these issues t annoi be adequately addiessed in the oiigin.il
evaluation.., i> 11 can piovtcle an agenda loi the next piogiam and evaluation
I'KOl.H \M I III HICl I \AI IIA I It IN  I'KAt IK I . I'KOMIsl , ANI> I'KUIII I Ms
                                                                                                                                                    II
                                                                     I 01 example, a (.ritual link in the Uickman and colleagues study was not
                                                                     tested (advocacy skills acquisition), given the paucity of measuiemeni devel-
                                                                     opment in this aie.i  Pointing out this deficiency suggests an agt nd.i to
                                                                     develop an mstiument to measure this variable in the next similai study
                                                                         Attributing Outcomes lo the Program.   Another piomise sometimes
                                                                     made loi P11: is beliei evidence for causal attribution—to answci the qucs
                                                                     lion ol wbeihei the piogiam caused the observed outcomes Progi.im  tin oi>
                                                                     has been used by cvaluaiors to develop better evidence lot aiiiihuimg out
                                                                     comes 10 a progiam in circumstances where random assignment is not pos
                                                                     sible (loi example, lloinel,  1990, in an evaluation ol random hicaili listing
                                                                     ol automobile chiveis) In the absence ol a countcrlactual, suppoii lot  i.msal
                                                                     aitiibuiion can come Irom evidence of achievement ol intermediate out-
                                                                     comes, inxcsiigaiion ol altcinative explanations foi otilioincs, and patiein
                                                                     in.HI lung Suppoii lot causal attribution can also come lioin piogiam stake-
                                                                     holclei assessmems (loi example. Funnel! and Mogiaby. 1995, m then eval-
                                                                     uation ol the mip.ici of piogiam evaluations in a ro.icl and liallu .iiilhoniy)
                                                                     01  horn dal.i about .1 i.mge of indicaiors, including data on t xteinal l.u tins
                                                                     likely lo i n Hue nee the ihconzed causal pathway (foi example. Waul, Maine,
                                                                     McCaiihy, and Kamaia,  1994, in their evaluation of activities to icclucc
                                                                     maternal moiiahiy in developing countries).  It may be possible lo develop
                                                                     testable hypotheses on (he basis of (he causal  model (Pawson and  lilley,
                                                                     1995). especially il the model includes contingencies or clilleieiitiaiion—
                                                                     expec'ted chllciences m outcomes depending  on dillerences in coniext
                                                                     Causal atlitlnition is also sometimes addressed by combining traditional
                                                                     expenmeni.il or quasi-experimental designs with P'l I.
                                                                         Many  P'l l:s do not address attribution at all, simply leporimg implc -
                                                                     mentation ol activities and achievement of  intended outconus  I his
                                                                     apptoac h is parliculaily common where program ihcoiy is used in develop
                                                                     ongoing monitoring and peifoimancc information systems ( ausal ainihii-
                                                                     tion in P11 s is disc ussed in moie detail in the  c hapleis in tins voluim by
                                                                     ( ook (C haptei   Ihiee),  Davidson (Chapter Iwo),  and Mac si (C liaptei
                                                                     Seven)
                                                                         Improving the Program.   Many of the claims lor the licneliis ol I'll
                                                                     relei lo us capacity to improve progiams directly and indiiectly Ann ulai-
                                                                     ing a piogiam ihcoiy can expose faulty thinking about why  the piogiam
                                                                     should woi k, wine h can be corrected before things aie up and iimnmg ai
                                                                     full speed (Weiss, ll>95)  Hie piocess of developing a piogiam ihcoii (.m
                                                                     itself be a tewaicling expedience, as stalf develop common uiuK islanding ol
                                                                     then wot k and identity the most important components  Man) an minis ol
                                                                     PI  I- (such as Milne. 1993, and lluebner. Chapter I ighl) lepon that this lias
                                                                     been the most positive benefit Irom conducting PI I: In this wa\. I'11  is
                                                                     veiy similai lo Hie cailiei technique of evaluabihly assessment
                                                                         Ihil PIT is supposed lo then use the program  ihcoiy '    -ult  tin eval-
                                                                     uation, and n is lu n  that some evaluations falli i  When      lahoi.itivcly
                                                                     building a piogiam ihcoiy can be an energizing team .ulivi,,, exposing this
-------
 12
I'KIM.KXM I Ml Din IN I VAIH.MION
 to haibh cmpmcal tests (..in  he less amactive  h.iciic.il difficulties abound
 as well  When PII: is iniplcincnied at a small project, staff may not have the
 lime or skills lo collect and analyze data in ways thai cither test the program
 theoiy or provide useful information to guide decisions and action  If pro-
 gram theoiy is used 10 develop accounlahility .systems, there is a real n.sk ol
 goal displacement, wherein stall seek to achieve targets and slated ohjcclivcs
 at the cost of achieving the ultimate goal 01 sustainahility of the  piogram
 (Winston, 1991)


 Conclusion
 In ihischaptei, we have outlined the range of activity that can he consul-
 cird  piogt.ini iheoiy evaluation .mil have idcnlilicd major issues in Us ihe-
 oiy and piaeliie   These arc discussed in moie detail hy the other  ch.ipieis
 in this volume
References

Benncll.C- "Up I he I lu-rari hy " /oiumil <>/ l:\lrn\nm, 1475. IK2).7-12
UK kin Jii, I (cd )  l/Miij;''n'X'/ llir Aiisl»dldsirodili Alcxandiia, Va Unucil Wuy of Amenta, 1996
Home I, K  "Random Brralh Testing in New South Wales The Evaluation of a Suuess-
   ful Social Lxpcriiuenl " National Evaluation Confeicnie 1990, Piotmlmgs, vol I  Aus-
   ualasian Lvaluaiion Society,  1990
Lennc. B , and Clclaiul, II "Describing Program l-ogu "  Pio^iam L'viiliuinun fiiillcdii
   1987, no 2 Puhlic SCIVKC Uoard of New Soiuh Wales, 1987
I i|>si-y, M  W " Iheoiy as Method  Sin.ill 'I heoncs of I iiMlinenls " In I.  Scihiisi .mil A
   Si DI I  (eils ), I'm/< i slum/nig Giusrs and CifiicMilicmj; Abtwt (firm  New Duec HOIKS foi
   Piogi.un I v.ilii.iiion. no 57 San Traiu-isco Jossey-ll.iss, 1993
I ipsey,  M W , .mil Pollard. J "Driving loward I linny in Program Lvaluaiion  More
   Models  10 C liiio.se I loin " l.viiliiiiliDii unit I'lvgnnn Pliiiiiiiii)>. 1489, 12. M7-32S
Mad.ius, (• , Siiilllehc.ini, 0 , .mil Suivcn, M I vuliiiilinii Mm/rl\ I  viidiiitinii <>/ (ihn uliiniiil
  and SIKml Pio^'diiis Norwrll, M.iss Kluwcr, I9H1
Milne, C  '(hiuomes Iliei.iulues  .mil Piogi.im I ogu .isCoiueplu.il loots I  IM  ( .ist
   Siudii-s  ' Papei  presented .11 the inieinalional conlcicnie ol the Ausli.il.i.sian I v.ilu.i
   lion Sourly lliisli.me, 1993
P.iiion,  M  (.) Ulilizalnin-l IIIIIMI! / viiliiiiinui (3ulid) I lioiis.ind O.iks, C .ihl   S.ig>.
                                                                                                      |'I;H(.K\M Inunn IV.MIIAIIIIN PRAI IK i , PKHMIM , ANU Pitt mi i MS    I  }

                                                                                           Piliosino, A  I  'Aiiswiiing I he Wliy Question  in I v.ilii.iiion   I lit  I .ins.il Mmli I
                                                                                             Appio.u h " ( (iiuiilnm /OIIMKI! oj 1'iogiam l-.valuation. 2000, I >( I ), I -2-4
                                                                                           Rogiis  P |  "Piogram I In ory I. valuation  Nol Whciliri  Piogi mis Work Km Wh>   In
                                                                                             d  M.id.ms  II Suirilelu .1111, anil T Kelleher (cils ), / vtiliKiiidii Mixfcls /  iiiliniliiiii n/
                                                                                             ( dm iifiiuuil iiiiif Sni iiil ('ioniums  Norwell, Mass  Kluwei, loiiluommj;
                                                                                           Rossi, P  II . I itiin.in, II . mil I ipsey. M  W Pinxniiii I vuliiniiiiii A Sysli'iiiiidi Afifiimnd
                                                                                             I hoiis.mil O.iks, ( ahl  Sage. 1999
                                                                                           Si lion I) \  'Ihroi)  ol-<\> lion Lvaluaiion " Papei piesenlrd in ihe llarvaul I \.ilii.iiion
                                                                                             lasL I OKI-. Api   1997
                                                                                           Sle\\arl.  K , ( ollon, K . Duckcll, M , and Mcleady,  K  " I lie New South VValis Piogiam
                                                                                             I ogu  Modi I  Ihe I \pinenie of the AIDS Bureau. New Soulh Wales Di p.uliiienl ol
                                                                                             lle.ilih ' Pniiiiiliii^s d/ (lit1 Aniiuiil Ciiii/ricnir oj (he AusOnfuiiciii f.viifnuniiii Smniy,
                                                                                             1990,^. II ^-322
                                                                                           Sin Inn, in I   \ (viilinilMi Kcsiiixd I'liiifiplri anil 1'iailiii in 1'nhln SMVIII niu/ Sunnf
                                                                                             Ailh'ii l'iiiv;iiiiii\  Ni \\ Noik Itussell Sage  I oundalion,  I 'Hi 7
                                                                                           W.iul \'  M  M.inu. I) . Mi( aiihy, | .and K.imaia. A  "A Snaltgy loi llu I v.ilii.iiion ol
                                                                                             At IIMIIIS in Kediiit Moil. ilny in Developing C.ounliies " I vdliiiilinn Id vn if. 199-1, IH.
                                                                                          Weiss. C  II ( iiiliidduM KiM-iinli  Mflli(»i/so/As5fSMM^ Pii^iiuii / //nlivriii \\ I nglcuood
                                                                                            Chlls  N|  Pienliiellall, 1972
                                                                                          Weiss C  II  Nolhmn As Prai Dial As Good Theory I xploring I heoiy llasid I valua-
                                                                                            tion loi C ompieliensivi Community Initiatives foi (.Inldii n and I ainilii s ' In |  P
                                                                                            Council, A C  Kuhisih I  B Schorr, and C.  M Weiss (eds ), Ni u< Af)fiiiiiiilii sin I ml
                                                                                            iidlinx( OIMIIIIIIII/V liiilidli\is ( Diucpls, Mel/ioils and C'dfilrxis  Waslunginn. M (  Aspen
                                                                                            Instiiuli. 1995
                                                                                          Weiss (  || ' I low C. in Ilieory-Uased llvalualion Make Ciicalei llc.ulw.i) ' ' I vdliidlliiii
                                                                                            RI-VKH. 1947.21,501-524
                                                                                          Weiss. C  II I \dliidliiin Mi (finds Jm !>liijyin£ 1'nigrdriis and l'iiln u \  (2nd id ) I ngle
                                                                                            wood C lills, N |   Pu nine Mall. 1998
                                                                                          Whole).)  S  "I valuahiliiv Assessment Developing Piogiam lheory"lnl   Bukni.iii
                                                                                            (ed ) IKinx I'mxidiii I In in \ in I v£ianijoi Public Sc< tin l:viiliuili«n in tin-
                                                                                          Faculty it) Applied i( iriirr, Rityal Melbourne Institute <»/ /rr/iiindi^v. Ai^tnilm
                                                                                          ANIMO.V) fi IKOSINO is icsri/ic/i fellow at the Cenlei Joi  I valuation, linimini s
                                                                                          /oi C./ii/i/icn I'logitim. /\iiiciiC(iii A
-------
M
              I III I'in IN I \AIII till >N
need 10 conduct iiadilional causal modeling analyses ol the pailcin ol influ-
ence Inini ihe mlei veiilinn in ilie various media I ing variables and llien Irom
these mediators to a disial outcome
     Tew cvaluaiors will argue against the inoic licquenl and sophisticated
use ol substantive theoiy to detail mici veiling processes Piobably ilie sole
exceptions aic those who believe lhai I lie act of mcasunng piocess cieatcs
conditions diffeieni from those that would apply in the actual policy world
|-"ew evaluatois aiguc that n is not possible lo collect measuies of interven-
ing processes So it should be possible to construct and justify a iheory-based
lorm of evaluation that complements experiments and is in no way an alter-
native to them It  would prompt experimenters to be more thoughtful about
how they conceptualize, ineasinv. and analyze miciveiling process  It would
also lemind them ol the need lo lust probe whether an intervention leads to
change* in each ol the iliroicmally specified intervening processes and then
exploie wheihci these piocesscs could plausibly have caused changes in the
moic distal outcomes of policy inteiesi  1 want lo see theory-based methods
used within an cxpcimicnial li.imevvoik and not a.s an alleinative to it
References
AngiiM.)  I) . linlm-ns. ("•  W . and Kuliin. H I* "Ideniifiialioii of Causal I lUiis Using
  liiMimm-iil.il Variables ' /inn mil »J l/ic Aiiirii«in MulislKiil /Usut union, 10.1)6, 'Jl.
  4-H-402
Anson. A . :nul oilicis " NIC Comer School Development 1'iogi.iin  A lhcorciu.il Analy-
  sis "Joiinmlii/Cibun I.(/in ill in".  I*)9I, 26, 5tt-82
Coiner.) I' .SVhiNil I'liwri  New York Free Press. I"80
( ook. "I  I) , and Campbell. !"»  I  QwiM-/.v/»niMiriiinlii>ii Pi MX» «". wiiiu-i 2000
Cilyiiioui. C . Vlu-mcs, K , Spines. I'. ami Kelly. K DIMUVCIIIIX («nn«il Slnidiiii  Am/i-
  iinl Inirllixi-iKf. I'liiloju/iliv o] .Sm-iicf. diul Snilisliidl Moitilmx Orlando, I l.i  Au-
  ilcmu 1'iess. 1987
Si men. M ••M.minizing I he Tower of Causal lnveslig;ilinii I IK- Modus OptT.nuli
  Mriliod ' In Ci V lilass (cd ), t'viiliuKioii iliu/ic» Ktvifiv Aiiiimil 'IhoiisaiidOaks.
  ( .ilil  Sjjir. 1476
         (/ (Jli 1C is little i iillsriisil.s 
-------
       l'Klll,i;\M till OKI IN I \ II MMIHN
             lahlc 4.1. 'I licory of a Job-Training Program
 Piogiam publicizes a iob-iianinigprogi.ini
     Youlh heai about llie piogram
     Youili are nneresieil and nioliv.ilcil lo apply
 Program eiiiolls eligible youlh
     Youlh sign up
 I'logiam pioviclts occiipalion.il naming in .in accessible location
     'louili .iiiciul legulaily
 "I Mining malt lies labor 111.11 kei IK eds
 1 raining is earned oul well
     Youlh leani skills
 11.lining ii'.u lirs HIHH! woik h.ibn.s
     Youlh internalize v.ilius ol legul.u emplii) mem and appiopn.ile beh.ivioi on (he
     job
 Piogiam 11 It is \ oulli 10 suilahli |i>lis
     Youib appl\ loi jobs
     Youlh beli.ive well in job mlcrvicws
     I mployeis ollei jobs
     Youlh .uirpl jobs
     ^oulli show up lor woik legiilail)
 Piogiam assists \oiiih in  m.ikmg liansiiion lo woik and In Ips \\ nh piolik ins
     Youib acicpl anllioiily on llie job
     Youlh do (hen woik well
     Youlh beh.ne well wnli cowoikers
     Youili slay on llie job

 Suniii Atl.ipli'il liiini Wriss, IW8, |> 54
    Till- is nn jncmpi U) sec how far (lie program succeeds in accomphsh-
iii}; all (lie intervening phases hciwcen enrollment in llie progiam and long-
tcrm job holding If uainccs do well all along (he rouie from participation
m (he training program to slaying on a job, there is at least plausible reason
10 believe ih.u the piogiam was icsponsihlc- (or the 11.uncos' woik success
(See Chapter I wo by Jane Davidson loi limber discussion ol establishing
causality)
    13111 lei us lake a  step baik Table 4 1 shows  the expected steps in the
implementation ol the program It is what might  be called the mi/>fciiienfl\ed \\ilh llie piogiam also expect them to help the youlh  deal wnli
dillicult hie ciicumstaiKes, such as an abusive parent or involvement in gang
activities  Some  piogiam people also expect them  to inleiiede loi  tumbled
youth wuh social woikeis, police, or piobalion officers 01  to help ihc u-ens
sec me sei vices horn health clinics or other service agencies
    Several ihcones of action might be operating  Some people, maybe the
piogiam adminisliatois, think that the counselors arc role models loi  the teens
Hecause ol common ethnic backgiounds and life circumstances, the teens tan
idcnlil) with them, will take  then words of advice seriously, and will lollow a
moie  positive suu.il path Anothei theory might be lhat the counselois uiuk'i
stand the  perils and piessures lhat the teens face  and will give advice that is
beiiei suited 10 the ie.il woild ol the  inner city ihan would a mitlille-i lass
teaihei  01 counseloi  they  will know how to advise on  laiiuly  probli ins
because ol the commonality of their family backgrounds Anothei theoiy migln
be that the counsclois, uiiclcrsiaiidmg the local culture, can use ihreais .uul
penalties ellecdvely, something thai while middle-class counsclois  would he
loath  to do Yet anolhci theoiy is that the counselors will be well au|ii.iinird
with all the available seivices in the community and theieloie can iclei the
youlh loan appropriate souice ol help All of these assumptions glow Imm ilu
match ol counselors to the ethnic and socioeconomic status ol die leeiiagt is
    A clillerent  set ol assumptions would refer lo the  specific steps .uul
actions that the counst-lois use in their relations wnli the inns, prih.ips
gioxMiig horn  the p.ulicular training that (hey received in the comimimiv
college  I hey  may have icceived training in the use ol rcuauls  loi  sm.ill
steps lhai a youlh takes in a positive dneclion, such as oflt nug .1 movie
pass liu alli-iuling school live cl.iys in a low Oi ilu-y m.i) li.ivi IH-I  n n.iiiuil
lo help with the developmeiil  of peer suppoii gioups. win it- .1  gioup of
-------
18    I'KIU.IIXM IlllOin IN I \ \l IIAIION

youngsters help one another maintain good school attendance and piopei
completion ol school woik One might also imagine thai a counselor could
he ellective by lutonng young people in  the subjects thai give them the
most double in school and help them oveicome cognitive deficits  Theie
aie a plethoia ol iheoietic.il bases on which one might expect the piogi.un
to he successlul in cncouiaging young people lo lemain in school and do
good woik
    II (he evalualor isembaikmgon a theory-based evaluation, which theoiy
does she1 hook the study to7 Hoes she follow ihe counselors encomagemcnt
ol school attendance7 His  intervention into family disputes7 Mis relenals to
set vice agencies711 is establishment ol support gioups711 is coaching m math7
Oi what7 One study can raiely collect data on all possible activities and their
cascading consequences It  would be buidensome lo follow each chain ol pos-
sible events, and the evaluation would  become complex and pondeious
C hoiccs have to be made I he evaluate)) has lo decide which ol the seveial (he-
ones lo Hack thiougb the senes ol subsequent steps
    Ovciall, theic aie two majoi sources ol theoiy—the social science hi-
eiatuie and the beliefs of program stakeholders "I he advantage ol social sci-
ence iheones is that they aie likely to be based on a body ol evidence that
has been systematically collected and analyzed I he mam disadvantage is
that available social science theoiy may not match the program  under
leview, and even when u does, u may be at such a high degiee ol ahsii.u-
iion that  n is difficult to opciationalize in the immediate context  Ncvei-
theless, when social science piovides theoiy and concepts that giomul and
suppon local (cumulations, it can be ol gieai evaluative value (Chen and
Rossi, 1987) The evalualor should  bring her knowledge of the social sci-
ence liteiaiuic lo beai on the evaluation at hand
    A way 10 begin the task ol choosing a iheory to lollow is to ask (he pio-
giam designers, administrators, and practitioners how they believe ihc piogiam
will woik They may have clear-cm ideas about the chain ol actions and icac-
lions that they believe will lead to better school achievement of (he youth But
it is not unusual to find thai different people in (lie  program hold difleienl
assumptions about the steps hy which inputs will translate  into desned out-
comes What can the cvalualoi do7
    Pnsi, she can convene a meeting of the stakeholders in (he piogiam,
peihaps including the youth who are the piogram's clients, and ask them to
discuss then assumptions about how the piogtam will reach the desned
icsulis They should discuss the mimstcps ol counselor action and youth
iespouse  thai will lead lo success "I hrough such discussion, their onginally
hazy ideas may become deal, and they may icach consensus about what the
piogiam nuly aims to do and how u aims to do it
    fiogiam stall  will often hud a disc USSIOM of this type revealing and emi-
nently piaclir    '"hey will leaf n what then colleagues assume should be
done (and wl      y aie doing) Stall may all be peiloiming (he same Imu-
nons but dome iiicm will) different assummions about why lhe\ will be
                    Wnii n 1 INKS IN WIIK n'liiioKii s SHAI i  Wi IV.MHAII'    W

  successlul Oi they may actually be doing difleienl things In discussion,
  they can hnd out wheihei they arc working at uoss-pmposes 01 aie on the
  same wavelength If they aic working in different directions, the progiam is
  apt lo be liagmenled and ineffective Staff will often find the efloil to ic.u h
  consensus.) stimulating and useful exercise  It may help the piogiam .mam
  cohciciicc and diiection
  Including Several Theories
  In some instances, some program staffs cannot  reach consensus  I hey
  have m.ukedly dilleieni (he-ones about where they should put then nine
  and what  kind of actions they should take in ordei to engage piohlem
  youth in school  In sui h cases, it may be necessary to include seveial dil-
  leunt iheones in (he evaluation design The evaluation can lollow the
  chains ol assumption ol several theories lo see which of them is best sup-
  ported by  the data
      When a numhci ol dillerenl assumptions are jostling loi piionty. a  1151:
  is wise to include  multiple theories If only one theory is tracked, and that ihc-
  01 y is wiong 01  incomplete, the evalualor may miss impoilanl  chains of
  action  I he final  lesult may show that positive outcomes weie at hievecl but
  not thiough the senes til steps posited by the iheory  The cv.tluaioi will be
  unable to explain  how success was attained (sec Ring, Slccnhuis, Van Assrma.
  and IV Vnes 1996, 1'uska, Nissmen, and Tuoniilehlo. IWi)  Oi il the pio-
  gram lias disappointing lesulls, and  only one theory was irai ked, the cv.ilua-
  toi may lace ieade)s who say, "But  that's not how we thought good u suits
  would come about anyway" When programs rest on fuzzy assumptions, it is
  ohen useful loi TBL lo represent a range of theoretical expectations
      Bui the more iheones that arc tracked, the more complex and expen-
  sive the evaluation  It is worthwhile lo try lo winnow down the numbei ol
  possible iheones to a manageable number Three or four would seem to be
  the maximum that an evalualor could explore in a single study I low can the
  evalualor decide which of the several theories is worth including in the eval-
  uation7
( Criteria for Select ing Theorifcs-^                        (   /
  Thc~fiisrcnTCMiorT iQhi-'lv-liils'iiiilip people associated with the piogi.nn.
  pnmarily the dcsigneis and developers who  planned the piogiam. the
  admmisiiaiois who manage it, and the practitioners who i.iny n out on a
  daily basis Also impoii.mt may he the beliefs of the sponsois whose money
  kinds the piogiam and the t hems who receive the set vices ol the piogiam
  Wl)4l di) lh"c" fl^imicigcitmi. are the pathways lo good outcomes7 SVhal
  aie the mmisleps (hathavc lo he taken U (lie clients aie to ic   -'lelienehis
  that the piogiam piomises7 What (he  people who aie deep!       ed m the
  piogiam believe is ciitual because iheir beliavioi largely u-   .nnes how
-------
•\
           I'KlH.K \M I III Hin IN I \ Al MM KIN
    llu- piogiam inns When ilu-y hold chvcigcni assumptions about I he ionic
    10 success, ihe seveial  ihconcs dial ihey prollcr become candidates loi
    inclusion
        A second cnierion is pl.uisihiluy Can die program actually do die
 I  (lungs dim a iheoiy .issuincs, and will die clients he likely lo respond in (he
    expected fashion7 'I he cv.iluator needs 10 see whal is really going on  Onc_
'^   vv.iv is lo liillow (he money  Wheie is die hudgel being:spent7 Where is the
    piogr.im K-jlly pulling its chips7 Which resources are they pioviding lor
    wlui kinds ol assisi.uue7 If die program makes available to each counseloi
    a  list ol accessible service agencies, llieir eligibility crilena, and hours ol
    operation, (hen n is a leasonablc bc( ibai (hey think the referral mule is
    impoitant  II nobody gives the counselois any mfoi maiion .ibout available
    icsouiccs, (hen this iheoiy is piobably not .in active candidate lot study II
    piogiam designers and administrators talk a good deal about ethnic m.ikli
    between counselor and  client but end up lining pumarily white middle-
    class counselois, ethnic  match is not an opeialive theory in this piogiam
    Similaily, il the counselois do not know enough about plane geometiy 01
    nmciccmh-ccniiny Ameiican Insioiy 10  tutoi youth, then assumptions
    about success ilnough luionngare noi apt to be the route lo follow (unless
    the counsclois hnd olhei people to do the luloimg) I he evalualor needs 10
    lake a haul look at the piogram in action, not just in its planning docu-
    ments, in oidei to sec which theones aic at least plausible in this location
       A third criiciion is lack of knowledge in die piugram field  For exam-
    ple, many piograms seem to assume thai providing information lo piogiam
    panic ipanis will lead lo a change in (lien knowledge, and met eased knowl-
    edge will lead to a positive change in behavioi  This theory is (he basis loi
    a wide lange of piograms, including those dial aim lo i educe the use ol
    dings, prevent unwanted pregnancy, improve pain-ills' adheience to med-
    ical icgimens, and so forth Program  people assume dial  il you lell panici-
    pants about  the  evil effects of  illegal  drugs, (he chllicult  long-term
    consequences of unwed  pregnancies, and the benefits of complying with
    physician oideis, (hey will become more conscious of consequences, think
    moie caiefully before embarking on dangeious couises ol action, and even-
    tually behave m moie socially acceptable ways
       I he  theory seems commonsensical out social scientists—and many
   piogiam people—know thai it  is loo simplistic  Much research and evalua-
   tion has casi doubt on its umvcisal applicability Although some piogiams
   dial convey knowledge  in an elloit lo change behavioi have had good
   lesults, many have been notoriously unsuccessful In an ellon lo add to die
   stock ol knowledge m the progtam aiena, an evalualor may find it woiih-
   wlnle lo puisne this theory in  the context  ol the particular program with
   which she is woikmg bhc may want lo caielully Hack die conditions ol die
   piogiam in  oider to galhei moie information about when and where suih
    i iheoiy is suppoiied 01 discontinued by die evidence (and whal elements
   of i onicxi, inieinal oigamzaiion, and leinloicemeiii make a dillcmuc)
                                                                                                          Wnu n I INKS IN WIIK n I'm OHM s SUM i wi |V,MUAII>
                                                                                                                                                          •II
     So much elloil is expended in providing inloi maiion in an aiiempi
 10 change behavior (through public service campaigns, material posted
 to Web sues, distribution of printed materials, leciuies and speeches,
 couises and  discussion gioups,  promotional messages disseminated
 ilnough multiple media) that careful investigation of (Ins iheoiy is wai-
 lanied  Riithcrnioic so much uncertainly exists about the efficacy of pio-
 vuling inloi maiion of different kinds (o different audiences dial piogiam
 developeis need a belter sense of (he prospects  I he evaluatoi who pui-
 snes this iheoiy in a "I HIE may look lo social science iheoiy loi a sophis-
 ticated undeisianding of when and where inlorniation is likely lo have
 effects and under whal circumstances She can build tins knowledge into
 (he evaluation When the results of (he evaluation are teady, she can ollei
 piogiam developeis and stall a greater understanding ol the extent lo
 wlnih inloi maiion i icales change wilhin the immediate piogram context
 Mail) studies have shown thai information lan lead to change in knowl-
 edge and ailiiudes but not often to change in behavioi The i uiiein eval-
 uation can examine whether and where the  sequence ol  steps in the
 iheoiy  bieaks down and  what forces undermine—01 leinloice—die
 powei ol mloimaiion
    A lin.il cnierion foi  choosing which theories to  examine  m a iheoiy- f [.
 based evaluation is die ccntrah(y of (he theory lo the  pioKiam  Some ilieo- V
 lies aie so essential lo the operation of a program that no mallei what else
 happens, the piogiams success lunges on the viability  ol this paiiiculai the-
 01 y I el us take die example of a comprehensive community piogiam  I he
 piogiam involves the provision ol funds (by government 01 a foundation) •'_
 lo a gioup ol  comnumily residents, who (hen decide wluc h enhancements  I)
 (he neighborhood needs in order lo improve die lot of us inhabit.mis  I be^A-
 lesidems can  choose lo use (he funds (o add more seivues (mental health,
 education, and so on), clean up the streets and parks, rehabilitate buildings,
 hue puvate police, alliacl new business lo die neighhoihood m oidei lo cie-
 ate jobs lor local people, begin a ear service lor elderly lesidenis, 01 what-
 evei oihei sei vices they decide arc most likely lo impiove die loi.il quality
 ol hie
    An evaluation can study die services chosen and Imd om ilu conse-
 quences ol adding police 01 rehabilitating buildings 01 whatevei olhei new
 scmccs have  been added Ikil a  fundaniental premise ol  this commiiiiiiy-
 bascd approach is dial local lesiclents arc knowledgeable, lommitted, haul
 woikmg, and  altiuistic enough lo find out whal is most needed .mil to go
about gelling  those sei vices into ihc community  rurthei, tin y .in  .issnmt d
 lo lepiesent (he needs and wants of a wide swath ol the (ommumiy So .in
 undeilying iheoiy has to do with die role ol citizen gioups in  developing
and directing a compiehensive community initiative  I he clleiiivt in ss ol a
gioup of lesidenis in lepresenting the interests ol dieir neigbboiliood .uul
seeming piionty sei vices is key to (he success of (he piogiam  I bis .issump
 lion becomes  a pi line candidate lor die evaluation
-------
 -12   I'KIM.KAM Iliroin IN I \AI.UAIK1N

 Which Links in a Theory to Study

 Many ihcoiu-;., il diawn oul in detail, consist ol a long series ol inteilinked
 assumptions about how a piogram will achieve us effects Let us go back to the
 job-ii.iimiig progiam in Table 4 1 J.I the evaluation does not have the resouic.cs
 01 the lime to study all ol the siqis laid out in die iheory. which ol than should
 The evaluation explore* Much ol ihe answer to this quesiion will depcnd~on~nie
7iraciicaTines of ihc'STtuaiion At whal jioml is ihe evaluaioi bi ought to the
 scene' Is il alter ihe fusi several siejis have already been taken7 How much
 money (Iocs the evaluation have to collect  daia? Mow difficult is it to get some
 kinds of data7 Tor example, what kind ol data will the cvalualor need  in oicler
 lo know whelhei die naming is earned oul  well71 low will she liiul out  whelhei
 the iiameesado|ii and internalize the values ol icgulai employment •' II some
 kinds ol daia aie dilhcull or expensive lo collect, that will sei practical limns
     Second, pirgiam Mall m iy hqyc pailiculai coiiceins about soiiyc  seg-
 mcnisol ihe impleineniaiion iheoiy They may wain to know, foi example,
 whelhei naineis aie giving jiinpci emphasis lo good work hahiis  and other
 "soli skills" 01 whelhei the youth in fact learn the oicujiaiion.il skills  ihat
 the nameis seek to convey They may want to know whether stall iclei them
 to iclevant jobs and whelhei the youth coinpoil themselves apjiiopiiaicly
 m job iniei views, so that it is deal why they do or do not get  jobs
     It may be even moie impoiianl to examine some links in the piogiam ihe-
 ory about the psychosoc i.il piocesses thai underlie the piogram I leie is wheic
 much ol the uin.eil.nniy in social programming lies  Whal impels developing
 counines to seek 10 aiuaci moie girls inio ihe school system7 Whal gels lac-
 ully members  in urban universities lo leach in mierclisciplmaiy courses in
 oidci to iciam sludenls in school7 In our example, what are ihe icasons thai
 tiameesjieisisi m the liammgcouiscand leain both job skills and woik icadi-
 ness skills7 Is u the capacity of the nainers lo develop supportive communi-
 ties among the youlh7 Is il the strength ol  external rewinds and |iumslimenis7
     An evaluation can concciinaic on understanding these kinds ol mech-
 anisms and the extent to which they ojicratc within the piogram milieu  The
 evaluaioi can collect data on whelhei pcei groups develop dm ing the couisc
 ol naming and the messages and supports thai these groups piovidc to their
 members  Do youlh affiliate in  subgroujw7 Do  members ol the  various
 gioups suppoii the aims ol the training piogram? (Or do they deingiaie the
 ellort to Icain skills  ihat will yield "chump change"?) Do  the  nainers
 actively cncouiagc the foimanon of subgioups and  provide leadership7
 What messages emulate in the dillerenl subgroups about the \ aluc  of work
 and the willingness lo accept aiilhonty  on the job7 Regarding the theory
 about external threats, how important to pamcipanls in the  naimng  pro-
 giam is the icduction in safely net suppoits7
    Jjecausc e    -lions lo date have lold theirjrcadcrs icjatively little about
 ihe wliy ol pi       success and lailuie.  such iiujinrics may have gie.il ics-
 oiiancc Studies mat  exploic the  jisye hosoc lal piocesses of jiingiam iheory
                  WlIK II I INkSIN WlllC.ll IlirOHIIxSllAII Wl IVAIUAII*    II

will have much to tell piogiam designers, lessons (howevei tentative) ilia)
may be suggesiive lor a whole range of programs

Criteria for Selecting the Links to Study
The cruel 1.1 lor choosing which links to study arc smnlai to ihe ciiiena loi
choosing which theories to study Two are probably most important  I he Insi
ci iierion is the link or links that are most critical lo the success ol  the |im-
giam  It seems wise to mvcsi resources m studying the pamculai assumpiion
on which the progiam most basically resis  If the program is predicated on
the assumption that what keeps youth enrolled m the full naming piogiam
is the suppoit ol then peers, then that assumption warrants investigation
     I he si-iond c ineiion is ilie degree of unceilainiv about the linkage II
nobody knows whelhei  I lie assumpiion is likely lo be suppoilccl cmpnually,
01  il pi 101 studies have pioduccd conflicting lindmgs on the subject, thai
link max be woiihy ol systematic sludy Some linkages aie unsettled in ihe
social si lence and the evaluation literatures Some linkages seem lo be siiji-
poitcd in ihe social science literature (or m common sense), bin evaluations
ol  i-ailiei piogiams show that they do not work in piaciicc  An cxanijilc
would be the piemise of case management  within a iiiulliscmcc progiam
A laige number of multiset vice programs  have employed case maiiageis
who analyze the seivices thai a lamily needs, locate and looidinatc a i.ingc
ol  SCIMCCS, and help  the family members obtain appiopnair sei vices horn
rck'vant agencies  I he  idea ol a family coordinator, an ndvoi.ne and 1011-
suliani lo the lamily,  sounds so utterly sensible that it is unsettling lo Inul
thai evaluations have  usually not lound such  piograms successful (loi
cxamjile. Iticknun and otheis, 1995, Si  Pierre, Layzci, and Goodson. IW)
What aie the assunijiiions thai underlie case managemeni7 What is the i.ise
manager assumed  lo do, with what immediate consequences, leading to
what ne.xt steps, with what laler consequences7 Including some  ol these
kinds ol links in the evaluation would yield impoiianl mloimaiion
Conclusion
In selecting the theory 01 theories to use as scaffolding foi a I'll! , the eval-
uaioi should consider these ciiiena

•  The assumptions ol the people associated with the piogiam What aie
   then consnuclions ol  the interlinked stejis by which piogiam IHJUHS aie
   liansmuied into piogiam outcomes7
•  I he plausibility ol the assumptions, given the mannei in whu h tin pio-
   giam is allocating us lime and resources
•  Uncertainty about the applicability of current assuiiijiiioiis  tiiven the
   olien m
-------
44    I'KIN.KAM I III DM IN I.VAI IIA IIUN

• 1 he ccnliahty oflhc assumptions u ihc piogiam  If llic program is based
  directly on a particular ihcory, it  would be sensible lo make this theory
  llie cemerpiece of ihc TBE

     Once ihc cv.ilu.uor decides which theory 01 theories to use for structur-
ing the evaluation, she ought to spell out all the links in  the theory iham—
what the progiam will do, how paiticipants will respond, whai the program
docs next, and so on  Many evaluations will not realistically be able to follow
all (he links in each chain, and (he evaluaior needs lo choose ihc links on
which to locus Considerations for making thai choice  include the practical-
ities of access, resomces, and methodological capability for studying given
links and ihc paniculai knowledge needs of piogram siafl, who wani lo know
which elements of the piogiam they  need  lo modify or shore up
     In making both choices—which theories lo select and wlmh link* 10
study—the evaluauu needs 10  consider ihc underlying iiifi IKIIIIMM.N on
which the piogram icsis, wh.il  I have called I lie fmignim (fnoiv m con-
nadistmi lion lo ihc im/>/riiirm  I  siiuiliins, I . V.m Assuiu. I* ,.IIK| De Vncs, II " I lie linp.iil of .1 ( oni|iiiu i l.tiloud
  Nniiiiion liiu rvi-niion " 1'n irndvr Medicine. 1996, 25. 23O-242
(Inn II I ..nul Rossi, I'  II  "The  I hcory-Driven Approach lo Validity " (•viifiuiiinii >in,l
  I'liiXHiiii rfiiiiiiiiix. l°"7. ID. 95-10)
I'usk.i. I' NISSIIMII.A  .ind I uomililimj "Ihe C'liniiniinily-llasnl Sli.iu-Ky l» l'i< vinl
  ( oioii.ii) IK .in DIMMM- I oiiilusions lor the Ten Years ol (lie Noilli K.iu-li.i l'io|(iis "
  AiiiiiiiilKiinii ofl'iiMu  (fnillfi. I'W>. fi. 147-19)
si  I'UIH K  d  l.i)zii |  I . (lOiiilsnii. li 15  , and llcriisicin. I  S Niiliniuil Inifuui / in/
  inilidii .i/(In (
  K.Mfu. 1997, 21. 501-524
Weiss, C II  I iiiliiiiiiiiii Methodsfoi SluilyingPiogiamsimJ Mmc-. (2il nl ) I ng
  Chlfs  N|  I'rcniui-llall, 1998
       (IlKSC (K.IN VV'I ISS is /JI()/CSS(ll (I/ CcJlKdflOII (l( (IlC MillVdlif ( .Mlifllilll
-------
                 I t UlJlKtflOIIN Kill be /MM'tl (III -
tivf (how the progiain actually works)  The issues raised in this i haptei i.m
he applied lo eiihei normal we models or clescnpiive models
Null  'I'his ih.ipltT li.ts hciu'liiccl sMl)si.iiiti.illy from the liclplvil loiumrnts, i|\ifsiiiiii-..
.Hid SU^CSIIDIIS limn iiii-mlicis ul i In- ll.nv.iicl l.v.ikulioii 'I ask I out. p.inu ip.inls .11
ilu- ll)l)H Aiiii'iu.in I v.ilii.iinin ASMU i.iiinn nu-ciui);, the cdiinis anil si'iits uliim ol tins
     r, .mil as ,il».iys my gi.nlii.iic r\.tliuii<>n bliiilfius  linlial wink mi I Ins ili.i|i'    .is
         11) .1 It Ilim'ship limn  (In S|n iucr I tniiul.Hioii
-------
48    I'KlK.KAM I III 010 IN I \ MIIAIION
What Do the Boxes and Arrows Represent?
Piogiam theory usually involves a diagram of boxes linked hyanows icp-
rcscnting causc-and-cffect rclaiionships It is perhaps templing 10 consider
these causal models lo lie like wn ing diagrams, in which, il we Hick .1 swiuh
at (he lust box in I he diagram, il will cause the lights in the other boxes to
illuminate And indeed, sometimes the descriptions of these models, using
a series ol if-ilicn statements, suggest this imagery (Owen, with Rogeis,
1999, IManiz, Grcenway, and Hendncks, 1997)
     l:valualois who aie familiar wiih social science principles will not he
sin pi used that lew piogiam  theoiy models aie based on simple i an sal ic-la-
tionships like this, even if  diagrams do not explicitly show  it llowevei,
some piogram theory models do explicitly attempt lo show the pioccsses
that aie "necessary and sufficient" to produce the desired results — lor exam-
ple, Cooley's causal model (1997)  of a progiam designed lo mcicasc gills'
participation in high school in developing countries
     Moje commonly, program theory models are  based on a recognition
that other faciois may influence the achievement of intermediate and ulti-
mate outcomes For example, ihe United Way's generic causal model dins
not explicitly include other (actors, apart  horn a list of consnaints on the
program Howeyej, in the instructions provided wyhjhis model, it is made
clear that  the further away from aciual program outputs'one moves, the
weaker the program^. inCLuence becomes, and The likelihood ol outside
forces  having an influence  increases (Planiz, Grcenway, and HendncUs,
1997)  They go on to give  an example ol a program pioviding prenatal
counseling for pregnant teens, pointing out that the program can influence
what pregnant teens know about appropriate prenatal practices but  cannot
influence what the teens' overall health was when they became pregnant
Nor can the programs affect whether teens were using drugs when they
became pregnant The authors recognize thai each of these issues, genei.il
health and involvement with drugs, can have as much long-lei m influence-
on the later health of babies as the program uself
     II  is inleresimg lo note  that tins analysis focuses only on lixed chaiac-
icrisncs or events thai happen before ihe clieni begins in ihe program— ^soine-
nmes jelened to a> mudeiaiois Outcomes can also be influenced by factors
that occui ai ihe same time as the program and either help or hinder its work
     I low can we represem ihese olher factors in our causal models7 Funnell's
piogram theory matrix (Chapter N income I udesolher factors explicitly .mjcxt
associ.aicd with each outcome 1 1 is also possible to sjiowjhcm on the program
tJlSUiy diagianuas I lalpernTf998, 1999) has done, as seen in j-igmc 5 1
     We might even  make  a  dramatic move compleicly__away horn  our
PI'jiJi-nii-uMimc causal model and show ihe web of client lelationships
thai influence client  outcomes, including the influence of family, friends,
schools, shops, economy, neighboihood, media, legal system, woik, econ-
i>. i  .nicl pohiii.il system (Mullen. 1995)
                       C'AIISAI MOI)i:i.S IN I'KlX.KAM I III DHV I VAIII \ I II IN
                                                                                                                                                        •19
Figure 5.1. Representing Other Factors in a Logic Model Tor Reducing
     Alcohol-Related Motor Vehicle Accident Injuries and Deaths
  Si I \ K i

  ( Illkiillll

lli.ihli
I'liiiiiuiion
|S) I'lOl III lull



Dull isi Dunking
iV Dm nit; in
lifiui.il l'ii|iiil.iliiin

Addnnon
Irealmcnl
Semies
!


I'meigiiKy
. .. , Anilinl.ii
Mediul
Sl 1 Ml 1
Seivn.es
\/
1 limniiile Dunking
ta Driving in
Akoliol-UtjHiidciH
Chcnls

Miniinui Mmliidii)
iV Mi in. ilny
limn MVAs
  Kignm.il
  llfjhh
  I Inu unii
KciliiLC AlLohol-
 Kelaled MVA
   fjialines
  6j Injuries
  I'niMiiiial
  I It.ill li
  lnl.ll
1'irin.iiurr
De.iih .mil
I'ifvi m.ihlc
                           (ii.iilu.iliil
                          I in using Itii
                          Noviir Dim is
                                                 Allnn.iiiv
                           SlKl.llA lllllll.ll
                            liilri.iim- lui
                            Dunking f-i
                             Dining
           IWH,
-------
50    I'KOl.KAM IIIIOMV IN I V.\l H4I ION

Mtilliplc Strands in Causal Models
Many piogiam ihcoiy models potiray I he piogiam as a single chain ol uilci-
mcdiaic and ultimate outcomes, wlicic A leads to R and then n> C liui u
,nj£ |w- hdpluljo show nuihinlc strands, where A and C.boih lead in U—
culie'i "liTcoinbinaUoti or iis allcinatives  Ideally, we would be able in distin-
guish between complemc-maiy causal paths and alternative lausal paihs in
a di.ini.nn. pcihaps by using line arrows for the complciiicni.ity paths and
block .mows lui  die alicinative paihs
     II a combination ol two causal pain-, is nc.ccis.iry in achieve llie intended
tesulis. u is mipni lain in make  llus explicit m order to avoid maximizing only
one ol mem  In many piogi.im!., stall must balance competing imperatives like
this  When I winked with iu.iirin.il and child heallh muses lo develop a ians.il
model ol then piogi.im In guide the development til pciloini.iiKf mdii.ilois,
ihey weic paiinitially pleased iliat they could make visilile the balancing they
needed in inamiain heiwceii providing mformaiion lo paienisand sitppoiimg
P.HCIIIS  tonhdcncc in llu-ir own anilines Part ol then program mode), which
used an adaptation ol  Uciuicus Incraichy (Brnnell and Rockwell, 1999) in
ilesi nhc ilu-ii  woik on infant leveling, .showed ilusclcaiU. as seen in 1'iguic 5 2
     h was impoitant Ini the stall in nuUe visible to piogt.tm m.iii.igns the
compelmg demands on them  and lo make sure ihai perfoimance mcasuies
lelenecl 10 both  of these  in otdei u> ensure llwi iheie weie not siiuiiui.il



  l-igurc 5.2. A Partial Program Model Showing Competing Demand-,
        (II I IMA 11
Ik ill I nun ilion mil
 j;in\vlll III Kil'h-
        III IIAVIOK
        ( IIANl.l
        KNO\VII IX.I
        III IIII s
             I'DI s
                                                                    C/U'SAI MODIISIN PRO(.KAM I I II I >K> I X Al I IA I U IN    "j |

                                             pu-ssu.es 10 IIMMIIIIZC cithei mlomiaiion giving 01 paiemal suppoii .» (he-
                                             expense ol ihe otliei  When programs are managed by managcis w.ilu.ui
                                             cleiailetl knowledge ol jirogram processes or arc managed ilnougb lonnat-
                                             iu.il aiungcmenis, it becomes  more important  to make cxpliut COIIIIMMIIIU
                                             nnpeiatives sui h as these If performance measuics only iiuliklr one ol the
                                             competing nnpo.ai.ves, then a  program may seem to be perlormmg well ,,.
                                             let MIS ol us iiucinuxluic outcomes because one  of these 1S being nuxtiniznl
                                             at the expense ol the other
                                                 Multiple sii.uuls m a causal model may instead represent  ahein.uive
                                             I.IHS.I! luilu, |-OI t-x.,,Mple, Weiss (1998) omlmcs four possible mc.lumsms
                                             hy winch higher UMdici pay may be linked in  increased siudem .« \»ew.
                                             Minn  II these aie seen as competing explanations for observed outcomes
                                             then .t piogi.im ilK-oiy evaluation might focus on testing which of these besi
                                             e\pl.ims the evident e (as Weiss discusses in ( haptcr Tour)
                                                 It is also possible lo see these ajlf rii.il ivi- ( .luJtajjJaih^^Jinm  inu fot
                                                                 i rrlam rnnrtiliaus. Drawing an .m.-ilogy
      ..-                       .            ..            ti-
  |w»wdci, whuh will ,,nly f,,c In favorable conditions. I'awson and 'hlley
  (1997) have suggested that program causal mechanisms only fnc wiihin
  la\niablr tnniexis _Aney^ju.iiiori based on this lype of c.ins.d moJ.-l will iry
 jojuidasLm^ljc^tixiiiinsiaticcs under which pa. in ul.ii nifVliams'iiis opcr-"
 Jtc  In then  ic-analysis of n crime preveiHioifjimgiaTn iii public" housing
  esiates. I'awson and Hlley (1997) demnnsirjird th,- tmp<,,i.m,«- ,,l ,,,.,lt-i-
  SMiKlinj. llu- CO.IIC-M ..I ddfeiem sites, including mtc.au.oiis .miong v.nn.us
  ma IUIIIMIIS (such as improved housing and increased lenani involvement
  in esiate manage mem) and  among oihei coexisting processes
      ll'i_LliJ!iUi!u<> rc-Gresetit ihese morc^gmnplex  iclai.o.isliiiis m ., ixvo-
 dmwnsioii.il  di^rajii Pawson and Tillcy (1997) insfraVl repies'em ihVn
 causal model in a matrix of context-mcchaiiism-niiicoiiir conligui,uion.s
 which describes m text the  causal mechanism that produces the outcoiiu-
 and the context in which the mechanism is opci alive
     The  characteristics of program clients— then motivations, .iiiimdes
 previous knowledge, and skills— are an important p.iruihhe umtext within
 which causal  mechanisms work or fail to work An iterative senes ol data
 collection and analysis activities can be used in identify impoiiam ways m
 which clients vaiy and the implications of these foi  program ellei tivrnrss
 (McDonald and Uogers. 1999)
     Tojully undeisiandjhe cunjcxnvijjiin which c.ntsal nircli.iiiisins «.pt-i-
 aic.vve may need to develop program models that do moic than UK hide m.>-
 giam ilicnis simply-.is passrvcrecTpienis of ircainienis that change thai lives
 II llie IIC.IIII1CIH Involves swallowing a pill, we mighi expnl teiMn. |,hysi(;-
 logical ellects, reg.udless ol the active involve mem ol the patient, hut even
 m tins example, we know ih;il the patient's expectations about the iiealmeni
tan inlluence its irponed impacts It is even less realistic n     nbe piogram
t hems as passive ,c< ip,ents when the program is nuk-.iv.       hi mK alHiut
|k-iiii.iiifiii ch.mge ,„. lot  example, students' school hrh.,    ,i .ommmii-
-------
'32    I'KlM.KAM llllCHU IN I V.MIIAIIHN
canon strategics ol llic healing nnp.iiial — changes lhal require progiam
i Ill-ins lo lc.ii ii, apply, and maintain new ways ol oni-i.il ing
    Pawson and Tillcy (1997) have argued dial we need "lo shake off those
concrpiu.il hahns which allow us lo speak of a piogram 'producing oul-
comes1 and lo icplace ihem with an imagery which sees the piogram oflcr-
mg chances which may (or may  not)  be  triggered into action  via the
suhjccfs capacny to make choices     Potential sublets will consider a pio-
gram (01 not), volunteer loi it (01 not), cooperate closely (01 not), slay the
couise (or not), learn lessons (or not), retain the lessons (01 not), apply the
lessons (01  not)" (p 38)
    I Ins issue docs nol need to l>c assiu idled with a philosophical com-
mitmeiil lo saving progiam chenis and having their needs and perspectives
at the lorelroni of progiam planning and evaluation nor with a belief that
the prison.il dignity ol clients and stafl requires treating them as program
panneis railiri than as passive objects  In lad, the same distinction holds
line loi  piogiams such as buiglary pievenlion, which mlcnd It) change the
behavior ol potential burglars Programs tan be understood as changing the
options available to participants and their capacities to choose and enact
these choices  Usually piograms seek to increase options and capacities,
some, such as burglary pieveniion, seek to icduce  them


Causal Models from Systems Theory
Systems theory suggests other types ol causal models In ihis section, I dis-
cuss liner of these  that appear lo he potentially useful for program evalua-
,K)n — viiiiious 01 vicious circles, symptomatic solutions, and feedback delays
    Virtuous or Vicious Circles, bysjnnsjhmkingsuggesis that cause and
elU;u.mj.ghj ojiejihc aimiei.led.nol  in a linear way bui in a circular way,
             es of VJL/UHUS cncfrs (where an initial  effect leads to us own
irrnloneiiirni and magnification) 01 VUIOHS rogiaiu.aieJjkely^ to dec ay
over lime or lo become stronger	
     Symptomatic Solutions. Symptomatic soln/ions ait- soluiions ihai
relieve the symptoms bin lhal actually make 11 harder to solve the pioblc-m
Ii would be like having ihe flu and taking tablets lo recline the sympioms
and then continuing to work excessively, rathci than convalescing, iheicby
making it haidci to actually recover
     I Ins pmhlc-m  has implications both lor evaluation and loi iiiomioimg
I 01 an evaluation, wheie we are trying lo understand how ellective a pai-
liculai piogiam h.ts been in solving a problem, we should design om eval-
uation so thai it can distinguish between lemporaiy i eduction of sympioms'
and sustainable solving ol the problem  l:or monitoring, wheie we aie seek-
ing to simultaneously understand and influence progiam implementation,
we should set up systems lhal do nol encourage people  to develop dys-
functional symptomatic solutions
     Owen and Limbeit (1995) addressed this issue in then evaluation ol a pio-
gram in which all grade-live students used their individual notebook comput-
ers in all subjects One ol the unintended consequences ol this piogiam was an
incicasc in  teacher stress, as teachers struggled lo develop then compuiei skills
and simultaneously acl.ipl their leaching material and piocesses him,illy, icai li-
ers lesponded to this increased stress by "gelling on with the |ob," avoiding
spending time in coordination meetings, or liaising with olhei leac hers, and the
administration sought lo support teachers by leaving them alone and nol mak-
ing addilional calls on their lime If the evaluation had measured icachri si i ess
at this point only, it would have found lhal in the shoil-teim teacliei stiess was
reduced thiough this coping mechanism Hut over time, this symptomatic solu-
tion led to hoaiding ol equipment, rivalry among groups, and pool  attendance
at mloimation sessions—consequences thai made it harder to implement the
fundamental solution,  which involved belter shanng and  coordination ol
resoiuces and increased support and training for leacheis
     Feedback Delays. We have probably all cxpcnenced ihe effects nl /tul-
badi-tkluy when Hying to adjust the water temperaluie in a shown If theie is
a delay in response, we lend to overcorreci—first loo hot, then too cold, until
eventually  teaching (he desired cquilibnum 1 he Massachusetts liismin< nl
Irchnology "beer game' simulation (Senge, 1990) hasdemonsti.iied ihe ellt t is
ol feedback delay on a simple piogram—a system lor piodiicing ami disinh-
ulinga single biand ol beci Once there is a delay buill mio the sysiein, sn ili.u
the decision m.ikeisdo nol immediately see the nnpac I of the ih.mges ihey aie
making, the ordeis  become moie and more excessive and unbalaiu cd
     r]ie.tcasyHLliir_usiug rjcrjormancc mcasuies and mduaiois as  pan ol
.oimmiiucy»ilu.i.lionjii_ihr public sector is lhal they  can be used by iii.inagi-is
lo I.ike i DI iei live action in  programs,  muclilike some(ine  innniliiiing and
-------
          I'NIM.KAM
                         IN I VAItl.\lli>N
    admstmg the shower icmpciaiure  Unloitu.iately, lew if any ol these sysiems
    addies. the pioblc.n ol feedback delay In lact, I have been unable 10 find a
    single example iliai does

    How Complex Should Program Theory Models Be?
    Although ihis chapter has lot used mi more complex model* and i .ius.il ii-la-
    uonshms. u is wonh remembering that si inplc models can olten be helplul,
    pan.cuWly.iULUioiitamsjn.whjcl. .here have previoUsFylxrii lew explicit
    ™ic7niual ,uul emm. ical comieci.ons made between piogram activities and
    ouicoiiS-s nmluTng ..'plausible model of bow ibe program is meant 10 wo.k
    helps managm uleniily the mosi important  moccsscs 01 mte.med.aie uui-
    comes and lotus then mcasureincnl and attention there  C.ivcn lhai many
    niogiain evaluations still collect little data about  progiain implementation
    or miermediaic outcomes, there is often considerable value (as Lipscy. IWJ.
    and Pctrosino  Chapter Six, point out) in using even a simple iwo-siep pio-
    Kraiu model that simply identifies and measures one mediating variable that
    is undeisiiHid to be nccessaiy loi ihe achievement ol the ultimate outcomes
    And having a common model ol how the piogiam is meant in woik can
    help piogram staff work together and IOLUS on those activities that arc most
    impoiiant loi piogiam success
         In fact as Wcick (IW5) aigucs, ajiioueljnighi provide a useful heuriS;
    nc foimirppseUiLaclioti without ncies'sanly being correct lie u-couiiis,  ihe
    sforfof ihe reconnaissance unii, lost in the snow  in ihe Swiss AI|)S loi IhiCK
    days in a blizzard, who eventually managed to find ibcir way safely back to
    camp with the help of a map-a map. they laic, discovered. i>T tliePv.ien.ffis,
    not ol the Alo_s.-n his incident raises the intriguing possibility thai when
    •TOU" aiTuIsT any old map will do     Once people begin to act,     they
    Kcneralc tangible outcomes    in some context     and this helps them  dis-
    cowt    what is occui ring,     what needs to be explained,    and what
    should be done next" (pp  54-55) Wcitk goes on to quote Suuhlle,  I  l.w-
  /"ing an accurate environmental map may be less important than having some
  (   man that hi mg> order to the world and prompts action" (pp 56-57)
'  V     Tins analysis may well explain the positive lesponscs that program stall
  /ohm have to progiam theory evaluation (see. for example, Huebner, Uiap-
/   ter fight)  even when th.s is based on ve.y simple causal models Bui simple
     causal models can be dysfunctional in high-stakes evaluations that are linked
     tc, n  h nn
                                                                            ()tiKniiirs'~^iiliivi sn/ rli\ui /Avwu
                                                                            in.ipl nun .ui/A I  linn)
                                                                           tin-ill I  I  HIKHV OMIIII I viilii.iln.il I IIOIIS.IIH! Oaks. Calif  S.igr, I WO
                                                                           tonliv.l  I'u M in. u inn in I In Waslimj>lon I. valuators' Confciemc.  IW7
                                                                           I iiinii-ll. S  Piu^iain Lnj;ii An Ail.ijilaUlc "lool " Evaluation Nrws uiiiJ ( niiiiiiriii  M>»7
                                                                            (.( I ). •>- 1 7
                                                                           ll,il|iiin.  Kc.ilil)-  Kvalualmg Innoviilivc t'l'igi.iins ill I'tihlu liisniu
                                                                            nuns ' (iiiini'iidiHi Jiiiiimif  (fir I'uhlic Seiloi Innovation Jnuuttil. Nov I WH. rovisul  N.n
                                                                            I*)*)  |.i\.nl,ilili .u liiip//\v\vw iiinov.iiionic./rev_nilsi-x  M   I lii-oi\ as Mrlluul Small Iliconcsof Ircatmrnis ' In I  SnliuM .mil A
                                                                            Si on ( i-iU ), I'liili-iwcmifiiig ( tiu\r\ and Ceneiahzmx About I In in  New Duniinns loi
                                                                            riiiKMin I i-aliMiicui. mi 57  San I rrmcisco Jussey-Kass. IW»
                                                                           I ipso M   anil I'nllanl, |  "Dnving I uwnnl "I heory in Progi.im I v:iliialinii Mini Mod
                                                                            rls in C luuisi- I rom ' I Mihutlum tmtl Pn>)>mm Pluiimn)>, I981J. 12.  \\7~MH
                                                                          Mi( Iniiuik (  ' Ailminisiialois. is Applied Theorists " In I. llulun.ui (nl }.Ni\vHim
                                                                            (KIMS /ni f uiliiiiiiiMi /Uvuiii rt in CIUJJIHIM llicoiy  New D.rct lions l.u Piii^iaiii  I valn-
                                                                            .ilion no -i?  San 1-r.mi.isi.o )osscy-Bass, 1990
                                                                          MiDon.ilil. IS . and Knurrs, P  "Markcl Segmentation as an An.il.ij>> foi liillt unu.iu.l
                                                                            1'io^i.nii I In Hi) An 1 xamplc rroin I lie Oairy Industry " I'api-i prrsriilril a. iln .innii.il
                                                                            ini'i'iiMn of ilu-AiiH-i 11.111 1 t'.ilu.-ition Association, Orlamlo, I la , I «W9
                                                                          Owen. |  M . with Rogers, I' | I'ni^nim h valuation f umii Hint A/i/noiii lies I linns.ind
                                                                            Oaks. ( ahl Sjftc, IW9
                                                                          llwin. | M .and l.imhtil, I C "Rules Tor Lvaluaitnn in  frainin^ Digniiiz.ilions " f vul
                                                                            iiiKiuii. IW5 1(2). 217-250
                                                                          I'awsuii, K , and hlli-y, N  KnilisiK I valuation "thousand Oaks. C.ahl  S:ij;<-, l*)°7
                                                                          riantz. M  C , lliernwa), M  I  . and Mcndricks, M  "Oulcome Measurcinciil S|HUMIIJ;
                                                                            Kcsitlis in llu- Nonpiolit Sec lor " In K I-  Newcomer (ed ),  Using IViJuiiiiiiun Mm
                                                                            MI 1 1 nil nl lit hni>it>\r I'lil'lu unii Niiiipnifit PiO£H>»is New Direilions for I \.ilu.iiioii. no
                                                                            75 San I tain ISLII  (ossry-lt.iss, 1997
                                                                          Suigi-  I'  M lit, (ijili Disc i()linc Tin- Ail tiiuf I'ruilKrit/lhrf r
                                                                          Weiss.. I  II f iiiliiiKinii MilluxJs/oi 3(ii((ying/'i()gi(miMiii<(riiMiio (2n
-------
                  The Pilot Process	Step by Step Outline




1  Step One- Create the Team and Team'




2. Step Two-  Begin with the END IN MIND and PLAN YOfR WORK




3. Products




4. Product Contents




5. Tasks




6  For Each Task




7  Step Three:  Plan and Conduct Initial Meeting with Agency Top Management




8. Step Four-  Gather Documents




9. Step Five:  Analyze Documents; Synthesize Information




10. Step Six: Develop Map of all Previous Work




11. Step Seven: Develop Draft Logic Models




12. Step Eight: Share  Models with Agency




13.  Step Nine:  Change Models; Write Narrative




14. Step Ten:  Assess  Logic Models




15.  Step Eleven: Identify Key Evaluation Questions




16.  Step Twelve- Create Potential List of Jobs




17.  Step Thirteen: Prioritize the List and Recommend Future Work




18. Step Fourteen: Assemble Final Products




19  Develop Pieces and Parts Along the Way
-------
The Pilot Process
Step by Step
 Step Two:  Begin with the END IN
 MINP and PLAN YOUR WORK
 • Identify the Products
 • List the Tasks
 • Create the Time Line
 • Assign Responsibilities
 • Identify Needed Resources
 • Specify a Regular Process for
  Communicating (phone, e-mail,
  meetings)
                                                     Step One:  Create the Team
                                                       Philosophy of Working in a Team
                                                       Mechanics of a Team
                                                       l/nique Aspects of the Virtual Te
                                                       Communication
                                                       Roles and Responsibilities
    PRODUCTS

    • Written Report
    • Supporting Documentati
.-
    • Oral Briefing & Materials
    • AND, CONTAINED IN THESE
     PRODUCTS...
-------
PRODUCT CONTENTS

• Logic Models
• Corresponding Narratives
• Observations
• Key Evaluation Questions
• Prioritization of Potential Jobs
• 'Lessons learned' about the Process
As 3 part of developing the logic
models, begin to 'overlay', on the
models, the following:

• GPRA annual performance measures
• Any metrics the agency uses
  to measure any part of the
  program (inputs — outcomes)
TASKS
 • Plan & conduct initial meeting with
  Agency Top Management
 • Gather and analyze documents
 • Synthesize information; develop tables
 • Develop draft 'Logic Models' Ctheory of
  program)
 • Share draft models with Agency
TASKS CONTINUED...
• Change Logic Models per additional
  information from Agency and other
  sources
• Write corresponding narrative for models
• Assess the Logic Models
• Identify Evaluation Questions
• Create and prioritize list of Potential Jobs
• Complete Report, Briefing, and 'Lessons
  Learned'
-------
   For Each Task...
   • determine if ft will be an effort that
     involves individual team members, the
     entire team, the Agency, the other pilo
     team, the consultant, others!*
   • identify (gather/request) resources
   • seta deadline
   • assign to a team member for follow-up
   • keep  notes, documents, references, a work
     log
• •
   Step Four-. Gather Documents
     GPRA documents & other
     information
     Info on metrics gathered)
     Published reports - internal
     Published reports - external
     Popular reports; articles...
     Past evaluations'

Step Three: Plan and Conduct Initial
Meeting with Agency Top
Management
'Past Evaluations, Audits,
Investigations
• When synthesizing the information
  collected, these documents will be very
  valuable as 'pieces of the puzzle'
  which may indicate
  that there little need
  for the same info againi
• Or can indicate a different
  of info is needed
-------
Step Five: Analyze Pocuments;
Synthesize Information
• Read
• Sort
• Begin to develop
  tables for each & every
  'component'of
  the 'program'
Step Seven:  Develop Draft Logic
Models
• From tables, create models which indicate
  sequencing, relationships, dependency
• fse 'Z' maps to indicate layered and
  program to program relationships
• Models include: overall program,
  clusters, individual programs  •
Step Six: At the same time, begin to
develops Urge 'map'...
• ...of the complete prograr
• Include previous
  audits,
  investigations,
  evaluations internjii^pjf'external to the
  program
Step Eight:  Share Models
with Agency
• Group meetings
• Individual meetings
• Ask: "Is this representative"; "What are we
  missings'" "Where can we find that
  information!*" "What other documents or
  persons would you suggest we consult/"
• Purpose is to gather more information for
  your models—not get agency approval
-------
Step Nine: Change Models; Write
Narrative
• Models are revised or sometimes scrapped
  and begun again:(
• Narrative and footnotes are important;
  models do not 'stand alone'
• Finished
  products are siicTT
Assess the Logic Models..
(continued)

• Logic (In THEORY: Po the resoura
  support the activities!' Can the activities
  produce the outputs** Do the customers
  receive the outputs** etc... Poes everything
  make sensed)
• Information Gaps (Where are the gaps in
  information/ Why/)


                                                        Step Ten:  Assess the  Logic
                                                        Models (You will have been doing
                                                        this step all along!)
                                                    it! • GPRA CPo the programs appear
                                                         to logically lead to the goals/)
                                                         Flow of goals, objectives
                                                         subobjectives...)
                                                         Metrics (Are there metrics/
                                                         Are they 'good'/ Can they
                                                         aggregated/)
                                                        Assess the Logic Models...
                                                        (continued)
                                                        • Completeness (Is the program logic
                                                          complete/)
                                                        • Externalities (Are there
                                                          unaccounted for externalities/)
                                                        • Mission (Poes the program fit
                                                          within the mission/ goals/
                                                          objectives/ subobjectives/)
-------
Step Eleven: Using Logic Model
Assessment, Identify Evaluation
Questions
• Is there a need for the program(*(calls for- a
  NEEDS ASSESSMENT)
• Is the program logically designed!* Care all
  parts of the logic model present-*) (calls
  for further DESIGN EVALUATION)
• Can the program be implemented as
  designed*' (calls for IMPLEMENTATION
  EVALUATION)
Using Logic Model Assessment,
Identify Evaluation Questions
(continued)
  Is this program worth
  the money!* (calls
  for a COST-BENEFIT
  ANALYSIS)
  Other Questions:
Using Logic Model Assessment,
Identify Evaluation Questions
Continued...
• if implemented as designed, can/will the
  outcomes be attained!* What about
  unintended outcomes!* (calls for
  OUTCOME EVALUATION)
• Just how much is this program
  contributing to the attainment of desired
  outcomes!*  (calls for an IMPACT
  EVALUATION)
                             an
Step Twelve:  Create a Potential List of
Jobs
• Group questions by type
• Translate evaluation questions into jobs
• Estimate amount of time, FTE's and
  resources for each 'job
-------
wa
,  ••
Step Thirteen: Prioritize the List and
Recommend Future Work
• Environmental risk and risk reduction
• Level of EPA investment     £*
• Importance of knowledge gaps
• Level of stakeholder interest
• Importance for restoring or
  preserving E PA's credibility

    ...DEVELOP 'PIECES ANP PARTS'OF
    PRODUCTS ALONG THE WAY
    • Logic Models
    • Corresponding Narratives
    • Observations
    • Key Evaluation Questions
    • Prioritization of Potential Jobs
    • 'Lessons learned' about the Process
Step Fourteen: Assemble Fin^l
Products
• Written Report
• Supporting
  Documentation
• Oral Briefing & Materials
• BUT, TRY TO...
                                                     From this.,
                                                                    To this...
-------