Program Evaluation
Studies
TK Logan and David Royse
A
variety of programs have been developed to address social problems such
as drug addiction, homelessness, child abuse, domestic violence, illiteracy,
and poverty. The goals of these programs may include directly addressing
the problem origin or moderating the effects of these problems on indi-
viduals, families, and communities. Sometimes programs are developed
to prevent something from happening such as drug use, sexual assault, or crime.
These kinds of problems and programs to help people are often what allracts many
social workers to the profession; we want to be part of the mechanism through which
society provides assistance to those most in need. Despite low wages, bureaucratic red
tape, and routinely uncooperative clients, we tirelessly provide services tha t are invaluable
but also at various Limes may be or become insufficient or inappropriate. But without
conducting eva luation, we do not know whether our programs are helping or hurting,
that is, whether they only postpone the hunt for real solutions or truly construct new
futures for our clients. This chapter provides an overview of program evaluation in gen -
eral and outlines the primary considerations in designing program evaluations.
Evaluation can be done informally or formally. We are constantly, as consumers, infor-
mally evaluating products, services, and in formation. For example, we may choose not to
return to a store or an agency again if we did not evaluate the experience as pleasant.
Similarl y, we may mentally take note of unsolicited comments or anecdotes from clients and
draw conclusions about a program. Anecdotal and informal approaches such as these gen-
erally are not regarded as carrying scientific credibility. One reason is that decision biases
play a role in our "informal" evaluation. Specifically, vivid memories or strongly negative or
positive anecdotes will be overrepresented in our summaries of how things are evaluated.
This is why objective data are necessary to truly understand what is or is not working.
By contrast, formal evaluations systematically examine data from and about programs
and their outcomes so that better decisions can be made about the interventions designed
to address the related social problem. Thus, program evaluation involves the usc of social
research meLhodologies to appraise and improve the ways in which human services, poli-
ci~s, and programs are co nducted. Formal eva l.uation, by its very nature, is applied research.
Formal program evaluations attempt to answer the following general ques tion: Does
the p rogram work? Program evaluation may also address questions such as the following:
Do our clients get better? How does our success rate compare to those of other programs
or agencies? Can the same level of success be obtained through less expensive means?
221
222 PART II • QUANTITATIVE A PPROACHES: TYPES OF STUD IES
What is the expe ...
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Program Evaluation Studies TK Logan and David Royse .docx
1. Program Evaluation
Studies
TK Logan and David Royse
A
variety of programs have been developed to address social
problems such
as drug addiction, homelessness, child abuse, domestic
violence, illiteracy,
and poverty. The goals of these programs may include directly
addressing
the problem origin or moderating the effects of these problems
on indi-
viduals, families, and communities. Sometimes programs are
developed
to prevent something from happening such as drug use, sexual
assault, or crime.
These kinds of problems and programs to help people are often
what allracts many
social workers to the profession; we want to be part of the
mechanism through which
society provides assistance to those most in need. Despite low
wages, bureaucratic red
tape, and routinely uncooperative clients, we tirelessly provide
services tha t are invaluable
but also at various Limes may be or become insufficient or
inappropriate. But without
conducting eva luation, we do not know whether our programs
are helping or hurting,
2. that is, whether they only postpone the hunt for real solutions or
truly construct new
futures for our clients. This chapter provides an overview of
program evaluation in gen -
eral and outlines the primary considerations in designing
program evaluations.
Evaluation can be done informally or formally. We are
constantly, as consumers, infor-
mally evaluating products, services, and in formation. For
example, we may choose not to
return to a store or an agency again if we did not evaluate the
experience as pleasant.
Similarl y, we may mentally take note of unsolicited comments
or anecdotes from clients and
draw conclusions about a program. Anecdotal and informal
approaches such as these gen-
erally are not regarded as carrying scientific credibility. One
reason is that decision biases
play a role in our "informal" evaluation. Specifically, vivid
memories or strongly negative or
positive anecdotes will be overrepresented in our summaries of
how things are evaluated.
This is why objective data are necessary to truly understand
what is or is not working.
By contrast, formal evaluations systematically examine data
from and about programs
and their outcomes so that better decisions can be made about
the interventions designed
to address the related social problem. Thus, program evaluation
involves the usc of social
research meLhodologies to appraise and improve the ways in
which human services, poli-
ci~s, and programs are co nducted. Formal eva l.uation, by its
very nature, is applied research.
3. Formal program evaluations attempt to answer the following
general ques tion: Does
the p rogram work? Program evaluation may also address
questions such as the following:
Do our clients get better? How does our success rate compare to
those of other programs
or agencies? Can the same level of success be obtained through
less expensive means?
221
222 PART II • QUANTITATIVE A PPROACHES: TYPES OF
STUD IES
What is the experience o f the typical client? Sho uld this prog
ram be terminated and its
funds applied elsewhere?
Ideally, a tho rough program eval uation would address more
complex questions in
three main areas: (1) Does the program produce the intended
outcomes and avoid unin-
tended negative o u tcomes? (2) For whom does the program
work best and un der what
conditions? and (3) Ilow well was a p rogram model developed
in one setting adapted to
another setti ng?
Evaluation has taken an especially p rominent role in practi.ce
today because o f the focu~
on evidence-based practice in social programs. Social work, as a
profession, has been asked
to use evidence-based practice as an ethical obligation (Kessler,
4. Gira, & Poertner, 2005).
Evidence-based practice is defined diLTerently, but most definit
ions include using program
evaluation data to help determine best practices in whatever
area of social programming is
being considered. In other words, evidence-based practice incl
udes using objective indica-
tors of success in addition to p ractice or more subjective
indicators of success.
Formal program evaluations can be found on just about every
topic. For instance,
Fraser, Nelson, and Rivn rd ( 1997) h ave examined th e
effectiveness of family preservation
services; Kirby, Korpi, Adivi, and Weissman ( 1997) have evalu
ated an AIDS and preg-
nancy prevention middle school program. Mo rrow- Howell,
Beeker-Kemppainen, and
Judy ( 1998) evaluated an interven tion designed to reduce the
risk of suicide in elderl y
adult clients of a crisis hotline. Richter, Snider, and Gorey (
1997) used a quasi-experimental
design to study the effects of a g roup work interven tio n on
female sur vivors of childho od
sexual abuse. Leukefeld and colleagues ( 1998) examined the
effects of an I IlV prevention
intervention with injecting drug and crack users. Logan and
colleagues (2004) examin ed
the effects of a drug co urt in terven tion as well as the costs of
drug co urt compared with
t he economic benefits of the drug court progra m.
Basic Evaluation Considerations
Before beginning a program eva luntion, several issues must be
initially considered. These
5. issues are decisions 1 hat are critical in determining the
evaluation methodology and goals.
Although you may not have complete answers to th ese qu
estions when beginning to plan
a n evaluation, these ques tion s help in developing th e plan
and must be answered before
a n evaluation ca n be carried out. We can 1.um up these
considerations with the following
questions: who, what, where, when, and why.
First, who will do the evaluation? This seems like a simple
question at first glance.
llowever, this particular consideration has major implications
for the evaluation results.
P rogram evaluators ca n be categorized as being either in ternal
or external. An internal
evaluator is someone who is a program staff member or regular
agency employee, whereas
an external evaluator is a professional, on contract, hired for the
specific purpose of evalu-
a tion. Th ere are adva ntages nnd disa dvan tages to using
either type of evaluato r. For
example, the internal evaluator probably will be very familia r
with the staff and the
program . This may save a lot of planning time. The d isadvnn
tage is that eva luatio ns com-
pleted by an internal eva luator may be considered less valid by
outside agencies, including
the funding source. The external evaluator gene rally is thought
to be less biased in terms of
evaluation outcomes beca use he or she has no persona l
investment in the program. One
disadvantage is that an externa l evaluator frequently is viewed
as an "o utsider" by the staff
w ithin an agency. This may affect the amount of time
necessar)' to conduct the eva lua tion
6. or cause problems in the overall evaluation if agency staff are
reluctant to cooperate.
CHAPTER 13 • P ROGRAM E VALUATION S1 UD I ES 223
Second, what resources are available to conduct the evaluation?
Hiring an outside eval-
uator ca n be expensive, whi le having a staff person conduct
the evaluation m ay be less
expensive. So, in a sense, you may be trading credibility for
less cost. In fact, each method-
ological decision will have a trade-off in credibility, level of
information, and resources
(including time and mo ney). Also, t he amount and level of
infor mation as well as the
research design .. ciU be determined, to some e11."1ent, by
what resources are available. A
comprehensive and rigorous eval uation does take significant
resources.
Third, where will the information come from? If an eval uation
can be done using exist-
ing data, the cost will be lower than if data must be collected
from numerous people such
as clien ts and/or staff across m ultiple sites. So having some
sense of where the data will
come from is important.
Fou rth, when is the evaluation information needed? In o ther
wo rds, what is the time-
fra me for the evaluation? The timeframe will affect costs and
design of research methods.
Fifth, why is the evaluation being conducted? Is the evaluation
7. being conducted at the
request of th e fun ding so urce? Is it being cond ucted to
improve services? Is it being con-
ducted to document the cost-benefit trade-off of the program? If
future program funding
decisions will depend on the results of the evaluation, then a lot
more importance will be
attache d to it than 1f a new manager simply wants to know
whether clients were satisfied
with services. The more that is riding on an evaluation, the
more attention will be given
to the methodology and the more threa tened staff ca n be,
especially if they think that th e
purp ose of the evaluation is to down size and trim excess
employees. In other words, there
arc many reasons an evaluation is being considered, and these
reasons may have implica-
tions for the evaluati on methodology and implemen tation.
Once the issues described above have been considered, more
complex questions and
trade-offs will be needed in planning the evaluation.
Specifically, six ma in issues guide
and shape the design of any program evaluation effort and m ust
be given thoughtful and
delib erate consideration.
L Defining the goal of the program evaluation
2. Un dersta ndi ng the level of infor mation needed for the
program evaluation
3. Determining the methods and analysis that need to be used
for the program evaluation
4. Consider in g issues that might a ri se and strategies to keep
8. the eval uation on course
5. Developing results into a useful fo rm at for the program
stakeholders
6. Providing practical and useful feedback about the program
strengths and weak-
nesses as well as providing infor matio n about next steps
Defining the Goal of the Program Evaluation
It is essen tial that the evaluator has a firm understanding of the
short- and long-term
objectives of the evaluation. Imagine being hired for a position
but not being given a job
descrip tio n or informed aboul how the job fits into the overall
organization. Without
knowing why an evaluation is called for or needed, the
evaluator might attempt to answer
a d ifferent set of c.1uestio ns from those of interest to the age
ncy director or advisory board.
The management might want Lo know why the majo rity of
clients do not return after one
or two visits, whereas the evaluator might think that his or her
task is to determ ine
224 PART II • QUANTITATIVF APPROACHES: TYPlS Or
SIUDIES
whether clien ts who received group therapy sessions were
better off than cl ien ts who
received ind ividua l counseling.
In defini ng the goals of t he prog ram evaluation, severa l steps
9. should be taken. First, the
program goals should be examined. These can be lea rned
through examining official
program docum ents as well as through talking to key program
stakeholders. In clarifying
the overall purpose of the evaluation, it is critical to talk with
different program "stake-
holders." Scriven ( 199 1) defines a program stakeholder as
"one who has a substantial ego,
credibility, power, futures, or other capital invested in the
program . . .. This includes
program staff and many who arc no t ac tively invo lved in the
day-to-day operations"
(p. 334) . Stakeholders incl ude both supporters and opponents
of the program as well as
program clients or consumers or even potential consumers or
clients. lt is essential that
the evaluator obtain a variety of different views about the
program. By listening and con-
sidering stakeholder perspectives, the evaluator can ascertain
the most important aspects
of the program to target for the evaluation by looking for
overlapping concerns, ques-
tions, and comments from the various stakeholders. However, it
is important th at the
stakehol ders have so me agreement on what program success
means. Otherw ise, it may be
d ifficult to conduct a satisfactory evalua tio n.
It is also important to consult the extant literature to understand
what similar
programs have used to evaluate their outcomes as well as to
understand the theoretical
basis of the program in defining the program evaluation goals.
Furthermore, it is critical
that the evaluator works closely with whoever initia ted the
10. evaluation to set priorities for
the evaluation. This process should identify the intended o
utcomes of th e program an d
which of those outco mes, if not all of them, will be evaluated.
Takin g the eval uation a step
further, it may be important to include the exam ination of un
intended negative outcomes
that may result from the program. Stakeholders and the
literature will also help to deter-
mine those kinds of outcomes.
Once the overall purpose and priorities of the evaluation a re
established, it is a good
idea to develop a written agreement, especially if the eva I uator
is an external one.
Misunderstandings can and will occu r m onths later if things
are no t wr itten in black
and white.
Understanding the Level of Information
Needed for the Program Evaluation
The success of the program evaluation revolves around the
evaluator's ability to develop
practical, researchable questions. A good rule to follow is to
focus the evaluation on one
or two key questions. Too many questions can lengthen the
process and overwhelm the
evaluator with too much data that, instead of facilitating a
decision, might produce
inconsistent findings. Sometimes, funding sources require only
that some vague unde-
fined type of evaluation is conducted. The funding sources m
ight nei ther expect nor
desire disserta tio n-quality researc h; they simply migh L
expect "good fa ith" efforts when
11. beginning eva luation processes. Other agencies may be quite
demand ing in the types and
forms of data to be provided. Obviously, the choice of
methodology, data collection
procedures, and reporting formats will be strongly affected by
the purpose, objectives,
and questions exam ined in the study.
It is important to note the difference between general research
and evaluation. In
resea rch, th e investigator often· focuses on q uestions based
on theoretical considerations
o r hypotheses gene rated to hu ilcl o n research in a specific
area of study. Altho ugh
CHAPTER 13 • PROGRAM EVALUATION $ TUU I ES 22 5
prog ram evaluatio ns m ay foc us on an intervention derived
from a theory, the evalua-
tio n questions should, first and foremost, be driven by the
program's objectives. The eval-
uator is less con ce rned with buildi ng o n prior litera ture o r
cont ributing to the
development of practice theory than with determinin g whether
a program worked in a
specific community or location.
T here are actually two main types of evalu ation questi ons.
There are quc~>tions that
focus on client outcomes, such as, "What impact did the
program have?" Th ese kinds of
questions are addressed by using outcome evaluation methods.
Then there are questions
that ask, "Did the program achieve its goals?" "Did the program
12. ad here to the spec ified
procedures or standards?" o r "vVh at was learned in operating
this program?" These kinds
of questions are addressed by using process evaluation methods.
We will examine both of
these two types o f evaluation approaches in the following sec
tions.
Process Evaluation
Process evaluations offer a "snapshot" of the program at any
given time. Process evalua-
tions typically describe the day-to- day program effo rts;
program modifica tions and
changes; outs ide even ts that infl uenced the program; people
and institutions involved;
culture, customs, and traditions that evolved; and
sociodemographic makeup of the clien-
tele (Scarpitti, In ciardi, & Pottieger, 1993). P rocess evaluation
is conce rned with identify-
ing p rogra m st rengths and weaknesses. T his level of p
rogram cvalua rion can be usefuhn
several ways, including providing a contex-t within wh ich to
interpret program outcomes
and so that other agen ci es o r localities wishin g to sta rr sim
ilar programs ca n benefit with-
out havin g to make the same mistakes.
As an example, Bentelspacher, DeSilva, Goh, and La Rowe (
1996) conducted a process
eva luation o f the cultural co mpatibility of psychoed ucational
fam ily grou p treatment
with eth n ic Asian cl ients. As another example, Logan,
Williams, Leukefeld, an d Minton
(2000) conducted a detailed process evaluation of the drug court
programs before under-
taking an outcome evalual ion of the same programs. T he Loga
13. n et al. sl udy used multiple
m ethods to condu ct the process evaluati o n, including .in-
depth i nterviews with the
program administra tive personnel, inten,iews with each of five
judges involved in the
progr am, surveys a nd face- to -face interviews with 22
randomly selected current clients,
and surveys of all program staff, 19 community treatment
provider representatives, 6 ran -
domly selected d efense attorney representatives, 4 prosecu tin
g attorney representatives, l
representative 6:om the probation and parole offi ce, 1
representa tive from the local
co unty jail, an d 2 police depa rtmen l representatives. In all,
69 different individuals repre-
senting I 0 different agency perspectives provided information
about the drug court
program. Also, all agency documents were ex amined and
analyzed , observations of vari-
ous aspects of the program process were conducted, and client
intake data were analyzed
as pa rt of the process evaluation. The results were all
integrated an d compiled into one
co mprehensive repo r t.
What makes a process evaluation so important is that resea
rchers often have relied only
on selected program outcome indicators such as termination and
grad uation rates or
number of rearrests to determine effectiveness. However, to
better understand how an d
why a program such as drug court is effective, an analysis of
how the p rogram was con cep-
tualized, implemented, and revised is needed. Consider this
exan1ple-say one outcome
eva luation of a drug cou rt p rogram showed a gra duat ion rate
14. of 80% of those who began
the program, while another outcome evaluation found that only
40o/o of those who began
the program graduated. Then, the graduates of the second
program were more likely to be
free from substance usc an d crimin al behaviors at the l2-
month foUow-up than the graduates
226 PART II • QuANTITATIVE APPROACHES: TYPES OJ
SJUDIES
from the first program. A process evaluation could help to
explain the specific differences
in facto rs such as selection (how clients get into the programs),
treatment plans, monitor-
ing, program length, and other program features that may
influence how many people
graduate and slay free from drugs and criminal behavior at
follow-up. Tn other words, a
process evaluation, in contrast to an examina tion of program
outcome only, can provide a
clearer and more com prehensive pictm e of how drug cou rt
affects those involved in the
program. More specifically, a process evaluation can provide
information about program
aspects that need to be improved and those that work well
(Scarpilli, Inciard i, & Pottieger,
1993). Finally, a process evaluation m ay help to facilita te
replicatio n of the drug cou rt
program in other areas. This often is referred to as technology
transfer.
A different but related process evaluation goal might be a
description of the failures
15. and depa r tures from the way in which the interventio n o
riginally was designed. How were
the staff trained and hired? Did the intervention depart from the
treatment manual rec-
ommendations? Influences that shape and affect the intervention
that clients receive need
to be identified because they affect the fidelity of the treatment
p rogram (e.g., delayed
funding or staff hires, ch anges in policies or procedu res).
"/hen program implementation
deviates significantly from what was intended, this might be the
logical explanation as to
why a program is not working.
Outcome or Impact Evaluation
Outcome or impact evaluation focuses on the targeted objectives
of the program, often
looking at variables such as behavior change. For example,
many drug t reatment programs
may measure outcomes or "success" by the number of clients
who abstain from drug use.
Questions always arise, though. For instance, an evaluation
might reveal that 90% of those
who graduate from the program abstai n from drug use 30 days
after the prog ram was com-
pleted. However, only 50% report abstai ning from drug use 12
months after the program
was completed. Would key stakeholders involved all consider
that a success or failure of the
progr am? This exam ple brings up three critical issues in
outcome evaluations.
One of the critical issues in outcome evaluations is related to
understanding for whom
docs the program work best and under what conditions. In other
words, a more interest-
16. ing and important question , rather than just asking whether a
program works, would be
to ask, "Who are those 50% of people who remained abstinent
from drug use 12 mo nths
after completing the program, and how do they differ from the
50% who relapsed?" It is
not unusual for some evaluation questions to need a
combination of both process and
im pact evaluation m ethodol ogies. For example, if it turned o
ut that r esults of a particular
evaluatio n showed that the program was not effective (impact),
then it might be useful to
know why it was not effective (process ). Tn such cases, it
would be important to know how
the program was im plemented, what changes were made in the
pro gram during the
im plementation, what problems were experienced dur ing the
implem entation, and what
was done to overcome those problems.
Another important issue in outcome evaluation has to do with
the timing of meas ur-
ing the o utcomes. Ou tcome effects are usually measured after
treatmen t or postin terven-
tion. These effects may be either short term or long term.
immediate outcomes, or those
generally measured at the end of the treatment or intervention,
might or might not pro-
vi de the same resu lts as one would get later in a 6- or 12-m
onth follow- up, as highlighted
in the exa mple above.
The third important issue in outcome evaluation has to do with
what specific measures
were used. Is abstinence, for example, the only measure of
interest, or is reduction in use
17. something that might be of inte rest? Refra inin g from cri minal
activity or holding a steady
CHAPTER l3 • PROGRAM EVALUATION STUOIES 2 27
job may also be an important goal of a subslance abuse
program. If we only m easure
abstinence, we would never know about other kinds of outcomes
the program may affect .
These last two issues in outcome evaluations have to d o with
the evaluation methodol-
ogy and analysis and are add ressed in more detail below.
Determining the Methods and
Analysis That Need to Be Used
for the Program Evaluation
The next step in the evaluation process is to determine the
evaluation design. There
are several interrelated steps in this process, including
determining the (a) sources of data,
(h) research design, (c) measures, (d ) analysis of change, and
(e) cost- benefit assessment
of the program.
Sources of Data
Several main so urces of data can be used for evaluat ions,
includ ing quali tative informa-
tion and quantitative information.
Qualita t ive Data Sources
Qualitative data sources are often used in p rocess evaluations
18. and might include o bsen a-
tions, analysis of existing program documents such as policy
and procedure manuals, in -
depth interview data, or focus group data. There are, however,
trade-offs when using
qualitative data so urces. On the positive side, q ua litative
evaluation data provide an "in-
depth" snapshot of var ious topics such as how the program
functions, what staff think are
the positive or negative aspects of the programs, or what clients
really think of the O'erall
program exp eriences. Reporting cl ients' experiences in their
own words is a characteristic
of qualitative evaluations.
Interviews arc good for collecting qualitative or sensitive data
such as values and atti -
tud es. This method requires an interview prolocol or
questionnaire. These usual!) are
structured so that respondents are asked questions in a specific
order, but they can be
semistructured so t.hat there are fewe r topics, and the
interviewer has the ability to cha nge
the order based on a "reading" of the client's responses. Surveys
can request information
of clients by mail, by telephone, or in person. They may or may
not be 1>clf-administered.
So, besides considering what data are desi red, evaluators must
be concerned with prag-
matic considerations regarding the best way in which to collect
the desired data.
Pocus groups also offer insight in to cer tain aspects of the
program or program func-
tioning; participants add their input, and input is interpreted and
discussed by other
19. group members. This discussion component ml!y provide an
opportunity to uncover
information that might otherw ise remain undiscovered such as
the m eaning of certain
things to different people. Focus gro ups typically are small
inform al groups of persons
asked a series of questions that start out very general and then
become more specific.
Focus groups are increasingly being used to provide evaluative
info rmation about human
services. They work pa rt icula rly well in identifying t he
questio ns that might be important
to ask in a survey, in testing planned procedures or the phrasing
of items for the spec ific
target population, and in exploring possible reactions to an
intervention or a service.
228 P!IRT II • QuANTITATIVE APP ROACHF.S: TYPES OF
SruOI[S
On the other hand, qualitative studies Lend to use small samp
les, and care mus t be used
in analyzing and interpreting the information. FurLhermore,
although both qualitative
and quantitative data are su bject to m ethod bias and threats to
validity, qualitative data
may be more sensitive to bias depending on how participants are
selected to be inter-
viewed, the nu mber of observations or focus groups, and even
subtleties in the questions
asked. With qualitative approaches, the evaluator often has less
abil ity to account for alter-
n ative expla nation s because th e data are more limited.
Making strong conclusions about
20. representativeness, validity, and reliability is more difficult
with quali tative d ata corn-
pared to something like an average rating of satisfaction across
respondents (a quantita -
tive measu re). Yet, an average rating do es not tell us much
about why parti cipants a re
satis fi ed with the program or why they may be dissatisfied
with other aspects of the
p rogram. Thus, it is often imperative to use a mixture of q
ualitative and quantitative
information to evaluate a program.
Quantitative Data Sources
Two main types of quantitative data sources ca n be used for
program evalu ations: sec-
ondary data and original data.
Secondary Data. One option for ob taining needed data is to use
existi ng data. Collecting
new data often is more expensive than using existing data.
Examining the data on hand
an d already available always is a good llrst step. H owever, the
evaluator migh t want to
rearrange or reassemble the data, for example, dividing it by
quarters or combining it into
12 -m onth periods that help to reveal patterns and trends over t
ime. Existing data can
come from a variety of places, including the following:
Client records maintained by the program: These may include a
host of demographic
and service-related data items about the population served.
Program expense and financial data: T hese can help the
evaluator to determ ine whether
21. one intervention is much more expensive than another.
Agenc.y annual reports: These can be used to identify trends in
service delivery and
program costs . The evaluator can compare an n uil l reports
from year to year and can
develop graphs to easily identify trends wilh clientele and
programs.
Databases maintained by the state health department and other
state agencies. Public
data such as births, d eaths, and divorces are available from
each state. Furthermore,
mos t state agencies produce annual reports that may reveal the
number of clients
served by program, geographic region, and on occasion, selcc
t·ed sociodemographic
variables (e.g., race or age).
Local and regional agencies. Planning boards for mental health
services, child protec-
tion, school boards, and so forth may be able to furnish
statistics on outpatient and
in patien t services, special school populations, or child abuse
cases.
The federa l government. The fed era l governmen t collects and
maintains a large amount
of data on many different issues and topics. State and national
data provide bench-
marks fo r comparing local demographic or social indicators to
national-level demo-
graphic or social indicators. For instance, if you were worki ng
as a cancer educator
whose objective is to red uce the incidence of b reast cancer,
you might want to consult
22. cancercontrolplanct.ca ncer.gov. That Web site w ill furnis h
natio nal -, state-, and
CHAPTER 13 • PROGRAM EVAlUA II ON S TUD ICS 229
county-level data on the nwnber of new cancer cases and deaths.
By compariso n, it
will be possible to determine if the rate in one county is higher
than the state or
national average. Demographic information about communities
can be found at
www.census.gov.
Foundations. Certain well-established foundations provide a
wealth of information
about problems. For example, the Annie E. Casey Foundation
provides an incredible
Kids Count Data Book that provides an abundance of child
welfare-related data at the
state, national, and county level. By using their data, you could
determine if infant
mortality rates were rising, teen births were increasing, or high
school dropouts were
decreasing. You can find the Web site at www.aecf.org .
lf existing data cannot be used or cannot answer all of the eva
luation q uestions, then
o riginal data rnust be coll ec lcd.
Original Data Sources. There a re rnan y typ es or evalua ti o n
designs (rom wh ich to choose,
and no single one will be ideal for every project. Th e specific
approach chosen for the
eva luation will depend on the purpose of the evaluation, the
23. research questions to be
explored, the h oped-to r or in ten d ed res ults, the quali ty and
vo lume of data available or
needed, and staff, time, and financial reso urces.
The evaluation design is a critical decision for a number of
reasons. Without the
appropriate evaluation design, confidence in the resuiL<> of the
evaluation might be lack~
ing. A strong evaluation design minimizes alternative
explanations and assists the evalua-
tor in gauging the true effects attributable to the intervention. In
other words, the
evaluation design directly affects tl1e interpretation that can be
made regarding whether
an intervention should be viewed as the reason for change in
clients' behavior. Howewr,
there are trade offs with each design in the credibility of
information, causality of
an)' observed changes, and resources. These trade- off.~ must
be carefully considered and
discussed with the program staff.
Quantitative designs include surveys, pretest-posttest studies,
quasi-experiments with
noncquivalcnt control groups, lo ngit u dinal designs, and
randomized experimental
designs. Quantitative approaches transform answers to specific
questions into numerical
data. Outcome and impact evaluations nearly always are based
on quantitative evaluation
desig ns. Also, sa mpli ng strategies must be co nsidered as a n
in regr<1l p<1rt of th e research
design. Below is a brief overview of the major types of
quantitative evaluation designs. For
a n expanded disc ussio11 o r these topics, refe r Lo Royse,
24. Thyer, Padgell, and Logan (2005 ).
Research Design
Cross -Sectional Surveys
A survey is limited to a description of a sa mple at o ne point in
time and provides us with
a "snapshot" of a group of respondents and what they were like
or what knowledge or atti-
tudes they held at a particular point in time. If the survey is to
generate good generalizable
data, then the sampling procedures must be carefully planned
and implemented. A cross-
sectional survey requires rigorous random sampling procedures
to ensure that the sample
closely represents the population of interest. A repeated survey
is similar to a cross-
sectional study but collects information at two or more points in
time from the same
respondents. A repeated (longitudinal) survey is effective at
measuring changes in facts,
attitudes, or opinions over a co urse of Lime.
230 PART II • QuANTITATIVE APPROACH ES: TYPES Of S
TUOIES
Pretest-Po sttest Design s (Nonexperimental)
Perhaps the mosl common quantitative evaluation design used in
social and human
service agencies is the pretest-posttest. In this design, a group
of clients with some specific
problem or diagnosis (e.g., depression) is admin istered a
pretest pr ior to the start of inter-
25. vention. At some point toward the end or after the inter vention,
the same inst rument is
admi nistered to the group a second time (the pos ttest) . The
one-group pretest-posttest
design can measure change, but the evaluator has no basis for
attributing change solely to
the program. Confidence about change increases and the design
strengthens when control
groups are added and when participan ts are randomly assigned
to either a control or
experimental condition.
Quasi-Experimental De signs
Also known as nonequivalent control group designs, quasi-
experiments generally use
comparison groups whereby two similar groups are selected and
foll owed for a period of
time. One group typically receives some program or benefit,
v,rhereas the other gro up (th e
control) does nol. Both groups are m easured and compared for
any differences at the end
of some time period. Participants used as controls may be clien
ts who are on a waiting list,
those who are enrolled in another treatment program, or those
who live in a different city
or county. The problem with this design is that the control or
compa rison group might
no t, in fact, be equivalent to the group receiving the
intervention. Comparing Ocean View
School to Inner City School might not be a fair comparison.
Even two differen L sch ools
within the same rural county might be more different than
similar in terms of the learn-
ing milieu, the proportion of students receiving free lunches,
the number of computers
26. and books in the school librar y, the principal's hiring
pract~ces, and the like. With this
design, there always is the possibility that whatever the results,
they might have been
obtained because the intervention group really was different
from the contro l group.
However, many of these issues can be considered and either
controlled for by collecting
the information and performing statistical analysis with these
considerations or at least
can be considered within the contex1: of interpreting the resu
lts. Even so, this type of study
does not provide proof of cause and effect, and the evalu ator
always must cons ider o th er
facto rs (both known and measured a nd unknown or
unmeasured) that co uld h ave
affected the study's outcomes.
Longitudinal Designs
Longitudinal designs are a type of quasi- experimental design t
hat involves tracking a par-
ticula r group of individu als over a substantial period of time to
discover potential
changes due to the influence of a program. It is not uncommon
for evaluators to want to
know about the effects of a program after an extended period of
Lime has passed. The
questio n of interest is whether treatment effects last. These
~tudies typically are compli-
cated and expensive in time and reso urces. In add ition, the
longer a study runs, the higher the
expected rate of attrition from cl ients who drop out or move
away. High rates of allrition can
bias the sample.
27. Randomized Experimental Designs
l.n a true experimental desig n, participants are ran domly
assigned to either the control
or treatment group. This design provides a persuasive argument
about causal effects of a
program on participants. The random assignment of respondents
lo treatm ent and con-
trol groups helps to ensure both groups are equivalent across
key variables such as age,
race, area of residency, and treatment history. This design
provides the best evidence Lhat
CIIAPT ER 13 • P ROC RA M EVALUATIO N STUDI ES 231
any observed differences between the tl'IO groups after the
intervention can be attributed
to the intervention, assuming the two groups were equal before
the interven tion. E·en
w ith random assignmen t, group differences preinLervention co
uld exist, and the eval uato r
should carefu lly look for them and use statistical controls when
necessary.
One word of warning about random assignment is that key
program stakeholders
often view random ass ignment as unethical, especially if they
view the treatment p rogram
as benefici al. One o utcome of this diffk ulty of accepting
random assignment is that staff
mi gh t have problems not giving the intervention they believe
is effective to specific needy
clients or to all of their clients instead of just to those who were
randomly assigned. If they
28. do succumb to this temptation, then the eval uation effo r t can
be unintentional ly sabo-
ttlged . The evalua tor must tra in an d prepa re all of those
individ uals involved in the e'al-
uation to help them und erstand the purpose and importance of
the random assignment.
That, more than any other procedure, prov ides the evidence
that the treatmen t really does
benefit the clients.
Sampling Strategies and Consideration s
vVhen the client population of interest is too large to obtain
information from each
individual member, a sample is drawn. Sampling allows the eva
luator to make predictions
abou t a population based on study findings from a set of cases.
Sampling st rategi es can be
very complex. lf the evaluator needs th e type of precision
afforded b y a p robability sam-
ple in which there is a known level of confidence and margin of
error (e.g., 95% confi-
dence, pl us or m inus 3 pe rcentage points), th en he o r she m
igh t need to hire a sampling
consultant . A co nsultant is particularly recommended wh en
the decisions about the
program or intervention are critical such as in drug research or
when treatments could
have potentially harmful side effects. However, there is a need
to recognize the trade-offs
that a re made when deter mining sampling strategy and samp le
si ze. Large samples can be
more accurate than sma ller ones, yet they usually are much
more expensive. Small
samples can be acceptable if a big change or effect is
ell.lJected. As a rule, the m ore critical
29. the decision, the larger (and more precise) the sample should
he.
T here are two main c<ttegories of sampli11g st rategies fro m
whic h the evaluator can
choose: probability sampling and nonprobability sampling.
Probabili ty sampling imposes
statistical rules to ensure that unbiased samples are drawn.
These samples normally are
used for impa ct studies. Nonprobability o r convenience
sampling is less complicated to
implement and is less expensive. This type of sampli ng often is
used in p rocess evaluations.
Wit h probabi lity sampling, the primary idea is tha t every in d
ivi du al, object, or institu-
tion in the population under study has a chance of being
selected into the sample, and the
likelihood of the selection of any individual is known.
Probability sampling pro,;des a
firm basis for generalizing from the sample lo the population.
No11probability samples
severely red uce the eval uator's ability to generalize the results
of the study to the larger
population.
The evaluator must balance the need for scientific rigor against
convenience and often
limited resources when determining sample size. If a m ajor
decision is bei ng based on
data collected, then precisio n and certa inty are critical.
Statistical precisio n increases as
the sample s ize increases. When differences in the results are
expected to be small, a larger
sampl e guards against confounding variables that might distort
the results of a treatment.
30. Measures
The next important meth od decision is to determine how best
Lo measure the variables of
interest needed to answer the evaluation questions. These will
va ry from evaluation to
232 PART I I • QUANTITATIVE APPROACHES: T YPES OF S
TUOICS
evaluation, depending on the questions being asked. In one
project, the focus migh L be on
the outcome variable of arrests (or rear rests) so as to determine
whether the program
reduced criminal justice involvement. In another project, the out
come variable mighL be
nmnbcr of hospitalizations or days of hospitalization.
Once there is agreement on the outcome variables, objective
meas ures for those
variables must be determined. Using the example of the drug
court program above, the deci-
sions might include the following: How will abstinence be m
easured? How will reduction in
substance use be measured? How will crimina 1 behavior be
measured? llow will employment
be measured? Th is may seem simple at first glance, but there
are two complicating factors.
First, there are a variety of ways to measure something as
simple as abstinence. One could
measure it by self-report or by actually giving the client a drug
test. When looking at reduc-
tion of use, the issu e of measurement becomes a bit more
complicated. This will likely need
31. to be self-report and some kind of comparison (either the same
measures must be used
with the same clients before and after the program [this being
the best way) or the same mea-
sure must be used with a control group of some kind [like
program dropouts)).
The second complicating factor in measurement is determining
what other constructs
need to be included to better understand "who benefits from the
program the most and
under what circ umstances" and how those constructs are
measured. Again, using the drug
court program as an example, perhaps those clients who are
most depressed, have the
most health problems, or have the mos L anxiety do worse in
drug court programs because
the program may not address co-occurring disorders. If this is
the case, then it will be
important to include measures of depression, anxiety, and
health. However, there are
many different measures for each of these constructs, and
different measures use different
timeframes as points of reference. T n other words, some
depression measures ask abo ul
12-month periods, some ask about 2-week periods, and some
ask about 30-day periods.
ls one instrument or scale better than another fo r measuring
depression? ·what are the
trade-offs relative to shorter or longer instruments? (For
example, the most valid instru-
ment might be so long thal clien ts will get fatigued and refuse
to complete it.) Is it better
to measure a reduction in symptoms associated with a
standardized test or to employ a
32. behavioral measure (e.g., counting the n umber of days that
patients with chronic mental
illness are compliant with taking their medications)? Is
measuring attitudes aboul drug
abuse better than measuring knowledge about the symptoms of d
rug addiction?
Evaluators frequently have to struggle with decisions such as
these and decide whether it
is better to use instruments that are not "perfect" or to go to the
tro u ble of developin g and
validating new ones.
When no suitable instrument o r available data exist for the
evaluation, the evaluator
might have to create a new scale or at least modify an existing
one. If an evaluator revises
a previ.ously developed measure, then he or she has the burden
of demonstrating that the
newly adapted instrument is reliable and valid. Then, there are
issues such as the reliabil-
ity of data obtained from clients. Will cl ients be honest in
reporting actual d rug and alco-
hol use? How accurate are their memories?
A note mu st be made here about a special case of program
evaluatio n: evaluating pre-
vention programs. Evaluation of prevention programs is
especially challenging because
the typical goal of a prevention program is to prevent a
particular problem or behavior
from developing. The question then becomes, " How do you
measure something that
never occurs?" In other words, if the prevention program is
successful, the problem will
not develop, but it is difficult to dclermine with any certainty
that the problem would
33. have developed in the flrst place absent the prevention program.
Ti l uS, measures beco m e
very important as well as the design (s uch as including a
control group ).
Evaluators use a multitude of methods and instruments to
collect data for their stud-
ies. A good strategy is to include mu ltiple measures and
methods if possible, especially
CHAPrtR 13 • PROCRAM EVALUATION STUD I ES 233
when random assignment is not possible. That way, one can
possibly look for convergence
of conclusions across methods and measures.
Analysis of Change
After the data are collected, the evaluator is faced with a
sometimes difficult question of
how to determine whether change had occurred. And, of course,
there are several consid-
erations within this overall decision as welL One of the first
issues to be decided is what
the unit of analysis will be.
The unit of analysis refers to the person or things being stud ied
or measured in the eval-
uation of a program. Typically, the basic unit of analysis
consists of individual clients but
also may be groups, agencies, communities, schools, or even
slates. Fo r example, an evalu-
alor might examine the effectiveness of a drug prevention
program by looking for a
decrease in drug-related suspensions or disciplinary actions in
34. high schools in which the
program was imp lemented. In that instance, schools are the
primary unit of analysis.
Another eva lu ator might be concerned only with the attitudes
toward d rugs and alcohol of
students in one middle school; in that situation, individuals
would be the uni l of analysis.
The smal lest unit of ana lysis from which data are gathered
often is referred to as a case. The
unit of analysis is critical for determining both the sampling
strategy and the data analysis.
The analysis will also be determined by the research design
such as the number of
groups to be analyzed, the type of dependent var iable
(categorical vs. continuous), the
control variables that need to be included, and whether the
design is longitudinal. The
literature on similar program eval uation s is also usefullo exam
ine so that analysis plans
can consider what has been done in the past. The analysis phase
of the evaluation is basi-
cally the end product of the evaluation activities. Therefore, a
careful analysis is critical to
the evaluation, the interpretation of the results, and the
credibility of the results. Analysis
should be conducted by somebody with adequate experience in
statistical methods and
statistical assumptions, and limitations of the study should be
carefully examined and
explained to program stakeholders.
Cost-Benefit Analysis
Whi le assessing program outcomes is obv iously necessary to
gauge the effectiveness of a
program, a more comprehensive understanding of program
35. "success" ca n be a ttained by
examin ing program costs and economic benefits. In general,
eco nomic costs and benefits
associated with specific progra ms have received relat ively
limited attention. One of the
major challenges in estimating costs of some cornm uni ly-
baseu socia l programs is that
slanuard cost est imat ion procedures do not always reflect the
true costs of the program.
For example, a drug court program often combines both
criminal justice supervision and
substance ab use trcaLrnen L in a comm unity-based envi
ronment. And in order for drug
court programs to work effectively, they often use many
community and outside agency
resources that are not necessarily directly paid for by the
program. For exa mple, although
the drug court program may not directly pay for the jail time
incurred as part of client
sanctions, jail time is a central component in many drug cou rt
programs. Thus, jail costs
must be considered a drug court program cost.
A comprehensive economic cost analysis would include
c.slimates of the value of all
resources used in providing the program. When resources are
donated or subsidized, the
out-of-pocket cost will differ from the oppo rt uni ty cost of the
resources for a given
program. Opportunity costs take into account the forgone val ue
of an alternative use for
program resources. Other examples of opportunity costs for the
drug court program may
include the time and efforts of judges, police officers, probation
officers, and prosecutors.
36. 234 PART II • QuANIITATIVE APPROACHES: T YPES OF
STUDIES
Including costs fo r which the program may not explicitly pay
presents an interesting
dilemma. The dilemma primarily stems from the trade-off in
presenting only out-of-
pocket expenditures for a program (thus the program will have a
lower total cost) or
accurately reflecting all of th e costs associated with the
program regardless of whether
th ose costs are pa id out of pocket (implying a higher total prog
ra m cost). furthe rm ore,
when agencies share resources (e.g., shared overhead costs), the
correct proportion of
these resources that are devoted specificall y to a program must
be properly specified. To
date, there has been liule discussion in the literature about
estimating the opportunity
cost of programs beyond the out of pocket costs. Knowin g
which costs to include and
what va lue to place o n certain se rvi ces or items that are not
directl y charged Lo the
program can be complica ted.
A comprehensive analysis of econom ic benefits also presents
challenges. The goal of
an econo mic benefit analysis is to determine the monetary
value of changes in a range
of program outcomes, mainly derived from changes in client
behavior as a result of par-
ticipating in the program. When estimating the benefits of a
program such as drug
court, o ne of the most o bvious and important o utco m es is the
37. reduction in c riminal
justice costs (e.g., reduced incarceration and supervision), and
these are traditio nally
the only sources of benefit s examined in many drug court
evaluations. However, drug
court programs often have a diverse set of goals in addition to
reducing criminal justice
costs. For examp le, drug court programs often focus on helping
the particip ants
beco m e more productive in society. This includes helping par
ticipants take responsibil-
ity fo r their financial o bligations such as child support. In
addition, employment is
often an important program goal for drug court clients. If the
client is working, he or
she is paying taxes and is less likely to use social welfare
programs. Thus, the drug court
program potentially re d uces several di fferent ca tegories of
costs that might have
accrued had program participants not r eceived treatment. These
"avoided" costs or
benefits are important com po nents to a full eco nomic eva lua
tion of drug court
programs.
So, although the direct cost of the program usually is easily
computed, the full costs
and the benefits are more di fficult to convert into dollars. For
example, Logan et al. (2004)
fo un d that the average direct cost per chug court treatment ep
iso de for a grad uate was
$3,319, w hereas the op portunity cost per episode was $5, 132.
These differences in costs
due to agency collaboration highlight the importance of clearly
defining the perspective
of the cost analysis. As discussed earlier, the trade-off in
38. presenting only out-of-pocket
expenditures for a progra m or accurately refl ect in g all of the
costs associated with the
prog ram regardless of whether those costs are paid out of
pocket is an important distin c-
tion that should be co nsi dered at the outset of every economic
evaluation . On the benefit
side of the program, results suggest that the net economic ben
efit was 514,526 for each
graduate of the program. In other words, this translates to a
return of $3.83 in economic
benefit for every dolla r invested in the drug court programs for
graduates. Obviously,
those who dropped o ut of the program before comple ti ng di d
not generate as large of a
r et urn. However, res ul ts suggest that when both gra duates
and terminators were exam-
ined together, the net economic benefit of any drug court
experience amounted to $5,446
per participant. This translates to a return of $2.71 in economic
benefit for every dollar
invested in the drug court programs.
When looking <~l the cost -benefit analys is of programs o r
compa r ing these costs and
ben efi ts across programs, it is important to keep in mind that
cost- benefit analysis may be
done very differently, and a careful assessment of the methods
must be under taken to
ensure comparabilily across programs.