Evaluating the Outcomes of Domestic Violence Service Providers: Some Practical Considerations and Strategies

Cris Sullivan and Carole Alexy

A great deal of external pressure is being brought to bear on domestic violence service programs to demonstrate their effectiveness. More and more funders are demanding to see outcome evaluations as a requirement for receiving continued funding. At the same time, staff and volunteers are interested in knowing more about the success of their efforts. In response to this, the Pennsylvania Coalition Against Domestic Violence enlisted the services of domestic violence researcher and advocate Cris Sullivan, from Michigan State University, to help Pennsylvania programs design appropriate evaluation outcomes. The result of a year-long collaboration was a guidebook entitled Outcome Evaluation Strategies for Domestic Violence Service Programs: A Practical Guide. The manual was designed, not to discuss evaluation in the abstract, but to provide practical assistance and examples in designing and carrying out effective evaluation strategies specifically for domestic violence service providers. This VAWnet document presents the highlights of that guidebook.

Why Evaluate Our Work?

Although the thought of "evaluation" can be daunting, there are some good reasons to evaluate the job we are doing. The most important reason, of course, is to understand the impact of our work on women's lives. We want to build upon those efforts that are helpful to women with abusive partners; at the same time, we don't want to continue putting time and resources into efforts that are not helpful or that have unintended negative consequences for women. Evaluation is also important because it provides us with "hard evidence" to present to funders, encouraging them to continue and increase their financial support of our work. Most of us would agree that these are good reasons to examine the kind of job we're doing... BUT...we are still hesitant to evaluate our programs for a number of reasons. In the past, research has been used against women with abusive partners, data have been misinterpreted, and the staff of domestic violence programs have seen such data used to critique program impact in inappropriate ways. And if that weren't enough, staff often feel they lack the time, money and/or expertise to evaluate their programs. Finally, unlike some service programs with obvious and tangible outcomes -- such as those designed to prevent teenage pregnancy or to teach parenting skills -- domestic violence service programs provide multiple services with difficult-to-measure outcomes. In some cases, services are extremely short-term (such as providing information over the phone) or are provided to anonymous individuals (as is often the case with crisis calls). It is also difficult to evaluate programs designed to prevent a negative event from occurring (in this case, battering), because the survivor is not responsible for preventing, and is indeed often unable to prevent, this negative event from occurring regardless of her actions.

Designing Our Evaluations

Effective evaluation begins by first defining our overarching goals (sometimes also referred to as objectives). Goals or objectives (and we're using these terms interchangeably in this paper; not everyone does) are what we ultimately hope to accomplish through the work we do. Program goals, usually described in our mission statements, are long-term aims that are difficult to measure. We would say that the overall goal or objective of domestic violence victim service programs is to enhance safety and justice for battered women and their children.

While some of us might articulate our overall objectives differently, it is important that we each choose goals and objectives that make sense for our programs. Survivors should be involved in this process, as well as all other stages of the evaluation. After the program's overall objective has been established, it is important to consider what we expect to see happen as a result of our program, that is measurable, that would tell us we are meeting our objective(s). These are program outcomes. The critical distinction between goals and outcomes is that outcomes are statements reflecting measurable change due to our programs' efforts. What occurred as a result of the program? In addition to being measurable, outcomes must be realistic and philosophically tied to program activities.

Depending on the individual program, program outcomes might include:

There are two types of outcomes we can evaluate: short-term outcomes (such as those discussed above) and long-term outcomes. Long-term outcomes involve measuring what we would expect to ultimately occur, such as:

Measuring long-term outcomes is very labor and time intensive, and costly. Research dollars are generally needed to adequately examine these types of outcomes.

More realistically, service programs measure short-term outcomes, which measure proximal change. Proximal changes are those more immediate or incremental outcomes one would expect to see that will eventually lead to the desired long-term outcomes. For example, a hospital-based medical advocacy project for battered women might be expected to result in more women being correctly identified by the hospital, more women receiving support and information about their options, and increased sensitivity being displayed by hospital personnel in contact with abused women. These changes might then be expected to result in more women accessing whatever community resources they might need to maximize their safety (e.g., shelter, personal protection orders), which ultimately -- long-term -- would be expected to lead to reduced violence and increased well-being. Without research dollars it is unlikely that staff would have the resources to measure the long-term changes resulting from this project. Rather, programs should measure the short-term outcomes they expect the program to achieve. In this example, that might include: (1) the number of women correctly identified in the hospital as survivors of domestic abuse; (2) survivors' perceptions of the effectiveness of the intervention in meeting their needs (including receiving information and support they perceived to be helpful); and (3) hospital personnel's attitudes toward survivors of domestic violence.

Measuring proximal or short-term outcomes requires obtaining the answers to questions such as: It should be noted that "satisfaction with services" is typically considered to be part of process evaluation (i.e., how we do our work) as opposed to outcome evaluation. However, most if not all domestic violence service programs strive to provide services unique to each woman's situation and view each woman's "satisfaction with the degree to which the program met her needs" as a desired short-term outcome.

Choosing Appropriate Outcomes

Outcome evaluation must be designed to answer the question of whether or not women attained outcomes they identified as important to them. So for example, before asking women if they obtained an Order for Protection, we must first ask if they wanted an Order for Protection. Before asking if our support group decreased a woman's isolation, we would want to know if she felt isolated before attending our group. Not all women seek our services for the same reasons, and our services must be flexible to meet those diverse needs. Outcome evaluation can inform us about the different needs and experiences of women and children, and this information can be used to inform our programs as well as community efforts.

"Problematic" Outcome Statements to Avoid

A common mistake made by many people designing project outcomes is developing statements that are either (1) not linked to the overall program's objectives, or are (2) unrealistic given what the program can reasonably accomplish. Three common problematic outcome statements follow, with explanations for why they should be avoided:

Problematic Outcome Statement #1: "75% of the women who use this service will leave their abusive partners."

The expectation that battered women should leave their abusive partners is problematic for a number of reasons, including: it wrongly assumes that leaving the relationship always ends the violence, and it ignores and disrespects the woman's agency in making her own decision. This type of "outcome" should either be avoided altogether or modified to read, 'xx% of the women using this service who want to leave their abusive partners will be effective in doing so.'

Problematic Outcome Statement #2: "The women who use this program will remain free of abuse."

Victim-based direct service programs can provide support, information, assistance, and/or immediate safety for women, but they are generally not designed to decrease the perpetrator's abuse. Suggesting that victim-focused programs can decrease abuse implies the survivor is at least somewhat responsible for the violence perpetrated against her.

Problematic Outcome Statement #3: "The women who work with legal advocates will be more likely to press charges."

Survivors do not press charges; prosecutors press charges. It should also not be assumed that participating in pressing charges is always in the woman's best interest. Legal advocates should provide women with comprehensive information to help women make the best-informed decisions for themselves.

Deciding When to Evaluate Effectiveness

Timing is an important consideration when planning an evaluation. Especially if the evaluation involves interviewing women who are using or who have used our services, the time point at which we gather the information could distort our findings. If we want to evaluate whether women find our support group helpful, for example, would we ask them after their first meeting? Their third? After two months? There is no set answer to this question, but it's important to remember we are gathering different information depending on the timing, and we have to be specific about this when discussing findings. For example, if we decided to interview only women who had attended weekly support group meetings for two months or more, we would want to specify that this is our "sample" of respondents.

Consideration for the feelings of our clientele must also be part of the decision-making process. Programs that serve women who are in crisis, for example, would want to minimize the number and types of questions they ask. This is one reason programs find it difficult to imagine how they might evaluate their 24-hour crisis line. However, some questions can be asked that can be used to evaluate our 24-hour crisis line programs; these questions must be asked only when appropriate, and should be asked in a conversational way. Examples of such questions might include: ìWas this helpful for you?, Did you get what you needed? and Is there any other information you need?

We also need to consider programmatic realities when deciding when and for how long we will gather outcome data. Do we want to interview everyone who uses our service? Everyone across a 3 month period? Every fifth person? Again, each agency has to individually answer this question after taking into account staffing issues as well as ability to handle the data collected.

The Importance of Confidentiality and Safety of Survivors

The safety of the women with whom we work must always be our top priority. The need to collect information to help us evaluate our programs must always be considered in conjunction with the confidentiality and safety of the women and children receiving our services. It is not ethical to gather information just for the sake of gathering information; if we are going to ask women very personal questions about their lives, there should always be an important reason to do so, and their safety should not be compromised by their participation in our evaluation. The safety and confidentiality of women must be kept in mind at all times.

First and foremost, participating in the evaluation of our services must be completely voluntary. Participation in evaluation should never be a condition of receiving services. Second, women should always be told why we are asking the questions we're asking. And whenever possible, an advisory group of women who have used our services should assist in supervising the development of evaluation questions. Finally, it is extremely important to ask ourselves: Who else might be interested in obtaining this information? Assailants' defense attorneys? Child Protective Services? Women should always know what might happen to the information they provide. If we have procedures to protect this information from others, women should know that. If we might share this information with others, women need to know that as well. Respect and honesty are key.

Collecting the Information (Data)

There are pros and cons to every method of data collection. Every program must ultimately decide for itself how to collect evaluation information, based on a number of factors. These factors should include: Often when we are trying to evaluate what kind of impact our program is having, we are interested in answering fairly straightforward questions: Did the survivor receive the assistance she was looking for, and did the desired short-term outcome occur? We are generally interested in whether something occurred, or the degree to which it occurred. We can generally use closed-ended questions to obtain this information. A closed-ended question is one that offers a set number of responses. For example: The answers to these types of questions are in the form of quantitative data. Quantitative data are those that can be directly or easily expressed in terms of numbers (i.e., quantified). There are many advantages to gathering quantitative information: it is generally quicker and easier to obtain, and is easier to analyze and interpret than qualitative data. Qualitative data generally come from open-ended questions that do not have pre-determined response options, such as: While we often get richer, more detailed information from open-ended questions, it is more time-consuming and complicated to synthesize this information and to use it for program development. Some programs use both quantitative and qualitative data to evaluate their programs, relying on percentages and averages to synthesize the information, but also including quotes or case examples to provide the human voice behind the numbers.

Information Gathering Methods

There are pros and cons to any strategy we choose for gathering outcome information. This section discusses the benefits and drawbacks of the most common methods: face-to-face interviews, telephone interviews, written surveys, focus groups, and staff records.

Face-to-face interviews
This is certainly one of the more common approaches to gathering information from women, and for good reason. It has a number of advantages, including the ability to: (1) fully explain the purpose of the questions to the respondents; (2) clarify anything that might be unclear in the interview; (3) gain additional information that might not have been covered in the interview but that arises during spontaneous conversation; and (4) maintain some control over when and how the interview is completed.

There are disadvantages to this approach as well, however, including: (1) lack of privacy for the respondent; (2) the potential for women responding more positively than they might actually feel because it can be difficult to complain to a person's face; (3) language barriers; (4) race, age, class and other differences between the interviewer and interviewee that might affect how women respond to questions; (5) the time it can take to complete interviews with talkative women; and (6) interviewer bias.

Although the first five disadvantages are self-explanatory, interviewer bias requires a brief explanation: It is likely that more than one staff member would be conducting these interviews over time, and responses might differ depending on who is actually asking the questions. One staff member might be well-liked and could encourage women to discuss their answers in detail, while another staff member might resent even having to gather the information, and her or his impatience could come through to the respondent and impact the interview process. Interviewers, intentionally or unintentionally, can affect the quality and quantity of the information being obtained.

Telephone interviews
Telephone interviews are sometimes the method of choice when staff want to interview a woman after services have already been received. After a woman has left the shelter, stopped coming to support groups, left counseling, ended her involvement with the legal advocacy program, or the like, we might still want to talk with her about her perceptions. Advantages to this approach include: (1) such interviews can be squeezed in during "down" times for staff; (2) women might feel cared about because staff took time to call, and this might enhance the likelihood of their willingness to answer some questions; (3) important information that would have otherwise been lost can be obtained; and (4) we might end up being helpful to the women we call.

The most serious disadvantage of this approach involves the possibility of putting women in danger by calling them when we don't know their current situation. Never call a woman unless you have discussed this possibility ahead of time and worked out certain codes through which she can tell you if it's unsafe to talk. It is never worth jeopardizing a woman's safety to gather evaluation information.

Another drawback of the telephone interview approach is that we will only talk with a select group of women (e.g., those with phones, those who have provided us with current contact information, those who haven't moved), who may not be representative of our clientele.

Written surveys
Asking clients to complete brief written surveys is probably the most common strategy used by nonprofit organizations to evaluate their programs. The greatest advantages of this method of data collection include: (1) they are easily administered (generally women can fill them out and return them at their convenience); (2) they tend to be more confidential (women can fill them out privately and return them to a locked box); and (3) they may be less threatening or embarrassing for the woman if very personal questions are involved.

Disadvantages include: (1) written questionnaires require respondents to be functionally literate; (2) questionnaires need to be written in all languages spoken by women using the program; (3) if a woman misunderstands a question or interprets it differently than staff intended, we can't catch this problem as it occurs; and (4) the method may seem less personal, so women may not feel it is important to answer the questions accurately and thoughtfully, if at all.

Focus groups
The focus group has gained popularity in recent years as an effective data collection method. Focus groups allow for informal and (hopefully) frank discussion among individuals who share something in common. For example, we may want to facilitate a focus group of women who recently used our services as a way of learning what is working well about our service and what needs to be improved. We might also want to facilitate a focus group of "underserved" women in the area -- perhaps women over 60, lesbians, women who live in a rural area, Latinas -- this would depend on the specific geographic area, the specific services, and who in the area appears to be underserved or poorly served by traditional services.

Focus groups generally are comprised of no more than 8-10 people, last no more than 2-3 hours, and are guided by some open-ended but "focused" questions. Again, an open-ended question is one that requires more than a ìyes or ìno answer, and this is important to consider when constructing questions.

It is important to consider a number of issues before conducting a focus group: Will transportation be provided to and from the group? Childcare? Refreshments? A comfortable, nonthreatening atmosphere? How will confidentiality be ensured? Who do we want as group members, and why? In what language will the group be conducted, and what are the implications of this? Do we have a facilitator who can guide without "leading" the group? Will we tape record the group? If not, who will take notes and how will these notes be used?

Staff records and opinions
While obtaining information from staff is one of the easiest ways to gather data for evaluation purposes, it has a number of drawbacks. The greatest drawback, of course, is that the public (and probably even the program) may question the accuracy of the information obtained if it pertains to client satisfaction or program effectiveness. The staff of a program could certainly be viewed as being motivated to "prove" their program's effectiveness. It is also only human nature to want to view one's work as important; we would not be doing this if we did not think we were making a difference. It is best to use staff records in addition to, but not instead of, data from less biased sources.

Using Our Findings to Improve Our Program

After collecting the information needed to evaluate our programs' effectiveness, we obviously want to put our findings to good use. Setting aside specific times to review outcome information with staff and survivors sends a message that these outcomes are important, and gives everyone an opportunity to discuss, as a group, what is working and what needs improvement. As improvements are made in response to the data gathered, we can broadcast these changes through posters on walls, announcements, and word-of-mouth. As staff, volunteers, funders, and service recipients see that our agencies are responsive to feedback, they will feel more respected by our organizations.

Authors of this document:
Cris M. Sullivan, Ph.D.
Michigan State University

Carole Alexy Pennsylvania Coalition Against Domestic Violence

Distribution Rights: This Applied Research paper and In Brief may be reprinted in its entirety or excerpted with proper acknowledgement to the author(s) and VAWnet (www.vawnet.org), but may not be altered or sold for profit.

Suggested Citation: Sullivan, C. and Alexy, C. (2001, February). Evaluating the Outcomes of Domestic Violence Service Programs: Some Practical Considerations and Strategies. Harrisburg, PA: VAWnet, a project of the National Resource Center on Domestic Violence/Pennsylvania Coalition Against Domestic Violence. Retrieved month/day/year, from: http://www.vawnet.org

To order a copy of Cris Sullivan's Outcome Evaluation Strategies for Domestic Violence Programs: A Practical Guide, contact:

PCADV
6400 Flank Dr. Suite 1300
Harrisburg, PA 17112-2778
1-800-932-4632
Attn: Cindy Leedom
Cost: $25 for Domestic Violence Service Programs; $30 others (postage & handling included)



Additional Resources to Assist with Outcome Evaluation

Burt, M. R., Harrell, A. V., Newmark, L. C., Aron, L. Y., Jacobs, L. K. and others. (1997). Evaluation guidebook: For projects funded by S.T.O.P. formula grants under theViolence Against Women Act. Washington, DC: The Urban Institute. This guidebook can be downloaded from www.urban.org/authors/burt.html

Domestic Abuse Project. (1997). Evaluating domestic violence programs. Minneapolis, MN: Domestic Abuse Project. Address and phone: 204 W. Franklin, Minneapolis, MN 55404; 612-874-7063 ext. 222. $39.95 + $4.00 shipping and handling.

Krueger, R. A. (1988). Focus groups: A practical guide for applied research. Newbury Park, CA: Sage.

United Way of America. (1996). Measuring program outcomes: A practical approach. United Way of America. Address and phone: 701 N. Fairfax St., Alexandria, VA 22314; 703-836-7100.


* The production and dissemination of this publication was supported by Cooperative Agreement Number U1V/CCU324010-02 from the Centers for Disease Control and Prevention. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the CDC, VAWnet, or the Pennsylvania Coalition Against Domestic Violence.

Distribution Rights

This Applied Research paper and In Brief may be reprinted in its entirety or excerpted with proper acknowledgement to the author(s) and VAWnet, a project of the National Resource Center on Domestic Violence, but may not be altered or sold for profit.