Saturday, June 1, 2013

Evaluation theory vs practice


What Guides Evaluation? A study of how evaluation practice maps on to evaluation theory.

Christina Christie 2003

This study came in response to repeated calls from theorists for more empirical knowledge of evaluation which would in turn help explain the nature of evaluation practice. A study with similar aims was carried out in 1987 by Shadish and Epstein however their study makes use of a survey instrument designed by the researchers. The Christie study realizes the fact that ‘theoretical orientation often cannot be accurately assessed through direct questioning because evaluation practitioners usually are not proficient in theory (Smith 1993), and so are unable to identify their particular theoretical orientation’ (P.9). A case study approach to observe behaviour of evaluation practitioners is the usual place to go however this offers depth but cannot cover breadth of understanding of how evaluators use theory to guide their work. This study is unique in that it uses eight distinguished evaluation theorists with a broad array of perspectives to construct the survey instrument. These theorists are Boruch, Chen, Cousins, Eisner, Fetterman, House, Patton and Stufflebeam.

The conceptual framework is built on the work of Alkin and House (1992) using their taxonomy of three dimensions: methods, values and use. Each dimension has a continuum that further defines it. For methods, its quantitative to qualitative; for values its unitary to pluralist (criteria used when making evaluative judgments); and for use its from enlightenment (academic) to instrumental (service-oriented). Each theorist was asked to submit one statement ‘demonstrating the practical application associated with his theoretical standpoint related to each of the three dimensions’. P11. They were also invited to submit up to six additional statements and in total, the final instrument contained 38 items related to evaluation practice and was piloted with 5 practicing evaluators.

The participants in this study were from 2 groups. The theorists were asked to complete the survey instrument and then 138 evaluators working on a statewide Californian educational program called Healthy Start. Many of these were not evaluators but school and program administrators and so represent a cross section of how evaluations are being conducted today. This group was also subdivided for reporting of results into internal and external evaluators. The collection of demographic data produced some interesting findings. The majority of evaluators were female, white and over 40. In terms of education, 75% of the external evaluators were PhD qualified but this aligns with the years of experience and self-rating of their evaluation knowledge and skills.

And a very interesting finding - only  a small proportion of evaluators in this study sample were using an explicit theoretical framework to guide their practice. More on this later.

The analytic procedure used multidimensional scaling (MDS) whereby observed similarities (or dissimilarities) are represented spatially as a geometric configuration of points on a map. More specifically, this study used classical multidimensional unfolding (CMDU) which is an individual-differences analysis that portrays differences in preference, perception, thinking or behaviour and can be use when studying differences in subjects in relationship to one another or to stimuli (p14). Furthermore, to interpret the CMDU results in this study, ALSCAL (Alternating Least-Square Scaling Algorithm) was used to produce two dimensions: scope of stakeholder involvement and method proclivity.

The first dimension ranged from users being simply an audience for the evaluation findings, to users being involved in the evaluation at all stages from start to finish. The second dimension, method proclivity is the extent to which the evaluation is guided by a prescribed technical approach. One end of this dimension would be characterized as partiality to a particular methodology that has as a feature predetermined research steps. Boruch anchored this end of the dimension as the experimental research design is at the centre of his approach to evaluation. The other end of this dimension represents partiality to framing evaluations by something other than a particular methodology with predetermined research steps. Patton for example falls to this end by use of his utilization-focused evaluation which is flexible in its nature calling for creative adaption during its problem solving approach.

So in this way, plotting the theorists’ responses on the two-dimensional axes helped to name and clarify the dimensions. The next step was to map the evaluators practice onto the dimensions in order to compare how practitioners rated against theorists. Evaluators were divided into two groups, internal and external due to noted professional characteristics. Results indicated that generally, stakeholders have a more limited role in evaluations conducted by internal evaluators than those conducted by external evaluators. In addition, internal evaluators are more partial to methodologies with predetermined research steps than are external evaluators. This analysis depicts only a broad depiction of their practice. By investigating placement in each quadrant of the map, a more comprehensive understanding is produced.

In general, external evaluators practice was more like the theorists. More specifically, they were most closely aligned with the theorist Cousins. Furthermore although they are concerned with stakeholder involvement, they are partial to their methods and conduct evaluations accordingly. Internal evaluators were distributed evenly between the theorists which reflect the diversity in their practices and implies that we cannot generalize about their categorization into any one genre of theoretical approaches.

Christie goes on to discuss the theorists results and concludes that through this study it has become evident that even though theories may share big picture goals, they don’t necessarily share the same theoretical tenets or similar practical frameworks. In addition, ‘the prescribed practices of a theory are not necessarily best reflected in its name or description’ (p.29). Her other major point was that ‘despite some theoretical concerns related to stakeholder involvement, all of the theorists in this study do involve stakeholders at least minimally’ (p.30). However many theorists have not chosen to incorporate changing notions of such involvement because of ‘a common perception that… it is understood to be a part of the evaluation process, no matter one’s theoretical approach’ (p.31).

In relation to practicing evaluators, they are often intimately involved in the program and therefore assume they understand how other stakeholders think and feel about the program and hence don’t tend to involve them a great deal. Politics may play out more heavily with internal evaluators and may influence their decision not only on stakeholder involvement but also to their emphasis on prescribed methods. In terms of evaluator bias, the study found that internal evaluators may be aware of the importance of objectivity and tend towards more quantitative methods to increase the credibility of their findings. External evaluators on the other hand may be influenced by their colleagues’ perception of their methods used, with the thought that criticism could jeopardize potential for future work. Therefore both types of evaluators employ method-driven frameworks influenced by the perception that the results are more defensible. And finally, this study shows that theory is not requisite to evaluation practice, in fact evaluators adopt only select portions of a given theory. Even ‘those who did claim to use a particular theory did not correspond empirically with the practices of the identified theorist’ (p33). Therefore Christie concludes that ‘the gap between the common evaluator and the notions of evaluation practice put forth by academic theorists has yet to be bridged.’ (p34).