You are seeing this message because your web browser does not support basic web standards. Find out more about why this message is appearing and what you can do to make your experience on this site better.


Lois-ellin Datta of Datta Analysis points to the importance of studying control and comparison group experiences when conducting experimental studies.

Some programs deserve to die. There’s been enough time for adequate training, implementation, learning, revisions, and fair evaluations. Direct and opportunity costs are high, evidence of benefits low.

Murder by evaluation is something else. Once it was easy—start a complex program late and demand an outcome evaluation early. But now we know that a fair assessment should examine implementation, gather baselines, offer formative support, seek understanding of factors associated with possible outcomes, involve stakeholders, and wait until the program reaches a reasonable level of maturity.

Another key aspect of evaluation fairness is design. Is value added compared to what would have happened without the intervention? There are many ways to satisfy this yearning, including interrupted time series in which the array of relevant measures is tracked for over 30 observation periods, and logic models that lay out “if-then” linkages with sound empirical reasoning. Randomized experimental designs become particularly attractive when stakes are high, because of the strong inference they offer under appropriate circumstances.

But there are challenges. In one randomized experimental study of new treatments for the homeless, the men in the control and comparison groups, though destitute, often drug abusers, alcoholics, and mentally ill, had their own ideas about preferred treatments and had the street skills to get them. The result was that the groups were too corrupted, in a statistical sense, to reach valid conclusions about treatment effectiveness, leading to “no evidence of benefits” for the new, costlier treatments.

Comer, She Wrote
The Abt Associates evaluation of the Comer program in the Detroit public schools¹ suggests how to deal with such challenges. Comer’s program involves a complex effort to change the school’s culture to reflect principles of child, family, and community development.² Program implementation in Detroit was well supported for five years. An evaluation design with qualitative and quantitative elements involving randomized control schools was in place for almost seven years.

The envelope, please—comparison of the Comer and non-Comer schools on a wide array of outcome measures showed no detectable differences. Had the evaluation stopped here, it would have resulted in a negative finding that did not support Comer’s approach. The evaluators, however, had examined with diligence over five years what made for a “Comer program” and obtained this information for both the comparison schools and the Comer schools. Their analysis showed:

  • The comparison schools looked a lot like the Comer schools. At year five, both had about the same variation in degree of implementation of “the Comer program.”
  • Children in high-implementing Comer schools had notable benefits compared to low-implementing schools, as did those in high-implementing comparison schools compared to low-implementing comparison schools.
  • Children in both high-implementing Comer schools and high-implementing comparison schools did well, but children in the Comer schools did better.
  • Length of time in a Comer school predicted outcomes.

These findings, together with rich qualitative data, suggested that:

  • The comparison schools were not chopped liver. They were part of the stream of “new initiatives” flowing through most of the Detroit schools.
  • Some of Comer’s ideas were part of the zesty educational mix being tried to varying extents in many Detroit schools.
  • The principles underlying the Comer program seemed associated with positive results, regardless of the banner under which the principles were implemented.

The Importance of Control/Comparison Experiences
This example suggests the importance of understanding the experiences of control/comparison groups as well as the treatment/intervention groups, because some control groups are likely to be active and “no treatment” or placebo conditions are not necessarily meaningful.

For example, without such attention, the requirement of randomized experimental designs in the Congressionally mandated national evaluation of Head Start has more than a slight risk of murder for the program. Here the “control group” parents, required to get work as part of welfare reform, may be anything but inert. The alternate services the control groups find may range from custodial care to well-implemented programs similar to Head Start in that they embody the child development principles and standards of Head Start. This is likely to add to variability when child development changes are compared, and to a small effect size or a no difference finding.

If in-depth knowledge is obtained about the actual experiences of the control groups, then the Abt analytic design could be applied. Possible murder by evaluation could be replaced with a fairer test of underlying principles and with more valid conclusions on which to base social policy.

¹ Millsap, M. A., Chase, A., Obeidallah, D., Perez-Smith, A., Brigham, N., & Johnston, K. (2000). Evaluation of Detroit’s Comer Schools and Families Initiative. Cambridge, MA: Abt Associates.

² Comer, J. P., Haynes, N. M., Joyner, E. T., & Ben-Avie, M. (Eds.). (1996). Rallying the whole village: The Comer process for reforming education. New York: Teachers College Press.

Lois-ellin Datta
President
Datta Analysis
P.O. Box 326
Captain Cook, HI 96704
Tel: 808-323-8168
Email: datta@kona.net

‹ Previous Article | Table of Contents | Next Article ›

Home |  HGSE Home |  Site Map |  Site Help |  Privacy |  Contact Us |  RSS

© 2017 Presidents and Fellows of Harvard College
Published by Harvard Family Research Project