The U.S. Department of Education and a national government-policy group have issued a practical guide aimed at taking some of the mystery out of the national push for “evidence based” education.
The 19-page guide attempts to define what constitutes “rigorous” evidence of effectiveness when it comes to evaluating the research track records of educational programs and practices. It also provides some questions to use in weighing the evidence.
“Practitioners and policymakers are really struggling with what we mean by evidence-based policy and strong evidence of effectiveness,” Grover J. “Russ” Whitehurst, the director of the Institute of Education Sciences, the department’s primary research agency, said in an interview. “This is one among a number of useful tools for people who are thinking about how to select and use evidence for the educational decisions they have to make.”
is available from the Education Department’s . (Requires .)
Terms such as “evidence-based education” and “scientifically based research” have become a pressing topic of conversation among education decisionmakers since Congress passed the No Child Left Behind Act in 2001.
In keeping with a push from the Bush administration to incorporate scientific evidence into policymaking decisions, the law requires states and districts to use only practices backed by scientifically rigorous studies. But the Education Department has never issued formal guidelines to define the new research requirements, and practitioners are attempting to try to figure out how to implement them.
To produce the informal guide, the institute hired the Coalition for Evidence-Based Policy, a non partisan, Washington-based group that advocates the use of randomized field trials in evaluating government programs.
The guide was released last month during a meeting the group held in Washington for chief state school officers and other state policymakers.
While the conference-goers praised the new publication, some national education research groups have given it a cooler reception.
“This will be an influential document because people are hungry for this kind of understanding of randomized field trials, as well as other tools of evaluation research,” said James W. Kohlmoos, the president of the Washington-based National Education Knowledge Industry Association. “But it’s important for the field to keep reminding ourselves that there are multiple ways to address different research questions, and this is one of them.”
Mr. Kohlmoos and others referred to the guide’s heavy focus on randomized field trials or randomized controlled trials as a “gold standard” for high-quality research. Because such studies randomly assign participants to either experimental or control groups, experts say they eliminate other factors that might cause the same outcomes in studies.
But critics complain that the Bush administration’s laserlike focus on randomized experiments leaves little room for other kinds of research that can also help build a knowledge base for the field.
‘Strong’ or ‘Possible’?
In keeping with that general thrust, the guide maintains that only well-designed randomized, controlled studies provide “strong” evidence of an intervention’s effectiveness.
Though rare in education, such studies have been used to test one-on-one tutoring programs for pupils deemed at risk of academic failure, life-skills training for junior high school students, small class sizes in grades K-3, and other interventions—all with positive results.
But the guide also says that other kinds of studies more common to the field, such as comparison studies, can provide evidence of “possible’’ effectiveness. The caveat, though, is that the groups being compared in those studies must be closely matched in prior test scores, demographics, and the time periods in which they are studied, among other factors.
Even then, comparison-group studies and randomized experiments sometimes reach opposite conclusions. A case in point, the publication notes, is the medical research testing the effectiveness of hormone-replacement therapy for reducing heart disease in women.
Despite 30 years of comparison studies suggesting the treatment was effective, a randomized experiment recently found that it increased the risk of heart disease, stroke, and breast cancer.
On the other hand, “pre-post” studies, used often by school districts, never provide meaningful evidence of effectiveness, according to the guide.
The problem with those types of studies—which compare the same group before and after an intervention—is that it’s hard to know whether the improvement would have occurred anyway.
The guide also notes that not all randomized studies are created equal. In school settings, findings from such studies lose strength, for instance, if parents insist on putting their children in the experimental group, if students volunteer to take part, or if too many students drop out of the study altogether and researchers lose track of them. Problems also arise, according to the guide, if sample sizes are too small.
“A rough rule of thumb,” it says, “is that a sample size of at least 300 students (150 in the intervention group and 150 in the control group) is needed to obtain a finding of statistical significance, for an intervention that is modestly effective.”
If schools or classrooms, rather than individual students, are randomized, it says, the minimum sample size should be 50 to 60 schools or classrooms.
“One of the benefits of this study is that when people start reading it, they see how complex this really is,” said Gerald R. Sroufe, the government-relations director for the American Educational Research Association, a Washington-based group. “Will this help people understand what all the shouting’s about? It will.”