Despite Teacher-Evaluation Changes, the 'Widget Effect' Is Alive and Well

Save to favorites
Print

Copy URL

Despite widespread efforts to make evaluation systems more truthful, most teachers continue to receive good teacher-evaluation ratings鈥攊ncluding a handful who probably don鈥檛 deserve them, according to a recently released working paper.

The findings largely mirror what Education Week reported in 2013, when the first results from systems retooled in the wake of the federal Race to the Top and No Child Left Behind waivers were released. States may have built a better mousetrap, but they haven鈥檛 changed the cultural norms at work in schools that can impact how principals and other evaluators assign ratings.

For the study, Matthew Kraft of Brown University and Allison Gilmour of Vanderbilt University collected data from 19 states with revamped teacher-evaluation systems. For a large, unnamed school district, they also collected surveys from evaluators in 2012-13 and 2013-14, asking them to guess the percentage of teachers that would fall into each rating category and comparing those figures that to how the teachers were actually rated. Finally, they interviewed some 24 principals.

Here are the top-line findings.

First, the percentage of teachers rated below proficient was generally quite low, ranging from below 1 percent to about 8 percent. New Mexico, with more than quarter of teachers falling into that category, was a major outlier鈥攁nd has gotten a lot of pushback from its teachers for the tough grading. Interestingly, the range of performance at the top end was much more spread out. Very few teachers in Georgia or Massachusetts earned their state鈥檚 highest rating, but more than half did in North Carolina, Rhode Island, Colorado, and Tennessee.

Second, the evaluators in the large school district were far more likely to perceive weaknesses in teachers than they were to actually give them a low score: In the 2012-13 school year, for instance, evaluators perceived that nearly 27 percent of teachers were below proficient, but only about 7 percent received that score.

In other words, the findings seem to indicate that 鈥渢he Widget Effect鈥� is alive and well. The name was coined by an influential 2009 paper by teacher-training group TNTP that suggested that teachers鈥� evaluations are inflated and teachers themselves aren鈥檛 given good feedback on how they鈥檙e actually doing.

In interviews, principals said they hesitated in giving poor ratings for fear it would demoralize a teacher even further. In some cases, they noted, it seemed easier to 鈥渃ounsel out鈥� a teacher, giving her a good rating in exchange for her agreement to leave, than to follow the state鈥檚 lengthy, bureaucratic firing process or tussle with the teachers鈥� union.

A Washington Post noted that some outside researchers, briefed on the findings, expressed concern that some teachers鈥� ratings don鈥檛 really match their performance. Not only does it potentially mean poor performance is going unaddressed, it鈥檚 also an issue that hard to fix through administrative means.

They also noted, though, that we shouldn鈥檛 necessarily expect the same breakdown of ratings in each state. The incentives built into each state鈥檚 system鈥攕uch as whether the evaluations are tied to job security or pay鈥攍ikely effect how principals implement the systems, and how teachers respond to them.

for the latest news on teacher policy and politics.