Principals continue to rate nearly all teachers as “effective,” despite states’ efforts in recent years to make evaluations tougher, two new studies show.
And there’s good evidence that those scores are inflated: When principals are asked their opinions of teachers in confidence and with no stakes attached, they’re much more likely to give harsh ratings, the researchers found.
That’s in part because principals want to maintain good relationships with their teachers, which can be tough to do when they have to confront them with bad reviews, the researchers say. For some principals, though, the hesitation to give low scores is a product of being strapped for time.
“It’s very, very time-consuming to document poor performance,” said Marilyn Boerke, a former principal who is the director of talent development for the Camas school district in Washington state. “At the end of the year, if you haven’t repeatedly gone into the classroom and given the teacher suggestions for improvements, it’s not really fair to give a poor evaluation.”
In 2009, TNTP (formerly the New Teacher Project) published a striking report, “The Widget Effect,” which found that less than 1 percent of teachers were being rated as unsatisfactory. Since then, many states have worked to put more-rigorous evaluation systems in place, including by incorporating student test scores.
But according to the pair of new studies, little has changed. On formal district evaluations, nearly all teachers continue to be deemed effective.
“We’ve invested a lot in making these systems rigorous, and yet they still seem to identify the vast majority of teachers as effective, especially when you look at the observation ratings from principals,” said Jason Grissom, an associate professor of public policy and education at Vanderbilt University, who co-authored the study with Susanna Loeb, an education professor at Stanford University.
‘Somebody’s Job Is in Your Hands’
That study, published recently in the journal Education Finance and Policy, analyzed how 100 principals from Miami-Dade County public schools rated the same teachers in two different settings: a confidential one-on-one with the researchers and the formal district evaluation.
On district evaluations, which could have consequences for compensation and employment, nearly every teacher was rated as “effective” or “very effective” on all the standards measured. In the confidential setting, the scores were still positive overall, but principals were much more likely to give low ratings.
In fact, the teachers who received scores of “very ineffective” on the low-stakes assessment, on average were deemed “effective” on the high-stakes evaluation.
“The stakes here are really important,” said Grissom. “When they talk to the researchers, there are no stakes attached—we’re not going to do anything, it doesn’t count for anything.” It makes sense a principal would in that case give “a true assessment,” he said.
The tendency to be more lenient on a district evaluation is understandable, said Jennifer E. Nauman, the principal at Shields Elementary School in Lewes, Del. “Somebody’s job is in your hands,” she said. “The rubric is very subjective.”
Another study, to be published soon in Educational Researcher, also found a disconnect between what principals said about their teachers privately and in a formal review.
The researchers, Matthew Kraft, an assistant professor of education and economics at Brown University, and Allison Gilmour, now an assistant professor of special education at Temple University, surveyed more than 200 principals in a large urban district in the Northeast. Again, evaluators identified far more teachers as weak in a confidential survey than they did on the formal district evaluations.
For instance, the 2014-15 data show that evaluators perceived 19 percent of teachers as below proficient—but they rated only about 6 percent of teachers that way on the district assessment.
Kraft and Gilmour’s study also looked broadly at teacher ratings in 24 states that had overhauled their evaluation systems.
Nearly all teachers in most of those states continued to get positive ratings. Hawaii was the least likely to designate teachers as ineffective or needing improvement.
But New Mexico was an outlier. There, about 1 in 4 teachers were rated as either minimally effective or ineffective, the state’s two lowest categories.
While nearly every other state had less than 1 percent of teachers in the ineffective category, New Mexico had 5 percent in that lowest designation.
But how long New Mexico retains its outlier status remains to be seen. Teachers there have fiercely pushed back on the stringent evaluation policies, which have been dubbed the toughest in the country. And the governor recently announced the state would be making major changes to the system.
A Matter of Time
So what’s behind these almost universally high ratings from principals? Some say it’s the need for positive relationships with their staffs.
With the district evaluations, “teachers know what the rating is,” Grissom said. “In many systems, that involves a postconference. If I gave you low ratings, that would be very uncomfortable for me to talk to you about. … We have to take seriously the fact that teacher evaluation is a relational enterprise.”
In interviews for the Kraft and Gilmour study, principals talked about personal discomfort as well. One veteran principal is quoted in the report as saying, “The most difficult part of the job is probably to deliver those difficult messages, and not everyone is capable of that.”
But other principals not involved in the studies push back on that notion.
“Those are challenging conversations, and you don’t want to hurt someone’s feelings,” said Boerke. “But the principals I know do not shy away from those conversations.”
Dwayne Young, who was an administrator in Fairfax County, Va., for 17 years before recently retiring, said giving honest feedback isn’t hard for administrators—but assessing the complex process of teaching can be.
“Principals do strive to have great relationships,” he said. “But I don’t think they would not evaluate someone according to what they believe to be really good instruction.”
Concerns about teacher turnover can also lead to high ratings, some say.
“It would be a rational response for a principal to think, if I give this person a low score, they might get angry and leave my school,” said Grissom, “or they might be dismissed, and then I have to replace this person, and I might be facing a hiring pool that doesn’t look appreciatively better than the teacher who would leave.”
Among the largest factors, though, many say, is time.
“We’re spread so thin as administrators,” said Boerke of the Camas school district. “When all’s said and done and it’s June and you’re responsible for submitting 32 evaluations, you’d err on the side of effective if you don’t have the documentation to prove ineffective.”
Interestingly, a closer look at the scores given in the high-stakes evaluations showed that principals actually were differentiating between teachers. They were just doing so within the “effective” categories.
Even though nearly all teachers got 3s and 4s (on a 4-point scale), which labeled them “effective,” the 3s seemed to be going to the weaker teachers, Grissom and Loeb found. And teachers with the lower evaluation scores also had lower value-added measures—which aim to determine how well a teacher is doing using student test scores.
“There is a difference between a teacher rating of effective and highly effective,” said Grissom. “It’s just not the level of differentiation that when these systems rolled out people thought they would see.”