New research attempts to determine the best way to measure principal effectiveness using students' test scores—and finds that the task is trickier than anticipated.
Principal evaluation presents a number of questions: Should principals be evaluated based on the performance of teachers they didn't hire? Should they be measured for their immediate impact or for growth over time? How can we actually compare one principal to another working in an entirely different school or district context?
In the new research, Jason A. Grissom at Vanderbilt University and Demetra Kolgrides and Susanna Loeb at Stanford University propose and examine several possible methods of using test scores to evaluate principals, adjusting, in each case, for the background characteristics of the students that might affect academic performance. They analyze three broad approaches to evaluation: tying principal performance directly to schools' performance ("school effectiveness"); comparing different principals' performance at the same school ("relative within-school effectiveness"); and examining growth in student achievement over a principal's tenure ("school improvement").
Tying principal performance directly to test scores ("school effectiveness") may seem too blunt. "You're capturing principal performance, but you're capturing a number of other things, too," Grissom said in an interview. And yet, in the end, the researchers found that this measure aligned most closely with what the researchers called "nontest measures of performance," like surveys of parents. However, this may mean that those outside indicators were reflecting the general performance of the school more than the performance of the principal, Grissom said.
The other methods may be theoretically more appealing, as they attempt to narrow in on what the principal actually controls. But they present logistical and practical challenges—and the results, the researchers write,"inspire less confidence." Comparing different principals' performance at the same schools is difficult because principals don't leave their jobs frequently enough to compare accurately between principals. (Nor would we want them to!) Assessing principals' growth over time might be more effective, but it requires a large amount of data and, in this run-through, was not particularly reliable. Neither of these measures matched up well with the "nontest measures of performance." They also didn't match up with each other.
So, in the end, this research doesn't necessarily clarify matters for policymakers looking to come up with effective principal evaluation tools. The paper concludes with a caution: "The warning that comes from these analyses is that it is important to think carefully about what the measures are revealing about the specific contribution of the principal and to use the measures for what they are, which is not as a clear indicator of principals' specific contributions to student test score growth."
But policymakers are moving toward tying principal evaluation and effectiveness to student achievement. Several states, including Florida, where this research took place, have laws requiring principal evaluations to be tied to students' test scores.
Grissom said in an interview that this doesn't mean that districts should avoid looking at test scores at all. "It means that you have to be thoughtful about it, and cognizant of what you're measuring. And you want to use it in conjunction with other sources of data if you're going to use it for accountability," he said. That lines up with recommendations the two national principal associations put out earlier this year.
A look at the effectiveness of evaluation systems actually in place in large districts or in states that mandate this kind of connection could be revealing. This study's data comes from the Miami-Dade public school system 2003-2011. Grissom says that he and his colleagues will likely look at systems in practice in Miami and potentially elsewhere.
Meanwhile, principal effectiveness will be the topic of an upcoming forum and public discussion hosted by the ASCD, as we shared last week.