A more flexible test, given on the devices schools and students are already using, that quickly produces actionable information for educators and policymakers: That’s the vision going forward for the test known as the Nation’s Report Card.
The National Assessment of Educational Progress, or NAEP, is the only national, comparative gauge of K-12 student achievement. The pandemic—as it did with so many other fields—utterly upended things, resulting in the disappointing cancellation of its 2021 administration. Now its leaders say they’ve taken what they’ve learned to heart and are devising plans for a more resilient, purposeful exam.
In a lengthy blog post last week, Peggy Carr—a longtime civil servant who was named the commissioner of NCES in August 2021—and Lesley Muldoon, the executive director of the National Assessment Governing Board, outlined these priorities (The two agencies, both within the U.S. Department of Education, share the responsibility for the exam. NAGB develops the test frameworks and policies, while NCES analyzes all the numbers and reports out the results.)
In interviews with Education Week, the leaders explained more about what these priorities will mean for NAEP.
Their blueprint is expected to be bolstered by —one of a series commissioned in 2018 to study the NCES as part of its 150th anniversary.
Here’s a rundown of what’s to come for NAEP.
1. Soon, NAEP will be given on different devices.
It wasn’t that long ago, in 2017, that the venerable exam began to be given on devices rather than paper and pencil fill-in-the-bubble forms. Testing agents went out to schools with special laptops to administer the exams, eliminating the need for all those pesky scoring sheets.
But that wasn’t enough to keep NAEP going during the pandemic. 69ý were operating in widely different modes—some in-person, some hybrid, some virtual only—and test contractors couldn’t access all of them. It all threatened to skew the results so badly that the data wouldn’t have been usable. The agencies had no choice but to push back the test.
Like many other fields, “When it hit we were kind of caught off guard—flat footed in a way. We could not reach these students,” said Carr. “And I think there was this awakening of the large-scale assessment community and stakeholders that we were not prepared to do what they needed us to do when the chips were down. Our infrastructure was not ready.”
That’s the impetus for plans to design a way for NAEP to be taken on different kinds of devices, like Chromebooks or school-issued laptops—whatever’s in use where students are taking the exam.
It will take until 2026 until this is completely up and running, but when it is, NAEP will be in much better shape to weather another massive disruption to schools. There will also be fewer contractors needed to make the testing happen. And over time, this could also potentially produce more accurate results. Here’s why.
A growing number of students are enrolled in some kind of online learning program. There used to be no way to capture these students because NAEP was only administered at physical school buildings. After this switch, though, these students could possibly be included—and that would help maintain an accurate picture of achievement as more students enroll in virtual offerings.
Making NAEP “device agnostic” does pose some interesting technical challenges for the agencies. They’ll have to ensure kids don’t have an unfair advantage from using one kind of device instead of another. (In the early days of online testing, researchers found a “mode effect” that produced higher scores for students tested using paper-and-pencil vs. online tests; NAEP will need to be sure some devices don’t produce their own mode effect.) This will require slow, steady work and pilot studies to perfect.
2. NAEP will experiment with adaptive testing and other innovations.
When we think of a test, we think of every student getting the same set of questions. Computer-adaptive testing is different. This kind of test varies the questions students get as they answer: Miss the first few and a student is given easier questions; get them right and they’ll get more difficult ones. The benefit, in theory, is getting better information about either very high- or low-performing students. (On a traditional exam, most questions are in the middle range, not at the very easy or hard levels.)
The approach is used by the Smarter Balanced series of K-12 state exams, as well as the GRE, a popular graduate school entrance exam.
Now, NAEP will investigate using computer-adaptive technology, too. This is a bit of a challenge because unlike state tests or the GRE, NAEP doesn’t measure any one individual student’s outcomes. The results we see are a composite score of lots of students who all took different segments of the exam.
Still, Carr said, it’s possible to use the technology within the discrete block of questions each student takes. And if it’s successful, it should help to generate more fine-grained information on what students who are scoring at NAEP’s lowest achievement levels are having the most difficulty with, and similarly what sets apart top performers. (That’s important because of a disturbing recent trend, both on NAEP and international tests, of these two groups’ performance moving in opposite directions.)
NAEP also wants to experiment with artificial intelligence to help it write new exam questions and to help score open-ended questions—both technically tricky ideas that could offer significant cost savings.
And it wants to support teachers, policymakers, and others to use the findings as they come out.
“How can we speed up the return of results and get them back in people’s hands faster? How can we help researchers dig into the raw data of NAEP more quickly so they can answer questions that, as federal agencies, are a bridge too far for us?” said Muldoon, ticking off some of the driving questions she, Carr, and their teams will consider. “How do we translate the results into language real people can understand? How do we modernize the infrastructure [to help] with things like speeding up results? We want to explore those kinds of ideas and utilities so NAEP is as relevant as it can be.”
3. An important measure of the pandemic’s impact on learning is on the runway.
In December 2021, NAGB in addition to 9-year-olds. That work is beginning now.
The long-term trend exams are the only continuous measure of student achievement, dating from the 1970s. (By contrast, the trends for the main NAEP, which produces state-by-state results, get reset each time NAGB updates the testing blueprint.)
This is a bit of balm after the disappointing delay of the main NAEP in late 2020. And it will offer a tight pre- and post-pandemic gauge of learning, because the long-term trend exam for those two age groups was also the final exam given before nearly every school shuttered in spring 2020. (EdWeek’s Sarah Sparks took a look at the results from the last long-term trends test.)
Results for 9-year-olds are just finishing up now, and they’ll be completed for 13-year-olds this fall—alongside the regular NAEP exams. When the results are released, they will be the only national measure of the pandemic’s impact on learning.
The NAEP folks do face a small interpretative challenge in releasing these results. The long-term trend exams haven’t changed significantly since they were created, and they tend to measure foundational knowledge and skills rather than higher-order ones. This means that expected pandemic-related declines on this measure might not show up as steeply as they do on other measures—especially if basic content is what teachers have prioritized the last few years.
4. The NAEP experts will spotlight equity.
Via better yardsticks for poverty and more context in its reporting, the agencies want to add clarity to the discussions of achievement patterns on NAEP.
For example, NAEP reports often talk about gaps in student performance. That’s important, but without context, such findings risk fueling a narrative that somehow students are to blame for these disparities—rather than their varied experiences and uneven access to well-funded schools and good teaching. (Some K-12 researchers and media organizations, including Education Week, now generally prefer to call them “opportunity” rather than achievement gaps.)
And analyses of how students perform often counter stereotypes. Carr pointed out, for example, significant progress in the proportion of Black high school students who took calculus, according to the NCES’ most recent report on high school transcripts; such a picture is one of resiliency and improvement, she noted.
Equity is important, if somewhat politically touchy, territory for the organizations. The term “equity” has become a lightning rod in discussions about race and schooling, and even NAEP has been no exception. NAGB faced some internecine drama last year over equity when it was finalizing a new reading framework, though most of it ultimately centered on disagreements about how best to assess lower-performing students fairly.
Muldoon of NAGB said the organization is also commissioning studies about how new test frameworks, like its upcoming science revision, can continue to embrace equity and give all students a shot to show what they know—while maintaining technical quality.
NAEP will also continue to work to get a better indicator for students’ socioeconomic status. The usual measure, eligibility for free and reduced-price lunch, is increasingly problematic because of policy shifts that permit more students to receive those services regardless of income level.
5. NAEP’s architecture will continue to support new research.
During the pandemic, the Biden administration issued an executive order requiring the Education Department to track the pandemic’s impact on schools, which led to showing the proportion of schools using different modes of learning. To pull this off, NCES used the NAEP architecture to get the surveys out quickly. After all, NAEP testing relies on a nationally representative sample of schools—and that happens to be just what researchers need for surveys.
In fall 2021, the NCES extended that approach for its new “pulse” surveys, designed to give additional quick-turnaround survey research. And the agency has set itself , while also slimming down how long the surveys take to fill out and number-crunch. (NCES’ other major collections, on principals, teachers, school finance, and scores of other indicators typically take a few years to complete.)
“It was an example of how nimble and flexible NAEP can be,” Carr said. “We need to take advantage of this infrastructure to help us quickly go in, ask a few questions of schools—thousands of schools—and gather the information that’s needed.”