69传媒

Standards & Accountability

Transition to Online Testing Sparks Concerns

By Catherine Gewertz 鈥 October 29, 2013 9 min read
  • Save to favorites
  • Print
Email Copy URL

When tens of millions of schoolchildren sit down at computers to take new common assessments in spring 2015, many of their peers will be taking similar tests the old-fashioned way, with paper and pencil, raising questions about the comparability of results鈥攁s well as educational equity鈥攐n an unprecedented scale.

Both state consortia that are designing tests for the Common Core State Standards are building computer-based assessments, but they will offer paper-and-pencil versions as well, as states transition fully to online testing. The Smarter Balanced Assessment Consortium plans to run the two simultaneous 鈥渕odes鈥 of testing for three years. The Partnership for Assessment of Readiness for College and Careers, or PARCC, will do so for at least one year.

In order to rely on the results, however, the consortia must show that the paper and computer modes of the tests in English/language arts and mathematics measure the same things.

The prospect of establishing such comparability between two versions of a test isn鈥檛 new. States have long used established statistical and psychometric practices to do so when they update their paper-and-pencil tests, for instance, or when they transition from paper-based tests to computer assessments. But the challenge before the two consortia ups the ante by hanging the validity of far more children鈥檚 test scores on the 鈥渓inking鈥 or 鈥渆quating鈥 process conducted by each group.

鈥淚n the assessment profession, we need to be able to back up claims we make about students鈥 and schools鈥 performance. Any threat to validity is a threat to those interpretations,鈥 said Richard Patz, the chief measurement officer at ACT Inc., which is conducting comparability studies of its own as the Iowa City, Iowa-based company introduces a digital version of its college-entrance exam.

Thorny questions have arisen, too, about whether children who take the paper-and-pencil version of the consortia tests will be at a disadvantage鈥攐r perhaps have an edge鈥攃ompared with their peers who take the computer version.

Could children in high-poverty areas, where technological readiness will likely be lower, lose something valuable by not interacting with the new tests鈥 technologically enhanced items, such as drawing and drag-and-drop functions? Would they actually benefit by sticking with paper exams if they are more comfortable taking tests in that mode?

Mixed Landscape

Consortia leaders say they are confident that comparability and equity questions will be fully addressed by the time the tests make their debut in 2015.

鈥淚t鈥檚 something we need to do carefully, and we intend to do it carefully,鈥 said the executive director of the 25-state Smarter Balanced group, Joe Willhoft, who oversaw such studies as the assessment director in Washington state.

Jeffrey Nellhaus, the testing director for PARCC, which includes 18 states and the District of Columbia, said the group鈥檚 test designers are 鈥渧ery sensitive鈥 to comparability questions and are planning studies to answer them.

Common-Core Exams Go Interactive

Both of the state testing consortia will include technology-enhanced questions on their computer-based exams, such as this interactive sample item from the Smarter Balanced group. You can also try your hand at interactive sample questions from the Smarter Balanced consortium.

BRIC ARCHIVE

SOURCE: Smarter Balanced Assessment Consortium

About 40 million students attend school in the states that belong to the two consortia. But much is still unknown about how many will take paper tests in 2015 and how many will use a computer. Even rough feedback, however, shows a strong likelihood that large swaths of students will be picking up their No. 2 pencils.

Survey data collected in July by Smarter Balanced鈥攁lso more of an approximation than a full accounting鈥攕how a wide range of technological readiness.

Oregon, long a leader in online assessment, reported that all its districts were capable of giving tests online, while only 45 percent of California鈥檚 districts did likewise. For PARCC, Mr. Nellhaus ventured a guess of a 50-50 split, but emphasized that data on districts鈥 and schools鈥 readiness are far from complete.

The consortia will not decide who takes the paper-and-pencil version of the test and who takes the computer version, officials said. That will be up to states, and in some cases, individual districts or schools.

Ideally, test results are 鈥渋ndifferent鈥 to the mode in which the test is given, said Henry Braun, a longtime researcher with Princeton, N.J.-based test-maker Educational Testing Service and now an expert in educational evaluation and measurement at Boston College. If the mode of administration helps or hampers some students, the results are distorted, he said.

Differences in Format

Assessment experts say it鈥檚 much easier to establish comparability when two tests are similar in format, such as a multiple-choice test on paper that becomes a multiple-choice test on the computer. But even then, comparability issues can arise.

A student who must read a text passage in order to answer a multiple-choice question, for example, might be able to read the entire passage on one page of the paper test, but on the computer, she must scroll up and down to do so. Such shifts can affect the performance of some students, said a longtime assessment expert at a major testing company. (Like most experts interviewed for this story, he agreed to speak only if his name was withheld because of his employer鈥檚 contracts with the assessment consortia.)

Comparability challenges deepen when tests differ significantly in format, experts said. In the case of the two state consortia, their computer-based exams鈥攚ith technology-enhanced items such as interactivity and animation, and longer, more complex performance tasks鈥攚ill be able to represent ideas in ways that the paper versions cannot, so establishing comparability between the two will be tougher.

鈥淲hen an assessment has types of items only available in one mode, it creates a greater challenge for establishing comparability, but it鈥檚 a familiar one and it鈥檚 generally a manageable one,鈥 said ACT鈥檚 Mr. Patz.

The other expert, however, said that while the consortia鈥檚 comparability challenge is 鈥渘ot a fatal problem, it needs to be thoughtfully negotiated and represented to anyone who will use those test scores.鈥

That source said it鈥檚 not possible to measure everything in the paper-and-pencil version that can be measured in the computer-based version.

鈥淚n the technical sense of 鈥榗omparable,鈥 the two might not be comparable,鈥 he said. 鈥淚f you were successful in measuring the same things, which would be a stretch if the computer-based version鈥檚 items are truly innovative, it could well be the case that one [test] could be harder or easier than the other because of how the items are presented.鈥

Writing From Scratch

Assessment specialists outlined various ways to establish comparability between the paper and computer versions of a test. One is to use a set of common items in both, so test designers can compare student performance on those items in the two modes. Another is to randomly assign students to take one or the other mode of the test. Better yet, a study group of students can be selected to take both the paper and computer versions. Consortium officials said such methods are being planned or considered for field tests next spring.

Testing experts also said it鈥檚 best to create assessment questions from scratch for the paper-based assessment, rather than building paper versions of test items originally designed for the computer.

鈥淵ou can鈥檛 replicate the interactivity of the computer environment on paper,鈥 said one testing expert. 鈥淵ou need to build alternate forms of the test that measure the same standards [on paper].鈥

Mr. Willhoft from Smarter Balanced said that his group is adapting items written for the online environment to paper. Mr. Nellhaus from PARCC said its developers are writing paper items from scratch to use in place of technology-enhanced items on the computer, but more traditional item types can be used in both modes.

PARCC鈥檚 field test next spring will include paper-based as well as computer-based exams, Mr. Nellhaus said. The Smarter Balanced field test will include paper forms only for a small group of students, to study comparability, Mr. Willhoft said. 鈥淭here鈥檚 no denying that there will be some items that will be difficult to translate into the paper environment,鈥 said Mr. Willhoft. One of the consortium鈥檚 math items, for instance, asks students to click on images of a cylindrical shape and a rectangular one in an exercise about volume. 鈥淏ut there鈥檚 nothing inherent in a given standard that requires a certain kind of interactive item,鈥 he said. 鈥淵ou can measure the same standard in different ways.鈥

Smarter Balanced faces an extra layer of complexity in comparability because its test is computer-adaptive, meaning it adjusts questions to the test-taker鈥檚 skill level.

鈥淲ith an adaptive test, you see right away what questions a kid needs,鈥 said Lauress L. Wise, a principal scientist with the Monterey, Calif.-based Human Resources Research Organization, which has performed quality assurance and evaluation on testing systems such as the National Assessment of Educational Progress. 鈥淲ith paper and pencil, you鈥檇 have to offer a lot more questions鈥攁 longer test鈥攖o make it comparable to that. If you can鈥檛 do that, you won鈥檛 be measuring the end points [of achievement] as well.鈥

Mr. Willhoft acknowledged that the paper version of the Smarter Balanced test will be 鈥渓ess precise, with a larger measurement error鈥 at those points in the spectrum.

In seeking comparability, a key consideration is what kinds of conclusions will be drawn from the scores on the two types of tests, said Mr. Wise. The degree of comparability takes on added significance when high-stakes decisions are based on the results, he said.

鈥淚f this were a graduation test, and some kids were getting denied diplomas because they took one form or another, you could make a plausible argument why there could be a lawsuit,鈥 Mr. Wise said. 鈥淭hat could get sticky.鈥

Quality of Tasks

The fact that paper-and-pencil tests might be more widely used in lower-income areas is something that officials at the Education Trust, which advocates school improvement for disadvantaged students, are keeping an eye on. But those potential questions of equity revolve more around the quality of the assessment鈥攁nd the teaching that goes with it鈥攖han about the mode of the test, they say.

Christina Theokas, the organization鈥檚 director of research, said she worries that if the paper test is less complex and instructionally rich than the computer version, classroom instruction could mirror that.

But students aren鈥檛 necessarily at a disadvantage just by taking a paper-and-pencil test, said Sonja Brookins Santelises, the Education Trust鈥檚 vice president of K-12 policy and practice. Top-notch paper tests such as NAEP and Massachusetts鈥 statewide exams demonstrate that, she said. The important thing to watch is not the mode in which a test is administered, Ms. Santelises said, but 鈥渢he quality of the task鈥 and how well students are prepared for it.

鈥淵ou can do a rudimentary task on a computer and have it not be beneficial, and you can have a paper-and-pencil task that鈥檚 instructionally rigorous and very beneficial,鈥 she said. 鈥淎re students going to have access to the kind of experiences and curriculum that prepare them for those kinds of tasks? Are teachers being prepared and supported to do that?鈥

Ms. Santelises added: 鈥淲e need to stay focused on the teaching and learning, rather than on whether we have the right technology to give a test.鈥

Take the test: Try your hand at interactive sample questions from the Smarter Balanced consortium.

Coverage of the implementation of the Common Core State Standards and the common assessments is supported in part by a grant from the GE Foundation, at www.ge.com/foundation. Education Week retains sole editorial control over the content of this coverage.
A version of this article appeared in the October 30, 2013 edition of Education Week as Transition to Online Tests Sparks Fears

Events

School & District Management Webinar Crafting Outcomes-Based Contracts That Work for Everyone
Discover the power of outcomes-based contracts and how they can drive student achievement.
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
School & District Management Webinar
Harnessing AI to Address Chronic Absenteeism in 69传媒
Learn how AI can help your district improve student attendance and boost academic outcomes.
Content provided by 
School & District Management Webinar EdMarketer Quick Hit: What鈥檚 Trending among K-12 Leaders?
What issues are keeping K-12 leaders up at night? Join us for EdMarketer Quick Hit: What鈥檚 Trending among K-12 Leaders?

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide 鈥 elementary, middle, high school and more.
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.

Read Next

Standards & Accountability State Accountability Systems Aren't Actually Helping 69传媒 Improve
The systems under federal education law should do more to shine a light on racial disparities in students' performance, a new report says.
6 min read
Image of a classroom under a magnifying glass.
Tarras79 and iStock/Getty
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
Standards & Accountability Sponsor
Demystifying Accreditation and Accountability
Accreditation and accountability are two distinct processes with different goals, yet the distinction between them is sometimes lost among educators.
Content provided by Cognia
Various actions for strategic thinking and improvement planning process cycle
Photo provided by Cognia庐
Standards & Accountability What the Research Says More than 1 in 4 69传媒 Targeted for Improvement, Survey Finds
The new federal findings show schools also continue to struggle with absenteeism.
2 min read
Vector illustration of diverse children, students climbing up on a top of a stack of staggered books.
iStock/Getty
Standards & Accountability Opinion What鈥檚 Wrong With Online Credit Recovery? This Teacher Will Tell You
The 鈥渨hatever it takes鈥 approach to increasing graduation rates ends up deflating the value of a diploma.
5 min read
Image shows a multi-tailed arrow hitting the bullseye of a target.
DigitalVision Vectors/Getty