Details can derail tests

Noa Rosinplotz, 12, an eighth-grader in Washington, D.C., said questions on last spring’s standardized tests were confusing.

Not all testing mistakes stem from high-brow jobs like writing questions or calculating scores.

Everything from shipping, to printing, to proofing, to distributing and collecting test booklets can derail a test.

Testing executives say their master project schedules contain thousands of tasks.

Test papers that come in from humid places can stick together – distorting results – if not dried out before being scanned. One mislaid character in lengthy computer coding can lead to monumental errors.

And no matter how organized contractors are, they still interact with hundreds, if not thousands, of schools that must handle and return every child’s test paper in exactly the same way.

There are just so many ways that things can go wrong. And they have.

One test contractor misspelled the word “assessment” on reports it planned to send parents — two years in a row.

And a few years ago, Robert Lee, chief analyst of Massachusetts’ testing program, noticed something odd while perusing the data on students’ responses for one of the state’s tests. He saw a surprising number of times when students had apparently bubbled in two answers.

He asked the contractor why. The company found the scanners hadn’t been cleaned and had their light bulbs changed regularly enough, leading to incorrect reads of students’ answer sheets.

“That’s stuff that we caught,” Lee said. “That’s when the process is working.”

Some testing industry experts say online tests will help reduce errors. But others warn new technology introduces a host of new variables into an already complicated process — and some are likely to misbehave.

The first year Indiana began using letter grades to rate schools, it suffered major interruptions in online testing that affected thousands of students, who were kicked off their computers in the middle of the exams.

The problems unnerved educators.

The state “is taking schools out behind the barn and shooting them when they find performance to be unacceptable,” Dan Rice, a school technology director in Elkhart, wrote in a heated 2011 e-mail to contractor CTB/McGraw-Hill, “and the tool they’re using to determine that performance level is itself unacceptable.”

“Our kids deserve better, CTB,” he wrote.

Another technology coordinator in Winamac sent the state an e-mail relaying fears by teachers whose evaluations would soon include test scores. The problems had “turned some teachers totally against” the online version of the test, she said.

“This is a test upon which you would base a portion of teacher evaluations?” she wrote. “A high stakes test should not have all these problems.”

This spring, disruptions in online testing happened again in Indiana and other states.

“It’s like having a recall on a car,” said Matthew Johnson, a testing expert and professor at Teachers College, Columbia University, in New York City. “and having the exact thing happen a year later.”

Ellen Haley, president of CTB/McGraw-Hill, said Indiana’s 2011 glitch was different from the one that occurred this year. “It shouldn’t have happened, I’m very upset that it did happen,” she said. “The good news is we were able to find it right away, correct it right away.”

While the testing industry has worked to improve core practices, including scoring, more attention is needed to mundane, operational details, said Chris Domaleski, a senior associate at the National Center for the Improvement of Educational Assessment.

Testing specialists are not always “going to catch a printing error in a test book or a glitch in the computer delivery system that renders an item incorrectly,” he said. Yet such errors can undermine public confidence just as much as a scoring problem.

“To the public the nature or cause of an error is, understandably, a matter of indifference,” Domaleski wrote in an e-mail. “Any error in any part of the system simply signals the test is ‘broken.’”