Editor’s note: This is one of the original stories the AJC wrote on suspected cheating at Georgia schools, including 12 in Atlanta. It was published in October 2009:

Statistically unlikely state test scores are showing up in more classrooms, suggesting the cheating investigation that has engulfed four schools might be about to widen.

An Atlanta Journal-Constitution investigation found 19 public elementary schools statewide with extraordinary gains or drops in scores between spring last year and this year. A dozen were in Atlanta.

In West Manor and Peyton Forest elementary schools, for instance, students went from among the bottom performers statewide to among the best over the course of one year. The odds of making such a leap were less than 1 in a billion.

This summer, state officials found strong evidence of cheating at four schools statewide in an investigation that followed a December AJC story about improbable gains on state tests.

In the most recent analysis, the AJC again used statistics to look for schools with test score changes far outside the normal range. The newspaper compared students’ scores in one grade versus their scores in the next. Some improved astronomically, but others deteriorated sharply.

“Changes of that magnitude are just extremely suspicious,” said Walt Haney, a testing expert and professor at Boston College.

Atlanta officials said they do not believe cheating occurred. Yet questionable scores appeared last school year in more than one in five of the district’s 57 elementary schools, the AJC analysis showed. At some, multiple tests stood out, with scores moving up or down erratically. Experts say children’s scores are normally fairly stable between grades.

“There had to be something considerable that happened that you would swing that much in a single year,” said Kathleen Mathers, executive director of the Governor’s Office of Student Achievement.

Her office is scrutinizing state test scores for the sort of anomalies that could signal test-tampering. It will use its findings — due this month — to decide where to audit, she said.

This is the first time state officials have undertaken such a broad search for test cheats.

Several Atlanta principals attributed unexpected score changes to factors such as a good instructional programs, talented or struggling teachers or changes in the student population.

Atlanta Deputy Superintendent Kathy Augustine said the district has no plans to check the validity of the scores highlighted by the AJC.

“I don’t have any reason to look at that,” she said. “We expect outliers every year.”

She said the district’s use of testing data to guide instruction and good teacher training are among the strategies that have helped schools make steady progress. Also, high rates of student turnover at some schools in question could create surprising score jumps, she said.

A check of several schools outside Atlanta with similarly high turnover, however, found none with such unusual test results.

Last summer, critics chastised Atlanta for its handling of cheating allegations at Deerwood Academy, one of the schools where state officials said they had uncovered evidence of likely test-tampering. Superintendent Beverly Hall said the district found no proof — a stance that drew a rebuke from Gov. Sonny Perdue.

Augustine said the district would investigate if it had evidence of cheating.

Besides the Atlanta schools, Heards Ferry in Fulton County was the only other in the metro area to report such unexpected scores, the AJC found.

A meteoric jump

The AJC examined scores on state reading, math and language arts tests for students in grades 3 through 5. The newspaper compared students’ scores from 2008 with how they did in spring 2009.

The state Criterion-Referenced Competency Tests are Georgia’s main measure of academic ability through eighth grade. The Atlanta elementary schools in question include one that state Superintendent Kathy Cox praised effusively in May as a hardworking school with an “absolutely no-excuses attitude.”

“By the way, they’re knocking the socks off of the test scores,” Cox said of Peyton Forest Elementary at a state Board of Education meeting. “They’re just a shining star.”

Indeed, when state test results arrived a few weeks later, some scores’ rise was meteoric.

Peyton third-graders’ math results last year were among the lowest in the state. But as fourth-graders this spring, they placed fourth in math out of nearly 1,200 schools statewide, outpacing dozens of affluent suburban classrooms.

The feat was even more surprising given that two months before the state test, 94 percent of Peyton fourth-graders scored at the lowest of four levels on the district’s own practice math tests.

Peyton Principal Karen Barlow-Brown said the increases were partly due to a former third-grade math teacher who was ineffective last year and a talented fourth-grade teacher this year. She also said the school doesn’t use the practice tests as a predictor of state test results.

“That is really an insult,” she said when asked whether tests might have been altered.

Such dramatic gains in such a short amount of time, however, are abnormal at best, experts said.

“It’s very hard to explain these huge gains,” said Tom Haladyna, a professor emeritus at Arizona State University and testing expert who reviewed the AJC’s findings. “You have to wonder: Is this the greatest school in the world?”

Schools that attribute such rare gains to a successful program have a responsibility to show others what they did, he said. “The whole world wants to know this,” he said. “If we could get this out of every class in your state and every other state, wouldn’t that be fantastic?”

An Atlanta district spokesman asked a reporter to call Michael Casserly, executive director of the Council of Great City Schools. The group, of which Atlanta is a member, supports urban systems.

Casserly had not seen the AJC’s analysis but said he disagreed with experts who said the scores were questionable.

Casserly said some schools might teach their curricula differently from others in the state, or the changes might be random, or another factor such as teacher turnover could differ in Atlanta.

“If you’re after one single explanation, you are on the verge of badly misleading the public on the basis of a very bogus analysis,” he said.

Leapfrogging peers

If falsified, scores can disguise serious academic problems, said Eric Cochling, vice president of public policy at the nonprofit Georgia Family Council.

Parents need valid test scores to make key decisions for their children, such as whether to change schools or teachers, or get remedial help, Cochling said.

“How do they know what their child needs, ultimately, if they can’t rely on the test results?” he asked. “It seems it sets these kids up for failure.”

Atlanta’s West Manor Elementary made some of the most astonishing gains this year.

In fourth grade last year, students’ poor scores ranked 830th statewide on the math test. This year, fifth-graders not only caught up to their peers but sped past them; they scored the highest statewide.

Their average score grew by nearly 90 points year to year, data show. Statewide, the average rise was about 15 points.

Practice tests again suggest a disconnect with the CRCT. Sixty percent of West Manor fifth-graders were still scoring at the lowest level in February practice tests. Not only did every student pass the CRCT in April, but 89 percent scored at the top “exceeds” level.

Principal Cheryl Twyman attributed gains to “the hard work of the teachers and students — that’s a given.” She declined to discuss the results further.

Parent Sharon Shannon Bussie said she has seen West Manor teachers push students to achieve and doesn’t believe they would cheat.

“This school is quite different,” she said. “If you’re an underachiever, you might as well not go here.”

Another puzzling result came from Atlanta’s Toomer Elementary.

Last year, Toomer’s fourth-graders scored best in the state on the English/language arts CRCT, which focuses on writing. Toomer’s average score was so high that no other school came within 14 points.

But this year, Toomer fifth-graders struggled with the test of concepts such as grammar and sentence structure. Their average score plummeted 58 points.

Haladyna said researchers rarely see such a steep drop. “Kids don’t go backward in their learning,” he said.

Interim Principal Hezekiah Wardlow said school staff realized some of its scores had dipped, but not to the extent made clear in the AJC analysis. He said the school has small grades, and three or four children leaving can have a big impact on scores.

To be sure, test scores can be affected by shifts in a school district’s boundaries or other events that change the makeup of the student population. A Fulton district spokeswoman noted that Heards Ferry’s attendance boundaries changed last year.

Some Atlanta schools have seen deep declines in student enrollment after housing projects closed in recent years. Blalock Elementary, which served children living in the Bankhead Courts housing project, had four subject tests with astronomical gains this year, the AJC’s analysis found.

On one, more than 96 percent of fifth-graders scored at the “exceeds” level in math, compared with 36 percent statewide. Former Principal Frances Thompson said last year was an unusual one for her school. Steep drops in enrollment as Bankhead Courts emptied meant more attention for students who stayed, she said, adding she has no concern that cheating might have occurred.

“Our class sizes were so much smaller, and we did use that to our advantage,” she said. “We were able to address the needs of the students very, very closely.”

Blalock closed at the end of the school year.

Peyton Forest is not the only school in question that has won awards, money or visits from dignitaries because of test scores.

Top federal education officials have visited Atlanta’s Capitol View and F.L. Stanton — which both had tests that were extreme outliers in the AJC analysis.

Capitol View Principal Arlene Snowden said she did not believe the gains the AJC cited were unusual. She said factors such as strong teaching programs and stellar staff made the difference.

“We accept no excuses from our children. I have a very highly competent staff. ... We look for teachers who know how to teach the Capitol View way,” she said. “We want everyone to be successful.”

Georgia School Superintendent Cox would not comment on the questionable scores because of the state’s investigation, a spokesman said, adding the state would act if cheating were found.

The student achievement office, which is independent of Cox’s agency, is scrutinizing both spring CRCT scores and the results from summer retests taken by students who failed on their first try, Mathers said.

She said the state may use two approaches in addition to statistical analyses. One examines erasure marks for unusual numbers of answers changed to correct. The other looks for unexpected patterns of responses, such as a class where students get all the hard questions right but the easy ones wrong.

This summer, state investigators said an erasure analysis revealed strong evidence that adults at four schools had cheated on CRCT retests.

DeKalb County police charged two school administrators with falsifying state documents. The state board of education revoked the four schools’ status as having met federal standards, or made “adequate yearly progress.”

Rumors persist

Unlike most districts, Atlanta hands out bonuses of up to $2,000 per educator to schools that meet targets for improving test scores. Last week, the district announced more than two dozen schools earned bonuses this year, including eight that the AJC found had highly unexpected score changes.

Rumors of cheating have swirled for years in the district. Some teachers have said they are afraid to report problems because they fear retaliation.

This summer, Superintendent Hall said she did not believe cheating was “pervasive” in the district and attributed anonymous complaints about it to disgruntled employees who resented being held accountable.

Former Atlanta teacher Joan Shensky said she reported finding a student with an illicit answer key that a teacher had distributed to other fifth-grade teachers at Collier-Usher Elementary in 2005.

“I was horrified, horrified,” she said in an interview.

District records show a teacher was sanctioned. Shensky said she wasn’t punished for speaking up but felt like an outcast afterward. She left for a teaching job in another system in 2007.

“I felt ostracized after that,” she said. “I was not comfortable.”

Steep gains

These charts show the change in two Atlanta schools’ average CRCT scores and the average change for all schools statewide. Compare schools’ soaring CRCT scores with the results of the district’s practice tests, which students took about two months before and did much worse on. On practice tests, “unsatisfactory” is the lowest of four levels and means less than 55 percent of answers were correct.

Standard deviation shows how unexpected a score change is. The odds of a four standard deviation change are worse than 1 in 31,000. The odds of a five standard deviation change are worse than 1 in 3 million. The odds of a six standard deviation change are worse than 1 in 1 billion.

West Manor Elementary School fifth-grade math

CRCT gain: 6.2 standard deviations

Odds: Less than 1 in a billion

Practice test results

January 2008 fourth-grade math: 57 percent unsatisfactory

February 2009 fifth-grade math: 60 percent unsatisfactory

Peyton Forest Elementary School fourth-grade math

CRCT gain: 6.1 standard deviations

Odds: Less than 1 in a billion

Practice test results

January 2008 third-grade math: 68 percent unsatisfactory

February 2009 fourth-grade math: 94 percent unsatisfactory

Source: AJC analysis of Georgia Department of Education and Atlanta Public Schools data

How we got the story

To detect unusual CRCT test score changes, the AJC used a statistical technique called linear regression to compare average 2009 scores at each elementary to comparable scores from the previous grade the year before.

This analysis found that the 2008 scores consistently explained about 80 percent of the differences between 2009 scores. The analysis also resulted in a mathematical formula that describes the general relationship between 2008 and 2009 scores. For example, an average 2008 fifth-grade reading score of 800 would predict a 2009 score of 802 points. A 2008 score of 850 would predict a 2009 score of 840.

The differences between actual and predicted scores were converted into a measure that can be plotted on a normal probability curve, or “bell curve,” to find the probability of that difference occurring by chance. A score greater than four “standard deviations,” for example, has an approximate probability of 0.0032 percent, or odds of less than 1 in 31,000. A score greater than six standard deviations has a probability of 0.000000099 percent, or odds of less than 1 in 1 billion.

There are limits to this analysis. Data publicly available from the state do not permit tracking students’ individual scores from year to year. And because we were able to look at average scores only, student mobility could create score variations not accounted for by the formula derived from the regression. This is especially true for schools and grades with smaller enrollments.

To counter these limits, we didn’t analyze cases with fewer than 20 students. We also only singled out schools with a four standard deviation or larger difference between predicted and actual scores.

Typically, cases greater than two standard deviations from the average are considered outliers.

MORE: