Figures Lie

The New York Times never fails to explore the nuances of the economy and politics, but it is surprisingly naive about testing. It rarely analyzes the results of achievement testing in the schools the way it scrutinizes the economic data that routinely lead the front page.

One of today’s editorials, entitled “Honest Testing,” begins “Congress did the right thing with the No Child Left Behind Act of 2002, when it required states to document student performance in yearly tests in exchange for federal aid. Parents and policy makers need to know how well their schools are doing.” (July 24, 2010). The editorial deplores the continuing decline in the rigor of the New York state math and English tests, compared with the NAEP  (National Assessment of Educational Progress) tests over the same time period.  It concludes that the only reason for eighth grade math scores to rise 20 percentage points in one year is that the test has become more predictable and easy to prepare for.  The Times may be right in this case, but there are dozens of reasons for the rising and falling of test scores, and the media seldom examines any but the most obvious ones.

One reason everyone respects the NAEP is its very cautious approach to reporting test scores, especially scores that compare states and cities.  While the reporting of local test scores may be greeted only with jubilation or dismay at the annual district trends in reading and math, the NAEP always qualifies results by reporting demographic and cognitive characteristics of the test takers. For example, it reported in the 2007 Nation’s Report Card on Writing, that the writing scores of eighth graders differed by 29 points based on the schooling level of parents (from “not finishing high school” to “graduated from college”). This tells us more about the scores than how one state fared against another. The gaps between these two groups is 30% higher than any other achievement gap studied, whether gender or race.

On the other hand, state scores take on greater significance when qualified by demography of the state.  One factor seldom examined in the media is “exclusion rates,” by which states exempt from testing  certain students with disabilities and second-language challenges that might invalidate their test performance.  Both No Child Left Behind and the NAEP have policies about excluding students from tests, but the states hold the ultimate power to exclude certain students from testing. Obviously the exclusion of large numbers of students from groups with language deficiencies will  have a positive effect on the average writing score within a given state.

With great delicacy NAEP cautions,

While the effect of exclusion is not precisely known, the validity of comparisons of performance results could be affected if exclusion rates are comparatively high or vary widely over time.  In the 2007 writing assessment, overall exclusion rates (for both students with disabilities and English language learners) were three per cent  at both grades 8 and 12, state exclusion rates at grade 8 varied from 1 to 7 percent, and the 10 urban school districts excluded from 2 to 11 per cent.” (“The Nation’s Report Card, Writing,   p. 7)

Suppose we compared the writing performance of eighth graders on the NAEP in 2002 and 2007? One state, let’s call it “Massachusetts,” showed an increase in scores from 163 to 167, while another state we’ll call “New York” showed an increase from 151 to 154. Both states exceeded the national average improvement, which was +2 points. In the media New York would be considered an also-ran, while Massachusetts would be honored for improving on writing scores already among the nation’s best.

But what if it was known that Massachusetts increased its percentage of students excluded for learning disabilities in 2007 from 4 percent to 6 percent, while New York decreased its percentage of LD students from 4 to 2 percent? What if the resulting increase gave Massachusetts an impressive average score of 139 among LD students, while New York, with its 120 average LD score,  was closer to the national average of LD students of 118. Wouldn’t a reasonable inference be that the increase in the overall writing score of Massachusetts students might be largely the result of excluding a larger percentage of students from taking the test in 2007?

It would be cynical to say that the 6% of students with disabilities in Massachusetts were unfairly excluded from taking the writing test.  Very likely they were legitimately excluded. But it would be just as unfair to claim the rise in the state writing score was attributable to the improvement of the teaching and learning of writing in the state’s middle schools.

For NAEP testing Massachusetts was among the highest three states in excluding for learning disabilities for middle school writing.  Massachusetts, Kentucky and Texas all excluded 6%. The national average for excluding students with learning disabilities on this writing test was 3%. If we were examining the eighth grade writing scores of Kentucky over  the nine-year period (1998 – 2007), we might want to consider that the state increased its exclusion rate from 2 to 4 to 6 percent over those nine years. While the exclusion rate was relatively stable in Texas over the same period, its writing scores declined with each administration, from six points higher than the national average to three points below it.

This is not the kind of test score analysis that fascinates the readers of the daily newspaper, but it is no more complicated than explaining what the changing unemployment rate might mean for real employment trends during a recession and no less significant for understanding raw data.  It is really a question of accurate reporting, not just writing for an audience of non-professionals.

It would be gratifying to read about educational test score trends in the national and local media with an intelligent analysis of reasons for the trends, rather than seeing it reported at face value like the pulse and blood pressure of a sick patient.