May.22.2023

Confusion on PIRLS reporting – some outlets make major mistakes

By Sally Larsen

The Progress in International Reading Literacy Study (PIRLS) results were released last Tuesday, generating the usual flurry of media reports. PIRLS selects a random sample of schools and students around Australia, and assesses the reading comprehension of Year 4 students. The sampling strategy ensures that the results are as representative of the Australian population of Year 4 students as they can be. 

These latest results were those from the round of testing that took place in 2021 amid the considerable disruptions to schooling that came with the COVID-19 pandemic. Indeed, the official report released by the team at ACER acknowledged the impacts on schools, teachers and students, especially given that PIRLS was undertaken in the second half of the 2021 school year – a period with the longest interruptions to face-to-face schooling in the two largest states (NSW and Victoria). 

Notwithstanding these disruptions, the PIRLS results showed no decline in the average score for Australian students since the previous testing round (2016), maintaining the average increase from the first PIRLS round Australia participated in (2011). The chart below shows the figure from the ACER report (Hillman et al., 2023, p.22).

The y-axis on this chart is centred at the historical mean (a score of 500) and spans one standard deviation above and below the mean on the PIRLS scale (1SD = 100). The dashed line between 2016 and 2021 is explained in the report: 

“Due to differences in the timing of the PIRLS 2021 assessment and the potential impact of COVID-19 and school closures on the results for PIRLS 2021, the lines between the 2016 and 2021 cycles are dashed.” (Hillman et al., 2023, p.22).

Despite these results, and the balanced reporting of the ACER official report, reiterated in their media release and piece in The Conversation, the major newspapers around Australia still found something negative to write about. Indeed, initial reporting collectively reiterated a common theme of large-scale educational decline. 

The Sydney Morning Herald ran with the headline: ‘Falling through the cracks’: NSW boys fail to keep up with girls in reading. While it’s true to say the average difference between girls and boys has increased since 2011 (from 14 scale scores to 25 in 2021), boys in NSW are by no means the worst performing group. Girls’ and boys’ average reading scores mirror a general trend in PIRLS: that is, improvement from 2011 and pretty consistent thereafter (see Figure 2.11 from the PIRLS report below). Observed gender gaps in standardised tests are a persistent, and as yet unresolved, problem – one that researchers and teachers the world over have been considering for decades. The words ‘falling through the cracks’ implies that no one is looking out for boys’ reading achievement, an idea that couldn’t be further from the truth. 

Similarly, and under the dramatic headline, Nine-year-olds’ literacy at standstill, The Australian Financial Review also ran with the gender-difference story, but at least indicated that there was no marked change since 2016. The Age took a slightly different tack, proclaiming, Victorian results slip as other states hold steady, notwithstanding that a) the Victorian average was the second highest nationally after the ACT, and b) Victorian students had by far the longest time in lockdown and remote learning during 2021. 

Perhaps the most egregious reporting came from The Australian. The story claimed that the PIRLS results showed “twice as many children floundered at the lowest level of reading, compared with the previous test in 2016 … with 14 per cent ranked as ‘below low’ and 6 per cent as ‘low’”. These alarming results were accompanied by a graph showing the ‘below low’ proportion in a dangerous red. The problem here is that whoever has created the graph has got the numbers wrong. The article has reversed the proportions of students in the two lowest categories. 

A quick check of the official ACER report shows how they’ve got it wrong. The figure below shows percentages of Australian students at each of the five benchmarks in the 2021 round of tests (top panel) and the 2016 round (bottom panel), taken directly from the respective year’s reports. The proportions in the bottom two categories – and indeed all the categories – have remained stable over the five-year span. This is pretty remarkable considering the disruption to face-to-face schooling that many Year 4 children would have experienced during 2021.

But, apart from the unforgivable lack of attention to detail, why is this poor reporting a problem? Surely everyone knows that news articles must have an angle, and that disaster stories sell? 

The key problem, I think, is the reach of these stories relative to that of the official reporting released by ACER, and by implication, the impact they have on public perceptions of schools and teachers. If politicians and policymakers are amongst the audiences of the media reports, but never access the full story presented in the ACER reports, what conclusions are they drawing about the efficacy of Australian schools and teachers? How does this information feed into the current round of reviews being undertaken by the federal government – including the Quality Initial Teacher Education Review and the Review to Inform a Better and Fairer Education System? If the information is blatantly incorrect, as in The Australian’s story, is this record ever corrected?

The thematic treatment of the PIRLS results in the media echoes Nicole Mockler’s work on media portrayals of teachers. Mockler found portrayals of teachers in news media over the last 25 years were predominantly negative,continually calling into question the quality of the teaching profession as a whole. A similar theme is evident even for a casual observer of media reporting of standardised assessment results. 

Another problem is the proliferation of poor causal inferences about standardised assessment results on social media platforms – often from people who should know better. Newspapers use words like ‘failed’, ‘floundered’, ‘slipped’, and suddenly everyone wants to attribute causes to these phenomena without apparently questioning the accuracy of the reporting in the first place. The causes of increases or declines in population average scores on standardised assessments are complex and multifaceted. It’s unlikely that one specific intervention or alteration (even if it’s your favourite one) will cause substantial change at a population level, and gathering evidence to show that any educational intervention works is enormously difficult.

Notwithstanding the many good stories – the successes and the improvements that are evident in the data – my prediction is that next time there’s a standardised assessment to report on, news media will find the negative angle and run with it. Stay tuned for NAPLAN results 2023.

Sally Larsen is a Lecturer in Learning, Teaching and Inclusive Education at the University of New England. Her research is in the area of reading and maths development across the primary and early secondary school years in Australia, including investigating patterns of growth in NAPLAN assessment data. She is interested in educational measurement and quantitative methods in social and educational research. You can find her on Twitter @SallyLars_27

Republish this article for free, online or in print, under Creative Commons licence.

Discover more from EduResearch Matters

Subscribe now to keep reading and get access to the full archive.

Continue reading