Dealing with the disease: The urgent need for exam reform

8/18/2020

Every so often a crisis appears in education which causes us to stop think. The A-Level crisis of August 2020 needs to be one of those moments. Although it has been portrayed as the catastrophic result of changes brought in haste due to Covid-19, the systems which have underpinned the current crisis have been in place for decades. The examinations system is the sick-man of education. What we have been witnessing over the last week is the tragic outcome of a diseased system, the underlying issues of which have festered away unchecked and untreated for far too long. It’s time to look for a cure. Let me explain...

The crisis
First a very brief overview of the specific crisis this summer. During the coronavirus lock down, formal examinations of pupils were cancelled by the DfE. A decision was taken to ensure students were still graded despite not sitting exams (we could discuss the problems in this too, but there is no space here). The statement from Gavin Williamson (below) really should have raised more questions and scrutiny at the time. The notion that grades for 2020 would be indistinguishable from other years despite students not sitting exams, or that “grading” in the usual way was the best outcome for students, were assumptions which should have been more robustly challenged. However, too many were unwilling to think through the potential consequences or were blinded by their faith in what they believed to be a robust and functioning examinations system which achieved fairness in normal years.

Despite the obvious issues, processes were put in place to provide students with grades. Centres were asked to undertake a process of providing a likely grade for each of their students in each subject had they sat the exams (a centre assessed grade or CAG). They were also asked to rank their students against each other in each subject area.

Once grades were submitted, Ofqual, in line with the DfE’s request, set about standardising the imagined examinations. A controversial algorithm was developed which redistributed the CAGs received by students according to the historic performance profile of the school. You can read much more on this process in the blog by FFT Education DataLab.

The result of all this was that when students received their grades on 13th August, many were shocked by what they found. Despite the many reassurances of fairness from the DfE and Ofqual, only 58.7% of centre assessed grades were retained. A tiny handful of grades (2.2%) were upgraded by one, but over 35% of grades were reduced by one and a further 3.5% by two or more. Naturally, this caused outrage.

Given Ofqual’s remit to maintain grade distributions and prevent inflation, it was inevitable that fairness would have to go out of the window. When teachers provided centre assessed grades, they were asked to submit grades which “reflect a fair, reasonable and carefully considered judgement of the most likely grade a student would have achieved if they had sat their exams this summer and completed any non-exam assessment.” Given this remit, many teachers explored evidence and agonised over the most appropriate grades to provide. Of course, in a real exam situation, some students do not perform to their “ability” for a whole range of reasons. But, no teacher in their right mind would have wanted to predict which of their B grade students might have choked in the exam, or chosen the wrong question, not slept the night before, or run out of time. That would have been unfair and unprofessional (despite claims by Ofqual, the DfE and the Telegraph that teachers were to blame for the whole fiasco).

The result of teachers’ deliberations over grading was that Ofqual were provided with exactly what the regulator asked for: a fair assessment of the abilities of students based on teacher evidence and judgement. Of course, the grade distribution was much more positive that the distribution would have been had exams been sat. This is because during a normal exam season the grade boundaries are set to maintain distributions in line with previous years. In other words, an A Grade does not exist until all the exams have been marked. By inverting that process Ofqual created a huge issue. Given their instruction from the DfE was to keep the grading as close as possible to previous years to prevent grade inflation, major changes to CAGs were required. Arguably it might have been more transparent if Ofqual had provided schools with their allocated grades beforehand i.e. “based on historic data you can award 6 A grades; 26 B grades” etc.

The issue of being fair to students and keeping grade inflation in line with historic figures was always going to cause problems. It is of course perfectly true that the same proportions of grades were given out as in previous years (indeed there was a small increase in awards at the top end). In a normal year however, the differentiation between students getting an A or a B grade in a subject would have been determined by the actual performance of that student in their exams, not by a system of statistical manipulation based on a school’s historic performance. As one commentator put it, students were judged by the ghosts of students past. The process in effect became a lottery based on historic data and teacher ranking (itself a problematic process, especially when teachers had ranked with little knowledge of how exactly this was to be used).

Recognising the disease
The outrage over results this year is perfectly understandable. However, many commentators are prone to assume that it is somehow unique. But the crisis did not appear from the ether in March 2020, rather it was a symptomatic outpouring of a much more pernicious disease. The reality is that our exams system has been eating itself away with its own contradictions and injustices while Ofqual, the DfE, and many others carry on as if nothing is wrong. We cannot ignore it any longer. Unless we admit there is a problem there will be no cure.

This year’s results have grabbed the headlines because students were affected across the board and through no evident fault of their own. Yet equally outrageous miscarriages of fairness have been occurring in many subjects at GCSE and A-Level for years.

I could list many example here of the ways in which specifications are poorly set and defined; how teaching time limits are poorly monitored at GCSE; how exams have driven content in some schools for far too long; how the norm referencing approach ignores what students can do in favour of how they rank against others in their national cohort; or how the exam support system has encouraged teachers to game results and push their pupils up in those rankings. In this blog though I want to focus on the sickness at the heart of the marking and grading of exams.

In 2018 Ofqual published research exploring the reliability of marking in reformed GCSE and A-Level examinations. They explored how far markers were in line with the “definitive marks” awarded by the principal examiner, as well as how far markers were in line with the “definitive grade” awarded by the principal. What they found in some subjects was just as shocking as the awarding crisis we have just experienced.

Whilst subjects like maths were found to have a 94% rate of agreement between markers and principles on the final grade, in many subjects the rate of agreement was below 70%. In History, the probability of a marker giving the same grade as the principal examiner across GCSE and A-Level exams was just 55%. Or in other words, there was a 45% chance that a student would receive an incorrect A-Level grade. This is actually worse than the 42% of pupils who received grades other than their CAGs during the recent summer series. The lottery we have seen in 2020 has been happening to students in History, English, Sociology, Geography, and RE (and to a lesser, though still worrying extent with Psychology and Biology) every single year for many, many years. Whole generations of students have sat papers in History and English with only just over a 50% chance that their grades will reflect their actual performance (and this is before we consider the impact of norm referencing which can move grade boundaries to ridiculous levels). Yet at the same time the commentary on exams each year has obsessed over grade inflation: an issue which is of course very easy to control and manipulate in a norm referenced system (something I wrote about back in 2012). Meanwhile Ofsted and the DfE have repeatedly dismissed worries about grading of exams and accused schools of trying to game grades through speculative re-marking of scripts.

So, what does all this tell us? Well it probably tells us that the line which gets spouted about the “inflated” grades this year damaging confidence only means anything if you also believe that historic exams are accurate reflections of students’ attainment. For over 45% of students in some subjects, we know that is unlikely to be the case. It also tells us that we need a meaningful review of our approach to examination in this country. As I have written in the past, we need to consider carefully what the purposes of examination are and consider the best systems to meet this need.

The crisis of 2020 may well be a blessing in disguise. We can no longer pretend that the exams system in the UK is fit and healthy. But we must ensure that treatment comes soon. We cannot continue to drag the sick-man on still further. To do so would be to fail yet another generation of young people.

Follow-up reading

For more on the marking issues noted in this blog please read my previous blog, “The Gilded Age”.

For some thoughts on the purposes of examinations and the issues of norm referencing see: “Searching for Gold”.

And for some musings on where we might go next see: “After the Gold Rush”.

1 Comment

Julie

8/26/2020 11:34:46 am

Thank you for writing this. You are so right! There is so much to be angry about on behalf of our students. And yet no one seems to listen! I’ve not even had a reply from my MP from when I wrote to her last week about the problematic year to come. It’s so exasperating!

Dealing with the disease: The urgent need for exam reform

Leave a Reply.

Archives

Categories