Skip To Main Content

Header Holder

Header Sticky

Search Canvas

Close container canvas

Breadcrumb

How comparative judgement changed my marking and my teaching
  • Exams and Assessments

Written by Tom Spreyer, COBIS Peer Accreditor

Comparative judgement (CJ) is an approach to assessment in which teachers compare pairs of student work and decide which is better overall, rather than awarding marks against detailed criteria. When many such comparisons are combined, statistical modelling produces a reliable rank order of work across a cohort. CJ is increasingly used to assess complex outcomes, such as extended writing, where quality is difficult to define precisely and traditional marking can be inconsistent.

I first encountered comparative judgement almost by accident. At the time, I was a Deputy Head at Rugby School Thailand, leading professional development and trying to address a familiar challenge in international schools: how to assess students’ understanding fairly in a context where many are working with English as an Additional Language (EAL).

Like many teachers, I was increasingly uneasy about the limits of traditional marking. Extended written work, particularly in subjects such as history, often seemed to reward fluency of expression as much as, or more than, depth of understanding. For EAL students, this could be particularly distorting. I wanted an assessment approach that would help us see what students actually understood, rather than what their English allowed them to demonstrate on the page.

Comparative judgement (CJ) offered a way in. Rather than relying on detailed criteria, it allowed teachers to make holistic judgements about quality and to see patterns of understanding across a cohort with far greater clarity (Pollitt, 2012). It is difficult to have even the most vague understanding of CJ without developing an admiration for the excellent work of one its most significant proponents, Daisy Christodoulou. Christodoulou argues that this form of relative judgement is not a shortcut, but a more cognitively natural and reliable way for teachers to evaluate complex work, particularly where quality is difficult to define precisely (Christodoulou, 2025). I began experimenting with CJ in Year 9 history, initially as a way of improving reliability and reducing marking time. What I did not anticipate was how profoundly it would shape my teaching.

One of the most influential texts in my professional development has been Black and William’s Inside the Black Box (1998). That paper reframed assessment as a tool for improving learning rather than simply measuring it, and it strongly influenced my understanding of responsive teaching: teaching that adapts to evidence of what students know and can do. CJ aligned powerfully with this idea. Because it produces a fine-grained picture of relative quality across a cohort, patterns in understanding became visible far more quickly than through conventional marking. When I reviewed the rank order, I could see clusters of students who were securely grasping causation and explanation, others who were partially secure but inconsistent, and some who were relying on narrative rather than analysis. This information was available almost immediately, without weeks of marking and moderation.

Christodoulou and Breen (2023) note that one of the strengths of CJ is its capacity to surface differences in quality without forcing teachers to deconstruct work into atomised criteria. That insight resonated strongly with my experience. The impact on my planning was immediate. Lessons became more responsive, reteaching more targeted, and feedback conversations more precise. Rather than guessing where misunderstandings lay, I could see them clearly and act on them.

As I continued to develop CJ across the 11–18 age range, both in international settings and later in UK schools, its diagnostic power was especially strong for EAL learners. CJ does not remove language from the assessment, nor should it, but it reduces the extent to which language proficiency dominates judgements of quality. This aligns with Christodoulou’s more recent writing, which emphasises that CJ allows teachers to attend to what matters most in a given domain, rather than being overly constrained by proxies for quality such as surface accuracy or formulaic structure (Christodoulou, 2025). In practice, this meant I could identify students whose conceptual understanding was strong even when their written expression was still developing. For those pupils, the appropriate response was not to reteach content, but to focus on disciplinary vocabulary, sentence structures, and opportunities for oral rehearsal. Conversely, where CJ revealed weaker understanding masked by fluent writing, teaching could be redirected accordingly. In both cases, assessment information led directly to better instructional decisions.

This experience mirrors wider research suggesting that comparative judgement can offer greater reliability and validity than traditional marking for complex outcomes such as writing (Bramley, 2015; Christodoulou, 2016). Work by the Bell Foundation has further shown how CJ can support more accurate scaling of language proficiency for EAL learners, precisely because it captures developmental progression more reliably than atomised criteria (Bell Foundation, 2017).

Over time, the cumulative effect of this more responsive approach became evident. Internally, we saw clearer progression in students’ work and a narrowing of gaps between EAL and non-EAL learners. More importantly, students’ writing improved in the ways that mattered most: clarity of argument, relevance of evidence, and coherence of explanation. At GCSE level, while CJ itself is not used in public examinations, the improved alignment between teaching, assessment and understanding translated into stronger and more consistent outcomes.

Comparative judgement is not a silver bullet. It does not replace strong subject knowledge, a coherent curriculum, or explicit teaching of academic language. Nor does it remove the need for professional dialogue about standards. What it does offer is a way of seeing student work more clearly and acting on that insight more quickly.

For me, CJ has been less about revolutionising marking and more about strengthening teaching. By freeing assessment from the false precision of marks and levels, it has allowed professional judgement, responsiveness and equity to come back to the fore. In linguistically diverse classrooms, that clarity matters, not just for fairness, but for learning itself.


References

Bell Foundation (2017) The EAL assessment framework: Research and development. London: The Bell Foundation.
Black, P. and Wiliam, D. (1998) Inside the black box: Raising standards through classroom assessment. Phi Delta Kappan, 80(2), pp. 139–148.
Bramley, T. (2015) Investigating the reliability of marking of GCSE English writing using comparative judgement. Cambridge: Cambridge Assessment.
Christodoulou, D. (2015) Comparative judgement: 21st century assessment. Daisy Christodoulou blog.
Christodoulou, D. (2016) Comparative judgement: practical tips for in-school use. Daisy Christodoulou blog.
Christodoulou, D. (2016) Making good progress? The future of assessment for learning. Oxford: Oxford University Press.
Christodoulou, D. (2025) What is comparative judgement and why does it work? No More Marking Substack.
Christodoulou, D. and Breen, J. (2023) Comparative judgement compared with traditional writing assessments. No More Marking Substack.
Pollitt, A. (2012) The method of adaptive comparative judgement. Assessment in Education: Principles, Policy & Practice, 19(3), pp. 281–300.