I first wrote this post back in September 2015 and have updated it in June 2016.
Over the last three years, I have written a number of posts about assessing without levels. Here’s a guide to them.
First of all, what were the problems with national curriculum levels that led to them being abolished? And were there any good things about them that we need to retain? It’s often claimed that one of the good things about NC levels was that they gave us a shared language – but I argue here, in Do national curriculum levels give us a shared language? that actually, they don’t. Instead, they provide us with the illusion of a common language, which is actually very misleading.
In Why national curriculum levels need replacing, I extended this argument, looking at research from the early 90s which showed that even then, national curriculum levels obscured the reality of pupils’ achievement.
One of the main reasons that NC levels were so confusing is that they were based on prose performance descriptors. Any replacement for NC levels which is based on similar descriptors is likely to run into similar problems. This post, Problems with performance descriptors, extends the argument against NC levels to all kinds of performance descriptors. And in The adverb problem, I look at how hard it is to define excellence in terms of adverbs or adjectives – what does it mean to say that a pupil writes effectively or brilliantly or originally?
David Didau responded to this argument accepting the major points, but also putting forward a replacement for levels that was based on performance descriptors. This was a pattern I was starting to see again and again – people would accept my criticisms of NC levels, but then produce a replacement which was essentially a rehashed version of levels. In this post, I tried to tackle this head on, by saying that Assessment is difficult, but it is not mysterious.
In response to that post, many people quite rightly claimed that if I wanted people to accept my critique of performance descriptors, I had to come up with some alternatives. This was the subject of the next two posts. In Assessment alternatives 1: using questions instead of criteria, I looked at how you could use banks of questions to replace descriptors. Instead of asking teachers if pupils can compare two fractions and see which is bigger, ask the pupils if ¾ is bigger than 4/5, or if 5/7 is bigger than 5/9. And if you get a computer to mark the question for you, you’ve cut down on a lot of workload too. In order to use this kind of approach, multiple choice questions can be very helpful. I’ve written a three-part guide to them here.
Again, many people responded to that post by saying that whilst it might work for maths and science, it wasn’t really applicable to essay-based assessments that you get in English and the humanities. In Assessment alternatives 2: using pupil work instead of criteria, I suggested that for these types of assessment you could use pupil work instead of criteria. Instead of marking an essay against a set of criteria, compare essays against each other and against exemplar work.
In the next two posts I looked at some of the theoretical reasons why performance descriptors don’t work. First, in Marking essays and poisoning dogs, I look at research which shows that human judgment is relative, not absolute. This is the theory which underpins comparative judgment, a really interesting new way of assessing the quality of complex tasks like essays. The website No More Marking allows you to use comparative judgment for free.
Second, in Tacit knowledge, I look at the research around tacit knowledge, which shows that it is hard to describe expertise and excellence using words. When we try to do so, as with performance descriptors, the result is often not quality but absurdity. For example, pupils who have been coached with certain rules of mark schemes sometimes end up writing essays which read less like a good essay and more like the mark scheme itself.
I’ve also written in more detail here about exactly how No More Marking and comparative judgment work. I am really convinced that this method of assessment has huge potential. It is much more reliable and efficient than traditional methods of marking writing. Not only that, but it also removes the distorting reliance on performance descriptors. We’ve seen just how distorting such descriptors can be this year at primary, where the interim assessment frameworks have led to tick-box style teaching that doesn’t reward real quality. I’ve written about this problem in more depth here, in Best fit is not the problem.
The two books I’ve found most helpful in thinking about assessment are Measuring Up by Daniel Koretz, and Principled Assessment Design by Dylan Wiliam. My review of William’s book is here. My review of Koretz’s book is in three parts: Part one is How useful are tests?, part two is Validity and reliability, and part three is Why teaching to the test is so bad.
As you can probably tell from the titles of the review of Koretz’s book, I think teaching to the test is hugely problematic, and I think that in many ways tests are used very badly in the English system. However, despite all these problems, I think that tests do have real benefits which other forms of assessment cannot match. I summarised some of these in a debate at Intelligence Squared in October 2015 where I spoke against the motion: Let’s End the Tyranny of the Test. The video of the debate is here. I also wrote four blogs summarising some of the arguments I made in this debate. The first post is a review of the debate. The second, Tests are inhuman – and that is what are so good about them, looks at how tests are fairer than teacher assessment because teacher assessment is biased against disadvantaged pupils. The third, Why is teacher assessment biased?, looks in more detail about why this bias happens. The fourth, Character assessment: a middle class ramp, looks at how the popular idea of assessing character risks the same kinds of bias.
In this post, from March 2014, I listed different approaches to life after levels that were being taken by different schools. I hope to be able to update this soon with further developments from the past 2 years.
From March-July 2015 I was a part of the government’s Commission on Assessment without Levels. My take on the final report can be found here.
Finally, I gave a speech at Research Ed in September 2015 summarising a lot of these ideas. The slides are here, and the video is below.