Skip to content

New report by the Sutton Trust: What Makes Great Teaching

Today the Sutton Trust and the University of Durham have published a fascinating new report called What Makes Great Teaching? It sets out to answer that title question, as well as looking at ways we can measure great teaching, and how that could be used to promote better learning. Here is my short summary of some key points from the report.

1. What is effective teaching? This report is very honest about the fact that we don’t have as clear an idea of what good teaching is as we think we do. I think this is an important point to make. Too often, reports like this one start from the point of assuming that everyone knows what good teaching is, and that the challenge is finding the time/money/will/methodology to implement changes. This report is saying that actually, there are a lot of misconceptions about what good teaching is, and as such, reform efforts could end up doing more harm than good. We need to think more clearly and critically about what good teaching is – and this report does that. As well as listing what effective teaching practices are, it also lists what ineffective practices are. This list has already received some media attention (including a Guardian article with a bit from me), as it says that some popular practices such as learning styles and discovery learning are not backed up by evidence. The report draws its evidence from a wide range of sources, including knowledge from cognitive psychology. It cites Dan Willingham quite a lot, and quotes his wonderful line that memory is the residue of thought. As regular readers will know, I think cognitive psychology has a lot to offer education, so it is great to see it getting so much publicity in this report.

2. How can we measure good teaching? According to this report, the focus should always be on student outcomes (not necessarily just academic ones). This can also be a bit of a hard truth. If a group of teachers work really hard at mastering a particular technique or teaching approach, and they do master it and use it in all their lessons, it can be tempting to define this as success. But this report says – no. The focus has to be on student outcomes. Although we can devise proxy measures which can stand in for student outcomes, we always need to be regularly checking back to the student outcomes to see if those assumptions are still holding true. The report is also honest about the fact that a lot of the current ways we measure teaching are flawed. That’s why we need to use more than one measure, to always be checking them against each other, and to be very careful about the purposes we put these measurements to. The report suggests that our current measures are probably only suitable for low-stakes purposes, and that they certainly can’t be used for both formative and summative measures at the same time (or ‘fixing’ and ‘firing’ as they call it).

3. How can we improve measurement? Although the report is very cautious about the current state of measurement tools, it offers some useful thoughts about how we could improve this state of affairs. First, school leaders need to be able to understand the strengths and limitations of all these various data sources. According to the report, there is ‘the need for a high level of assessment and data skills among school leaders. The ability to identify and source ‘high-quality’ assessments, to integrate multiple sources of information, applying appropriate weight and caution to each, and to interpret the various measures validly, is a non-trivial demand.’ Also, student assessment needs to be improved. If we always want to be checking the effect of our practices on student outcomes, we need a better way of measuring those outcomes. The report gives this tantalising suggestion: that the profession could create ‘a system of crowd-sourced assessments, peer-reviewed by teachers, calibrated and quality assured using psychometric models, and using a range of item formats’. It would be great to hear more details about this proposal, and perhaps about how CEM or the Sutton Trust could provide the infrastructure and/or training to get such a system off the ground.

One of the authors of the paper is Rob Coe, and I think this report builds on his 2013 Durham Lecture, Improving Education: A Triumph of Hope over Experience. This lecture was also sceptical about a lot of recent attempts to measure and define good teaching, as can be seen in the following two slides from the lecture.

Improving Education Fig 6 Mistaking School Improvement Improving Education Fig 8 Poor Proxies

I recommended this lecture to a friend who said something along the lines of ‘yes, this is great – but it’s so depressing! All it says is that we have got everything wrong for the last 20 years and that education research is really hard. Where are the solutions?’ I think this paper offers some of those solutions, and I would recommend it to anyone interested in improving their practice or their school.

Daily Politics soapbox – facts are vital

Today I was on the Daily Politics soapbox talking about why facts are vital for learning. Click on the image below to see the video on the BBC website.


For more information about the research I refer to, see my book, Seven Myths about Education, available here.

The short video was filmed at the Ragged School Museum in East London. It is a lovely little museum just round the back of the Mile End Road. The building was one of Doctor Barnado’s original ragged schools in the late 19th century, set up to educate the poor of the East End. It closed in 1908, and in 1990 it was turned into a museum. As well as some permanent displays, children can take part in an authentic Victorian lesson, taught by the rather formidable lady in the video. My mother and father grew up not far from this school, although I feel I should point out they are not quite old enough to have actually attended it. I also grew up in East London not far from the museum and can remember visiting the museum as a child. It is definitely worth visiting, or taking a school trip to.

In defence of norm-referencing

A couple of weeks ago Ofqual published their consultation on new GCSE grades. A lot of the media debate has focussed on the new 1-9 grading structure, but tucked away in the consultation document there is a lot of very interesting information about how examiners make judgments.

I’ve written before on this blog about the difference between norm-referencing and criterion-referencing. Briefly, norm-referencing is when you allocate a fixed percentage of grades each year. (Update – Dylan Wiliam has pointed out in the comments that this  is not the correct definition of norm-referencing – see here for his comment). Each year, the top 10% get A grades, next 10% B, etc. It’s a zero sum game: only a certain number of pupils can get the top grade, and a certain number have to get the lowest grade. This seems intrinsically unfair because however hard an individual works and however highly they achieve, they are not really going to be judged on the merits of their own work but on how it stacks up against those around them. More than x% of pupils might be performing brilliantly, but they can’t be recognised by this system. It seems much fairer to set out what it is you want pupils to know and do in order to achieve a certain grade, and to give them the grade if they meet that criteria. That’s criterion-referencing.

The old O-level allocated fixed percentages of grades, and when it was abolished, the new GCSE was supposed to be criterion-referenced. I say ‘supposed’, because whilst criterion-referencing sounds much fairer and better, in practice it is fiendishly difficult and so ‘pure’ criterion-referencing has never really been implemented. Criteria have to be interpreted in the form of tests and questions, and it is exceptionally hard to create tests, or even questions, of comparable difficulty year after year– even in seemingly ‘objective’ subjects like maths or science.

We are not the only country to have this problem. The Ofqual report references the very interesting example of New Zealand. Their attempt at pure criterion referencing in 2005 led to serious problems. A New Zealand academic wrote this report about it, which includes a number of interesting points.

Taken at face value, criterion-referenced assessment appears to have much to recommend it (the performance demonstrated is a well-specified task open to interpretation) and norm-referencing very little to recommend it (the level of performance must be gauged from the relative position obtained), nevertheless, there are difficulties that make the introduction of criterion-referenced assessment in areas like reading, mathematics, and so on, much less smooth than this view might lead one to anticipate.

Likewise, in his book Measuring Up (which I reviewed in three parts here, here and here), the American assessment expert Daniel Koretz outlines some of the flaws with criterion-referenced assessments. The basic flaw at the very heart of criterion-referencing may be that we are ill-equipped to make absolute judgments. In the words of Donald Laming, ‘there is no absolute judgment. All judgments are comparisons of one things with another.’

As a result, our system has never been purely criterion-referenced. Tim Oates says this of the system we use at the moment:

‘In fact, we don’t really have a clear term for the approach that we actually use. ‘Weak-criterion referencing’ has been suggested: judgement about students meeting a standard, mixed with statistical information about what kinds of pupils took the examination.’

Ofqual are proposing to continue with this approach, but to improve it. I support their direction of travel, but I wonder if they couldn’t have gone a bit further – say, for example, actually reintroducing fixed grades.

One argument against fixed allocations of grades is that it won’t allow you to recognise genuine improvement in the system – or indeed genuine decline. If the top x% always get the top grade, you have no idea if standards are improving or declining. However, this argument no longer holds water because Ofqual are proposing to bring in a national reference test:

 The performance of the students who take the test will provide a useful additional source of information about the performance of the cohort (rather than individual students) for exam boards awarding new GCSEs. If, overall, students’ performance in the reference test improves on previous years (or indeed declines) this may provide evidence to support changing the proportion of students in the national cohort achieving higher or lower GCSE grades in that year. At present such objective and independent evidence is not available when GCSE awards are made.

I think the reference test is an excellent idea. Ideally, in the long-term it could assume the burden of seeing if overall standards are improving, leaving GCSEs free to measure the performance of individual pupils. In that case, why not have fixed grades for GCSEs? Alan Smithers makes a similar point in the Guardian here.

One reason why Ofqual might not have wanted to reintroduce fixed allocations of grades at the moment is because, despite all the real technical flaws with criterion-referencing which I have outlined above, there is still an element of hostility to norm-referencing amongst many educationalists. In my experience, I sense that many people think that norm-referencing is ‘ideological’ – that the only people who advocate it are those who want to force pupils to compete against each other.

Nothing could be further from the truth. Norm-referencing has some basic technical advantages which make it a sensible and pragmatic choice. The Finnish system, for example, which is often seen as being opposed to the ideas of competition and pupil ranking, has a norm-referenced final exam where the ‘number of top grades and failed grades in each exam is approximately 5 percent.’ Not only that, but as the example of New Zealand shows, those countries who have experimented with fully criterion-referenced exams have faced serious problems. If we refuse to acknowledge the genuine strengths of norm-referencing, we risk closing down many promising solutions to assessment problems.

Why national curriculum levels need replacing

One of the main reasons why people say we need to keep national curriculum levels is because they provide a common language.

I am all in favour of a common language, but levels did not provide this, as I have argued before here. Since I wrote that last post, I have come across this fascinating paper by Peter Pumfrey. It was written nearly twenty years ago, when levels were first introduced. It looks at the results of pupils in the KS1 reading tests. It is summarised by Bonnie Macmillan in Why School Children Can’t Read:

An investigation comparing pupils’ standardised reading scores with their level of attainment on national curriculum tests is starkly illustrative.  Children who had been assessed as having attained level 2 (the average expected for their age) on national curriculum tests were found to have reading ages, determined from standardised testing, ranging from 5.7 to 12.9 years. That is, within the group of pupils all categorised as level 2, there was an incredible 7 year range in the actual reading abilities represented. Similarly, those categorised as level 1 were found to have reading ages ranging from 5.7 to 9.6 years.

Even though I was well aware of all the problems with levels, I was still astonished to read this. Not only does the level 2 category include pupils of such differing attainment as to be practically meaningless, it also significantly overlaps with the level 1 category. That doesn’t look to me as though levels are giving us a common and shared understanding.

Although I know of no similar research which has been done more recently, a look at the distribution of levels in the KS2 tests suggests that there is something similar going on. In the KS2 tests, approximately 15% of pupils get a level 3 or below, 50% get a 4 and 35% get a 5. So the number of pupils achieving a level 4 – that is, national expectations – runs from approximately the 16th to the 65th percentile. I suspect if we did a reading age test on all of these pupils, we would find huge variations in their results. Anecdotally, I know of plenty of secondary schools who find that some of their level 4s have difficulty with reading and are placed in their catch-up reading classes. So again, how useful is it, and how much of a ‘common language’ do levels provide if the level 4 range runs from pupils who are still struggling with reading and writing up to pupils who are confident readers and writers. One of the reasons why so many secondary schools reassess their pupils on entry (CAT4, for example, is used in over 50% of UK secondaries) is because the KS2 SATs do not provide a common language or that much in the way of useful information.

These vague bands cause further problems for secondary schools because they are used as the baseline for measuring progress across secondary. Expected progress for all level 4s is a C at GCSE. 84% of pupils in the top third of that level 4 category do go on to achieve a C or above at GCSE. But only 50% of those in the bottom third of the level 4 category do. (These figures are for English; they are similar for Maths). Schools who have a lot of pupils clustered in the bottom part of the level 4 category are being held to very tough attainment targets. Schools with lots of pupils clustered at the top of that level get relatively easy attainment targets. And of course, in practice, schools will not get fair spreads of level 4 pupils. Schools in some areas will take on a disproportionate number of ‘low’ level 4s, whereas other schools will get a disproportionate number of ‘high’ level 4s.

So, in conclusion, national curriculum levels do not provide a common language and this results in many pernicious effects. As for what could provide a common language, I will return to this in my next post.

Replacing national curriculum levels

Life beyond levels? Life after levels? Life without levels?  Lots of teachers, senior leaders and academics have come up with some interesting ideas for what should replace national curriculum levels. Here’s a summary of some of those ideas.

  • Michael Fordham is a former history teacher and now works at Cambridge’s education department. He has written three articles which put forward a possible system for assessing history – one, two, three.
  • Alison Peacock is the head of Wroxham Primary School, who moved away from levels a while ago. In this post she expresses a worry that any list of aims she writes up will become APP under another name.
  • Alison Peacock was also a part of the NAHT commission who recently released a report on this.
  • The NAHT report attracted quite a few comments.  I’m in broad agreement with David Thomas’s post here, particularly the point he makes about how easy it to say you should assess pupils according to objective criteria, and how hard it is to actually achieve this. (See below for Paul Bambrick-Santoyo’s work on this). Gifted Phoenix also commented on it here.
  • Tom Sherrington is a secondary head. I like the focus here on taking actual samples of pupil work as definitions of standards.
  • Phil Stock is an English teacher who has shared his department’s plans. They involve new rubrics for assessing reading and writing, and the use of multiple choice questions.
  • David Thomas is a head of maths and in this proposal he notes that there is a tension between providing teachers and students with useful feedback and providing teachers and students with a system that is easy to understand.
  • Joe Kirby is an English teacher at a London secondary. A lot of the ideas in this post are ones Joe and I have discussed together. This post and this one expand on the issues.
  • Michael Tidd has put forward a proposal for primary assessment here, and has made more specific proposals about a mastery approach to assessment here, with an interesting comparison to a game of Jenga.
  • Alex Quigley has a draft model for assessing English here.
  • Chris Waterworth has written about a possible approach for primary assessment. I’m less of a fan of this approach, as it suggests simply using the current level descriptors and APP grids, just without the levels. The problem with level descriptors is well described by …
  • …Paul Bambrick Santoyo, who has written some fascinating things about the difficulty of using prose descriptions of standards as a guide for assessments. Pages 6-8 of Driven by Data explain exactly what the flaws are.
  • Rob Coe has come up with a list of 47 criteria you should consider before you let a test into your classroom.
  • GL Assessments have two excellent articles on their website about assessing without levels. The first one, here, explains the purpose of standardised tests and how they could feature in a world without levels. The second article, here, offers a case study of St Peter’s Collegiate School in Wolverhampton which abolished levels in 2009.
  • Finally, a bit of light relief: this clip from This is Spinal Tap reminds us that whatever scale we use, it has to have some underlying meaning.
  • 1st May update: the DfE have announced the winners of their assessment innovation fund. David Thomas’s plan, mentioned above, is one of the winners.

I know there are some people who are disappointed that levels are going, fearing that we will lose a common language. I am not worried at all. I’m delighted at how many people are seeing the abolition of levels as an opportunity. I am also much less worried about the loss of a common language, because I don’t think levels really did provide a common language. I have written about this before here. Since I wrote that, I came across this paper by Peter Pumfrey, which shows that a group of pupils who achieved a level 2 in the KS1 teacher reading assessments had reading ages ranging from 5-10. In these circumstances, can we really say that levels provided a common language? Rather, it seems to me that they provided the illusion of a common language, which is actually far worse than having no common language at all.

Seven Myths about Education – out now

My book, Seven Myths about Education, was published this week by Routledge.

It was actually first published as an ebook by The Curriculum Centre in June 2013.  There are a couple of new additions to this version. There are some slight alterations to chapter two, and, more significantly, forewords by E.D. Hirsch and Dylan Wiliam.

There’s a nice review of it here by Joe Kirby.

Before publication I wrote a short summary of each chapter. Here they are.

After publication, I also wrote three posts restating the evidence base for the book and responding to some of my critics.

Evidence base part one
Evidence base part two
Evidence base part three

Teaching knowledge is not indoctrination

Myth 7 of my book is ‘Teaching knowledge is indoctrination’. I found lots of influential educationalists who believed this, but I did also feel that it was not the most pervasive of the myths I identified. Generally, I find the problem is not that people think that teaching Romeo and Juliet is indoctrinating pupils with the cultural products of dead white European men. More, the problem is that Ofsted think that making puppets is an acceptable way of teaching Romeo and Juliet.

Most of the influential people who believe in this myth wrote their first works in the 1970s. Whilst many of them are still around today, I think their work shows signs of being a bit dated. Significantly, one of the most important promoters of this myth, Michael Young, has actually recanted. In 1971 he edited a collection of essays called Knowledge and Control which was one of the seminal works of the ‘teaching knowledge is indoctrination’ school. He’s since published a book called ‘Bringing Knowledge Back In’ which argues for the importance of teaching what he calls ‘powerful knowledge’.

So, even though some traces of this myth persist, I would have said that on the whole, the belief that knowledge is indoctrination was going the way of the mullet, Love thy Neighbour, and other unlamented aspects of the 1970s. Just as I was thinking that, however, up popped this article in the Guardian by the deputy head teacher Tait Coles.

Interestingly, Mr Coles and I seem to have very similar aims for education.

Teachers can’t ignore the contexts, culture, histories and meanings that students bring to their school. Working class students and other minority groups need an education that prepares them with the knowledge of identifying the problems and conflicts in their life and the skills to act on that knowledge so they can improve their current situations.

I agree with all this. Where Mr Coles and I would depart is the best way to achieve these aims. Mr Coles compares two different approaches to curriculum and pedagogy – that of E.D. Hirsch’s and Paulo Freire’s. For him, Paulo Freire’s methods are far superior.

In contrast, ED Hirsch’s Core Knowledge Curriculum is a ‘hegemonic vision produced for and by the white middle class to help maintain the social and economic status quo’ and ‘teaching a “core knowledge” instils a culture of conformity and an insipid, passive absorption of carefully selected knowledge among young people…Schools that adopt this method become nothing more than pipelines producing robotic citizens, perpetuating the vision of a capitalist society and consequently preventing social mobility.’

Mr Coles offers no evidence for this assertion, and it is hard to see any way in which this criticism is justified. I would genuinely like to see the evidence and logic which led him to this conclusion. I don’t want to accuse him of not having read the Core Knowledge curriculum, but it is quite hard to see how anyone could have read it and come up with this conclusion.  The CK curriculum includes speeches by Martin Luther King and Sojourner Truth, units on reformers such as Lucretia Mott and Elizabeth Cady Stanton and texts such as Jacob Riis’s How the Other Half Lives, a seminal work of photojournalism which exposed the inequalities of late 19th century New York. If you commissioned someone to design a curriculum that ‘deliberately failed to consider the values and beliefs of any other particular race, class or gender’ and they came back to you with the Core Knowledge Curriculum, you’d send them away to start again. Anyone who has read it or seen it in action will know that the CK curriculum is inclusive, global and multicultural. Indeed, its global and multicultural focus has seen it become the target of criticism from the religious right in America.

Another thing you wouldn’t know from reading Mr Coles’s piece is that Hirsch and Freire both have progressive aims. Where they differ is in how they think you should achieve such aims. As Mr Coles acknowledges in his article, ‘critical pedagogy isn’t a prescriptive set of practices – it’s a continuous moral project that enables young people to develop a social awareness of freedom.’ This vagueness can make it hard to work out the practices Freire advocated. As part of my research for my book Seven Myths about Education, I read some of Freire’s works and those of Freirean practitioners and attempted to pin down exactly what he was proposing. I concluded that for Freire, the very act of transmitting knowledge is suspect and regressive. Instead, his critical pedagogy involves teachers working with the knowledge pupils already have and with the knowledge pupils are able to discover independently.  The problem with this is that the knowledge pupils can discover independently is always going to be limited. Discovery learning is a wholly inefficient way of acquiring knowledge. The knowledge pupils already have is always going to be unequal, and unfortunately may also diverge along socio-economic lines. In modern Britain, it is also the case that the type of knowledge pupils will pick up from the environment will very likely come from the mass media, whose primary focus is often entertainment, not truth. As Harry Webb has argued, a pupil whose only knowledge about Winston Churchill is from the mass media would be in no position to critique Churchill’s reputation. Mr Coles himself accepts that education should ‘challenge the accepted social truths purveyed by media.’ However, a Freirean discovery-based critical pedagogy will not achieve this. It will actually just give more power to media distortions. Thus, if we are so worried about indoctrination that we teach pupils no knowledge, one result is that we actually end up outsourcing the transmission of knowledge to the mass media, which is far more likely to result in indoctrination and bias. Another consequence is that we end up entrenching and reinforcing existing class divisions. Interestingly, one of the secretaries to the Plowden Report recognised this. ‘This view of education, naturalistic, heuristic and developmental as it was, was in some unremarked conflict with the Committee’s thinking about education as a redistributive agency.’ In short, discovery learning and social justice are in conflict.

The alternative to this approach is to accept that knowledge transmission does carry with it the risk of indoctrination, but that it is also an inevitable part of teaching, and the foundation of all skill. Given these three things, then teachers and schools should take great care over the selection of that knowledge and should most certainly not leave it up to the chance of a pupil’s background or the whims of a TV producer.  (There are important questions that will remain about how you choose the knowledge, who chooses the knowledge, and what knowledge you end up choosing – but these are questions that have to be answered, not questions that demolish the possibility of teaching knowledge. They are real questions, not rhetorical ones. I discuss some of the answers to them in chapter 7 of my book, and will discuss it at more length in my next blog.)

Broadly speaking, the former approach is taken by Freire, and the latter by Hirsch (and those in the early labour movement); the former approach is not backed by evidence, and the latter is. Thus, whilst Hirsch and Freire both have progressive aims, Freire’s methods simply haven’t been as effective as Hirsch’s. If we compare the empirical and theoretical evidence in favour of a Hirsch style curriculum and a Freire style curriculum, I am afraid there is no contest. The principles of the CK curriculum are based on a solid understanding of cognitive psychology and the specific curriculum has performed excellently in practice in a number of research studies, including the impressive Core Knowledge Language Arts programme which was shown to be particularly beneficial for precisely the types of disadvantaged pupils Mr Coles is worried about. There is no such evidence in favour of Freire’s pedagogy.

So, to sum up, whilst Hirsch and Freire may both be motivated by the right ideas, only Hirsch is motivated by the right methods.

From my perspective, one of the good things about Mr Coles’s article is that whilst it grossly misrepresents Hirsch, it doesn’t ignore him. Five years ago it was hard to find someone in English education who had heard of Hirsch, or Dan Willingham, or any of the evidence in favour of a content rich curriculum. As I say in my book, the real ‘hegemonic vision’ is not in Hirsch’s curriculum, but in the exclusion of Hirsch and others like him from so many teacher training curriculums. For years, the education establishment has not had to argue against people who opposed its world view because it had effectively airbrushed them out of the debate. In the last few years, things have changed. People can no longer ignore the accumulation of evidence against so many of these dominant ideologies. Instead, they misrepresent and attack this evidence. Moving from being ignored to being attacked may not seem like an improvement, but it is. For every person who reads Mr Coles’s articles and nods in agreement, I think there will be one whose interest is piqued enough to want to find out more about this Hirsch chap. The playing field is levelling. ‘Let truth and falsehood grapple; who ever knew truth put to the worse in a free and open encounter?’


Get every new post delivered to your Inbox.

Join 4,178 other followers