Myth 7 of my book is ‘Teaching knowledge is indoctrination’. I found lots of influential educationalists who believed this, but I did also feel that it was not the most pervasive of the myths I identified. Generally, I find the problem is not that people think that teaching Romeo and Juliet is indoctrinating pupils with the cultural products of dead white European men. More, the problem is that Ofsted think that making puppets is an acceptable way of teaching Romeo and Juliet.
Most of the influential people who believe in this myth wrote their first works in the 1970s. Whilst many of them are still around today, I think their work shows signs of being a bit dated. Significantly, one of the most important promoters of this myth, Michael Young, has actually recanted. In 1971 he edited a collection of essays called Knowledge and Control which was one of the seminal works of the ‘teaching knowledge is indoctrination’ school. He’s since published a book called ‘Bringing Knowledge Back In’ which argues for the importance of teaching what he calls ‘powerful knowledge’.
So, even though some traces of this myth persist, I would have said that on the whole, the belief that knowledge is indoctrination was going the way of the mullet, Love thy Neighbour, and other unlamented aspects of the 1970s. Just as I was thinking that, however, up popped this article in the Guardian by the deputy head teacher Tait Coles.
Interestingly, Mr Coles and I seem to have very similar aims for education.
Teachers can’t ignore the contexts, culture, histories and meanings that students bring to their school. Working class students and other minority groups need an education that prepares them with the knowledge of identifying the problems and conflicts in their life and the skills to act on that knowledge so they can improve their current situations.
I agree with all this. Where Mr Coles and I would depart is the best way to achieve these aims. Mr Coles compares two different approaches to curriculum and pedagogy – that of E.D. Hirsch’s and Paulo Freire’s. For him, Paulo Freire’s methods are far superior.
In contrast, ED Hirsch’s Core Knowledge Curriculum is a ‘hegemonic vision produced for and by the white middle class to help maintain the social and economic status quo’ and ‘teaching a “core knowledge” instils a culture of conformity and an insipid, passive absorption of carefully selected knowledge among young people…Schools that adopt this method become nothing more than pipelines producing robotic citizens, perpetuating the vision of a capitalist society and consequently preventing social mobility.’
Mr Coles offers no evidence for this assertion, and it is hard to see any way in which this criticism is justified. I would genuinely like to see the evidence and logic which led him to this conclusion. I don’t want to accuse him of not having read the Core Knowledge curriculum, but it is quite hard to see how anyone could have read it and come up with this conclusion. The CK curriculum includes speeches by Martin Luther King and Sojourner Truth, units on reformers such as Lucretia Mott and Elizabeth Cady Stanton and texts such as Jacob Riis’s How the Other Half Lives, a seminal work of photojournalism which exposed the inequalities of late 19th century New York. If you commissioned someone to design a curriculum that ‘deliberately failed to consider the values and beliefs of any other particular race, class or gender’ and they came back to you with the Core Knowledge Curriculum, you’d send them away to start again. Anyone who has read it or seen it in action will know that the CK curriculum is inclusive, global and multicultural. Indeed, its global and multicultural focus has seen it become the target of criticism from the religious right in America.
Another thing you wouldn’t know from reading Mr Coles’s piece is that Hirsch and Freire both have progressive aims. Where they differ is in how they think you should achieve such aims. As Mr Coles acknowledges in his article, ‘critical pedagogy isn’t a prescriptive set of practices – it’s a continuous moral project that enables young people to develop a social awareness of freedom.’ This vagueness can make it hard to work out the practices Freire advocated. As part of my research for my book Seven Myths about Education, I read some of Freire’s works and those of Freirean practitioners and attempted to pin down exactly what he was proposing. I concluded that for Freire, the very act of transmitting knowledge is suspect and regressive. Instead, his critical pedagogy involves teachers working with the knowledge pupils already have and with the knowledge pupils are able to discover independently. The problem with this is that the knowledge pupils can discover independently is always going to be limited. Discovery learning is a wholly inefficient way of acquiring knowledge. The knowledge pupils already have is always going to be unequal, and unfortunately may also diverge along socio-economic lines. In modern Britain, it is also the case that the type of knowledge pupils will pick up from the environment will very likely come from the mass media, whose primary focus is often entertainment, not truth. As Harry Webb has argued, a pupil whose only knowledge about Winston Churchill is from the mass media would be in no position to critique Churchill’s reputation. Mr Coles himself accepts that education should ‘challenge the accepted social truths purveyed by media.’ However, a Freirean discovery-based critical pedagogy will not achieve this. It will actually just give more power to media distortions. Thus, if we are so worried about indoctrination that we teach pupils no knowledge, one result is that we actually end up outsourcing the transmission of knowledge to the mass media, which is far more likely to result in indoctrination and bias. Another consequence is that we end up entrenching and reinforcing existing class divisions. Interestingly, one of the secretaries to the Plowden Report recognised this. ‘This view of education, naturalistic, heuristic and developmental as it was, was in some unremarked conflict with the Committee’s thinking about education as a redistributive agency.’ In short, discovery learning and social justice are in conflict.
The alternative to this approach is to accept that knowledge transmission does carry with it the risk of indoctrination, but that it is also an inevitable part of teaching, and the foundation of all skill. Given these three things, then teachers and schools should take great care over the selection of that knowledge and should most certainly not leave it up to the chance of a pupil’s background or the whims of a TV producer. (There are important questions that will remain about how you choose the knowledge, who chooses the knowledge, and what knowledge you end up choosing – but these are questions that have to be answered, not questions that demolish the possibility of teaching knowledge. They are real questions, not rhetorical ones. I discuss some of the answers to them in chapter 7 of my book, and will discuss it at more length in my next blog.)
Broadly speaking, the former approach is taken by Freire, and the latter by Hirsch (and those in the early labour movement); the former approach is not backed by evidence, and the latter is. Thus, whilst Hirsch and Freire both have progressive aims, Freire’s methods simply haven’t been as effective as Hirsch’s. If we compare the empirical and theoretical evidence in favour of a Hirsch style curriculum and a Freire style curriculum, I am afraid there is no contest. The principles of the CK curriculum are based on a solid understanding of cognitive psychology and the specific curriculum has performed excellently in practice in a number of research studies, including the impressive Core Knowledge Language Arts programme which was shown to be particularly beneficial for precisely the types of disadvantaged pupils Mr Coles is worried about. There is no such evidence in favour of Freire’s pedagogy.
So, to sum up, whilst Hirsch and Freire may both be motivated by the right ideas, only Hirsch is motivated by the right methods.
From my perspective, one of the good things about Mr Coles’s article is that whilst it grossly misrepresents Hirsch, it doesn’t ignore him. Five years ago it was hard to find someone in English education who had heard of Hirsch, or Dan Willingham, or any of the evidence in favour of a content rich curriculum. As I say in my book, the real ‘hegemonic vision’ is not in Hirsch’s curriculum, but in the exclusion of Hirsch and others like him from so many teacher training curriculums. For years, the education establishment has not had to argue against people who opposed its world view because it had effectively airbrushed them out of the debate. In the last few years, things have changed. People can no longer ignore the accumulation of evidence against so many of these dominant ideologies. Instead, they misrepresent and attack this evidence. Moving from being ignored to being attacked may not seem like an improvement, but it is. For every person who reads Mr Coles’s articles and nods in agreement, I think there will be one whose interest is piqued enough to want to find out more about this Hirsch chap. The playing field is levelling. ‘Let truth and falsehood grapple; who ever knew truth put to the worse in a free and open encounter?’
In my previous two posts (here and here), I looked at the structure of my book and restated some of the evidence I’d used to make the claim that a certain set of ideas were dominant in English education. In this post, I want to restate the evidence I used to back up my second claim: that these ideas are misguided. Essentially, the evidence here is fairly straightforward and derives mostly from cognitive psychology. In summary, working memory is limited; long-term memory is powerful; and we remember what we think about. Here’s a summary of just some of the evidence about this.
Dan Willingham – Why Don’t Students Like School?
Kirschner, Sweller & Clark – Why Minimal Guidance During Instruction Does Not Work: An Analysis of the Failure of Constructivist, Discovery, Problem-Based, Experiential, and Inquiry-Based Teaching
Herbert Simon – Skill in Chess and Expert and novice performance in solving physics problems
John Anderson – A Simple Theory of Complex Cognition
Here are two other excellent articles which I don’t cite in my book but which deal with very similar ideas.
Greg Yates – “How Obvious”: Personal reflections on the database of educational psychology and effective teaching research
Richard E.Mayer – Should There Be a Three-Strikes Rule Against Pure Discovery Learning? The Case for Guided Methods of Instruction
These three key facts about memory have huge implications for classroom practice. It’s because working-memory is so limited that projects, authentic activities and discovery learning are so problematic. It’s because long-term memory is so powerful that we need to make sure pupils commit facts to memory, and that pupils who don’t already have background knowledge are not left to devise their own education. And it’s because we remember what we think about that we need to make sure all our classroom activities focus our pupils’ attention on what we want them to remember.
The myths I identify, and the practice they have influenced, are in direct opposition to all of this evidence. I will give just three examples of this here – one example of a lesson from Ofsted, one example of a statement from a popular educationalist, and one example of a popular non-governmental curriculum. I’m not going to repeat in detail the problems with each example – I’ve just followed it up with a quotation from the research literature.
Lesson from an Ofsted report
‘Pupils first matched each of the diverse group of party guests (baby mice through to a giant) to various balloons. Then they had to measure string of differing lengths (5cm to 2m) for tying onto the balloons for each guest. The higher level teaching assistant encouraged good debate between the pupils around whether the string should be measured and cut before tying, or tied first and then measured. She did not steer them towards the other approach when they decided to measure and tie the string first. The pupils wrestled with measuring the string after tying it to the balloons which enabled them to appreciate the difficulty of measuring accurately once the string was attached to the balloon. They also realised that some of the string was used up in tying it to the balloon. This led to good discussion around which approach should be taken. The pupils revised their strategy for the task, which they went on to complete successfully.’ (Link here.)
Evidence from research
‘Controlled experiments almost uniformly indicate that when dealing with novel information, learners should be explicitly shown what to do and how to do it.’ (Kirschner, Sweller, Clark)
‘Children who have the advantage of clear instructional cues will achieve understanding more readily than children expected to acquire knowledge via less directive teaching methods.’ (Yates)
‘Knowledge is changing so fast that we cannot give young people what they will need to know, because we do not know what it will be. Instead we should be helping them to develop supple and nimble minds, so that they will be able to learn whatever they need to.’ (Link here)
Evidence from research
‘In every domain that has been explored, considerable knowledge has been found to be an essential prerequisite to expert skill.’ (Simon, Expert and novice performance in solving physics problems)
‘Data from the last 30 years lead to a conclusion that is not scientifically challengeable: thinking well requires knowing facts, and that’s true not simply because you need something to think about. The very processes that teachers care about most — critical thinking processes like reasoning and problem solving — are intimately intertwined with factual knowledge that is in long-term memory (not just in the environment).’ (Willingham, Why Don’t Students Like School?)
RSA Opening Minds curriculum
‘Children plan their work, organise their own time and explore their own ways of learning.’ (Link here.)
‘With the primary sector’s more cross-cutting and discovery-based approach to teaching and learning we expect there to be a natural fit with how Opening Minds has evolved at Key Stage 3.’ (Link here.)
Evidence from research
‘Constructivism too often is seen in terms of student centred inquiry learning, problem-based learning and task based learning and common jargon words include “authentic”, “discovery” and “intrinsically motivated learning”. The role of the constructivist teacher is claimed to be more of facilitation to provide opportunities for individual students to acquire knowledge and construct meaning through their own activities, and through discussion, reflection and the sharing of ideas with other learners with minimal corrective intervention. These kinds of statements are almost directly opposite to the successful recipe for teaching and learning as will be developed in the following chapters.’ (Hattie, Visible Learning, p.26)
‘In all major domains, an accumulation of effective methods has occurred for teaching the accumulated knowledge and skills…teachers know how and to what degree of mastery the simpler tasks have to be acquired to serve as the building blocks of more complex skill. Unlike the beginners themselves, teachers can foresee the future demands and avoid the need for complete relearning of previously obtained skill.’ (K. Anders Ericsson, link here)
E.D. Hirsch has used the insights we have from cognitive psychology to formulate some ‘middle axioms’ for classroom practice. (I blog about this here). Middle axioms are general theoretical principles that can guide our practice. Hirsch’s seven are as follows.
• Prior knowledge as a prerequisite to effective learning.
• The right mix of generalization and example.
• Attention determines learning.
• Rehearsal (repetition) is usually necessary for retention.
• Automaticity (through rehearsal) is essential to higher skills.
• Implicit instruction of beginners is usually less effective.
I would suggest that these seven middle axioms are a better guide to educational practice than the seven myths I identify. Unfortunately, it’s the latter which currently guide much of our education system.
I have noticed that a common response to my book has been a) to deny the existence of the myths I’ve outlined and b) to claim that they are not myths after all.
This is not only rather illogical, it’s also something I anticipated prior to publication in this blog post.
Very often, I’d give a brief outline of what I thought about education and explain what that meant in practical terms – for example, teaching discrete grammar lessons. I would then get two responses, often from the same person. First, the person would say that most schools do what I am asking already, so what I am proposing isn’t anything new – eg, they would say, all schools teach grammar anyway. Second, they would say that I was backward looking and wanted to take education back to the 19th century. Self-evidently, both these criticisms cannot be true. If all good schools already do what I am asking, then I can’t be advocating a return to the 19th century. If I am advocating a return to the 19th century, then schools can’t all be doing what I am asking.
Similar criticisms would emerge whenever I tried to give an example of something I thought was wrong. So, I might say that a lesson where pupils learnt about Romeo and Juliet through making puppets was not very effective. Again, I would get two criticisms: first, the person would say that I was attacking a straw man and that nobody really taught like that. Second, they would say that making puppets to teach Romeo and Juliet was very effective. Again, you can’t really make both criticisms. If you think that I am attacking a straw man, then you are implicitly conceding that teaching Romeo and Juliet through the use of puppets is not effective. So going on to argue that such a method is effective is contradictory. Again, both criticisms are wrong. Making puppets to teach Romeo and Juliet is not an example of a straw man; it is an example which has been cited as best practice. It is also, in actual fact, ineffective practice.
Since I wrote that, I have come across a paper by the cognitive psychologist Greg Yates where he records getting exactly the same response to one of his early research findings. In his words,
At the seminar, various critics noted the findings as (a) obvious, and (b) in conflict with Piagetian theory. A strange thing for a young graduate’s findings to be seen simultaneously as obvious and at variance with one of the field’s major statements.
In order to try and forestall these types of criticism, I realised that what I needed to do was to show people not just that statement x, y and z were myths, but also that lots of people actually believed in statements x, y and z. Hence, the structure of my book. I am making two claims: one, that people believe in these myths; two, that they are indeed myths. Only once I have shown beyond doubt that people believe in a myth do I explain that it is a myth.
Gratifyingly, a number of people have said that they found this structure very useful in helping them to understand the way theory influences practice. However, despite structuring the book like this, I’ve still encountered the logical fallacy I outline at the start. First, my critics will deny that anyone believes in the myths I outline and attack the evidence I’ve used to show this. OK, fair enough – I disagree, of course, but so far so logical. But secondly, they then say that the myths I’ve outlined are not, in fact, myths.
That’s a bit like saying this. ‘Daisy, stop being stupid. No-one believes in statement x. Statement x is a straw man. You’ve created this straw man from your limited view of classroom practice. If you had more experience, you’d realise that no-one believed in statement X. Oh, and by the way, statement x is true! It must be, I read a book about it!’ On the one hand, my critics claim that I have a distorted view of the reality of classroom practice. In the next breath, they defend this exact same reality in the exact same terms as I have described it!
Here, for example, is Tom Sherrington. First, hardly anyone believes in statement x.
‘For me, the myths just don’t ring true as a general description of the state of our schooling or the issues with it.’
‘There are quite a few references to RSA’s Opening Minds and passing references to Guy Claxton. But their ideas are only used directly in a tiny sample of schools; they don’t represent the system in any way.’
Second, statement x is true.
‘I have to be open minded about the eventual outcomes [of projects].’
‘The example of Y4s talking about health and safety prior to going on a trip seemed perfectly reasonable.’
‘This Y9 History lesson, comparing bombing campaigns, sounds great to me.’
There are, obviously, some logical ways you can argue against my book, even if I think they are wrong. You can argue that my myths don’t exist – ie, that the evidence I produce to substantiate the myths isn’t strong enough, and the myths are not the problem that I claim. That is, not many people really do believe in statement x. I discuss the evidence that I use for this first claim and the criticism I’ve faced for it here.
Or, you can say that the myths are not myths after all. That is, they are sound and rational beliefs. In other words, people are right to believe in statement x because it is, after all, true.
You could also argue that I am wrong on both counts. In order to this, you’d have to say that hardly anyone believes in the myths I am outlining – which is a real shame, because they aren’t myths, they’re the best way to teach. Everyone should believe in them. In this case, the fact that you disagree with me on the first claim is actually of less importance. The disagreement over the second claim is more important.
But what you can’t do is to say that my depiction of the myths is a straw man – and then go on to say that they aren’t myths. That is, you can’t argue that statement x is a straw man AND that statement x is in fact true. You can’t claim I am attacking a straw man, before going on to show that the alleged ‘straw man’ is something you are in complete agreement with. If you do so, you are actually inadvertently giving me evidence for my first claim – that these myths really do exist – and proving that, after all, the alleged straw man is not so straw after all.
For me – and I think, in reality, for most of my critics – the really interesting claim is the second one. Even if you don’t think the myths are the systemic problem that I do, there is enough evidence to show that they are present in places. So it would be nice if we could at least agree on this, and then move on to the more interesting and educationally meaningful second claim – are the myths really myths? Are the types of lessons I criticise really deserving of criticism? That will be the subject of next week’s post.
In Seven Myths about Education, I make two claims: first, that in English education, a certain set of ideas about education are predominant; second, that these ideas are misguided. Finding the evidence to prove the second point was relatively straightforward. It is scientifically well-established that working memory is limited and that long-term memory plays a significant role in the human intellect. This has clear implications for classroom practice, implications which others have made and which I was happy to recap.
However, there was no one evidence base which could prove (or indeed, disprove) the first claim. Instead, I identified a range of different evidence bases to prove my point that in English education, the seven ideas I discuss are predominant. (In doing so, I also defined clearly what these ideas meant in theory and practice.) Here’s a recap.
• The writings of prominent theorists, and the proof that such theorists are indeed prominent (eg book sales, presence on government committees, the judgment of their peers, etc.)
• The advice given in popular teacher training textbooks, and the proof that such textbooks are popular (eg book sales, presence on reading lists)
• The National Curriculum
• Ofsted reports – I created an appendix of 228 exemplar lessons that were described in Ofsted subject reports from the last three years. These exemplar lessons were in turn drawn from the thousands of lesson observations done by Ofsted inspectors over the previous 3-5 years. See here for more information about this and a link to the appendix, which is available for free online.
• Popular non-governmental curriculums (and the evidence of such popularity, eg by number of schools applying it)
• Examples of lessons from popular resource sharing websites
I could have included a lot, lot more examples from the latter category which would have proved my point, and then some. I deliberately chose not to because it is hard to tell how popular or influential such lessons are. Had I used a lot of these, it would have been easier to accuse me of simply cherry picking the worst examples I could find on the entire internet. So I went easy on these, and only used a few examples from such websites, and then only from websites I could demonstrate were popular.
Given this range of evidence, I found it odd that some people criticised the book as being reliant on anecdotal evidence. As I say in the introduction, I do add in the odd anecdote to try and liven up the text, but only when I have clearly established that my anecdotal experience is in line with the evidence. In his review of my book, Tom Sherrington refers to my book being based on ‘personal anecdotal experience…from a very specific teaching situation’. That is absolutely not the case. It’s also particularly baffling that Tom Sherrington would say this given that the only counter-evidence he brings to bear is his own anecdotal experience. His strongest counter-argument is that ‘for me, the myths just don’t ring true’.
Interestingly, when I first started writing the book, my instinct was that my own personal anecdotal experience would bear little relation to the wider system, simply because I couldn’t believe that an entire system would have endorsed beliefs that were so completely at odds with all the available evidence. I actually set out planning to write a book that critiqued some parts and aspects of the education system. It was a genuine surprise to me to see that clear examples of bad practice were being endorsed as good practice on a system-wide level.
In a largely positive review, Michael Fordham did criticise my use of Ofsted reports as ‘a reliable and unproblematic account of pedagogy’. I think he is right to say that Ofsted reports are problematic. In the book I do in fact discuss the strengths and weaknesses of Ofsted reports, conceding that,
The one flaw with these reports is that Ofsted inspections are pre-announced. This means that teachers can, and do, put on a show for Ofsted. It has long been a complaint of many teachers that the kind of lesson Ofsted grade as outstanding is simply not possible to repeat consistently. So, when I give an example of an outstanding lesson from an Ofsted report, I do not mean to suggest that lessons exactly like this are going on all the time in every lesson. However, given the power we have seen Ofsted have, given the cottage industry of Ofsted preparation and given the fact that most schools will organise their own internal observation systems around Ofsted criteria, it is still fair to say that if Ofsted demand a certain type of lesson, that matters.
Also, I am not just using Ofsted reports as evidence for how teachers teach. I am also using them as evidence for how teachers are told to teach.
So, Ofsted’s inspection reports and subject reports are a fairly reliable guide to what actually happens in schools and a very reliable guide to what teachers are told to do.
Given that I am trying to prove what the dominant views are in the English education system, finding out how teachers are told to teach is just as important as finding out how they actually do teach.
Finding completely reliable and valid evidence of how teachers in England teach is always going to be difficult. Even a well-funded research study would run into methodological problems. Any large-scale observation programme, however low-stakes, would face the Hawthorne effect. However, whilst finding completely reliable and valid evidence is hard, I still think the Ofsted reports, backed up by the other sources listed above, offer a fairly reliable and valid picture.
Also, finding reliable and valid evidence of how teachers in England are told to teach is nothing like as methodologically difficult. Essentially, it is as straightforward as looking at what Ofsted and the government tell teachers. I would argue that even if teachers completely and utterly ignored this advice, the advice would still tell us something important about the education establishment. And in any case, there is good evidence that teachers and schools do not and can not ignore Ofsted. Since I first published the book, I think events have reconfirmed the central importance of Ofsted and their judgments in the education system.
I am always interested in discovering new sources of evidence about classroom practice. If people think there are other important sources I’ve missed, I would really like to know. In Tom Sherrington’s case, he seems to be suggesting that his own personal experience is a more valid and reliable source of evidence than the list I’ve outlined above. In Michael Fordham’s case, he doesn’t put forward any alternative. If anyone wants to suggest any others in the comments thread, I’d be glad. (Two very interesting sources which in the end I didn’t use were the PISA and TIMSS teacher, school and student surveys. These are really fascinating and have a lot of interesting data in them. Liz Truss and Laura McInerney, amongst others, have used these surveys to make interesting points. But they didn’t have the level of detail about classroom practice that I was looking for. TIMSS have done video surveys of some classrooms, but unfortunately not, so far as I know, of English classrooms.)
In any case, as I will go on to discuss in my next post, a lot of the arguments about the evidence base for my first claim are actually just a rather illogical smokescreen for people who are really interested in challenging my second claim.
Over Christmas I read Nate Silver’s excellent book The Signal and the Noise. Silver runs the FiveThirtyEight politics blog and became famous for his uncannily accurate predictions of US elections. Before predicting elections, he predicted the success of baseball players and teams. Before that, he made money playing online poker. His book is a distillation of what he has learned in these various pursuits about the art and science of prediction. As well as discussing politics, baseball and poker, he also talks about chess, weather, climate change, earthquakes and terrorist attacks.
He doesn’t discuss education, but there is one particular aspect of his book which I think has big implications for education research: that of the role of statistical analysis. The basic thesis of Silver’s book is that we can’t just predict using statistics alone. We need a theory.
This was initially quite surprising for me, as I came to the book with the knowledge that Silver was famous for his sophisticated statistical analyses and wonkish approach to problems. His successful political predictions became all the more famous because he was an outsider, not from a political background but a statistical one. I believe that in the early days of his blog, seasoned Beltway observers dismissed his predictions on this ground. I therefore was expecting the book to be an endorsement of detached statistical methods and an attack on overly emotive predictions by biased ‘experts’ too immersed in their field to analyse it objectively.
And in some ways, it is that book. Silver does criticise political pundits for their laughably inaccurate predictions, and notes rather drily that one of the reasons why he was able to make such a splash with his political predictions is because the bar was set so low. However, in other fields, he is much less critical of established experts. Notably, in the chapter on baseball he goes out of his way to show that there was much less of a difference than you think between the new stats-based analysts and the old-style player scouts. Silver is one of the new-style baseball analysts who are profiled in the book and film Moneyball. The stereotype is that the grizzly old baseball player scouts were caught unawares by a bunch of laptop-wielding geeks with an Excel model. Silver has a bit of fun with this stereotype, but basically fails to endorse it. He argues that the traditional scouts were always aware of stats, and that the modern ‘sabermetricians’ understood the nuances of baseball. So whilst he is willing to criticise traditional political experts, he is less willing to criticise traditional baseball experts.
And on the point as to whether detached statistical analyses on their own will suffice for prediction, Silver is clear that they can not. Indeed, he argues that, paradoxically, the more data we uncover, the harder it may be to make accurate predictions.
This is why our predictions may be more prone to failure in the era of Big Data. As there is an exponential increase in the amount of available information, there is likewise an exponential increase in the number of hypotheses to investigate. For instance, the US government now publishes data on about 45,000 economic statistics. If you want to test for relationships between all combinations of two pairs of these statistics – is there a causal relationship between the bank prime loan rate and the unemployment rate in Alabama? – that gives you literally one billion hypotheses to test. But the number of meaningful relationships in the data – those that speak to causality rather than correlation and testify to how the world really works – is orders of magnitude smaller. Nor is it likely to be increasing at nearly so fast a rate as the information itself; there isn’t any more truth in the world than there was before the Internet or printing press. [Amen to that – click here to read chapter 3 of my book, which is about just this point].
In fact, Silver directly takes on one of the more extreme predictions of futurologists – Chris Anderson’s prediction that Big Data would make the scientific method obsolete. Silver calls Anderson’s view ‘badly mistaken’, because ‘the numbers have no way of speaking for themselves. We speak for them. We imbue them with meaning.’
He titles another section of his book ‘data is useless without context’. If you only ever seek to find statistical significance, and never bother to think about the plausibility of your finding, then in a world of Big Data you will end up with a lot of odd predictions.
Thus, you will see apparently serious papers published on how toads can predict earthquakes, or how big-box stores like Target beget racial hate group which apply frequentist tests to produce “statistically significant” (but manifestly ridiculous) findings.
To return to education, I think the worry is that we introduce tests of statistical significance without a good working theory of how learning happens. Without this theoretical understanding, we are more likely to conduct meaningless tests, mistake correlation for causation and confuse statistical significance with causal significance. This is something that E.D. Hirsch has written an absolutely brilliant article about, and which I have blogged about before here.
To recap, in Hirsch’s article he takes as his test case that of class sizes, perhaps one of the most popular issues in education, and one of the most well researched. He notes that a methodologically sound study of class sizes in Tennessee showed that reducing class sizes had a positive impact on achievement. And yet, when California rolled out a hugely expensive programme to reduce class sizes, it had little impact. Hirsch’s point is that the Tennessee study, whilst methodologically robust, did not probe to the root causes of its statistically significant finding. This is exactly the kind of thing Silver warns against. In his words, ‘statistical inferences are much stronger when backed up by theory or at least some deeper thinking about their root causes.’ The deeper thinking about root causes was absent in the case of this class size study.
Hirsch notes that we do have a strong theory from cognitive science about how pupils learn. We can use this theory to guide our teaching. He argues that we can use the first principles of cognitive science to derive certain ‘middle axioms’ or ‘reliable general principles’ that can guide our day-to-day teaching. Here is his list of reliable general principles (in the article he discusses each at length).
• Prior knowledge as a prerequisite to effective learning.
• The right mix of generalization and example.
• Attention determines learning.
• Rehearsal (repetition) is usually necessary for retention.
• Automaticity (through rehearsal) is essential to higher skills.
• Implicit instruction of beginners is usually less effective.
It seems to me this is an excellent and easily accessible summary of what we know from cognitive science. If we used these as a basis for devising RCTs and as a starting point for discussing the findings we get from them, I think we would be doing well.
These middle axioms would also make an excellent basis for teacher training or a course of CPD, but as far as I know they have not been used in this way. This brings me to one final point about Hirsch. The article that these ‘middle axioms’ are taken from is a small masterpiece. In just a few thousand words, Hirsch combines his mastery of a range of different intellectual fields – educational history & theory, cognitive psychology, the scientific method – and produces some genuine theoretical and practical insights. There are many good people working in education, but there are none I have read who are capable of this kind of scholarship. All of us in education are lucky to be able to profit from his insights.
Andrew Old frequently tells me that the way to get lots of people visiting your blog is to mention Ofsted in the title, so here goes. If you are an Ofsted-watcher, you might be interested in my book, particularly in the appendix. Although my book won’t be published until March 5th, you can still access the online appendix for free here. The appendix consists of lessons from Ofsted reports. I went through Ofsted subject reports dating from June 2010 to May 2012 and extracted all the descriptions of lessons from them. In total there are 228 descriptions of lessons, curriculum units and schemes of work. This was a fairly tedious piece of work, and if it’s of use to anyone else, I’d be very pleased.
The main thesis of my book is that child-centred education is the orthodoxy in English schools. The Ofsted lesson descriptions are one of main sources of evidence for this claim, and are also one of the main ways in which such an orthodoxy is maintained, and indeed enforced. You can read what I have written about some of the lessons Ofsted praise here, and here are some links to summaries of each chapter of my book.
This is part three of my summary of Daniel Koretz’s book Measuring Up: What Educational Testing Really Tells Us. Part one, How useful are tests?, is here and part two, Validity and reliability, is here. In my last post I spoke about how tests are only ever a proxy for what we want to measure. Koretz goes into great detail on this point and comes up with a brilliant analogy to help you understand it. Tests are about making a measurement, and generally, tests are trying to measure something huge. The technical term for what we are trying to measure is the domain. The domain that tests are trying to measure is the extent of a pupil’s knowledge and skills and their ability to apply these skills in the real world in the future. The domain is vast, and normally we have just a two or three hour exam to try and measure this domain. It’s for this reason that you often hear critics of tests say ‘but how can you measure everything a pupil can do in just one two-hour exam’? And it is for this reason, of course, that tests are so daunting – I can vividly remember before one of my A-levels thinking how odd it was that all the work I had done over the last two years and all of the life I was to lead over the next three were going to be largely determined by what I did in one three hour exam. Often, critics move from noting this point to dismissing tests altogether – as Peter Tait seems to do here at the end of a recent article in the Telegraph, where he quotes a famous AS Neill line about the limitations of tests.
If we have to have an exam at eleven, let us make it one for humour, sincerity, imagination, character – and where is the examiner who could test such qualities?
It is correct to note that tests can never measure an entire domain. But it is not correct to dismiss tests on this basis. Tests can only ever sample the domain they measure. But thanks to statistical techniques and the skill of test-setters, this sample can provide a fairly reliable proxy of the entire domain. Koretz is also keen to point out, contra Neill, that exams can measure things that are of value: ‘the evidence shows that standardized tests can measure a great deal that is of value.’ Koretz’s analogy here is with opinion polling. Opinion polls are trying to measure vast domains. In Britain, opinion pollsters are trying to work out the voting intentions of 40 million voters. They do so on the basis of a sample of 1,000 voters. On the face of it, this seems ridiculous. How on earth can you poll 1,000 people out of a population of 40 million and hope to get any kind of accurate measure about the 40 million? But of course, you can. Opinion pollsters use statistical techniques to ensure that their sample is a representative one, and as a result, more often than not they are right. Of course, they aren’t always right, just as exam results aren’t always right. It certainly is less accurate to measure a sample than it is to measure the entire domain. But done well, opinion polls and exams can both be extremely accurate. And where there is inaccuracy or uncertainty, we have methods of measuring those too. On the face of it, it seems absurd that 1,000 people can tell us about 40 million, and that two hours can tell us about a person’s skills and knowledge. But if the poll or test are designed carefully enough, then this small sample is enough to tell us quite a lot. Of course, careful design in both cases involves a lot of complexity. Before the 2012 US election, there were lots of arguments about the correct way of polling and about Nate Silver’s particularly uncanny methods of polling. And the recent PISA tests came in for criticism about their statistical complexities. Koretz’s argument is that these complexities are important and can’t just be dismissed.
Many people simply dismiss these complexities, treating them as unimportant precisely because they seem technical and esoteric…This proclivity to associate the arcane with the unimportant is both ludicrous and pernicious. When we are ill, most of us rely on medical treatments that reflect complex, esoteric knowledge of all manner of physiological and biomedical processes that very few of us understand well. Yet few of us would tell our doctors, that their knowledge, or that of the biomedical researchers who designed the drugs we take, can’t possibly be important because to our uninformed ears it is arcane.
In the case of opinion polls, it would be prohibitively expensive and impractical to survey the entire electorate every week or so. However, eventually the entire electorate is measured, in the form of a general election. In the case of exams, it is essentially impossible to measure the entire domain.
If we wanted to know whether schools successfully imparted the skills and dispositions needed to use algebra successfully in later work, we would go observe students later in life to see whether they used algebra when appropriate and whether they were successful in their applications of it.
There are clearly all kinds of practical problems with this. You’d only get a measure years after pupils had finished school, it would be hard to standardise such tests and it would be hugely expensive. And some things you might want to measure would be hard to observe. The result is that we need standardized tests which elicit a certain behaviour and which are the same for everyone. These tests are proxies or samples for that far wider domain which is essentially impossible to measure on its own. So, Koretz’s point is that tests are only ever a sample of the wider domain. There is no final ‘general election’ which will give us the actual measure of the domain. We can only ever sample from the domain, and then use that sample to make an inference about the wider domain. We can never measure the domain in the way we can measure a table or weigh some ingredients. However, thanks to the power of statistics and the accumulated research we have about exams, these samples can allow us to make very valid inferences about the domain. But what happens if, for some reason, the sample stops being a good proxy for the wider domain? This is of course why cheating is such a problem for test-setters. If a pupil can get hold of the test questions in advance and work out the answers, then they can score a top mark on the test without having the mastery of the domain that a top mark is meant to imply. This isn’t just true of educational assessment. Koretz gives lots of real examples of situations where knowing the sample that is used for making inferences has led to those inferences being compromised. For example, the US postal service use a random sample of 1,000 addresses to check the speed of postal deliveries. Some workers found out the sample addresses and made sure that those addresses received a very speedy delivery. Those addresses did indeed receive a very speedy delivery, but the inference it allowed you to make about the entire domain was compromised. So far, I guess a lot of these implications are fairly obvious. We all know that if someone gets hold of a test paper in advance, their scores on it are no longer valid. And we are all aware that tests can’t cover everything. That’s why they need to be secret – so that you end up learning and revising everything because you don’t know just what bit turns up on the exam. However, there are two particular ways in which I think Koretz goes even further than this, and in so doing makes an argument that challenges a lot of very common teaching practice. First, the point that Koretz is making with the above analogies and example is that the domain is vast. It isn’t just that the domain is bigger than the test – that much, as I have said, is obvious. It’s that the domain is even bigger than the syllabus. In fact, the domain is even bigger than the school curriculum. Second, the point Koretz is making with the postal service example is that if you teach to the test, then a pupil may well genuinely improve on those test items. But the point of a test score is not actually to tell you how well pupils have done on those particular items. The point of a test score is to allow you to make an inference about the wider domain. And the implication of this is that it isn’t just outright cheating which compromises the validity of the inferences we make from tests. Certain kinds of test preparation compromise validity too. Here is the logic: the test is a sample from the domain. The syllabus is also a sample from the domain. In order for the test to provide a valid inference about how a pupil will perform on the entire domain, teaching must be geared towards the domain. If teaching is geared towards the test, that compromises the result. But even if teaching is geared towards the syllabus, that can compromise the result too. Koretz gives some examples of how teaching to the test and syllabus, whilst not cheating, nevertheless results in inflated exam scores. He ranks activities in the following order 1. Working more effectively 2. Teaching more 3. Working harder 4. Reallocation 5. Alignment 6. Coaching 7. Cheating. Koretz says that 1-3 result in genuine gains, 7 always results in false gains and 4-6 can result in either genuine or false gains depending on how they are used. Reallocating time generates false gains if you are taking time away from things that are also an important part of the domain. Alignment is when you match the teaching to the test syllabus. ‘Coaching refers to focussing instruction on small details of the test, many of which have no substantive meaning.’ I think all three of these tactics are used in the most damaging ways in English schools. When it comes to coaching, I have often noticed how pupils who cannot tell you one date from history are able to tell you the number of marks available for each question on a history exam paper. This is perhaps one reason why memorisation gets such a bad name. Memorisation of the right things – for example, times tables and verb tales – is extraordinarily valuable, but memorisation of the wrong things – for example, marks to minute ratios and exam cheats and hints – is clearly not. In terms of alignment and reallocation, here are some examples from my own experience. The syllabus cannot tell you everything that a pupil needs to know in order to do well on that exam, so aligning teaching to the test or the syllabus will result in false gains. As I’ve argued in this post, the syllabus is not the curriculum, and nor should it be the curriculum. This is particularly the case in exams done in later years. For example, to do well on a history GCSE paper you need to be able to write well. But the syllabus provides very little guidance or detail on writing well – and nor should it. Another example: when I studied A-level history, one of the modules was the Russian Revolution. The syllabus made no mention of medieval absolutism and the difference in land distribution in early medieval England and Russia. Yet the very first lesson I had on Russian history was on these topics, and these topics were enormously helpful in getting me to understand the part of Russian history that was actually in the syllabus. I think that if that teacher’s scheme of work were analysed by a headteacher or senior leader keen to improve grades, then those introductory lessons on early medieval Europe would be the first to ‘reallocated’ or ‘aligned’ out of existence. I worry that even without a head teacher or senior leader getting involved, the pressure to do well on the exam might lead to the teacher making those cuts herself. And I worry that making those cuts in order to focus on ‘exam skills’ might even improve results on the exam even as it reduced the pupil’s wider understanding of the domain – ie, even as it reduced the validity of the inference we can make from the exam result. One more example: when I studied A-level history, one of the modules was on the Vietnam War. Essentially, we worked out from looking at past papers and the syllabus that there were basically six essay questions you could get in the final exam. There were two essays each year, and you could make a fair bet that they wouldn’t repeat the two from the previous year. There was one question that hadn’t been asked at all, so you felt confident that that would come up. So that meant there were three other essay questions which you felt fairly sure would come up. Now, this kind of analysis was something we did at the very end of the course after we’d learnt all the material. But it would have been entirely possible for the teacher to have done that analysis at the start and then just taught us the answers to those four essay questions. In fact, it would have been entirely possible for her to have given us model answers to those four questions and told us to go away and learn them. We could have done that and got very very good grades on the exam without cheating. It would also have taken a lot less time than actually learning about the Vietnam war. The gains from this approach would have been particularly valuable for weaker students. I think that these kind of tactics go on in schools now, and I also think that in some cases they are even recommended and praised. I think one of the reasons why such tactics are praised is they involve the teacher and the student working very hard. It’s not easy to write model answers, and it isn’t easy to memorise them. Clearly, such a tactic is not cheating. But it does completely compromise the validity of the test. Koretz’s argument is that because tests can only ever be samples of the domain, there is no possibility of an optimal test. There is no way we can measure the domain. Hence, all tests will be to some extent imperfect, and if a teacher tries hard to game them, they will be able to. I accept this. This is one way in which Koretz has genuinely changed my mind. I used to think that the problem of excessive test prep was one of badly designed tests. That is, I used to think that it was OK to teach to the test if the test was worth teaching to. Koretz takes on this exact point and shows that it is false. He’s convinced me. Even the best test in the world is only a sample, and samples can be gamed. However, there is one aspect where I would depart from him. Whilst I accept that all tests are to some extent imperfect, there are clearly varying degrees of imperfection. Some of the tests Koretz describes are so flawed that they are crying out to be gamed. For example, he gives an example of one teacher who didn’t teach her pupils about irregular polygons because they never came up on the test. Only regular polygons appeared on the test. Koretz is highly critical of the teacher for this. I see his point, but I think it is also the case that we can blame the test-setter in this case for not including irregular polygons. Including irregular polygons surely can’t be that much of a burden for the test-setters. I accept that as a profession, we have to move away from teaching to the test, and that policymakers also have to stop encouraging it. But I think that assessment experts and test designers like Koretz must also design exams that are difficult to game. A couple of years ago, I was at a conference where a deputy head of a successful school presented the amazing exam results his school had achieved over the past few years, and outlined some of the ways he’d achieved them. In the question and answer session afterwards, I asked how he could be sure that these results were down to improved teaching and learning and not just teaching to the test. His reply was that all of his teachers and students were working harder, and that it was so much easier nowadays for pupils and teachers to download past papers from the internet and work on their areas of weakness. I wasn’t totally convinced by this argument, but at the time it did seem to be making a fairly good point, which I’ve since heard many other people make: ‘pupils nowadays have better resources which they can access more quickly than in the past and this contributes to them learning more and becoming smarter.‘ But having read Koretz, this argument does not stand up at all. Teaching which focusses on past papers and test prep is not teaching to the domain. It’s teaching to the sample. The improvements it generates are not likely to be genuine. What is also particularly worrying about this example is that even when asked to identify a form of teaching and learning which was not test prep, this deputy referred to methods of test prep. Teaching to the test has become teaching and learning. It is hard for many people to have a model of improved teaching and learning which is not teaching to the test. It is for this reason that I am so keen on schools using a rigorous and detailed curriculum. The curriculum is not as wide as the domain either, but it is much wider than either the syllabus or the exam. Unlike the syllabus and exam, it is designed with teaching and learning in mind, not assessment. As Tim Oates has shown here, and as I have tried to argue here, if you have a vague curriculum then the result is not ‘teacher freedom’. The result is that the syllabus and/or the test become the curriculum, with hugely damaging consequences.