I think technology has great potential to transform education, but I am frustrated by how ineffective so much educational technology really is. For more on this, see my Guardian article here. Recently, I read a fascinating book about how big data could transform education, which described a lot of what I think are the more effective uses of education technology. It’s called Learning with Big Data, by Kenneth Cukier and Viktor Mayor-Schonberger, and it gives some really good examples of how data analysis of in-class formative assessment could improve teaching.
The best part of the book is where it follows the work of Professor Ng, a computer scientist at Stanford who is a co-founder of Coursera.
By tracking homework and tests done on a computer or tablet, he can identify specific areas where a student needs extra help. He can parse the data across the entire class to see how the whole cohort is learning, and adjust his lessons accordingly. He can even compare that information with other classes from other years, to determine what is most effective…For example, in tracking the sequence of video lessons that students see, a puzzling anomaly surfaced. A large fraction of students would progress in order, but after a few weeks of class, around lesson 7, they’d return to lesson 3. Why?
He investigated a bit further and saw that lesson 7 asked students to write a formula in linear algebra. Lesson 3 was a refresher class on math. Clearly a lot of students weren’t confident in their math skills. So Professor Ng knew to modify his class so it could offer more math review at precisely those points when students tend to get discouraged— points that the data alerted him to.
This, I think, is very powerful stuff. It really does have the potential to dramatically improve teaching and learning and to help identify what the most effective teaching methods are. There is more on this type of activity in the book, as well as interviews with the founders of Duolingo and Khan Academy, two of my favourite educational apps.
What I liked less about that book is that at times, it fell into lazy and entirely erroneous clichés of the ‘Shift Happens’ sort. The worst example was a throwaway comment in an otherwise excellent discussion how Professor Ng uses quizzes.
He interlaces the video classes with pop quizzes. It’s not to see if his charges are paying attention; such archaic forms of classroom discipline don’t concern him. Instead, he wants to see if they’re comprehending the material— and if they’re getting stuck, exactly where, for each person individually.
So, according to this, checking to see if pupils are paying attention is an archaic form of classroom discipline which great educators should not concern themselves with. Really? Not only does this just feel wrong, it also contradicts lots of the modern (and very un-archaic) research on the topic. Paying attention to things is how we remember them, and remembering things is how we learn. If we don’t pay attention, we don’t learn. In fact, research on the importance of paying attention is probably the most practically useful research for teachers. Dan Willingham goes so far as to say that ‘the most general and useful idea that cognitive psychology can offer teachers’ is to ‘review each lesson plan in terms of what the student is likely to think about’. Far from being an archaic form of classroom discipline, making sure that students pay attention is absolutely vital. One of the main reasons why the pop quizzes described above are so powerful is precisely because they help focus attention on the right things.
To be fair to this book, it is relatively free of this kind of error – certainly much freer than most of the books and articles I read about ed tech, where mere possession of an iPad will transform your intellectual capacities. This book does, for example, include the following important and salutary warning:
Learning will continue to require concentration, dedication, and energy.
But still, whilst the error about paying attention may be a brief one, it is there and it is important. Before we even start to think about how to use technology in the classroom, we need a clear understanding of what causes learning to happen. Only then can we start to think about how technology can enhance or improve that process. If we start with magical thinking about education, then we will end up applying technology in unhelpful ways, and the technology itself will get a bad name amongst many educators. We can see this happening at the moment – because so many uses of technology are based on misapprehensions and fail badly, many teachers become sceptical about all uses of technology. So, whilst I remain a fan of edtech, books like this actually convince me that whilst it’s important, it’s a second-tier issue: the most important issue is to establish clearly what causes learning.
Nicky Morgan’s comments today have started a debate over whether pupils really do need to have to learn their times tables by the end of primary. I think they should and I’m not going to rehearse the arguments here.
What I do want to do is to ask what other maths facts it’s useful for pupils to know by heart? The new national curriculum specifically says that pupils should memorise the number bonds up to 20 and the times tables up to 12, but are there other facts it is worthwhile memorising?
I’m going to start by saying fraction / decimal equivalences. I’m not talking about ones like 0.5, 0.25, etc, which are obviously helpful but which most pupils will just know (I hope!). I also think that memorising the decimal equivalences of less common fractions is useful: in particular, fractions with denominators of 6, 7, 8, 12 and 15. Very often newspaper articles and statistics you come across in everyday life are reported as fractions in these terms – for example, one in seven adults has a subscription to Netflix, or one in every 12 pounds is spent at Tesco (I made those up by the way). Being able to instantly flick back and forward from that to the percentage is really useful. The reverse is also useful. A lot of data are reported as precise percentages, and being able to easily mentally flick from this to a fraction often helps with understanding. If someone tells you that Andy Carroll wins 84% of aerial duels, it can help to think instead that that means he loses about one in every six of them. (Also a made-up stat).
It’s also a classic example of why it isn’t enough to know how to work it out. You might know how to convert a fraction to a decimal, but by the time you’d worked it out, you’d have forgotten what the context of the statistic was. The person who did know that 1/12 is 8.3% can move on to considering whether Tesco’s dominant market share is a cause for concern, estimating the share of other big chains and wondering what that might look like as an absolute sum of money. As ever, knowing stuff off by heart enables critical thinking rather than stifling it.
Some other suggestions: the 75 times tables. The person who suggested this one did so as a technique for winning the numbers game on Countdown. I wouldn’t recommend that we reorganise education around winning TV quiz shows (god forbid) but since I took this advice and learnt my 75 times tables, I have found them useful in more ways than expected. I suspect this is the case with a lot of these things – it’s only once you learn them that you fully appreciate how useful they are. A bit of the Dunning-Kruger effect, perhaps.
Any maths teachers out there, please leave your suggestions in the comments. Square numbers? Other times tables?
In modern discussions of education, the value of education is quite often defined in economic terms. We saw this very recently with Nicky Morgan’s comment that we could link subjects to later earnings to determine their ‘true worth’. The idea that public spending on education is justified by its impact on GDP is shared by many across the political spectrum.
There are huge problems with this consensus. First, as Alison Wolf showed in her brilliant book Does Education Matter?, the link between more years of education and greater GDP is not as clear-cut as it might seem. At a basic level, universal literacy and numeracy are important for the economy, and at the elite level, expertise in science and technology drives innovation. But beyond that, it is far more complex than some glib statements suggest. In Wolf’s words:
‘We know that basic literacy and numeracy matter a great deal, and that the labour market rewards mathematical skills. We also know that technical progress depends on the best scientific and technological research; but that there is no evidence that education spills over to raise productivity in a general, economy-wide way.’
Not only that, but as Wolf remarks at the very end of her book, the idea that education must always be justified and defended on purely economic terms is a very recent one.
‘Our preoccupation with education as an engine of growth has not only narrowed the way we think about social policy. It has also narrowed – dismally and progressively – our vision of education itself. This book reflects that narrowing…The contribution of education to economic life is an important subject, and an interesting subject, and it can actually be investigated empirically. But it is only one aspect of education, not the entirety, and it does not deserve the overwhelming emphasis which it now enjoys. Reading modern political speeches and official reports and then setting them alongside those of twenty-five, let alone fifty or a hundred, years ago is a revelation. Contemporary writers may pay a sentence or two of lip-service to the other objectives of education before passing on to their real concern with economic growth. Our recent forebears, living in significantly poorer times, were occupied above all with the cultural, moral and intellectual purposes of education. We impoverish ourselves by our indifference to these…The history of public education in any modern democratic state concerns issues of identity and citizenship quite as much as the instilling of more or less utilitarian skills…The role that schools play in creating citizens, and in passing on to new generations both an understanding of their own history and society and particular moral, intellectual or religious values, should concern any modern state with a public education system.’
Modern liberal-democratic societies depend on a well-informed and well-educated citizenry. They depend on knowledge: on people knowing the contours of contemporary debates, the functions of government, the history of civilisation, the difference between the supernatural and the natural, the language and literature of their society, and much more.
As Thomas Jefferson said:
If a nation expects to be ignorant and free in a state of civilization, it expects what never was and never will be.
And R.H. Tawney:
No one can be fully at home in the world unless, through some acquaintance with literature and art, the history of society and the revelations of science, he has seen enough of the triumphs and tragedies of mankind to realize the heights to which human nature can rise and the depths to which it can sink.
Education may be important for prosperity, but it is vital for democracy.
Today the Sutton Trust and the University of Durham have published a fascinating new report called What Makes Great Teaching? It sets out to answer that title question, as well as looking at ways we can measure great teaching, and how that could be used to promote better learning. Here is my short summary of some key points from the report.
1. What is effective teaching? This report is very honest about the fact that we don’t have as clear an idea of what good teaching is as we think we do. I think this is an important point to make. Too often, reports like this one start from the point of assuming that everyone knows what good teaching is, and that the challenge is finding the time/money/will/methodology to implement changes. This report is saying that actually, there are a lot of misconceptions about what good teaching is, and as such, reform efforts could end up doing more harm than good. We need to think more clearly and critically about what good teaching is – and this report does that. As well as listing what effective teaching practices are, it also lists what ineffective practices are. This list has already received some media attention (including a Guardian article with a bit from me), as it says that some popular practices such as learning styles and discovery learning are not backed up by evidence. The report draws its evidence from a wide range of sources, including knowledge from cognitive psychology. It cites Dan Willingham quite a lot, and quotes his wonderful line that memory is the residue of thought. As regular readers will know, I think cognitive psychology has a lot to offer education, so it is great to see it getting so much publicity in this report.
2. How can we measure good teaching? According to this report, the focus should always be on student outcomes (not necessarily just academic ones). This can also be a bit of a hard truth. If a group of teachers work really hard at mastering a particular technique or teaching approach, and they do master it and use it in all their lessons, it can be tempting to define this as success. But this report says – no. The focus has to be on student outcomes. Although we can devise proxy measures which can stand in for student outcomes, we always need to be regularly checking back to the student outcomes to see if those assumptions are still holding true. The report is also honest about the fact that a lot of the current ways we measure teaching are flawed. That’s why we need to use more than one measure, to always be checking them against each other, and to be very careful about the purposes we put these measurements to. The report suggests that our current measures are probably only suitable for low-stakes purposes, and that they certainly can’t be used for both formative and summative measures at the same time (or ‘fixing’ and ‘firing’ as they call it).
3. How can we improve measurement? Although the report is very cautious about the current state of measurement tools, it offers some useful thoughts about how we could improve this state of affairs. First, school leaders need to be able to understand the strengths and limitations of all these various data sources. According to the report, there is ‘the need for a high level of assessment and data skills among school leaders. The ability to identify and source ‘high-quality’ assessments, to integrate multiple sources of information, applying appropriate weight and caution to each, and to interpret the various measures validly, is a non-trivial demand.’ Also, student assessment needs to be improved. If we always want to be checking the effect of our practices on student outcomes, we need a better way of measuring those outcomes. The report gives this tantalising suggestion: that the profession could create ‘a system of crowd-sourced assessments, peer-reviewed by teachers, calibrated and quality assured using psychometric models, and using a range of item formats’. It would be great to hear more details about this proposal, and perhaps about how CEM or the Sutton Trust could provide the infrastructure and/or training to get such a system off the ground.
One of the authors of the paper is Rob Coe, and I think this report builds on his 2013 Durham Lecture, Improving Education: A Triumph of Hope over Experience. This lecture was also sceptical about a lot of recent attempts to measure and define good teaching, as can be seen in the following two slides from the lecture.
I recommended this lecture to a friend who said something along the lines of ‘yes, this is great – but it’s so depressing! All it says is that we have got everything wrong for the last 20 years and that education research is really hard. Where are the solutions?’ I think this paper offers some of those solutions, and I would recommend it to anyone interested in improving their practice or their school.
Today I was on the Daily Politics soapbox talking about why facts are vital for learning. Click on the image below to see the video on the BBC website.
The short video was filmed at the Ragged School Museum in East London. It is a lovely little museum just round the back of the Mile End Road. The building was one of Doctor Barnado’s original ragged schools in the late 19th century, set up to educate the poor of the East End. It closed in 1908, and in 1990 it was turned into a museum. As well as some permanent displays, children can take part in an authentic Victorian lesson, taught by the rather formidable lady in the video. My mother and father grew up not far from this school, although I feel I should point out they are not quite old enough to have actually attended it. I also grew up in East London not far from the museum and can remember visiting the museum as a child. It is definitely worth visiting, or taking a school trip to.
A couple of weeks ago Ofqual published their consultation on new GCSE grades. A lot of the media debate has focussed on the new 1-9 grading structure, but tucked away in the consultation document there is a lot of very interesting information about how examiners make judgments.
I’ve written before on this blog about the difference between norm-referencing and criterion-referencing. Briefly, norm-referencing is when you allocate a fixed percentage of grades each year. (Update – Dylan Wiliam has pointed out in the comments that this is not the correct definition of norm-referencing – see here for his comment). Each year, the top 10% get A grades, next 10% B, etc. It’s a zero sum game: only a certain number of pupils can get the top grade, and a certain number have to get the lowest grade. This seems intrinsically unfair because however hard an individual works and however highly they achieve, they are not really going to be judged on the merits of their own work but on how it stacks up against those around them. More than x% of pupils might be performing brilliantly, but they can’t be recognised by this system. It seems much fairer to set out what it is you want pupils to know and do in order to achieve a certain grade, and to give them the grade if they meet that criteria. That’s criterion-referencing.
The old O-level allocated fixed percentages of grades, and when it was abolished, the new GCSE was supposed to be criterion-referenced. I say ‘supposed’, because whilst criterion-referencing sounds much fairer and better, in practice it is fiendishly difficult and so ‘pure’ criterion-referencing has never really been implemented. Criteria have to be interpreted in the form of tests and questions, and it is exceptionally hard to create tests, or even questions, of comparable difficulty year after year– even in seemingly ‘objective’ subjects like maths or science.
We are not the only country to have this problem. The Ofqual report references the very interesting example of New Zealand. Their attempt at pure criterion referencing in 2005 led to serious problems. A New Zealand academic wrote this report about it, which includes a number of interesting points.
Taken at face value, criterion-referenced assessment appears to have much to recommend it (the performance demonstrated is a well-specified task open to interpretation) and norm-referencing very little to recommend it (the level of performance must be gauged from the relative position obtained), nevertheless, there are difficulties that make the introduction of criterion-referenced assessment in areas like reading, mathematics, and so on, much less smooth than this view might lead one to anticipate.
Likewise, in his book Measuring Up (which I reviewed in three parts here, here and here), the American assessment expert Daniel Koretz outlines some of the flaws with criterion-referenced assessments. The basic flaw at the very heart of criterion-referencing may be that we are ill-equipped to make absolute judgments. In the words of Donald Laming, ‘there is no absolute judgment. All judgments are comparisons of one things with another.’
As a result, our system has never been purely criterion-referenced. Tim Oates says this of the system we use at the moment:
‘In fact, we don’t really have a clear term for the approach that we actually use. ‘Weak-criterion referencing’ has been suggested: judgement about students meeting a standard, mixed with statistical information about what kinds of pupils took the examination.’
Ofqual are proposing to continue with this approach, but to improve it. I support their direction of travel, but I wonder if they couldn’t have gone a bit further – say, for example, actually reintroducing fixed grades.
One argument against fixed allocations of grades is that it won’t allow you to recognise genuine improvement in the system – or indeed genuine decline. If the top x% always get the top grade, you have no idea if standards are improving or declining. However, this argument no longer holds water because Ofqual are proposing to bring in a national reference test:
The performance of the students who take the test will provide a useful additional source of information about the performance of the cohort (rather than individual students) for exam boards awarding new GCSEs. If, overall, students’ performance in the reference test improves on previous years (or indeed declines) this may provide evidence to support changing the proportion of students in the national cohort achieving higher or lower GCSE grades in that year. At present such objective and independent evidence is not available when GCSE awards are made.
I think the reference test is an excellent idea. Ideally, in the long-term it could assume the burden of seeing if overall standards are improving, leaving GCSEs free to measure the performance of individual pupils. In that case, why not have fixed grades for GCSEs? Alan Smithers makes a similar point in the Guardian here.
One reason why Ofqual might not have wanted to reintroduce fixed allocations of grades at the moment is because, despite all the real technical flaws with criterion-referencing which I have outlined above, there is still an element of hostility to norm-referencing amongst many educationalists. In my experience, I sense that many people think that norm-referencing is ‘ideological’ – that the only people who advocate it are those who want to force pupils to compete against each other.
Nothing could be further from the truth. Norm-referencing has some basic technical advantages which make it a sensible and pragmatic choice. The Finnish system, for example, which is often seen as being opposed to the ideas of competition and pupil ranking, has a norm-referenced final exam where the ‘number of top grades and failed grades in each exam is approximately 5 percent.’ Not only that, but as the example of New Zealand shows, those countries who have experimented with fully criterion-referenced exams have faced serious problems. If we refuse to acknowledge the genuine strengths of norm-referencing, we risk closing down many promising solutions to assessment problems.
One of the main reasons why people say we need to keep national curriculum levels is because they provide a common language.
I am all in favour of a common language, but levels did not provide this, as I have argued before here. Since I wrote that last post, I have come across this fascinating paper by Peter Pumfrey. It was written nearly twenty years ago, when levels were first introduced. It looks at the results of pupils in the KS1 reading tests. It is summarised by Bonnie Macmillan in Why School Children Can’t Read:
An investigation comparing pupils’ standardised reading scores with their level of attainment on national curriculum tests is starkly illustrative. Children who had been assessed as having attained level 2 (the average expected for their age) on national curriculum tests were found to have reading ages, determined from standardised testing, ranging from 5.7 to 12.9 years. That is, within the group of pupils all categorised as level 2, there was an incredible 7 year range in the actual reading abilities represented. Similarly, those categorised as level 1 were found to have reading ages ranging from 5.7 to 9.6 years.
Even though I was well aware of all the problems with levels, I was still astonished to read this. Not only does the level 2 category include pupils of such differing attainment as to be practically meaningless, it also significantly overlaps with the level 1 category. That doesn’t look to me as though levels are giving us a common and shared understanding.
Although I know of no similar research which has been done more recently, a look at the distribution of levels in the KS2 tests suggests that there is something similar going on. In the KS2 tests, approximately 15% of pupils get a level 3 or below, 50% get a 4 and 35% get a 5. So the number of pupils achieving a level 4 – that is, national expectations – runs from approximately the 16th to the 65th percentile. I suspect if we did a reading age test on all of these pupils, we would find huge variations in their results. Anecdotally, I know of plenty of secondary schools who find that some of their level 4s have difficulty with reading and are placed in their catch-up reading classes. So again, how useful is it, and how much of a ‘common language’ do levels provide if the level 4 range runs from pupils who are still struggling with reading and writing up to pupils who are confident readers and writers. One of the reasons why so many secondary schools reassess their pupils on entry (CAT4, for example, is used in over 50% of UK secondaries) is because the KS2 SATs do not provide a common language or that much in the way of useful information.
These vague bands cause further problems for secondary schools because they are used as the baseline for measuring progress across secondary. Expected progress for all level 4s is a C at GCSE. 84% of pupils in the top third of that level 4 category do go on to achieve a C or above at GCSE. But only 50% of those in the bottom third of the level 4 category do. (These figures are for English; they are similar for Maths). Schools who have a lot of pupils clustered in the bottom part of the level 4 category are being held to very tough attainment targets. Schools with lots of pupils clustered at the top of that level get relatively easy attainment targets. And of course, in practice, schools will not get fair spreads of level 4 pupils. Schools in some areas will take on a disproportionate number of ‘low’ level 4s, whereas other schools will get a disproportionate number of ‘high’ level 4s.
So, in conclusion, national curriculum levels do not provide a common language and this results in many pernicious effects. As for what could provide a common language, I will return to this in my next post.