Category Archives: Uncategorized

Global Education and Skills Forum 2018

Last weekend I spoke at the Global Education and Skills Forum in Dubai. I spoke for the motion in the following debate: ‘”I can just Google it” is making us stupid.’ You can see the video here. I’ve put a transcript of my speech below, together with references.

In a letter to a friend, the ancient philosopher Seneca recounted the story of a rich Roman merchant who wanted to appear as though he was a very well-read man. This merchant decided that instead of actually reading books himself, he would instead hire a team of slaves to do it for him. “He spent an enormous amount of money on slaves: one of them to know Homer by heart, another to know Hesiod, while he assigned one apiece to each of the nine lyric poets. Then, he used these slaves to give his dinner guests nightmares: He would have these fellows at his elbow so that he could continually be turning to them for quotations from these poets which he might repeat to the company.”[1]

Of course, no one nowadays has slaves to remember things for them. But we do all feel very comfortable with the idea that we can outsource our memories to Google. In my book, Seven Myths about Education, I devoted a chapter to collecting examples of technologists and educationalists telling us that remembering things just isn’t necessary in a world with ubiquitous smartphones.[2]

These people are wrong, and they are dangerously wrong. And it is not just ancient writers like Seneca who tell us they are wrong. There is a whole body of modern scientific literature which makes the same point. Somewhat ironically, a great deal of this research derives from the work of Herbert Simon, one of the pioneers of artificial intelligence and modern computing. What we know from this research about how the brain works is that memory and attention are two vital parts of our intellectual equipment.[3] We also know that memory and attention are under siege from modern technology like never before.[4] Let us consider these two vital components in turn: why do they matter, and why are they under threat from technology?

First, memory. Our memories matter because we need facts stored in long-term memory in order to be able to think. This is because our working memory – what you might think of as consciousness – is extremely limited and can handle only about 4 – 7 new items of information. That isn’t nearly enough to do anything complex like driving a car, or reading a book. But we can cheat working memory’s limitations – not by hiring a bunch of slaves or using Google, but by committing facts to long term memory. This is why memorising times tables matters. When you solve a complex real world maths problem, you have to process a lot of information in working memory. If you also have to stop every second to type the times tables into your smartphone, your working memory will quickly be overwhelmed, and you will not be able to solve the problem. You’ll forget what the start of the problem was by the time you get to the end.[5] As one group of researchers have said, long term memory is the seat of human intellectual skill.[6] What we know influences how we see the world, how we think and how we reason. Intuition and creativity are the function of large well-memorised bodies of knowledge clashing against each other.[7] We can’t outsource this stuff.

If memory is so important, how do we make memories? The simplest answer is that we remember what we pay attention to – and that brings me to the second thing I want to talk about – attention.[8] If we pay attention to something, we are more likely to remember it. Our attention determines our memories. And nearly all of the major technology companies make their money by harvesting our attention, and selling it to advertisers.[9] These companies have invented increasingly sophisticated methods of grabbing our attention, even if it involves distorting the truth, manufacturing outrage, and exploiting loneliness.[10]  In the process, they don’t just distract our attention: they degrade its quality. Think how hard it is to concentrate on a book after spending an hour or so on social media.[11] Recent research shows that even the sight of a switched off phone makes it harder to focus.[12] Given the vital importance of attention for forming memories, a system that is built on stealing and degrading our attention cannot make us smarter.

At this point, people might typically say, but what about the good uses of technology? What about the Khan Academys, the Duolingos, the Courseras? What about Andrew’s platform Cerego, which uses the science of learning to design educational content that really will stick in long-term memory? And I agree that these kinds of websites are fantastic. They give billions of people access to quality educational content at low or even no cost, which is amazing. We on this side of the house are absolutely not opposed to educational technology. I work for an ed tech company. In my previous jobs as an English teacher I was always experimenting with different methods of online learning. What we are opposed to are misconceptions like the one in the title of this debate, that you can just Google it. Or, as one Google executive said recently, ‘I don’t know why children are learning the quadratic equation. I don’t know why they can’t just ask Google for the answer.” (See footnote 2). And in fact, the reason we are so particularly opposed to misconceptions like this one is that such misconceptions damage good education technology. They make it harder for the really powerful and effective methods of education technology to fulfil their potential, because the really effective education technology is not about outsourcing memory, but about making the process of memorisation as effective, efficient and fun as possible.

Not only that, but good forms of education technology are also being damaged by the tech companies’ insatiable appetite for attention. Online education courses have a phenomenally high drop-out rate. One study from 2014 showed that just 13% of people who enrol on an online course complete it.[13] Why is this? Plenty of reasons have been put forward, but I would like to suggest that one important reason is that because these courses are delivered online, they are therefore competing with everything else that online has to offer – the instant social updates, the flash shopping discounts, the cat videos, Donald Trump’s twitter feed. It isn’t enough to create fantastic educational content for free.[14] In order for it to make people smarter, people have to pay attention to it. And large numbers of them simply aren’t.

Of course one could imagine a world in which technology was used to make us smarter. I would happily sketch for you the outlines of a world where technology did make us smarter.[15] The point is that that is not the world we currently live in. The technology we use prioritises entertainment, outrage, distraction and convenience ahead of learning. By and large, the big money in technology is not going towards helping children to learn their times tables in the most efficient and fun way possible. It is going towards encouraging children to take another selfie, and to forget about the times tables because there’s a robot who will do it for them.

Seneca concluded his story of the Roman merchant with the following moral: “A sound mind can neither be bought nor borrowed.”[16] I would add the following modern updating. “A sound mind can neither be bought, nor borrowed, nor outsourced to the cloud.” And until we recognise that truth, Google will continue to make us stupider.

[1] Seneca: letters from a Stoic. Ed. Campbell, Robin. Penguin, 1969, Letter XXVII

[2] Christodoulou, Daisy. Seven myths about education. Routledge, 2014, chapter 4. Seven Myths was published in 2014; plenty of similar claims have been made since then, including, for example, here by Jonathan Rochelle, Google’s director of education apps: “Referring to his own children, he said: “I cannot answer for them what they are going to do with the quadratic equation. I don’t know why they are learning it.” He added, “And I don’t know why they can’t ask Google for the answer if the answer is right there.”

[3] EG, see Frantz R. “Herbert Simon. Artificial intelligence as a framework for understanding intuition.” Journal of Economic Psychology 2003; 24: 265–277. Simon also wrote explicitly about education here: Anderson J. R., Reder L.M. and Simon H.A. Applications and misapplications of cognitive psychology to mathematics education. Texas Education Review 2000; 1: 29–49. I discuss this paper in my blog post here.

[4] EG see Wu, Tim. The attention merchants: The epic scramble to get inside our heads. Vintage, 2017, also Teixeira, Thales S. “The rising cost of consumer attention: why you should care, and what you can do about it.” (2014). Simon also commented on the economics of attention here: Simon, Herbert A. “Designing organizations for an information-rich world.” (1971): 37-72. “In an information-rich world, the wealth of information means a dearth of something else: a scarcity of whatever it is that information consumes. What information consumes is rather obvious: it consumes the attention of its recipients. Hence a wealth of information creates a poverty of attention and a need to allocate that attention efficiently among the overabundance of information sources that might consume it.”

[5] Cowan N. “The magical number 4 in short-term memory: A reconsideration of mental storage capacity.” Behavioral and Brain Sciences 2001; 24: 87–114; Cowan N. Working Memory Capacity: Essays in Cognitive Psychology. Hove: Taylor and Francis, 2005. See also Miller G.A. “The magical number seven, plus or minus two: Some limits on our capacity for processing information.” Psychological Review 1956; 63: 81–97; More recently, Professor Daniel Willingham has written this New York Times article about this exact issue.

[6] Sweller J., van Merriënboer J.J.G. and Paas F.G.W.C. Cognitive architecture and instructional design. Educational Psychology Review 1998; 10: 251–296.

[7] Larkin, J., McDermott, J., Simon, D. P., & Simon, H. A. “Expert and novice performance in solving physics problems.” Science, 1980; 208(4450), 1335-1342, p.1335.

[8] Willingham D.T. Why Don’t Students Like School? San Francisco: Jossey-Bass, 2009, p. 53. William James also discusses attention in chapter 11 of The principles of psychology: ‘My experience is what I agree to attend to.’

[9] As Tristan Harris argues, the advertising model which underpins the modern technology economy means that companies ‘have an unbounded interest in getting more of people’s time on a screen’.

[10] See for example this article from the Guardian which investigates YouTube’s ‘Most Recommended’ algorithm and this on how Facebook uses information on users’ emotional states. See also Jean Twenge, in this article in the Atlantic and iGen: Why Today’s Super-connected Kids are Growing Up Less Rebellious, More Tolerant, Less Happy–and Completely Unprepared for Adulthood–and what that Means for the Rest of Us. Simon and Schuster, 2017. “The more time teens spend looking at screens, the more likely they are to report symptoms of depression.” See also Tromholt, Morten. “The Facebook experiment: Quitting Facebook leads to higher levels of well-being.” Cyberpsychology, Behavior, and Social Networking 19.11 (2016): 661-666.

[11] One small-scale study showed that undergraduates switch windows on their computers every 11 seconds on average. Yeykelis, Leo, James J. Cummings, and Byron Reeves. “The Fragmentation of Work, Entertainment, E-Mail, and News on a Personal Computer: Motivational Predictors of Switching Between Media Content.” Media Psychology (2017): 1-26.

[12] Ward, Adrian F., et al. “Brain drain: the mere presence of one’s own smartphone reduces available cognitive capacity.” Journal of the Association for Consumer Research 2.2 (2017): 140-154.

[13] Onah, Daniel FO, Jane Sinclair, and Russell Boyatt. “Dropout rates of massive open online courses: behavioural patterns.” EDULEARN14 proceedings (2014): 5825-5834.

[14] It should also be pointed out that whilst there is a lot of brilliant educational content on the internet, there are also a lot of educational claims made for websites, activities and games that are unlikely to lead to real learning. In his book Deep Work, Cal Newport points out the ‘absurdity of the now common idea that exposure to simplistic, consumer-facing products—especially in schools—somehow prepares people to succeed in a high-tech economy. Giving students iPads or allowing them to film homework assignments on YouTube prepares them for a high-tech economy about as much as playing with Hot Wheels would prepare them to thrive as auto mechanics.’ Newport also argues for the importance of attention, seeing uninterrupted ‘deep work’ as one of the main creators of value in the modern economy. Newport, Cal. Deep work: Rules for focused success in a distracted world. Hachette UK, 2016.

[15] For some suggestions, see the final chapter of my second book, Making Good Progress?: The future of Assessment for Learning. Oxford University Press, 2017.

[16] Seneca, ibid.

Advertisements

Research Ed 2017

This was the fifth national Research Ed conference, and in my mind they’ve started becoming a bit like FA Cup Finals or Christmas – recurring events that start to blur into one. “Oh, South Hampstead – was that the one where Ben Riley from Deans for Impact visited and it all kicked off about grammars?” “No, that was Capital City 2016South Hampstead 2015 was the one where Eric Kalenze visited and where James Murphy taught us the Maori word for green.” Etc. Looking back at my notes from 2013, I find that Ben Goldacre warned then against the ‘energy-zappers’ who will criticise everything you do – too true.

  • The title of my talk was: Improving assessment: the key to education reform.
  • You can download my slides here: Research Ed 2017
  • The livestream is here.
  • If you’re interested in finding out more about comparative judgement, one of the things I talked about, then there are still a few places left on our London training day later this week.

As ever, it is inspiring to meet so many people who are so committed and excited about the cause of research in education, and to be able to talk and share ideas with them. I always come away from these conferences with my mind buzzing with new ideas. Research Ed has only been around for four years, but I cannot imagine the world of education without it. Here’s to many more brilliant conferences.

Workload and English mocks

You can also read this post on the No More Marking blog here.

Last weekend, I posted a question to English teachers on Twitter.

Most of the answers were in the range of 10 – 30 minutes. People also pointed out that the time it took to mark mocks varied depending on whether you wrote lengthy comments at the bottom of each script or not.

My own experience of marking the old spec GCSE English Language papers was that it took me about 15 minutes to mark each paper, which included some fairly brief comments. I also found it difficult to mark for more than about 90 minutes / 2 hours in one go, and if I did try and mark for longer than that, I would get slower and need to take more frequent breaks.

If we take 15 minutes, therefore, as a relatively conservative estimate, that means that if you teach 28 pupils, it will take you 7 hours to mark those scripts. That doesn’t include any moderation. If we assume a 90 minute moderation session for each mock, plus 90 minutes to go back and apply the insights from moderation, that means we are looking at a total of 10 hours.

That’s for one English Language Paper. There are two English Language papers, and two English Literature paper. So if you want pupils to do a complete set of English mocks, that’s a total of 40 hours of marking for the teacher.

With the old specification which included a lot of coursework, I think most English teachers spent the bulk of year 10 teaching and marking coursework essays, and didn’t get on to doing mocks until year 11. I was really pleased when coursework was abolished as I felt it would free up so much more time for teachers to plan and teach, instead of mark and administer coursework. However, it does appear as though a lot of this gained time has now been replaced with equally time-consuming mock marking, with mocks being introduced more and more in year 10. Many schools have three assessment points a year. If you were to do two mock papers three times a year in both year 10 and 11, then a teacher who taught one year 10 class and one year 11 class would spend 120 hours of the year marking GCSE mocks. That’s three normal working weeks, or nearly 10% of the contracted 1,265 annual hours of directed time.

In our first No More Marking Progress to GCSE English training days last week, we looked at how schools could use comparative judgement to reduce the amount of time it took to mark an English mock paper. The exact amount of time it takes to judge a set of scripts using comparative judgement will depend on the ratio of English teachers to pupils in your school. But we think that at worst, using comparative judgement will halve the amount of time it takes to grade a set of GCSE English papers; that is, it will take 5 hours instead of 10. The best case scenario is that we can get it down to 2 hours. That includes built-in moderation, as well as time to discuss the results with your department and prepare whole-class formative feedback. You can read more about the pilot, and how to sign up for it, here.

Of course, workload is not the only issue we should consider when looking at planning assessment calendars and marking policies. At No More Marking, we like to evaluate the effectiveness of an assessment by looking at these three things.

  • Efficiency and impact on workload
  • Reliability – is the assessment consistent?
  • Validity – does the assessment allow us to make helpful inferences about pupils, and does it help pupils and teachers to improve?

In future blog posts, we’ll consider how reliable and valid traditional mock marking is. But for now, it’s clear that on the measure of efficiency, traditional mock marking doesn’t do that well.

Sharing Standards 2016-17: The results

In July, I will be leaving my role at Ark Schools to work for No More Marking as Director of Education. 

Over the last 6 months, No More Marking have been working with primary schools in England on a pilot of comparative judgement for year 6 writing called Sharing Standards. Comparative judgement is a quick and reliable method of marking open tasks like essays and stories. The easiest way to understand it is to try out the demo on the No More Marking website, but you can also read my explanation of it on this blog here.

The results of this pilot were published last Tuesday, and you can read the full report here.

Overall, 199 schools participated in the pilot, and a total of 8,512 writing portfolios were judged. 1,649 teachers in those schools did the judging, and the reliability of their judgements was 0.84.  This allowed for the creation of a measurement scale featuring every portfolio, and then for the application of a national gradeset: Working Towards, Expected Standard and Greater Depth. The overview report on the No More Marking website features exemplars of the portfolios at each threshold. Here’s a piece from the portfolio that was judged as the best.

Shackleton

80% of the judgements teachers made were comparisons of pupils in their own school. 20% were comparisons of pupils from other schools. This allowed for the creation of a national scale, but it also meant that it wasn’t possible for teachers to favour pupils from their own schools, as they were never asked to directly compare their pupils with pupils from other schools.

The other nice thing about this structure was that it allowed teachers to see tasks and pupil work from other schools. I particularly noted the popularity of tasks that asked pupils to write from the point of view of a character in a novel, and the variety of novels selected as the basis for this task. And in discussions with teachers after, it was interesting to try and pick out the aspects that made such types of writing more or less successful. Very often it was subtle uses of syntax or vocabulary that made the difference. For example, some pupils trying to capture the voice of Bruno in ‘Boy in the Striped Pyjamas’ would use the same very precise and measured sentence structure of Bruno. Others would get this right, but then fall down by using modern slang terms that just didn’t ring true.

And this brings me to the most exciting next step for comparative judgement. As Jon Brunskill writes here, once you have the fascinating data set of accurately graded portfolios, you can then ask: now what? Why are some pieces of writing better than others?  What aspects of writing matter, and how can we teach them? Of course, good teachers have always been doing this, but it’s also always been made harder by the way that traditional methods of marking writing lead to disagreement and disputes. If you can’t get reliable agreement on what good writing is, it’s obviously going to be much harder to teach good writing.

Take a look at the exemplar portfolios here and start this process yourself! Next year, No More Marking will be running similar national assessment windows for all primary year groups. See here for more details about how to participate.

 

How do bad ideas about assessment lead to workload problems?

This is part 7 of a series of blogs on my new book, Making Good Progress?: The future of Assessment for Learning. Click here to read the introduction to the series.

Bad ideas can cause workload problems. If you have a flawed understanding of how a system works, the temptation is to work harder to try and make the system work, rather than to look at the deeper reasons why it isn’t working.

The DfE run a regular teacher survey diary. In the survey from 2010, primary teachers recorded spending 5 hours per week on assessment. By 2013, they were spending 10 hours per week on assessment. Confusion and misperceptions around assessment are creating a lot of extra work – but there is no evidence they are providing any real benefits.

So what are the bad assessment ideas which are creating workload but not generating any improvements? Here are a few ideas.

Over reliance on prose descriptors when grading work
Like a lot of teachers, I used to really dislike marking. But when I would stop and think about it, I realised that I actually really liked reading pupils’ work. It was the process of sitting there with the mark scheme trying to work out a grade and provide feedback from the mark scheme that I disliked. And it turns out there is a good reason for that: the human mind is not good at making these kind of absolute judgements. The result is miserable teachers and not very accurate grades. There is a better way (comparative judgement).

Over reliance on prose descriptors when giving feedback
Prose descriptors are equally unhelpful for giving feedback. A lot of the guidance that comes with descriptors recommends using the language of the descriptors with pupils, or at least using ‘pupil friendly’ variations of the descriptor. The result is that teachers end up writing out whole paragraphs at the end of a pupils’ piece of work: ‘Well done: you’ve displayed an emerging knowledge of the past, but in order to improve, you need to develop your knowledge of the past.’

These kind of comments are not very useful as feedback because whilst they may be accurate, they are not helpful. How is a pupil supposed to respond to such feedback? As Dylan Wiliam says, feedback like this is like telling an unsuccessful comedian that they need to be funnier.

I like the approach being pioneered by a few schools which involves reading a class’s responses, identifying the aspects they all struggled with, and reteaching those in the next lesson. If this response is recorded on a simple proforma, that can hopefully suffice for accountability purposes too.

Mistrust of short answer questions and MCQs
Short answer questions and multiple-choice questions (MCQs) can’t assess everything, clearly. But they can do some things really well and they also have the bonus of being very very easy to mark. A good multiple choice question is not easy to write, to be fair. But once you have written it, you can use it again and again with limited effort, and you can use MCQs that have been created by others too. Unlike feedback based on prose descriptors, if you use MCQs to give feedback then pupils can actively do something helpful in response to your feedback.

How can we measure progress in lessons?

This is part 6 of a series of blogs on my new book, Making Good Progress?: The future of Assessment for Learning. Click here to read the introduction to the series.

With national curriculum levels, it was possible to use the same system of measurement in exams as in individual lessons.

For example, national curriculum tests at the end of year 2 and 6 were measured using national curriculum levels. But you could also use NC levels to measure progress in individual lessons and at the end of terms. For example, you could have a task at the end of a lesson, and then you could tell pupils that in order to be a level 4a, they would need to perform in a certain way on the task; to be a 5c, they would need to reach a certain standard, and so on.

You can see the attraction of this approach: it is coherent, because you are always using and talking about the same grades. It’s also great for accountability. When Ofsted call, you can offer them ‘real-time’ assessment data based on performance from the most recent lesson.

However, in practice this system led to confusion. Pupils might be able to perform at a certain level at the end of a particular lesson. But when they came to sit their test at the end of the unit or the end of the year, they might not be at that level. As Rob Coe says here, levels started to take on very different meanings depending on how they were being used. Far from providing a coherent and unified system, levels were providing the illusion of a coherent system: everyone was talking about the same thing, but meaning something very different.

So what is the answer? I don’t think exam grades can be used to measure progress in lessons. What happens in the lesson is an ‘input’, if you like, and what happens in the exam is an ‘output’. It makes no sense to try to measure both on the same scale. Here is an analogy: we know that if you eat more, you put on weight. But we don’t measure food intake and weight output with the same scale, even though we know there is a link between them. We measure food with calories, and weight with kilograms. Similarly, we have to record what happens in lessons in a different way to what happens in the final assessment.

If you do try to measure activities in an individual lesson with the same scale as the final exam grade, then I think one of two things can happen. One is that you use activities in the classroom which are most suited for learning and for formative assessment: for example, in English you might use a spelling test. Activities like spelling tests are not very well suited for getting a grade, so the grade you get from them is very inaccurate, and causes a lot of confusion. The second option is to start to alter all of the activities you do in class so that they more closely resemble exam tasks. So you get rid of the spelling test, and get pupils to do a piece of extended writing instead.  This makes it more likely (although not certain) that the grades you get in class will be accurate. But it means that you are now hugely restricted in the types of activities you can do in class. You have effectively turned every lesson into a summative assessment.

We should record in-lesson progress in the way that is most suitable for the tasks we want to use. And lots of very useful activities are not capable of being recorded as a grade or a fraction of a grade.

How can we close the knowing-doing gap?

This is part 4 of a series of blogs on my new book, Making Good Progress?: The future of Assessment for Learning. Click here to read the introduction to the series.

One frequent criticism of memorisation is that it doesn’t lead to understanding. For example, a pupil can memorise a rule of grammar, or a definition of a word, but still have no idea how to use the rule or the word in practice. This is a real problem. I would say almost every pupil I have ever taught knew that a sentence began with a capital letter. They ‘knew’ this in the sense that if I asked them ‘what does a sentence begin with?’ they would instantly respond ‘a capital letter’. Yet many, many fewer of them would reliably begin every sentence they started with a capital letter. This is a classic example of a knowing-doing gap, where a pupil knows something, but doesn’t do it.

Frequently, I see people using examples like this one to prove that explicit instruction and teaching knowledge do not work, and to argue that we should use more authentic and real-world teaching practices instead. For example, if we just ask pupils to read lots of well-written books and articles, and do lots of writing, they will implicitly understand that sentences begin with a capital letter, and use capital letters in the same way in their own work too. Unfortunately, this kind of unstructured, discovery approach overloads working memory. If only it were possible to pick up the rules of the apostrophe simply by reading a lot of books – the world would be a lovelier place, but it wouldn’t be the world.

So what is the answer? The approach with the best record of closing the knowing-doing gap is direct instruction. I will discuss one specific direct instruction programme here, Expressive Writing, as it is the one I know best. Expressive Writing aims to teach the basic rules of writing. Whilst it does introduce some rules and definitions, this is a small part of the programme. The bulk of the programme is made up of pupils doing a lot of writing, but the writing they do is very structured and carefully sequenced.  The programme begins with the absolute basics: pupils are given a list of verbs in the present tense, and have to convert them to the past tense. Then they are given sentences that use those same verbs in the present tense, and they have to cross out the words and replace them with the past tense verb. Then pupils are given sentences with a blank which they have to fill in with either a past or present tense verb. As the programme continues, pupils are expected to do bigger and longer pieces of writing. Each activity is carefully sequenced so that pupils are never expected to learn too many new things at once, or take on board too many new ideas, and there is frequent repetition so that important rules are not just things that pupils ‘know’ – they are things that become habits.

To sum up, there is such a thing as the knowing-doing gap. Pupils can memorise a definition and not know what a word means, and they can memorise a rule and not know how to apply it. But this does not mean either that memorisation is redundant, or that discovery learning is better at producing true understanding. The way to close the knowing-doing gap is through memorisation and practice, but through memorisation and practice of the right things, broken down in the right ways. Expressive Writing offers one way of doing this for writing, and I think Isabel Beck’s work does something similar for vocabulary. In the next post, we’ll look at how this approach can be applied to the creation of formative and summative assessments.