What do exams and opinion polls have in common?

A lot.

Daniel Koretz, Professor of Education at Harvard University, uses polls as an analogy to explain to people how exams actually work. Opinion polls sample the views of a small number of people in order to try and work out the views of a much larger population. Exams are analogous, in that they feature a small sample of questions from a much wider ‘domain’ of knowledge and skill. In Measuring Up, Koretz says this:

The full range of skills or knowledge about which the test provides an estimate – analogous to the votes of the entire population of voters in [an opinion poll] – is generally called the domain by those in the trade. Just as it is not feasible for the pollster to obtain information from the entire population, it is not feasible for a test to measure an entire domain exhaustively, because the domains are generally too large. Instead we create an achievement test, which covers a small sample from the domain, just as the pollster selects a small sample from the population.

Since first reading Koretz’s book (see my review here) I’ve used the analogy quite a lot. I used it this week to explain something to a colleague. She stopped and looked at me like I was crazy. ‘Daisy’, she said, ‘I think you need to get a new analogy.’

I know what she means. After a week where opinion polls have been torn to shreds over their failure to predict the result of the 2015 UK general election, it seems perverse for me to keep using them as an analogy. But actually, the failure of the recent opinion polls makes the analogy all the more useful, because just as opinion polls can and do get things wrong, we need to acknowledge that the similar structure of exams means that they can and do get things wrong too. Exams are susceptible to the same kinds of errors that opinion polls are. In fact, in one way, exams are even more susceptible to error than opinion polls. In the case of opinion polls, we can check the validity of the poll result because eventually the domain is measured, in the form of the election. In the case of exams, there is no final equivalent measure of the domain.  Imagine if there never was an election, and all we ever had were opinion polls of differing types. That’s what exams are like.

Plenty of reasons have been put forward for the failure of the polls in this general election. One of the most popular is the idea that, for whatever reason, people did not tell the pollsters who they were really planning to vote for. The analogy with tests would be where a pupil is, for whatever reason, not interested in answering the items on the test to the best of their ability. In Koretz’s words,

Just as the accuracy of a poll depends on respondents’ willingness to answer frankly, the accuracy of a test score depends on the attitudes of the test-takers – for example, their motivation to perform well.

Whilst this may have been the problem with the UK opinion polls, I don’t think it is a major problem with UK tests. Enough important outcomes depend on the tests for pupils to be motivated to do well on them. Of course, we should never forget that variation in performance on the day is always one of the major factors in exam-score unreliability, and pupils who feel under too much pressure to perform may fail to produce their best work. But by and large, I don’t think this is the major problem with tests at the moment. However, there are two aspects of this sample and domain structure which I think do cause serious problems.

Here is Koretz again:

In the same way that the accuracy of a poll depends on seemingly arcane details about the wording of survey questions, the accuracy of a tests score depends on a host of often arcane details about the wording of items, the wording of ‘distractors’ (wrong answers to multiple choice items), the difficulty of the items, the rubric (criteria and rules) used to score students’ work, and so on…If there are problems with any of these aspects of testing, the results from the small sample of behaviour that constitutes the test will provide misleading estimates of students’ mastery of the larger domain. We will walk away believing that Dewey will beat Truman after all. Or, to be precise, we will believe that Dewey did beat Truman already.

Or, if I can update the analogy, we will believe that Ed Miliband is currently in negotiations with Nicola Sturgeon and Nick Clegg about forming the next government, and David Cameron is sunbathing in Ibiza. A great blog post here by Dan Hodges (written before the election) explains some of the ways in which some of the arcane details of opinion polls can be manipulated to get a certain result. Similarly, changes in arcane details of exam structure can change the value of the results we get from them. For me, there are two particular problems: poor test design, and teaching to the test. I’ve written about the problems of poor test design and teaching to the test on my blog before, here. I also have an article being published soon in Changing Schools where I discuss this at greater length. Here, I will add just one more point, about coursework. Coursework and controlled assessments are a perfect example of the problems with poor test design and teaching to the test.

The essential problem with coursework and controlled assessment is that they allow the teacher and pupils to know what the test sample is in advance. When I taught Great Expectations as a coursework text, I knew what the final ‘sample’ from the novel was –  the final essay would be on the first chapter of the novel. To extend the opinion poll analogy, it’s as if the final result of the election depended not on an vote of the population, nor on an anonymous sample of the population, but on a sample of 1,000 voters whose names were known in advance to all of the political parties. Even if the sample were well chosen, this would clearly be problematic. There would be an obvious incentive to neglect the views of anyone not in that sample. Some political parties might not want to behave so dishonourably, but they would almost certainly be forced to as if they didn’t, another party would do so and therefore win the ‘election’. I think the analogies with coursework and teaching are clear.  A teacher might not want to focus their instruction on the first chapter of Great Expectations, as they realise that doing so will not help pupils in the long run, particularly those pupils who want to study English Literature at A-level. But if they think that every other teacher is doing so, they may feel that they have no choice, as those pupils who do get the targeted instruction will probably get better grades. Thus, poor test design, in the shape of coursework and controlled assessment, encourages teaching to the test and distorts the validity of the final test score. Obviously, the whole problem is exacerbated by high-stakes testing, which places great weight on those test scores. But I think the poor test design is a major feature here too, and hopefully the analogy with opinion polls makes it clearer why this is such a problem.

Advertisements

6 thoughts on “What do exams and opinion polls have in common?

  1. Sirkku Evans

    Hi Daisy,

    Thanks for your latest blog — keep them coming!

    Did you see the 9 May (11 May online) book review in the Saturday Telegraph by Katharine Birbalsingh?

    The title of her article is ‘The Creative Schools Myth’ and the book she is reviewing was written by Ken Robinson & Lou Aronica with the subtitle ‘Revolutionizing Education from the Ground Up’.

    A quote from Katharine’s critique: “It is precisely because these sorts of ideas have had enormous influence over the past 40 years that our schools are as broken as they are.”

    She also says, “At my free school in Brent, Michaela Community School, we believe in the opposite of what Robinson thinks works…. Creativity only comes by standing on the shoulders of giants… The truth is that only knowledge can set children free.”

    Perhaps you already know about Michaela Community School? Here are people who are spreading the message — that ‘the emperor has no clothes’ — such an encouragement!

    Best regards, Sirkku (orig. from Finland)

    Reply
  2. Pingback: Coursework Destroys Education | The Traditional Teacher

  3. Robert Craigen

    Since your London ResearchEd talk I have thought about these things. It seems to me that sometimes one may purposefully promote teaching (or learning) to the test. For example, in our linear algebra exams there are always questions demanding the skill of Gaussian elimination. That acts as a statement of value: it communicates to our students that we place value on them having mastered this algorithm. We want students studying from old exams because we wish to have this reinforced for them. In analysis courses there always appear relatively routine (but nontrivial) questions involving “epsilon-delta” proofs. We want students to understand that they will be tested on this, because we want them to master the concepts, which are not easy, and only come with practice.

    It is the practice in many North American universities for final exams to comprise a large portion of one’s term grade — generally 50%. Exams serve more purposes than assessment. In our case we regard them as motivators, a carrot-and-stick mechanism. Learn this and you will achieve an easy reward. Fail to do that and it will cost you.

    This is part of a mixed strategy. I am of the school that says students should not know which part of the domain will be tested. That is, when they ask (as they always do) in the last 2 weeks of the course “which sections will be on the exam?” my only answer is, generally, “all of them”. Or, more honestly, “I wish you to prepare as if all elements of the course will be tested.” It is a sad reality that many students do not begin to earnestly work on mastering knowledge, concepts and procedures until they begin their exam work. So saying “Section X will not be tested this term” may be tantamount to not teaching Section X in the first place. Rarely, but from time to time, I will tell students, when a certain topic comes up, “This is not part of this course, I simply thought it would add colour for you; you needn’t take notes.” But for the most part my view is, “I would not teach it if I didn’t want you to learn it, and if I want you to learn it I also want you to prepare as if you will be tested on it.”

    The validity of the test as a measure of mastery of the domain is an important issue, but almost orthogonal to this. I also want the test to be a useful addition to the collection of past tests as a guide for students to our level and breadth of expectation that they master that domain.

    Reply
  4. Pingback: Wellington Festival of Education 2015 – review | The Wing to Heaven

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s