Saturday, 10 October 2009

Adjusting multiple choice examinations

Multiple choice examinations have several benefits. They have no intra- or intermarker variability. In fact they can be automated. And I wouldn't be surprised if they are as effective as any other system in effectively evaluating material.

They need to be well written.
• The correct answer needs to be clearly more correct than other options.
• The correct answer should not be able to be guessed by the construction of the question.
• The order of the answer option should be random.
• A reasonable number of options need to be given.
• And the same number of options for every question.
• A significant number of questions needs to be included.
• The problem with multiple choice questions is the chance element. This can be reduced by increasing the number of questions.
If we have 20 questions with 4 options for each question, then random guessing will lead to people getting 5 correct on average (Exam mark = 25%); 20 / 4. However the range of correct answers will be quite great. Some will get 1 correct (5%), others 10 (50%). Whereas 200 questions will mean that people get 50 correct on average (Exam mark still = 25%), but a much lower range. Some may get 40 correct (20%), others 60 correct (30%).

Thus both exams when taken by people ignorant of the topic will give an average mark of 25%, but the chance of any particular individual getting a high mark is much greater with a smaller number of questions.

This seems obvious based on the examples above. Mathematically the range of marks is (inversely) related to the number of questions. The standard deviation of the range of answer marks is inversely proportional to the square root of the number of questions.

The other issue is standardising the results. Because people are likely to get 25% of the answers correct by chance (for 4 options), then one could subtract 25% from the final mark. So if you get 25% as a raw mark, you likely didn't know the answer to any of the questions, that is your knowledge is 0%. So we subtract 25% from your mark to get your adjusted mark, which is 0%.

However if you get 100%, it is unlikely you knew 75% and got the other 25% correct by chance. Rather you get the ones you know correct, and you tend to get about a quarter of the ones you don't know correct. So if you know 50% of the questions you will get 50% plus a quarter of the remaining 50%, that is 12.5%, which gives you a total of 50% + 12.5% = 62.5%. So a raw mark of 62.5% needs to be scaled back to 50%. And 100% means you know all the answers and does not need to be scaled back at all.

So we need to adjust the raw marks linearly to get adjusted marks.
• Let N be the number of questions.
• Let R be the number of options.
• Let X be the number of questions correct.
• Let Y be the adjusted number of questions correct.
Then
• X/N is the raw mark.
• Y/N is the adjusted mark.
• N/R is the chance number of correct answers.
When X = N/R then the mark needs to be adjusted to zero, ie. Y = 0.
When X = N then the mark needs no adjustment, ie. Y = N and Y/N = 1 (= 100%).

The number of questions correct equals the number of questions known plus the remaining number of questions divided by the number of options.

X = Y + (NY)/R

Rearranging for Y we get

Y = (RXN)/(R – 1)

Or as a mark

Y/N = 100% × (RXN)/N(R – 1)

And any negative numbers are given zero.