Skip to content
Assessment

How to Create a Multiple Choice Test That Actually Measures Learning

April 24, 20268 minSarah Mitchell
Share:XLinkedIn

The Problem With Most Multiple Choice Tests

Pick up any multiple choice test and you'll likely find the same problems: questions that are too easy for anyone who attended class, trick questions that punish careful readers, and distractors so obviously wrong that students with zero content knowledge can score 60% by elimination.

These tests don't measure learning. They measure test-taking skill — pattern recognition, elimination strategies, and the ability to decode what the question writer was thinking.

A well-constructed multiple choice test can do much better. It can reveal precisely what students understand, identify specific misconceptions, and generate data that improves your teaching. Here's how to build one.

Anatomy of a High-Quality Multiple Choice Question

Every multiple choice question has three parts:

The stem — the question or incomplete statement that defines what's being measured

The correct answer — the one defensible correct response

Distractors — wrong answers that are plausible to students who partially understand the content

Most test-writing problems originate in poorly constructed distractors. If your distractors are obviously wrong, students don't need to know the content to score well.

8 Rules for Writing Better Stems

Rule 1: The stem should be a complete question

❌ "The American Civil War..."

✅ "What was the primary economic cause of the American Civil War?"

Complete questions communicate exactly what knowledge is being tested.

Rule 2: Test one concept per question

❌ "What caused the Civil War and what were its effects on the Southern economy?"

✅ "Which of the following best describes the role of slavery in causing the American Civil War?"

Compound questions make it impossible to know what a wrong answer means.

Rule 3: Avoid negative phrasing when possible

❌ "Which of the following is NOT a cause of the Civil War?"

✅ "Which of the following was the PRIMARY cause of the Civil War?"

Negative questions are harder to parse and measure test-taking skill more than content knowledge. If you must use negatives, bold or capitalize "NOT."

Rule 4: Put the cognitive work in the stem, not the options

The options should be short; the stem should do the heavy lifting. Students who get long options right are often making decisions based on reading patterns, not content mastery.

Rule 5: Avoid "all of the above" and "none of the above"

These options teach test-taking strategies rather than content. Students learn that if they're unsure, marking "all of the above" is statistically safe.

Rule 6: Match difficulty to Bloom's level

  • Remember/Understand: "What is the term for...?"
  • Apply: "Given this scenario, which approach would...?"
  • Analyze: "Which of the following conclusions is best supported by this data?"
  • Higher-order questions reveal deeper understanding and are harder to guess correctly.

    Rule 7: Ensure only one answer is defensibly correct

    Have a colleague try to argue for a wrong answer. If they can, revise the question. Trick questions create resentment without generating useful data.

    Rule 8: Use consistent grammar

    All options should complete the stem grammatically. Grammatical inconsistencies reveal correct answers to students who have no content knowledge.

    5 Rules for Writing Better Distractors

    Rule 1: Base distractors on real misconceptions

    The most powerful distractors represent answers that students who partially understand the material would plausibly choose. Ask yourself: "What do students commonly get wrong about this topic?"

    Rule 2: Make distractors similar in length and complexity to the correct answer

    A long, detailed correct answer stands out. Keep all options roughly equal in length.

    Rule 3: Avoid "always," "never," "all," and "none" in distractors

    These absolute terms are almost always false — students learn to eliminate them without thinking.

    Rule 4: Use three to four options, not more

    Research shows that four options (one correct, three distractors) provide optimal discrimination without making questions unnecessarily difficult to write. Five or more options rarely improves discrimination and significantly increases writing time.

    Rule 5: Randomize the position of correct answers

    If you consistently put correct answers in position B or C, students will learn the pattern. Randomize correct answer positions across the test.

    Using AI to Generate First-Draft Questions

    Writing high-quality multiple choice questions from scratch is time-consuming. AI tools like SimpleQuizMaker can generate first-draft questions in seconds, which you then review and refine.

    The workflow:

  • Paste your lesson content or learning objectives
  • Generate 15–20 questions
  • Review each question against the rules above
  • Keep strong questions, revise weak ones, discard any that can't be fixed
  • Add 2–3 of your own questions targeting known misconceptions in your class
  • AI generation typically produces 70–80% usable questions on the first pass. The review step takes 5–10 minutes — far less than writing from scratch.

    How to Analyze Your Test After Students Take It

    Two metrics tell you most of what you need to know:

    Item difficulty (p-value): The proportion of students who answered correctly. Ideal range: 0.30–0.80. Questions with p < 0.30 are too hard; p > 0.80 are too easy and don't discriminate.

    Item discrimination: Do high-performing students get the question right more than low-performing students? A question that low performers get right more often than high performers is probably flawed.

    SimpleQuizMaker's dashboard shows you which questions your class struggled with — the equivalent of item difficulty analysis without the spreadsheet work.

    Frequently Asked Questions

    How many questions should a multiple choice test have?

    For a 50-minute class period: 30–40 questions (approximately 1 minute per question). For a 90-minute exam: 50–75 questions. For a quick formative check: 5–10 questions. Match length to available time and the weight of the assessment.

    Should I curve my multiple choice tests?

    Curving is a symptom of a test design problem. If most students score below 70%, the test is likely too hard or measuring things you didn't teach. Fix the test rather than curve the scores — the data you get from a well-calibrated test is more useful.

    How do I prevent cheating on multiple choice tests?

    Create two versions of the test with questions in different order, use random distractor ordering, include some application questions that require content knowledge rather than fact recall, and space students appropriately.

    Can AI generate multiple choice questions that are as good as human-written ones?

    AI generates excellent first drafts that capture key concepts from your content. Human review is still important — primarily to check that distractors reflect real misconceptions in your specific class and that the correct answer is truly defensible. The combination of AI speed + human review produces better tests than either alone.

    What percentage of a test should be multiple choice vs. other formats?

    Research suggests 60–70% multiple choice for broad content coverage and efficiency, 20–30% short answer for depth and application, and 10% extended response for higher-order thinking. Adjust based on your subject and the time you have for grading.

    Get weekly study & quiz tips

    Join teachers and students who get practical tips on quizzing, active recall, and AI-powered learning.

    Share:XLinkedIn

    Sarah Mitchell

    Curriculum Designer & Former High School Teacher

    Practice with AI-generated quizzes

    Ready to create your first quiz?

    Use AI to generate quizzes from your own study materials in seconds.

    Try SimpleQuizMaker Free