Glossary

What Is Item Difficulty? The Quiz Quality Metric Most Teachers Ignore

May 30, 20264 minSarah Mitchell · Curriculum Designer & Former High School Teacher

In this article

1.What the values mean
2.How to use it
3.Item difficulty + item discrimination
4.How modern quiz tools surface this
5.Common mistakes
6.Related reading

Short answer. Item difficulty (denoted *p*) is the proportion of students who answered a given quiz question correctly. A question that 80% of students got right has *p* = 0.80. The metric is poorly named — *higher* difficulty value = *easier* question — but it's a foundational stat for evaluating quiz quality.

What the values mean

p = 1.0: Every student got it right. Either trivially easy, or a freebie. Not useful for distinguishing students.

p = 0.80-0.90: Easy. Good for confidence-building, foundational checks.

p = 0.50-0.70: Medium. Most useful for assessment — discriminates between students who know and don't know.

p = 0.20-0.40: Hard. Good for stretch questions; use sparingly.

p < 0.20: Either too hard for the cohort, or a broken question.

p = 0: No one got it right. Almost certainly a broken question.

How to use it

For a typical quiz:

Aim for an average difficulty of 0.5-0.7 across all items

Don't have all easy items (p > 0.85) — the quiz doesn't differentiate

Don't have all hard items (p < 0.4) — the quiz frustrates without measuring

Mix difficulty levels so the quiz has a discrimination curve

Item difficulty + item discrimination

Item difficulty alone is incomplete. A question with p = 0.50 might be:

A great question (top half of class gets it right, bottom half doesn't) — high [item discrimination](/blog/what-is-item-discrimination)

A coin-flip question (every student has 50/50 odds) — low item discrimination

A *reverse-discrimination* broken question (bottom half gets it right, top half misses) — negative discrimination

Use both metrics together. A good quiz has medium-difficulty items with positive discrimination.

How modern quiz tools surface this

Most LMSs and quiz platforms show item difficulty after each quiz administration:

Canvas, Blackboard, Moodle: quiz statistics page

SimpleQuizMaker: per-question analytics across all submissions

After every quiz, scan the difficulty distribution. The items with p > 0.95 (everyone right) and p < 0.15 (almost no one right) are usually the ones to review or rewrite.

Common mistakes

Confusing the name. Higher value = easier. Not intuitive. Many teachers reverse this in their heads.

Ignoring difficulty data when grading. The "everyone missed Q5" pattern usually means Q5 is broken, not that everyone failed to learn.

Treating low p as a teaching problem when it's a question problem. Verify the question is well-written before concluding students don't know the material.

[What Is Item Discrimination?](/blog/what-is-item-discrimination)

[How to Write Good Quiz Questions](/blog/how-to-write-good-quiz-questions)

[How to Write Hard Quiz Questions](/blog/how-to-write-hard-quiz-questions)

[Quiz Analytics — Teacher Guide](/blog/quiz-analytics-teacher-guide)

Generate a quiz and see per-question difficulty after the first 5 submissions.

Get weekly study & quiz tips

Join teachers and students who get practical tips on quizzing, active recall, and AI-powered learning.

Share:X LinkedIn

Sarah Mitchell

Curriculum Designer & Former High School Teacher

Practice with AI-generated quizzes

🎯 GRE Prep Quiz Generator