On a gumdrop cake fail and multiple points of assessment
What can a failed gumdrop cake remind us about assessment?
I’m a pretty good baker and love to indulge myself when there’s time, like last month’s holiday season. For me, baking is partly about eating (of course!) but also about tradition, hospitality, and comfort.
Just before Christmas, I set out to make a gumdrop cake. It was an unmitigated disaster. When I turned it out of the pan, it collapsed. (See embarrassing photo at right).
Based on that single point of baking, a casual observer could determine that I’m a lousy baker. In fact, I should be barred from the kitchen and given directions to the closest bakery for all subsequent treats. This wouldn’t be a fair representation of my skills, just a snapshot of a single – bad! – evening.
It’s the same for our system of assessment in the UG program: no single assessment determines a student’s progress. We use multiple points of assessment, both in preclerkship classes and through clerkship rotations, to ensure we have an accurate portrait of a student’s performance over time. Admittedly, some assessments are higher stakes than others, but no single assessment will determine a student’s fate in the program.
Anyone can have an “off” day – for any number of reasons. What’s important following poor performance, is to take stock of what happened, reflect on what may have contributed to the poor outcome, and make a plan for next time.
I was really upset. I’d made this many times. I was “good” at this. Had I somehow lost my baking mojo? Plus, I was embarrassed — as well as annoyed with myself for wasting all kinds of butter, sugar, eggs, flour and gumdrops!
My adult daughter gamely offered this advice: “Sometimes a new recipe takes a few times to get right.” Except it wasn’t a new recipe. I’ve made this gumdrop cake dozens of times for over two decades. What could possibly have gone wrong? I reread the recipe (photocopied from my mother’s handwritten book) and my scrawled notes in the margins. I’d used mini-gummy-bears in place of the “baking gums”. In trying to be cute and expedient (didn’t have to chop those up!), I’d sabotaged my own cake. I’d also forgotten to put the pan of water on the bottom rack, but I thought that was likely pretty minor.
For students after a poor assessment, that same reflection can help: did I study or practice enough? Was it efficient study/practice? Was I under the weather? Did I have enough sleep? These self-reflection questions will vary based on the type of assessment, but it boils down to this: What can I learn from this assessment experience and what can I do differently next time?
I waited over a week before I attempted the gumdrop cake again. In the meantime, I (successfully) made four kinds of cookies, a triple-ginger pound cake, and a slew of banana breads. Then, I bought the right kind of baking gumdrops and remembered to follow ALL the instructions, and it turned out just fine. In fact, I sent some to my parents in New Brunswick and my mother judged it “delicious”.
With thanks to Eleni Katsoulas, Assessment & Evaluation Consultant, for her continued counsel on assessment practices.
Improving existing MCQs
By Theresa Suart & Eleni Katsoulas
Writing and editing test questions is an ongoing challenge for most instructors. Creating solid multiple choice questions (MCQs) that adequately address learning objectives can be a time-consuming endeavor.
Sometimes you may have existing questions that are pretty good, but not quite where you need them to be. Similar to a house reno versus new construction, sometimes it might be worth investing the time improve what you already have. How do you know which questions need attention and how can you rework them?
Previous exams are analyzed to determine which questions work well and which don’t. This can provide some guidance about questions that can be improved.
To select questions for an MCQ renovation, you can start with checking out the statistics from last year’s exams (available from your curricular coordinator or from Eleni).
Two statistics are useful indicators for selecting individual questions for tweaking, rewriting or other fixes: Item Difficulty and Discrimination Index.
Item difficulty is a check on if questions are too easy or too hard. This statistic measures the proportion of exam takers who answered the question correctly.
Discrimination index differentiates among text takers with high and low levels of knowledge based on their overall performance on the exam. (Did people who scored well on the exam get it right? Did people who scored poorly get it right?)
These two statistics are closely intertwined: If questions are too easy or too hard (see item difficulty), they won’t provide much discrimination amongst examinees.
If questions from previous years’ tests were deemed too easy or too hard, or had a low discrimination index, they’re ripe for a rewrite. Once you have a handful of questions to rewrite, where do you start? Recall that every MCQ has three parts and any of these could be changed:
- The stem (the set-up for the question)
- The lead-in (the question or start of the sentence to be finished with the answer)
- The options (correct answer and three plausible but incorrect distractors*)
The statistics can inform what changes could be necessary to improve the questions. For one-on-one help with this, feel free to contact Eleni, however, here are some general suggestions:
Ways to change the stem:
- Can you change the clinical scenario in the stem to change the question but use the same distractors? (e.g. – a stem for a question that asks students what the most likely diagnosis is based on a patient presenting with confusion with the correct answer being dementia, can be then re-written to change the diagnosis to delirium)
- Ensure the stem includes all information needed to answer the question.
- Is there irrelevant information that needs to be removed?
Ways to change the lead-in:
- Decide if the questions is to test recall, comprehension, or application.
- Recall questions should be used sparingly for mid-terms and finals (but are the focus for RATs)
- Verbs for comprehension questions include: predict, estimate, explain, indicate, distinguish. How can these be used with an MCQ? For example: “Select the best estimate of…” or “Identify the best explanation…”
- You can use the same stem, but change the lead in (and then, of course, the answers) – so if you had a stem where you described a particular rash and asked students to arrive at the correct diagnosis, you can keep the stem, but change the lead-in to be about management (and then re-write your answers/distractors).
Ways to change one or more distractors:
- Avoid grammatical cues such as a/an or singular/plural differences
- Check that the answer and the distractors are homogeneous to each other: all should be diagnoses, tests or treatments, not a mix.
- Make the distractors a similar length to the correct answer
- Ensure the distractors are reasonably plausible, not wildly outrageous responses
- Skip “none of the above” and “all of the above” as distractors
As you dig into question rewriting, remember the Education Team is available to assist. Feel free to get in touch.
Watch for MCQ Writing 2.0 later this spring.
* Yes, there could be more than three distractors, but not at Queen’s UGME. The Student Assessment Committee (SAC) policy limits MCQs to four options.
Everything you need to know about exam questions types in our curriculum!
Are all exam questions created equal? Not really—different type of questions test different levels of understanding. In the UGME program, we use a variety of exam questions to assess student learning—broadly classified as multiple-choice questions (MCQs) and short-answer questions (SAQs). But within these broad categories are a range of types of questions designed to test different levels of cognition. We use these different types of questions at different points both within courses and within the program.
Based on Bloom’s Taxonomy
Bloom’s taxonomy is a classification system used to define and distinguish different levels of human cognition—thinking, learning, and understanding. The taxonomy was first developed in the 1950s by Benjamin Bloom and further revised by him in the 1990s. In his original version, there are six levels of cognitive behaviours that explain thinking skills and abilities of learners. The original six levels of cognition as described by Bloom are: knowledge, comprehension, application, analysis, synthesis and evaluation. Educators have used Bloom’s taxonomy to inform or guide the development of assessment, such as with the construction of MCQs. MCQs are widely used for measuring knowledge, comprehension and application of learning outcomes. Our curriculum uses MCQs in different assessment formats, for different purposes, and those are described below.
You may hear acronyms and terms about assessment in our UGME program: RATs, MCQs, SAQs, Key Features. Here is a brief description of each:
Readiness Assessment Tests (RATs)
RATs used in our curriculum often consist of 10-15 multiple-choice questions that are linked directly to the readings (and/or prior lectures). A RAT focuses on foundational concepts that will be important for following SGL activities. MCQs found on a RAT, test for knowledge (i.e., recall information) and less for application of knowledge. Examples of verbs used in the question stem that would test knowledge include: define, list, label, recall, select, name, outline, or match.
Multiple-choice questions (MCQs): on midterms and finals
There are three components to an MCQ: the stem, lead-in question, and options that consist of one correct answer and typically three distractors (wrong answers). The stem should be directly linked to a learning objective assigned to a course. MCQs that are used on midterms and final exams often test for comprehension and application of knowledge; this is beyond the recall information that is typically the case with MCQs on RATs. Some multiple-choice questions may assess simple recall, depending on the learning objectives of the course but should be kept to a minimum. Verbs used in the question stem to test comprehension include: predict, estimate, explain, indicate, distinguish, or give examples. Verbs that would test application include prompts such as: solve, compute, illustrate, interpret, demonstrate, or compare.
Short-answer Questions (SAQs)
SAQs typically are composed of a case scenario followed by a prompt that requires a written answer that varies in length from one or two words to several sentences. SAQs often test the higher cognitive skills in Bloom’s taxonomy. Final examinations in our curriculum are typically composed of a mix of MCQs and SAQs. To test analysis, verbs in the question stem include: explain, arrange, select, infer, calculate, or distinguish. Verbs such as develop, design, plan, devise, formulate, or generalize test for synthesis, whereas verbs in the question stem to test evaluation include: argue, assess, estimate, justify, predict, compare, conclude, or defend.
Key Features Questions
Key features problems are used by the Medical Council of Canada for the assessment of clinical decision-making skills in the MCCQE Part 1. Key features problems have a case scenario usually followed by two or three questions, each question testing one or more key features. A key feature is defined as a critical step in the resolution of a clinical problem, and key-feature problems consist of clinical case scenarios followed by questions that focus only on those critical steps. While knowledge is an important feature for effective problem solving, the challenge posed by key features problems is the application of knowledge to guide clinical decision-making. For each question, instructions may require selection of whatever number of responses is appropriate to the clinical tasks being assessed, and there may be more than one response in the answer key. The development of key features problems for clinical decision-making is being piloted in the Clerkship curriculum courses this year.
How do we administer our tests?
Queen’s Undergraduate Medical Education has moved to an electronic exam system called ExamSoft for the administration midterms and final exams in Preclinical and the Clerkship curricular courses. Medical students no longer write exams on paper; rather they do it all on laptops. This greatly facilitates marking of exams, and it means we are no longer managing huge volumes of paper and deciphering student handwriting.
- Page, G., Bordage, G. & Allen, T. (1995). Developing Key-feature proglems and examinations to assess clinical decision-making skills. Academic Medicine, 70 (3).
- Laura April McEwen, OHSE 2011, MCQ Checklist
Using the IDEAL banks of questions for your assessments
Obtaining IDEAL Consortium Questions
Queen’s School of Medicine has joined the IDEAL Consortium, an international assessment item-sharing collaboration among Schools of Medicine. The Consortium has 27 member schools from 11 countries. Queen’s and UBC are currently the only Canadian members.
The IDEAL Restricted Question Bank contains over 20,625 assessment items including 17,109 MCQs, 539 short-answer questions and 461 OSCE stations. Collectively, members contribute about 4,000 new questions to the restricted and non-restricted question banks annually.
Restricted Bank: Please contact your Curricular Coordinator to request sets of restricted bank questions in your subject area or questions on particular topics. (Zdenka Ko for Year 1, Tara Hartman for Year 2, Jane Gordon for Clerkship Rotations and Candace Trott for “C” courses in clerkship.) Restricted bank questions need to be kept secure, so they can only be used on final examinations. A Word document containing the questions (as well as their answers and “item numbers”) will be couriered to you, or you can request that a secure MEdTech community be created for you to share restricted questions with other faculty members in your course.
To use restricted questions on final exams, simply provide your Curricular Coordinator with the item number of each question and the order in which you would like the questions to appear on the final exam. If you are sharing restricted questions via a secure MEdTech community, you can copy and paste your question selections into a Word document and upload it to the Curriculum Coordinator’s folder in the secure community. It is important that the IDEAL restricted bank questions not be emailed except in password-protected Word files. The restricted questions must not be viewed by students except during the writing of final exams.
You can specify edits to any of the IDEAL items – including OSCE stations. If you edit the items yourself, please highlight your edits so that your Curriculum Coordinator can transfer the edits to the local copy of the IDEAL bank.
The old LXR bank contained many duplicate and triplicate questions, so please let your Curriculum Coordinator know the origin of each exam question (IDEAL? LXR? Original? From a colleague?) We especially need to know, for copyright and item submission reasons, if any questions did not originate at Queen’s. Questions that did not originate at Queen’s will be marked, “Do not submit to IDEAL”, but can be stored in the local copy of the IDEAL bank and used on Queen’s exams.
Unrestricted Bank: Unrestricted bank items can be used in online quizzes, in clicker sessions, on midterms etc. Students can have full access to all unrestricted bank questions. Currently the MEdTech team is creating an interface for the unrestricted bank so that faculty members will have full access to the questions. At present, requests for emailed sets of unrestricted bank question sets can be sent to Catherine Isaacs (firstname.lastname@example.org).