Designing multiple choice test items

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (642.45 KB, 41 trang )

What to be covered
•
Multiple choice items
•
Alternatives in assessment
Quiz
You know not to run with scissors. Did you also know that scissors
are linked to many different superstitions? Show how sharp you
are by selecting the one FALSE superstition from the statements
below.
A. Using scissors on New Year's Day will double your
fortune.
B. Placing scissors under a patient's pillow will cut their
pain.
C. Nailing a pair of scissors in the open position above a
door will protect the house from witchcraft.
D. Dropping a pair of scissors is a warning that a lover is
unfaithful.
Quiz answer: A
Information for this Quiz comes from commercial web site
Uncommon Scissors whose Scissors History and Superstitions
explains that scissors may have been in use in Egypt "as far
back as 1500 BC." They also note that, centuries later, "as
calligraphy spread throughout the Islamic countries, concave
blades were developed to cut paper. Scissors became a part
of everyone's life and not just for the use of guilds and the
wealthy." Uncommon Scissors also shares many old
superstitions about scissors, including how they could cut a
patient's pain, protect a house from witchcraft, or signal a
lover's infidelity. The one false statement above is the first

one; the actual superstition maintains that using scissors on
New Year's Day will "cut off fortune."
Designing multiple choice test items
Week 8
Multiple choice
•
MC items are all receptive, or selective: test-takers choose
from a set of responses rather than creating a response.
(Other receptive item types include true-false questions and
matching lists)
•
Every MC item has a stem, which presents several options
(usually between three and five) or alternatives to choose
from.
•
One of those options, the key, is the correct response, while
the others serve as distractors.
Multiple choice items
•
A preferred mode for large-sale tests, co’z MC
items provide an “objective’ means for
determining correct or incorrect responses
•
Scoring procedures are streamlined (for either
scannable computerized scoring or hand-
scoring with a hole-punched grid) for fast
turnaround time.
Weaknesses in multiple-choice items
•
The technique tests only recognition knowledge.

•
Guessing may have a considerable effect on test
scores.
•
The technique severely restricts what can be tested.
•
It is very difficult to write successful items.
•
Washback may be harmful.
•
Cheating may be facilitated.
However, the two principles that stand out in
support of MC are practicality and reliability
Guidelines for designing MC items
1. Design each item to measure a specific
objective.
2. State both stem and options as simply and
directly as possible.
. Do not use superfluous words,
rule of succinctness is to remove needless
redundancy from your options
Guidelines (cont.)
3. Make certain that the intended answer is clearly the
only correct one. Eliminating unintended possible
answers is often the most difficult problem of
designing MC items. With only a minimum of context
in each stem, a wide variety of responses may be
perceived as correct.
A CBT is a method of administering tests in which responses are …
recorded, assessed, or both.

a. Electrically
b. Appropriately
c. Personally
d. Carefully
Guidelines (cont.)
4. Use item indices to accept, discard, or revise items
The appropriate selection and arrangement of suitable MC
items on a test can best be accomplished by measuring items
against three indices:
•
Item facility (IF) (or Item difficulty)
•
Item discrimination (ID)/Item
differentiation
•
Distractor efficiency
Item facility (IF)
•
is the extent to which an item is easy or
difficult for the proposed group of test-takers.
•
A too easy item (e.g., 99% of the test-takers get it right)
•
A too difficult item (99% get it wrong)
→ Does nothing to separate high-ability and
low-ability test-takers
Item facility (IF)
The formula looks like this
Item Facility (IF)
•

There is no absolute IF value that must be met
•
The appropriate test items will have IFs that
range between .15 and .85
Item Discrimination (ID)
•
is the extent to which an item differentiates between high-
and low-ability test-takers.
•
An item on which high-ability students and low-ability
students score equally well would have poor ID
because it did not discriminate between the two
groups.
•
An item that garners correct responses from most of
the high-ability group and incorrect responses from
most of the low-ability group has good discrimination
power.
Item Discrimination (ID)
Item # Correct Incorrect
High-ability students (top 10) 7 3
Low-ability students (bottom10) 2 8
Item Discrimination (ID)
•
ID: 7-2=5/ 10= 0,50 → The result tells us that us that
the item has a moderate level of ID.
•
High discriminating level would approach 1.0 and no
discriminating power at all would be zero.
•

In most cases, you would want to discard an item that
scored near zero.
•
As with IF, no absolute rule governs the establishment
of acceptable and unacceptable ID indices.
Distractor Efficiency
is the extent to which
•
the distractors “lure” a sufficient number of
test-takers, especially lower-ability ones, and
•
those responses are somewhat evenly
distributed across all distractors.
Note: C is the correct response
Choices A B C* D E
High-ability students (10) 0 1 7 0 2
Low-ability students (10) 3 5 2 0 0
DE (cont.)
The item might be improved in two ways:
a) Distractor D doesn’t fool anyone. Therefore it probably has no
utility. A revision might provide a distractor that actually attracts a
response or two.
b) Distractor E attracts more responses (2) from the high-ability group
than the low-ability group (0). Why are good students choosing this
one? Perhaps it includes a subtle reference that entices the high
group but is “over the head” of the low group, and therefore the
latter students don’t even consider it.
•
The other two distractor (A and B) seem to be fulfilling their
function of attracting some attention from the lower-ability

students.
Alternatives in assessment
•
Portfolios
•
Journals
•
Conferences and interviews
•
Observations
•
Self- and peer-assessments
Portfolios
•
Is “a purposeful collection of students’ work
that demonstrates … their efforts, progress,
and achievements in given areas” (Genesee &
Upshur, 1996, p. 99)
Portfolios (cont.)
Portfolios include materials such as
•
Essays and compositions in draft and final forms;
•
Reports, project outlines;
•
Poetry and creative prose;
•
Artwork, photos, newspapers or magazine clippings;
•
Audio and/or video recordings of presentations,

demonstrations, etc.
•
Journals, diaries, and other personal reflections;
•
Tests, test scores, and written homework exercises;
•
Notes on lectures; and
•
Self- and peer-assessments – comments, evaluations,
and checklists
Attributes of Portfolios
•
Collecting
•
Reflecting
•
Assessing
•
Documenting
•
Linking
•
Evaluating
Benefits of Portfolios
•
Foster intrinsic motivation, responsibility and ownership,
•
Promote student-teacher interaction with the teacher as
facilitator,
•

Individualize learning and celebrate the uniqueness of each
student,
•
Provide tangible evidence of a student’s work,
•
Facilitate critical thinking, self-assessment, and revision
processes,
•
Offer opportunities for collaborative work with peers,
•
Permit assessment of multiple dimension of language
learning.

Designing multiple choice test items

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về