By Matt Ferris, MA, MBA, CAE, ELS, National Board of Certification and Recertification for Nurse Anesthetists (NBCRNA), and Nathan Thompson, PhD, Assessment Systems Corporation (ASC)
For organizations using or contemplating computer-based testing, technology-enhanced items (TEIs) have the potential to improve fidelity and reliability and spark the creativity of item writers. They offer an excellent opportunity to improve your assessment program, but are not without concerns. This article provides a brief introduction to some benefits and potential pitfalls of TEIs.
What Are TEIs?
TEIs are most broadly construed as any items that use technology to create interactions with content that are impossible using paper and pencil. Hotspot items, for instance, that require examinees to select a spot on a graphic are certainly TEIs. But the old-school multiple-choice question (MCQ) that includes a video clip could also be considered a TEI.
Therefore, we might consider two categories of TEIs: those that merely enhance or move a traditional item format onto a computer screen, and those that are truly designed with current technology in mind. The most notorious offender of the first group is the “gridded” item from K-12 assessment, which literally takes the antiquated bubble sheet and puts it on the computer screen.
More recently, a new generation of TEIs has emerged that is truly in the second camp. These TEIs simulate or closely approximate tasks the candidate will need to perform, such as operating complex equipment, utilizing special software (or even common software like MS Excel) and writing computer code. These are often called innovative item types or formats.
Developing TEIs: Content Perspective
Traditional MCQs have been criticized for tending to assess memory or comprehension, as opposed to higher-level thinking skills.1,2 They have been accused of rewarding guessing, and measuring test-taking ability along with the targeted knowledge.3,4 Whether one believes such criticism is justified, the perception in the test-taking public remains. The National Board of Certification and Recertification for Nurse Anesthetists (NBCRNA) moved toward TEIs in 2008 largely to mitigate such concerns, pushed along by the availability of item-banking tools that greatly facilitate writing, reviewing and maintaining TEIs.
When NBCRNA began developing drag-and-drop matching and ordering items, hotspots, short-answer numerical and multiple correct response items, volunteers still used Word templates for first drafts. Most item-writing tools now present off-the-shelf templates for these types of TEIs, though this should not be taken for granted when choosing a vendor. But if much of the basic design is now a nonissue, psychometric consultation and usability testing with representative candidates are still advisable to help answer many fine-tuning questions, such as:
- For TEIs that have multiple response elements, such as drag-and-drop items, will partial credit be given?
- For multiple-correct response items, will examinees be told the correct number of responses to select, and will delivery technology prevent selection of too many or too few?
- With short-answer items, how will candidates’ creative entry of correct answers (misspellings; varied use of decimal places, fractions, etc.) be managed?
Such details are important to the psychometric value TEIs will yield. They are also of perspiration-provoking interest to candidates experiencing TEIs for the first time, and fairness dictates communicating them early.
Answering these questions also makes possible detailed item-writing specifications that increase quality and provide a framework for creativity. Item-writing workshops help writers apply those specifications, and can even be moved online, potentially as continuing-education (CE), as NBCRNA has done. Even without that level of investment, an organization minimally needs clear guidelines for TEI writers, with many examples to inspire adaptations of basic models.
Developing TEIs: Psychometric Perspective
The most important consideration in developing a TEI should be psychometric aspects, not the pedagogy or the technology.5 Otherwise, the TEI is just expensive window dressing. The design of the TEI should revolve around a cognitive model with a scoring algorithm that is intended to maximize distinction of candidate ability. Scoring is often self-evident for a simple TEI, but one that simulates complex software should have a clear cognitive model with detailed scoring rules. In K-12 testing, evidence-based selected response (EBSR) items are a counterexample. They violate item independence as well as the basic psychometric assumption that candidates with higher ability should score higher on the item; examinees can only score one point by guessing, making them the same ability level as zero-point examinees.
Remain wary of the pitfall of overengineering for the sake of technology itself. This can lead to an assessment with a lower ratio of reliability or precision to item development cost or testing time. Basically, watch out for a TEI that has one and a half times better discrimination than a traditional item, but three times the cost and three times the examinee time investment. With some well-known TEI types, NBCRNA has found higher difficulty or response times, but better discrimination, still providing a positive trade-off.6 Another more complex, custom TEI, however, has shown less benefit relative to NBCRNA’s cost. Russell7 presents a framework for evaluating the utility of TEIs.
Rollout and Maintenance
The first question in rolling out TEIs ― in fact, the first step at all ― is whether your delivery platform can support what you have in mind. As noted, most vendors can handle a number of TEI formats out of the box. But if you have a new idea for a simulation or complex vignette that will be custom-developed, talk to your vendor first.
After creating the new item types, organizations should pilot them with examinees and other stakeholders to evaluate performance, training needs, usability and other practical issues. Providing sample items and online tutorials for candidates using the new format is also recommended. Ongoing use of the TEIs should include a feedback loop from the candidates, as well as frequent performance checks, just as with traditional items. Candidates may need reassurance as to effects of TEIs on passing rates, and performance data can help with this too.
Producing and maintaining an MCQ exam with reliable, defensible scores that are valid for their intended purpose is no small feat in itself. But if testing is already computerized, and item-writing tools support it, it is worth considering whether TEIs could raise an exam to the next level with a manageable investment of effort. However, the step forward can be a lot of work with potentially little reward, and therefore should be evaluated closely. As Bryant8 notes, many organizations roll out TEIs without evaluating psychometrics, cost-benefit ratio, or impact on candidates. All TEIs are definitely not alike in these respects, so innovation should be tempered with caution.
- Martinez, M.E. (1999). Cognition and the question of test item format. Educational Psychologist, 34(4), 207‐218.
- Stanger-Hall, Kathrin F. (2012). Multiple-Choice Exams: An Obstacle for Higher-Level Thinking in Introductory Science Classes. CBE Life Sci Educ, 11(3), 294–306.
- Dulger, Mehmet & Deniz, Hasan. (2017). Assessing the Validity of Multiple-Choice Questions in Measuring Fourth Graders' Ability to Interpret Graphs about Motion and Temperature. International Journal of Environmental and Science Education, 12(2), p177-193.
- McKenna, Peter. (2018). Multiple Choice Questions: Answering Correctly and Knowing the Answer. International Association for Development of the Information Society, Paper presented at the International Association for Development of the Information Society (IADIS) International Conference on e-Learning (Madrid, Spain, July 17-19, 2018).
- Parshall, C.G., Spray, J.A., Kalohn, J.C., Davey, T. (2002). Practical Considerations in Computer-Based Testing. New York: Springer-Verlag, 2002.
- Muckle, T. (2012). Beyond Multiple Choice: Strategies for Planning and Implementing an Innovative Item Initiative. Washington, DC: Institute for Credentialing Excellence.
- Russell, M. (2016). A Framework for Examining the Utility of Technology-Enhanced Items. Journal of Applied Testing Technology, 17(1), 20-32.
- Bryant, W. (2017). Developing a Strategy for Using Technology-Enhanced Items in Large-Scale Standardized Tests. Practical Assessment, Research, and Evaluation (22,1). Available at: https://scholarworks.umass.edu/pare/vol22/iss1/1.