Beyond the Inbox: Comparing Longitudinal Assessments as an Alternative to Email Surveys

Friday, September 19, 2025

Professional practice analysis (PPA), or job task analysis, is a critical process conducted to ensure the content validity of certification assessments (Standards for Educational and Psychological Testing, 2014). PPAs form the key evidential linkage between the certification assessment and clinical or other professional practice settings. This article summarizes the process and outcomes of the most recent PPA performed for the National Board of Certification and Recertification for Nurse Anesthetists’ (NBCRNA) Maintaining Anesthesia Certification (MAC). We decided to explore a principled assessment design approach to collect PPA data and evaluate whether a longitudinal assessment modality may provide a reliable and representative alternative to traditional survey methods.

Traditional PPAs use a mixed-method strategy, involving both focus groups and surveys, which collect validation data via rating scales. Like many certifying organizations, the NBCRNA has relied on web-based surveys to collect PPA data from practitioners. However, these methods often face limitations such as low response rates, incomplete data, response sets and recall bias. Previous PPAs conducted by the NBCRNA for the Continued Professional Certification Assessment (CPCA) over the last 10 years yielded a response of 12% in 2015 and 13.6% in 2019 (Ferris et al., 2021). Low participation rates can limit the generalizability of findings and reduce confidence that the data accurately represents the overall population being studied.

Longitudinal Assessment Pilot

To overcome the limitations of traditional PPAs, we launched a pilot using MAC Check, which is a continuous, on-demand longitudinal assessment platform (MAC, 2025). The MAC Check replaced the previous point-in-time CPCA, which was administered every eight years either at a test center or using online remote proctoring.

We viewed this pilot as a feasibility study by approaching it using a principled assessment design framework. A principled assessment design approach emphasizes embedding the method for evolving the design and development directly into the assessment process itself (Mislevy, Steinberg & Almond, 1999). In our case, this meant collecting frequency ratings immediately after each MAC Check item was answered, eliminating the need for a separate PPA survey.

Before launching the pilot, we assembled a panel of subject matter experts to review and update the MAC Check content outline. This outline organizes the assessment into core domains, subdomains and knowledge elements relevant to nurse anesthesia practice. Each item in the MAC Check was then systematically mapped to a corresponding domain, subdomain or knowledge element in the revised outline, ensuring alignment across the item pool, and allowing us to group and analyze item-level frequency ratings.

During the three-month data collection period, MAC Check participants rated how often they used the knowledge tested in each item in practice based on their personal experience. The frequency rating scale included four options: “never or rarely,” “monthly,” “weekly” or “daily.” We then aggregated and analyzed the ratings to identify patterns of knowledge used across the nurse anesthesia profession.

Figure 1 shows how MAC Check items were mapped to the content outline and how frequency data was aggregated. For example, we calculated the frequency measure for knowledge element 2.1.1 by averaging the ratings from the 84 items mapped to that specific topic in the content outline. If more than 50% of participants rated an item “never or rarely,” it was flagged for review. The expert panel then reconvened to evaluate these flagged areas and to formalize recommendations for any further revisions needed to the MAC Check content outline.

Results of the Initial Pilot

We collected 573,297 frequency ratings across 1,737 MAC Check items from 21,338 participants between Nov. 1, 2024, and Jan. 31, 2025. We flagged only five knowledge elements for review because more than half of the respondents rated them as “never or rarely” used in practice. Our expert panel examined the flagged areas to determine whether further revisions to the content outline were necessary.

Additionally, the data showed that for all four primary domains, the frequency ratings were applied with similar regularity by respondents. This finding supported our decision to assign equal weight to each primary domain in the MAC Check.

By collecting frequency ratings immediately after each item, the longitudinal assessment format provided us a detailed and timely view of how certified registered nurse anesthetists (CRNAs) apply knowledge application by capturing frequency ratings immediately after each item. This method reduced the potential for recall bias often associated with traditional surveys and increased engagement, as participants interacted with the assessment more frequently and consistently, since MAC Check is required for certification renewal, compared to a one-time survey.

Successes and Future Considerations

We integrated frequency ratings directly into the MAC Check as part of our principled assessment design approach. This allowed us to collect self-reported data from a large, representative sample of practicing CRNAs undergoing certification renewal. By embedding the data collection process within the assessment itself, we circumvented the inherent limitations of optional surveys by eliciting increased participation from individuals who might otherwise be less engaged. Since MAC Check is required for certification renewal, we saw consistent engagement without needing follow-up reminders. This design also eliminated the need for email outreach, which can often exasperate survey recipients.

The large sample size and strong demographic relationship with the broader CRNA population strengthened the representativeness of our findings. This made it easier for the panel to interpret the data and increased their confidence in any decisions made during content outline revisions. Additionally, the continuous nature of data collection provided a more dynamic view of how knowledge is applied in real time, allowing us to detect emerging practice trends that a one-time survey might miss.

We discussed how to interpret item-level data when making decisions about broader domains, subdomains and knowledge elements. Panel members expressed concerns that low-frequency ratings on individual items could skew results and wanted to see the particular items that were rated lower. To address this, we limited the panel’s role to evaluating the content outline itself based on aggregated patterns across multiple items rather than individual outlier items. As part of our continuous quality improvement plan, we will reevaluate the content outline every five years during each PPA period or may take earlier action if warranted based on participant feedback.

While the longitudinal assessment format offers many advantages, it does require a dedicated platform, ongoing resources and IT infrastructure to manage the data. We recognize that participant fatigue is a potential concern, especially given the mandatory nature of MAC Check. However, its self-paced design helps mitigate this risk. To monitor and track signs of response fatigue or lower engagement, we included an optional survey that collects feedback on the participant’s experience to help us ensure that the format remains effective and user-friendly.

While traditional surveys may be quicker to administer and require lower initial resource investment, our experience with the longitudinal assessment format made it a feasible alternative. However, organizations should weigh these benefits against the resource demands and consider their specific goals, as well as the user experience, when choosing a data collection method for their PPA process. We juxtaposed a high-level overall comparison between traditional survey methods and the longitudinal format to help inform decisions (Table 1).

Table 1. Comparison of Traditional Surveys and a Longitudinal Assessment Format for Professional Practice Analysis (PPA) Data Collection.

Feature	Traditional Survey	Longitudinal Assessment Format
Data Collection	One-time, retrospective, sent via email invitations	Continuous, real-time, integrated within the assessment platform
Response Rate	Lower participation and higher sampling error	Higher participation and lower sampling error
Data Granularity	Domain/subdomain level	Item-level
Bias Risk and Fatigue	Higher recall/selection bias and survey fatigue	Lower recall bias and response fatigue
Engagement	Optional, one-time participation, follow-up email reminders often needed	Required, ongoing, no follow-up reminders needed
Resource Needs	Lower initial cost	Higher ongoing investment
Practice Trends	Static snapshot	Dynamic, evolving insights

The Alternative We’ve Been Waiting For?

We designed this feasibility study to pilot whether a principled assessment design framework could serve as an effective alternative to traditional survey-based methods by transcending beyond the recipient’s email inbox to maximize engagement. By amalgamating the collection of frequency ratings into the MAC Check, we eliminated the need for a separate email campaign with a link to a lengthy survey or follow-up reminders.

We believe this model can be adapted across a variety of professional contexts by organizations using longitudinal assessments or considering their implementation. Beyond enhancing the validity of certification decisions, this approach demonstrated the broader utility of longitudinal assessment platforms for credentialing organizations seeking alternative strategies for their PPA data collection. Future research could explore the scalability of this method across different credentialing sectors to provide broader comparative context for the findings of this study.

References

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. American Educational Research Association. Retrieved August 19, 2025 from https://www.testingstandards.net/uploads/7/6/6/4/76643089/standards_2014edition.pdf

Ferris, M., Gill, C., Preston, J., & Browne, M. (2021). Validity Evidence for the Continued Professional Certification Assessment: Professional Practice Analysis of the National Board of Certification and Recertification for Nurse Anesthetists. J Nurs Reg, 11(4), 51–62. https://doi.org/10.1016/S2155-8256(20)30172-1

MAC Check. (n.d.). Retrieved August 19, 2025 from https://www.nbcrna.com/certification-programs/mac/mac-check

Mislevy, R. J., Steinberg, L. S., & Almond, R. G. (1999). Evidence-centered design: A framework for educational assessment. (CSE Technical Report 470). National Center for Research on Evaluation, Standards, and Student Testing (CRESST), University of California, Los Angeles. Retrieved August 19, 2025 from https://files.eric.ed.gov/fulltext/ED427190.pdf

Disclosure: Assistance of Microsoft Copilot, an AI platform based on OpenAI’s GPT-4 architecture with enterprise data protection, was used to help rewrite some technical sentences for clarity and readability as well as summarize information into a table format. All content was reviewed and edited by the authors to ensure accuracy and appropriateness.

This work was accepted as a poster presentation during the 2025 I.C.E. Exchange meeting in Arizona.

Did you enjoy this article? I.C.E. provides education, networking and other resources for individuals who work in and serve the credentialing industry. Learn about the benefits of joining I.C.E. today. And if you enjoyed, share this article with a friend or on your social media page.