By Dennis Maynes and Alison Foster
In 1962, R.W. Hamming, one of America’s great mathematicians of the 20th century and a pioneer in scientific computing, wrote: “The purpose of computing is insight, not numbers.” Our relationship with computing and data has changed greatly since Hamming penned his motto. In Hamming’s time, only a select few had access to computers that were maintained in large buildings with huge electrical transformers and specialized air conditioning and ventilation equipment. Now nearly everyone uses a computer carried in their pocket or purse. Despite these changes, Hamming’s philosophy on the purpose of computing remains starkly relevant. The power of computing transcends compilation of information; indeed, the analysis of that information is essential for gaining insight.
Today’s credentialing exams face an unceasing barrage of attacks from individuals who want either to improve their scores by cheating or steal and share secure exam content. Because of these attacks, trust has been lost in exams — in their scores and in their validity. Insights gained from analysis of observational data can help us regain lost trust and restore exam integrity.
What Is Big Data?
Big data can be defined as “a set of data that is too complex to collect and analyze by a single person, even with the help of several computers.” Analysis of big data that has been properly gathered, organized and processed becomes a source of ongoing discovery. Using these large data sets, we can uncover important test security patterns and trends.
Big data comes in many forms. In this article, two types of observational big data are discussed: those gleaned by searching the internet for disclosed exam items and those gathered by observing the test administration process.
Observations from Searching the Internet for Test Content
One way the internet challenges test security is the pervasive threat of exam content disclosure. Even when computers are used to automatically extract data and information, the value of human analysis and observations must not be overlooked. With the click of a camera and the quick entry of a URL, an item’s image may be distributed almost instantaneously to hundreds or thousands of websites. The task of searching for stolen exam items on the internet that has more than 4 billion indexed web pages quickly exceeds the capacity of a single individual, even when multiple computers are used. Whereas automated data processing allows us to sift enormous quantities of data, analysts and observers can clearly discern patterns that are not readily assimilated by computers.
Much can be learned by patrolling the internet for exam content. This is both a passive and an active task. The passive portion includes scouring the web for stolen test content, extracting information from potentially infringing websites, and assembling the observations to help testing program managers deal with potential test security breaches. During this passive phase, observers and analysts gain greater insights by employing automated tools and critically analyzing the data. The active portion includes contacting website administrators, sending out Digital Millennium Copyright Act (DMCA) notifications, and persuading individuals who may have disclosed exam content to change their behavior. As infringement patterns on websites become more apparent, it becomes easier to respond quickly against copyright infringement by websites and individuals. By efficiently quantifying, categorizing and analyzing observations from this activity, item harvesting threats and vulnerabilities associated with test theft may be understood.
Observations from Proctors and Site Monitors
Because administering and taking tests is a human activity, many opportunities exist for observing and analyzing the test administration process. A central responsibility of proctors or invigilators of secure tests is to administer tests in accordance with standardized test procedures and ensure that test security is maintained. In addition to ensuring that tests are administered properly and according to testing rules, proctors or invigilators record and report any testing irregularities and incidents. Site monitors can observe the testing process and verify that 1) test security procedures have been followed throughout the test and 2) proctors did their jobs properly and reported any observed irregularities. These observations can help determine the extent to which test security procedures were followed and whether security weaknesses were present.
The observations of site monitors, especially when the exam is administered at more than one location, cannot be gathered by a single individual. As a result, effective site monitoring requires a team of individuals at multiple test administration sites to identify any anomalous patterns that may be present.
Leveraging Computational Power
In order to efficiently quantify and categorize the data gathered from internet searches and site monitoring, both types of data should be entered into one or more applications designed for that purpose. The application should allow for easy filtering and quick tabulation, which can be used to extract the most relevant data for understanding the nature of security threats and prioritizing follow-up tasks such as investigations. The data entered into these applications need to be carefully designed so that quantities may be computed (e.g., counts of testing irregularities, amount and accuracy of harvested content, and number of individuals who might be involved). The applications also need to compile qualitative information in the form of comments and suggestions for analysis purposes.
It is important to realize that the data obtained from both observational methods can be analyzed in tandem. For example, if a site monitor is concerned that a security breach has occurred and an examinee might have harvested test questions, web monitors can search the internet to see how widespread the test content is online and determine the extent of the breach.
After the observations have been gathered, they can be analyzed and reviewed in order to strengthen test security. Test security is an ongoing process comprised of four steps: protect, detect, respond and improve; and observational data are an important element of this process.
In this stage of the test security process, exams are protected by preventing and deterring security breaches. The presence of observers and monitors deters cheating through fear of being caught. An individual intending to use a cheat sheet might reconsider when there is a strong likelihood of being caught. Individuals trying to share stolen test content may be deterred by web monitoring when discovery and prosecution are real possibilities. Thus, the process of collecting observational data alone can protect the exam by preventing and deterring various types of cheating and theft.
A primary role of observation is to detect and document test security breaches. In addition to direct observation, analysis of observational data can be used to detect security breaches. These data can come in many forms; a few examples include a proctor report, stored proctoring videos, chat logs, web-monitoring reports, etc. By analyzing observed instances of cheating and theft uncovered through web monitoring and by proctors, it is possible to uncover trends and patterns associated with these breaches.
After a test security breach is discovered, it is important to determine an appropriate and effective response. At times, observers can take positive, corrective action as soon as a breach is detected. For example, web patrollers can respond by sending takedown messages, starting an investigation, and collecting data about getting that content removed from the internet. Site monitors can report irregularities that prompt test sponsors to require proctors to immediately change and improve test administration procedures. For example, security vulnerabilities detected at the start of a long testing window can be addressed quickly, thereby protecting later exams from similar threats. Other suitable responses may only be formulated after trends and patterns of test security breaches are understood, and test sponsors may require the collection of additional data from other sources before they can achieve such understanding.
The data gathered by online or in-person observations are critical for improving test security processes and procedures. Insights can be used to effectively prevent and deter test security threats in the future, after they have been discovered and analyzed. Improved procedures for proctoring and informed tactics for web monitoring can and should be derived from the gathered data and introduced as quickly as possible.
Summary and Conclusions
The first part of this article (“Big Data Speaks Up for Test Security,” published in the Q2 2016 issue of ICE Digest) discussed how data obtained from test results (i.e., data forensics) can be used to improve test security. This second part has discussed how observational data may be collected and analyzed to improve test security. Applications designed for the purpose of gathering and categorizing observational data are very useful, because these data can be difficult to analyze quantitatively. It is important that testing professionals recognize analysis of data is a powerful tool, and it has the ability to yield insights for countering threats that undermine the validity and usefulness of exams.