Detection of aberrant testing behaviour in unproctored CAT via a verification test


BALTA E., UÇAR A.

International Journal of Assessment Tools in Education, cilt.12, sa.3, ss.681-700, 2025 (ESCI, TRDizin) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 12 Sayı: 3
  • Basım Tarihi: 2025
  • Doi Numarası: 10.21449/ijate.1598330
  • Dergi Adı: International Journal of Assessment Tools in Education
  • Derginin Tarandığı İndeksler: Emerging Sources Citation Index (ESCI), Central & Eastern European Academic Source (CEEAS), Education Abstracts, ERIC (Education Resources Information Center), Directory of Open Access Journals, TR DİZİN (ULAKBİM)
  • Sayfa Sayıları: ss.681-700
  • Hakkari Üniversitesi Adresli: Evet

Özet

Unproctored Computerized Adaptive Testing (CAT) is gaining traction due to its convenience, flexibility, and scalability, particularly in high-stakes assessments. However, the lack of proctor can give rise to aberrant testing behavior. These behaviors can impair the validity of test scores. This paper explores the use of a verification test to detect aberrant testing behavior in unproctored CAT environments. This study aims to use multiple measures to detect aberrant response patterns in CAT via a paper-and-pencil (P&P) test as well as to compare the sensitivity and specificity performances of the l_z person-fit statistic (PFS) using no-stage and two-stage (l_z is used after the Kullback–Leibler divergence (KLD) measure) methods in different conditions. Three factors were manipulated – the aberrance percentage, the aberrance scenario, and the aberrant examinee’s ability range. The study found that in all scenarios, the specificity performance of l_z in classifying examinees was higher than its sensitivity performance in no-stage and two-stage analyses. However, the sensitivity performance of〖 l〗_z was higher in two-stage analysis.
Unproctored Computerized Adaptive Testing (CAT) is gaining traction due to its convenience, flexibility, and scalability, particularly in high-stakes assessments. However, the lack of proctor can give rise to aberrant testing behavior. These behaviors can impair the validity of test scores. This paper explores the use of a verification test to detect aberrant testing behavior in unproctored CAT environments. This study aims to use multiple measures to detect aberrant response patterns in CAT via a paper-and-pencil (P&P) test as well as to compare the sensitivity and specificity performances of the l_z person-fit statistic (PFS) using no-stage and two-stage (l_z is used after the Kullback–Leibler divergence (KLD) measure) methods in different conditions. Three factors were manipulated – the aberrance percentage, the aberrance scenario, and the aberrant examinee’s ability range. The study found that in all scenarios, the specificity performance of l_z in classifying examinees was higher than its sensitivity performance in no-stage and two-stage analyses. However, the sensitivity performance of lz was higher in two-stage analysis.