KDD2025
Paper-Level Computerized Adaptive Testing for High-Stakes Examination via Multi-Objective Optimization
Mingjia Li, Junkai Tong, Yiyang Huang, Yifei Ding, Hong Qian, Aimin Zhou
Abstract
Computerized Adaptive Testing (CAT) is a testing technique that accurately infers students' proficiency levels using a relatively small number of questions.Most existing CAT systems operate on a question-level adaptive paradigm, which is suitable for practice scenarios.However, in computerized standardized high-stakes examinations such as the GRE and GMAT, this paradigm faces several challenges: (1) the lack of comparability in exam results, (2) high implementation costs due to the reliance on real-time interactions and the financial burden of maintaining CAT testing system, and (3) the difficulty in balancing multiple factors of diagnosis quality, attribute coverage, and question exposure.To address these challenges, we propose a Paper-level Computerized Adaptive Testing (PCAT) and its corresponding evaluation method.PCAT divides an exam into multiple testing stages, where examinees adaptively receive test papers of varying difficulty based on their performance in previous stages.The paper assembly problem in PCAT is solved using a population-based multi-objective optimization (MOO) approach.PCAT offers several advantages: First, the paper-level adaptive mechanism ensures that the questions faced by examinees depend solely on their performance in the earlier stages, maintaining adaptability while enhancing the comparability of results across different examinees.Second, PCAT replaces the selection strategy module in traditional CAT with an assembly module, allowing computationally intensive tasks such as cognitive diagnosis and paper assembly to be completed offline before the exam, eliminating the need for real-time interactions.Additionally, the population-based MOO approach generates a set of high-quality solutions in one run, meeting the demands of frequent administration of standardized high-stakes exams like the GRE and reducing the financial burden of maintaining a large-scale CAT system.Finally, MOO naturally