ASE2025

From Sparse to Structured: A Diffusion-Enhanced and Feature-Aligned Framework for Coincidental Correctness Detection

Huan Xie, Chunyan Liu, Yan Lei, Zhenyu Wu, Jinping Wang

摘要

Coincidental correctness (CC) refers to test cases that execute faulty code but still produce excepted outputs. This phenomenon introduces noise into the data of software testing-related tasks. As demonstrated in the literature, CC has negative impact on test suite reduction, test case prioritization, fault localization, and automated program repair. Thus, it is essential to detect and mitigate the impact of CC. Although CC is commonly observed across a large number of programs, CC test cases are typically sparse within each program’s test suite. In other words, CC test cases generally make up merely a small portion of the passing test cases. The proportions vary from 3.27% to 31.74% within Defects4J V1.4. This results in a highly imbalanced distribution of CC versus non-CC test cases, posing challenges for accurate detection.To address this issue, we propose a Diffusion-Enhanced and Feature-Aligned Framework for Coincidental Correctness detection, named DEFACC, to obtain more structured representations of test cases. Specifically, DEFACC first introduces a diffusion-based generation module. This module generates new CC samples from original samples to alleviate class imbalance issue and enhance the diversity of CC samples. However, generated feature samples may deviate from the distribution of real CC samples. Such shifts can hurt model reliability and generalization. To resolve this, DEFACC integrates a feature alignment module that is founded on the Maximum Mean Discrepancy (MMD) loss. This module enforces distributional consistency between generated and original CC samples during training. Together, these components ensure that the augmented samples are from sparse to structured, which is not only quantitatively balanced but also semantically faithful. Experimental results show that the DEFACC significantly improves the performance of existing CC detection methods and provides a stronger representation foundation for accurate fault localization.