ICLR2026

Detective SAM: Adaptive AI-Image Forgery Localization

Gert Lek, Nicolas van Schaik, Chaoyi Zhu, Pin-Yu Chen, Robert Birke, Lydia Y. Chen

Abstract

Image forgery localization in the generative AI era poses new challenges, as modern editing pipelines produce photorealistic, semantically coherent manipulations that evade conventional detectors while model capabilities evolve rapidly. In response, we develop Detective SAM, a framework built on SAM2, a foundation model for image segmentation that integrates perturbation-driven forensic clues with lightweight feature adapters and a mask adapter to convert forensic clues into forgery masks via automatic prompting. Moreover, to keep up with the rapidly evolving capabilities of diffusion models, we introduce AutoEditForge: an automated diffusion edit generation pipeline spanning four edit types. This supplies high-quality data to maintain localization accuracy under newly released editors and enables up-to-date periodic fine-tuning for Detective SAM. Across four benchmark datasets and seven baselines, Detective SAM delivers stable out-ofdistribution performance, averaging 34.68 IoU / 42.03 F1, a 38.94% relative IoU gain over the best baseline. Further, we show that state-of-the-art edits cause localization systems to collapse. With 500 AutoEditForge samples, Detective SAM quickly adapts and restores performance, enabling practical, low-friction updates as editing models improve. The pretrained weights, AutoEditForge, and evaluation script are available at the GitHub repository. This paradigm shift, brought on by diffusion models, initiated a surge in research on stronger forensic clues. Part of this surge shows empirical success with training-free (Ricker et al., 2024; Tsai et al., 2024a; He et al., 2024) and zero-shot (Cozzolino et al., 2024) methods that rely on explicit perturbation artifacts in the embedding space of foundation models. Image foundation models learn embeddings through large-scale self-supervision (Dosovitskiy et al., 2021; Oquab et al., 2024) . Such CoCoGLIDE This small evaluation set contains 512 GLIDE based edits Nichol et al. (2022) . We use 512 samples for out-of-distribution testing. UltraEdit This dataset serves as an additional OOD benchmark utilizing the SDXL-Turbo model. We use the region-based (local edited) subset, it contains 100.000 samples with pixel-level ground truth masks, from which we take a 10.000 random subset.