ICLR2026

Co-occurring Associated REtained concepts in Diffusion Unlearning

Miso Kim, Georu Lee, Yunji Kim, Hoki Kim, Jinseong Park, Woojin Lee

Abstract

Unlearning has emerged as a key technique to mitigate harmful content generation in diffusion models. However, existing methods often remove not only the target concept, but also benign co-occurring concepts. Unlearning nudity can unintentionally suppress the concept of person, preventing a model from generating images with person. We define these undesirably suppressed co-occurring concepts that must be preserved CARE\textbf{CARE} (C\textbf{C}o-occurring A\textbf{A}ssociated RE\textbf{RE}tained concepts). Then, we introduce the CARE score\textbf{CARE score}, a general metric that directly quantifies their preservation across unlearning tasks. With this foundation, we propose ReCARE\textbf{ReCARE} (R\textbf{R}obust e\textbf{e}rasure for CARE\textbf{CARE}), a framework that explicitly safeguards CARE while erasing only the target concept. ReCARE automatically constructs the CARE-set, a curated vocabulary of benign co-occurring tokens extracted from target images, and leverages this vocabulary during training for stable unlearning. Extensive experiments across various target concepts (Nudity\textit{Nudity}, Van Gogh\textit{Van Gogh} style, and Tench\textit{Tench} object) demonstrate that ReCARE achieves overall state-of-the-art performance in balancing robust concept erasure, overall utility, and CARE preservation.