ICLR2026

Co-occurring Associated REtained concepts in Diffusion Unlearning

Miso Kim, Georu Lee, Yunji Kim, Hoki Kim, Jinseong Park, Woojin Lee

Abstract

Unlearning has emerged as a key technique to mitigate harmful content generation in diffusion models. However, existing methods often remove not only the target concept, but also benign co-occurring concepts. Unlearning nudity can unintentionally suppress the concept of person, preventing a model from generating images with person. We define these undesirably suppressed co-occurring concepts that must be preserved $\textbf{CARE}$ ( $\textbf{C}$ o-occurring $\textbf{A}$ ssociated $\textbf{RE}$ tained concepts). Then, we introduce the $\textbf{CARE score}$ , a general metric that directly quantifies their preservation across unlearning tasks. With this foundation, we propose $\textbf{ReCARE}$ ( $\textbf{R}$ obust $\textbf{e}$ rasure for $\textbf{CARE}$ ), a framework that explicitly safeguards CARE while erasing only the target concept. ReCARE automatically constructs the CARE-set, a curated vocabulary of benign co-occurring tokens extracted from target images, and leverages this vocabulary during training for stable unlearning. Extensive experiments across various target concepts ( $\textit{Nudity}$ , $\textit{Van Gogh}$ style, and $\textit{Tench}$ object) demonstrate that ReCARE achieves overall state-of-the-art performance in balancing robust concept erasure, overall utility, and CARE preservation.