WWW2026

From Random Forget-Sets to Realistic Natural-Language Deletion Requests

Imran Ahsan, Jinsung Kim, Mucheol Kim

Abstract

Most graph unlearning evaluations delete a random fraction k% of nodes or edges. In practice, however, controllers act on natural?language erasure requests. To support evaluation under such text inputs, we release Text2Forget, a DSAR-style natural-language erasure-request corpus and benchmark for CoraFull, PubMed, and Yelp, together with a resolver that maps request text to concrete graph targets. Our pipeline designs law?motivated, retrieval-grounded requests that use human-readable anchors and resolves each request to real nodes and edges via a hybrid BM25-plus-embedding index, yielding thousands of requests per dataset. Because this corpus is large, a deterministic ranking score selects a small, actionable top-k subset for evaluation. A plug-and-play harness (GCN, GAT, and GIN with UtU, GNNDelete, GIF, and Retrain) reports privacy via membership-inference AUC (MI-AUC) on forgotten items, averaged over the selected requests and backbones. Across datasets, request-driven forget-sets induce membership-inference and task-accuracy behavior broadly comparable to standard random-forget and retraining baselines, indicating that the Text2Forget corpus can serve as a practical benchmark within existing privacy–utility trade-offs. The released package includes the corpus, resolved targets, confidence scores, seeds, and scripts, supporting reproducible, text-driven unlearning studies. We release our package at https://github.com/ImranAhsan23/Text2Forget.