EMNLP2025
SAFE-SQL: Self-Augmented In-Context Learning with Fine-grained Example Selection for Text-to-SQL
Jimin Lee, Ingeol Baek, Byeongjeong Kim, Hyunkyung Bae, Hwanhee Lee
1 citation
Abstract
Text-to-SQL aims to convert natural language questions into executable SQL queries. While previous approaches, such as skeleton-masked selection, have demonstrated strong performance by retrieving similar training examples to guide large language models (LLMs), they struggle in real-world scenarios where such examples are unavailable. To overcome this limitation, we propose Self-Augmentation incontext learning with Fine-grained Example selection for Text-to-SQL (SAFE-SQL), a novel unsupervised framework that enhances SQL generation by generating and intelligently filtering self-augmented examples. SAFE-SQL leverages an LLM to generate diverse Textto-SQL examples, which are then filtered by a novel fine-grained mechanism using criteria for semantic similarity, structural alignment, and reasoning path quality to curate highquality in-context learning examples. Leveraging these carefully selected self-generated examples, SAFE-SQL significantly surpasses previous zero-shot and few-shot Text-to-SQL frameworks, achieving superior execution accuracy. Notably, our approach demonstrates substantial performance gains in challenging extra hard and unseen scenarios, where conventional methods often struggle.