ICLR2026

Attend to the Active: Structure-Aware Dynamic Attention in LLMs for Compositional Instruction Following

Fangrui Lv, Yulei Qin, Ruixin Hong, Jian Liang, Jinyang Wu, Ke Li, Xing Sun, Changshui Zhang

摘要

Large language models (LLMs) have demonstrated strong instruction-following capabilities; however, they often struggle with compositional instructions that involve multiple interleaved yet logically independent sub-tasks. These sub-tasks are typically organized in mutually exclusive structures, such as branching, chaining, or paralleling, where only one sub-task should be active at each generation step, while the others remain dormant. Despite their inactivity, dormant sub-tasks can inadvertently attract the model's attention due to structural entanglement within the input context or intermediate representations, leading to interference that compromises output fidelity. To address this challenge, we propose ATA, a structure-aware dynamic attention mechanism grounded in compositional structures, which dynamically identifies the active sub-task during generation while suppressing attention to inactive ones. By precisely steering the model's focus, ATA mitigates interference and explicitly enhances model adherence to the active sub-task. Importantly, ATA operates within a single forward pass without requiring parameter updates. Extensive experiments show that ATA consistently enhances LLMs' instructionfollowing ability across various compositional structures, effectively mitigating attention distraction and demonstrating a strong generalization ability. * Corresponding author. the description should be in English Wrong Generation If the work contains any animal, the description should be in Chinese. Otherwise, the description should be in English.