ICLR2026

A2ASecBench: A Protocol-Aware Security Benchmark for Agent-to-Agent Multi-Agent Systems

Tianhao Li, Chuangxin Chu, Yujia Zheng, Bohan Zhang, Neil Zhenqiang Gong, Chaowei Xiao

摘要

Multi-agent systems (MAS) built on large language models (LLMs) increasingly rely on agent-to-agent (A2A) protocols to enable capability discovery, task orchestration, and artifact exchange across heterogeneous stacks. While these protocols promise interoperability, they also introduce new vulnerabilities. In this paper, we present the first comprehensive security evaluation of A2A-MAS. We develop a taxonomy and threat model that categorize risks into supply-chain manipulations and protocol-logic weaknesses, and we detail six concrete attacks spanning all A2A stages and components with impacts on confidentiality, integrity, and availability. Building on this taxonomy, we introduce A2ASECBENCH, the first A2Aspecific security benchmark framework capable of probing diverse and previously unexplored attack vectors. Our framework incorporates a dynamic adapter layer for deployment across heterogeneous agent stacks and downstream workloads, alongside a joint safety-utility evaluation methodology that explicitly measures the trade-off between harmlessness and helpfulness by pairing adversarial trials with benign tasks. We empirically validate our framework using official A2A Project demos across three representative high-stakes domains (travel, healthcare, and finance), demonstrating that the identified attacks are both pervasive and highly effective, consistently bypassing default safeguards. These findings highlight the urgent need for protocol-level defenses and standardized benchmarking to secure the next generation of agentic ecosystems. https://safo-lab.github.io/A2ASecBench/ Published as a conference paper at ICLR 2026 Mao et al., 2025; Du et al., 2025; Vaziry et al., 2025) , and multiple enterprise-grade products from different vendors have also emerged (detailed in Appendix D). These developments demonstrate that the A2A protocol is already making tangible real-world impact. However, the A2A ecosystem expands a protocol-level threat surface that lies beyond prompt-centric defenses. As shown in Figure 1 , threats can arise at the supply chain during discovery and selection (misleading capability claims or cloaked functions), and throughout task orchestration and artifact exchange (lifecycle manipulation, flooding, and malicious payloads embedded in artifacts). The risk is exacerbated by A2A's opaque execution model, where agents collaborate via declared capabilities and exchanged context without exposing internal logic, memory, or proprietary tools, rendering identity and capability claims difficult to independently verify (A2A Project, 2024). Once admitted, a spoofed or cloaked agent can induce a client to submit sensitive inputs, misroute or hijack tasks, withhold or corrupt partial results, launch denial-of-service (DoS) style task floods, or return artifacts that trigger downstream code execution or data exfiltration, thereby compromising confidentiality, integrity, and availability.