ICLR2026

Minimax Sample Complexity of Graph Neural Networks: Lower Bounds and Structural Effects

Ahmad Ghasemi, Hossein Pishro-Nik

Abstract

Graph Neural Networks (GNNs) achieve strong empirical performance across domains, yet their fundamental statistical behavior remains poorly understood. This paper develops a minimax analysis of ReLU message-passing GNNs with explicit architectural assumptions, in both inductive (graph-level) and transductive (node-level) settings. For arbitrary graphs without structural constraints, we show that the worst-case generalization error scales as $\sqrt{\log d / n}$ with sample size $n$ and input dimension $d$ , matching the $1/\sqrt{n}$ behavior of feed-forward networks. Under a spectral--homophily condition combining strong label homophily and bounded spectral expansion, we prove a stronger minimax lower bound of $d/\log n$ for transductive node prediction. We complement these results with a systematic empirical study on three large-scale benchmarks (ogbn_arxiv, ogbn_products_50k, Reddit_50k) and two controlled synthetic datasets representing the worst-case and structured regimes of our theory. All benchmark graphs we study fall in the slow-mixing, bottlenecked regime captured by our spectral-homophily condition, and ratio-based scaling tests show error decay consistent with the $d/\log n$ rate in real and structured settings, while the worst-case synthetic dataset follows the $\sqrt{\log d / n}$ curve. Together, these results indicate that practical GNN tasks often operate in the spectral-homophily regime, where our lower bound $d/\log n$ is tight and effective sample complexity is driven by graph topology rather than universal $1/\sqrt{n}$ behavior.