ICLR2026

Minimax Sample Complexity of Graph Neural Networks: Lower Bounds and Structural Effects

Ahmad Ghasemi, Hossein Pishro-Nik

Abstract

Graph Neural Networks (GNNs) achieve strong empirical performance across domains, yet their fundamental statistical behavior remains poorly understood. This paper develops a minimax analysis of ReLU message-passing GNNs with explicit architectural assumptions, in both inductive (graph-level) and transductive (node-level) settings. For arbitrary graphs without structural constraints, we show that the worst-case generalization error scales as logd/n\sqrt{\log d / n} with sample size nn and input dimension dd, matching the 1/n1/\sqrt{n} behavior of feed-forward networks. Under a spectral--homophily condition combining strong label homophily and bounded spectral expansion, we prove a stronger minimax lower bound of d/lognd/\log n for transductive node prediction. We complement these results with a systematic empirical study on three large-scale benchmarks (ogbn_arxiv, ogbn_products_50k, Reddit_50k) and two controlled synthetic datasets representing the worst-case and structured regimes of our theory. All benchmark graphs we study fall in the slow-mixing, bottlenecked regime captured by our spectral-homophily condition, and ratio-based scaling tests show error decay consistent with the d/lognd/\log n rate in real and structured settings, while the worst-case synthetic dataset follows the logd/n\sqrt{\log d / n} curve. Together, these results indicate that practical GNN tasks often operate in the spectral-homophily regime, where our lower bound d/lognd/\log n is tight and effective sample complexity is driven by graph topology rather than universal 1/n1/\sqrt{n} behavior.