WWW2026

GFMixer: Decoupled Temporal Gradient and Fourier-Aware Attention for Time Series Forecasting

Lin Zhang, Qing Li, Jingmei Zhao

Abstract

Multivariate time series forecasting is fundamental to web-scale systems. However, frequency-domain forecasters face two structural challenges: (i) frequency bias, where low-amplitude yet informative temporal cues are often submerged in noise and neglected during training; (ii) spectral degradation, where standard neural transformations distort high-amplitude periodic structures, thereby weakening the predictive signal. To address both issues, we propose GFMixer, a decoupled dual-path architecture. GFMixer first applies a Temporal Gradient Block (TGB) to capture low-amplitude information through adaptive selection based on temporal-geometric evidence. Subsequently, a Fourier-Aware Attention Block (FAB) represents high-amplitude, multi-frequency information to mitigate spectral degradation. Finally, both information streams are integrated via time-aligned residual connections within a Gradient Aggregation Block (GAB) for the final forecasting task. Extensive evaluations across seven standard benchmarks (ETT, Weather, Electricity, and Traffic) demonstrate GFMixer's superiority, securing 37 first-place rankings and the lowest average MSE overall. Beyond standalone performance, GFMixer serves as a versatile plug-and-play module that consistently enhances mainstream backbones. Code is available at: https://github.com/superlin30/GFMixer.