KDD2025

MTM: A Multi-Scale Token Mixing Transformer for Irregular Multivariate Time Series Classification

Shuhan Zhong, Weipeng Zhuo, Sizhe Song, Guanyao Li, Zhongyi Yu, S.-H. Gary Chan

Abstract

Irregular multivariate time series (IMTS) is characterized by the lack of synchronized observations across its different channels.In this paper, we point out that this channel-wise asynchrony can lead to poor channel-wise modeling of existing deep learning methods.To overcome this limitation, we propose MTM, a multi-scale token mixing transformer for the classification of IMTS.We find that the channel-wise asynchrony can be alleviated by down-sampling the time series to coarser timescales, and propose to incorporate a masked concat pooling in MTM that gradually down-samples IMTS to enhance the channel-wise attention modules.Meanwhile, we propose a novel channel-wise token mixing mechanism which proactively chooses important tokens from one channel and mixes them with other channels, to further boost the channel-wise learning of our model.Through extensive experiments on real-world datasets and comparison with state-of-the-art methods, we demonstrate that MTM consistently achieves the best performance on all the benchmarks, with improvements of up to 3.8% in AUPRC for classification.