ICLR2025
Physiome-ODE: A Benchmark for Irregularly Sampled Multivariate Time-Series Forecasting Based on Biological ODEs
Christian Klötergens, Vijaya Krishna Yalavarthi, Randolf Scholz, Maximilian Stubbemann, Stefan Born, Lars Schmidt-Thieme
Abstract
State-of-the-art methods for forecasting irregularly sampled time series with missing values predominantly rely on just four datasets and a few small toy examples for evaluation. While ordinary differential equations (ODE) are the prevalent models in science and engineering, a baseline model that forecasts a constant value outperforms ODE-based models from the last five years on three of these existing datasets. This unintuitive finding hampers further research on ODE-based models, a more plausible model family. In this paper, we develop a methodology to generate irregularly sampled multivariate time series (IMTS) datasets from ordinary differential equations and to select challenging instances via rejection sampling. Using this methodology, we create Physiome-ODE, a large and sophisticated benchmark of IMTS datasets consisting of 50 individual datasets, derived from ODE models developed by research in Biology. Physiome-ODE is the first benchmark for IMTS forecasting that we are aware of and an order of magnitude larger than the current evaluation setting. Using Physiome-ODE, we show qualitatively completely different results than those derived from the current four datasets: on Physiome-ODE deep learning methods based on ODEs can play to their strength and our benchmark can differentiate in a meaningful way between different IMTS forecasting models. This way, we expect to give a new impulse to research on irregular time series modeling. Published as a conference paper at ICLR 2025 which always predicts a constant value independent of time, is competitive with or even outperforms complex neural ODE models on these datasets. Hence, it appears questionable whether the currently used datasets are indeed well-suited for forecasting. To address this limitation, we introduce Physiome-ODE, a wide benchmark of IMTS datasets consisting of 50 individual datasets, derived from ordinary differential equations from biological research, that are stored in the Physiome Model Repository (PMR). Biological processes are wellsuited for generating IMTS datasets, as they are inherently multivariate and irregularly measured in real-world experiments. Additionally, the PMR provides Python implementations of many of these models, allowing us to create Physiome-ODE in an automated manner. While Biology researchers create their models based on very few and non-published observations, they enable us to create an arbitrary number of time series which relate to possible measurements of a real-world phenomenon. Physiome-ODE is the first benchmark for IMTS forecasting that we are aware of and an order of magnitude larger than the current evaluation setting of just four datasets. Furthermore, to evaluate the complexity of the different forecasting datasets, we introduce a simple metric called Joint Gradient Deviation (JGD), which measures the gradient variance of ODE solutions. We will show that our benchmark consists of datasets of different complexity and covers a wide range of different JGD values. Finally, we evaluate current IMTS forecasting methods on Physiome-ODE and show that it includes many datasets on which neural ODE-based models significantly outperform the time-constant baseline model. Furthermore, a member of the neural ODE model family actually emerges as the overall most accurate model on Physiome-ODE. However, the datasets in Physiome-ODE are diverse enough that no single model is the most accurate for every dataset.