ICLR2026
STABLE: Shift-Tolerant Allocation via Black-Litterman Using Conditional Diffusion Estimates
YEJUN SOUN, Hosung Lee, Suyoung Park, U Kang
摘要
In dynamic financial market characterized by shifting regimes, how can we make effective investment decisions under the changing 1) market regimes and 2) their impact? Among many research fields in financial AI, portfolio allocation stands out as one of the most practically significant areas. Consequently, numerous researchers and financial institutions continually seek approaches that improve the risk-reward trade-off and strive to apply them in real-world investment scenarios. However, achieving robust risk-adjusted performance is extremely challenging, because each asset's return and volatility fluctuate according to the shifting market regime. In response, modern portfolio theory (MPT) addresses this issue by solving for asset weights that maximize a risk-reward objective, using estimates of the return mean and covariance from historical returns. Reinforcement learning (RL) frameworks have been introduced to directly decide portfolio allocations by optimizing risk-adjusted objectives using asset prices and macroeconomic indices. In this work, we propose STABLE (Shift-Tolerant Allocation via Black-Litterman Using Conditional Diffusion Estimates), which combines a diffusion-based generative model that captures regime shifts with an estimation-based portfolio allocation module that maximizes expected risk-adjusted return. STABLE takes macroeconomic context and asset-specific signals as inputs and generates per-stock return trajectories that reflect the prevailing macro regime while preserving firm-specific dynamics. This yields regime-aware predictive return distributions at the singlestock level together with a coherent covariance structure, which are then incorporated as investor views within a Black-Litterman allocation module to obtain risk-diversified portfolio weights. Empirically, STABLE delivers superior portfolio outcomes, achieving up to 122.9% higher Sharpe ratios with reduced drawdowns across major equity markets. It also attains state-of-the-art time-series estimation, lowering MSE by up to 15.7% compared with generative baselines. However, to perform robust portfolio optimization under shifting market regimes, we must overcome three key challenges. First, stocks are high-risk assets with substantial exposure to global macro conditions, thus failing to jointly model these macro drivers with firm-specific factors undermines predictive accuracy for price dynamics. Second, even when both global factors and local factors, it is difficult to know how strongly each factor influences each stock, and the influence varies across assets and over time. Third, individual assets in a portfolio are often correlated (Yoo & Kang, 2021; Yoo et al., 2021; Soun et al., 2022; Kim et al., 2024) , so failing to diversify risk when determining weights can cause severe drawdowns. There have been existing works relying on portfolio MULTI-LEVEL GUIDANCE (MLG) Noise decomposition and gate. We use multi-level guidance that decomposes, for each rebalancing time τ and stock s, the guided noise into a shared (macroeconomic) impact and an unshared (firm-specific) impact with a stock-specific balancing gate. This is motivated by two empirical properties of the financial market. First, macro impact varies over time (Mezei & Sarlin, 2014