NeurIPS2025

Do-PFN: In-Context Learning for Causal Effect Estimation

Jake Robertson, Arik Reuter, Siyuan Guo, Noah Hollmann, Frank Hutter, Bernhard Schölkopf

34 citations

Abstract

Causal effect estimation is critical to a range of scientific disciplines. Existing methods for this task either require interventional data, knowledge about the ground-truth causal graph, or rely on assumptions such as unconfoundedness, restricting their applicability in real-world settings. In the domain of tabular machine learning, Prior-data fitted networks (PFNs) have achieved state-of-theart predictive performance, having been pre-trained on synthetic causal data to solve tabular prediction problems via in-context learning. To assess whether this can be transferred to the problem of causal effect estimation, we pre-train PFNs on synthetic data drawn from a wide variety of causal structures, including interventions, to predict interventional outcomes given observational data. Through extensive experiments in synthetic and semi-synthetic settings, we show that our approach allows for the accurate estimation of causal effects without knowledge of the underlying causal graph. Recent developments have shown that certain limitations in inferring causal structures and causal effects can be addressed by using multi-domain data in the form of mixtures of i.i.d. observational data (Guo et al., 2023 (Guo et al., , 2024)) . Interestingly, PFNs also leverage pre-training on a mixture of i.i.d. data to meta-learn how to solve predictive tasks at test time. We thus hypothesize that some causal tasks at test time could also be addressed through meta-learning on multi-domain data. As a first step, our goal is to extend PFNs to the problem of estimating conditional interventional distributions (CIDs). In contrast to TabPFN, we not only simulate observational tabular data in order to predict a target feature. Rather, we additionally simulate causal interventions, teaching our model, which we call Do-PFN, to meta-learn how to perform causal inference. Our contributions 1. Do-PFN: We propose Do-PFN, a foundation model pre-trained on data from structural causal models (SCMs) that can predict interventional outcomes and causal effects from observational data. 2. Semi-synthetic evaluation: We evaluate the performance of Do-PFN on six case studies across more than 1,000 synthetic datasets, the popular RealCause benchmark (Neal et al., 2020) , as well as two observational datasets with widely agreed upon causal graphs. We provide ablation studies within our prior, an out-of-distribution analysis, assess uncertainty calibration, and evaluate Do-PFN against a competitive set of meta-learners, doubly robust, and deep-learning-based methods in the task of CATE estimation. 3. Theoretical results: In providing a mathematical underpinning for Do-PFN, we prove that it can achieve an optimal approximation of the conditional intervention distribution (CID) concerning the chosen prior over data-generating functions. We also provide a characterization of the sources of uncertainty in our model, and present a consistency argument to show which types of uncertainty vanish with infinite data.