ICML2024

Embarrassingly Parallel GFlowNets

Tiago da Silva, Luiz Max Carvalho, Amauri H. Souza, Samuel Kaski, Diego Mesquita

3 citations

Abstract

GFlowNets are a promising alternative to MCMC sampling for discrete compositional random variables. Training GFlowNets requires repeated evaluations of the unnormalized target distribution, or reward function. However, for large-scale posterior sampling, this may be prohibitive since it incurs traversing the data several times. Moreover, if the data are distributed across clients, employing standard GFlowNets leads to intensive client-server communication. To alleviate both these issues, we propose embarrassingly parallel GFlowNet (EP-GFlowNet). EP-GFlowNet is a provably correct divide-and-conquer method to sample from product distributions of the form R(•) ∝ R 1 (•)...R N (•) -e.g., in parallel or federated Bayes, where each R n is a local posterior defined on a data partition. First, in parallel, we train a local GFlowNet targeting each R n and send the resulting models to the server. Then, the server learns a global GFlowNet by enforcing our newly proposed aggregating balance condition, requiring a single communication step. Importantly, EP-GFlowNets can also be applied to multi-objective optimization and model reuse. Our experiments illustrate the EP-GFlowNets's effectiveness on many tasks, including parallel Bayesian phylogenetics, multi-objective multiset, sequence generation, and federated Bayesian structure learning. 2023; Atanackovic et al., 2023b), GFlowNets find varied applications in combinatorial optimization (Zhang et al., 2023a), multi-objective active learning (Jain et al., 2023), and design of biological sequences (Jain et al., 2022). Inheriting jargon from reinforcement learning, GFlowNets learn to sample from R : X → R + by incrementally refining a policy function p F that incrementally builds objects x ∈ X by augmenting an initial state s 0 . In Bayesian context, R is proportional to some posterior p(•|D), and X is its support. Nonetheless, similarly to running MCMC chains, training GFlowNets requires repeatedly evaluating R. For posterior sampling, this implies repeated sweeps through the data. Furthermore, if the data is scattered across many clients -e.g., in Bayesian federated learning (El Mekkaoui et al., 2021; Vono et al., 2022) -, the multiple communication rounds between clients and the server may be a bottleneck. In the realm of MCMC, embarrassingly parallel methods (Neiswanger et al., 2014) address both aforementioned issues simultaneously through a divide-and-conquer scheme. In a nutshell, given an N -partition D 1 , . . . , D N of the data D, the strategy consists of sampling in parallel from N subposteriors defined on different data shards: and subsequently combining the results in a server to get approximate samples from the full posterior p(x|D), or equivalently from R ∝ R 1 R 2 . . . R N . In federated settings, the data partition reflects how data arises in client devices, and this class of algorithms incurs a single communication round between the clients and the server. However, these methods are tailored towards continuous random variables, and their combination step relies on approximating (or sampling from) the product of sample-based continuous surrogates of the subposteriors, e.g., kernel density estimators (Neiswanger et al., 2014) , Gaussian processes (Nemeth & Sherlock, 2018; de Souza et al., 2022), or normalizing flows (Mesquita et al., 2019) . Consequently, they are not well-suited to sample from discrete state spaces. We propose EP-GFlownet, the first embarrassingly parallel sampling method for discrete distributions. EP-GFlowNets start off by learning N local GFlowNets in parallel to sample proportional to their corresponding reward functions R 1 , . . . , R N and send the resulting models to the server.