ICLR2026
Energy-Efficient Random Variate Generation via Compressed Lookup Tables
Johann Ukrow, Anna Kazachkova, Nicolas Alder, Sven Köhler, Rainer Schlosser, Ralf Herbrich
摘要
Generating (pseudo-)random variates lies at the core of probabilistic machine learning and prediction algorithms and yet remains a major bottleneck due to its high computational and energy cost. In this paper, we introduce a general and scalable sampling strategy that enables fast and energy-efficient random variate generation from arbitrary distributions. Our approach is based on compressed lookup tables (cLUT) combined with a fast index sampling scheme. Using only a handful of fast and energy-efficient compute operations on simple array structures, we achieve superior speed, energy efficiency, and precision at near-optimal entropy cost compared to state-of-the-art techniques. Microbenchmarking our approach with a C implementation shows up to 40% savings in time and 50% in energy compared to state-of-the-art approaches. Compared to commonly employed Python samplers, we achieve a 100× time improvement. * Equal contribution. i -p i log 2 (p i ) is the Shannon entropy. While entropy-optimal, discrete distribution generating trees typically require exponential memory in the distribution precision. Lumbroso (2013) overcame this limitation for uniform and Bernoulli distributions with a linear-memory implementation, but the approach does not generalize to arbitrary distributions. The generic interval algorithm (Hao and Hoshi, 2006) achieves linear memory usage while consuming at most H(p) + 3 bits per sample. However, implementations require expensive binary searches at each sampling step, limiting practical efficiency (Devroye and Gravel, 2020; Uyematsu and Li, 2003) . Saad et al. (2020) presented the FLDR algorithm that combines entropy-optimal sampling with rejection sampling, achieving an upper bound of H(p) + 6 bits per sample. ? improved this to H(p) + 2 bits with faster sampling speed for the ALDR algorithm, though at a higher memory cost. Building on Marsaglia (1963 ), Marsaglia et al. (2004) proposed compressed lookup tables for discrete sampling. However, their compression scheme requires conditional branching and searches across multiple tables during sampling,