NeurIPS2022
Learning to Find Proofs and Theorems by Learning to Refine Search Strategies: The Case of Loop Invariant Synthesis
Jonathan Laurent, André Platzer
摘要
We propose a new approach to automated theorem proving where an AlphaZerostyle agent is self-training to refine a generic high-level expert strategy expressed as a nondeterministic program. An analogous teacher agent is self-training to generate tasks of suitable relevance and difficulty for the learner. This allows leveraging minimal amounts of domain knowledge to tackle problems for which training data is unavailable or hard to synthesize. As a specific illustration, we consider loop invariant synthesis for imperative programs and use neural networks to refine both the teacher and solver strategies. * * * num-processes none Number of distinct CPU processes spawned for data generation (by default, this value is set to the number of available physical CPU cores). * * * search Proof search limits. * * * * max-tree-size 256 Maximal size of the MCTS tree. This parameter is relevant to avoid out-of-memory errors when reset-tree if false. * * * * policy-loss-coeff