ACL2024

Chain of Logic: Rule-Based Reasoning with Large Language Models

Sergio Servantez, Joe Barrow, Kristian J. Hammond, Rajiv Jain

Abstract

Rule-based reasoning, a fundamental type of legal reasoning, enables us to draw conclusions by accurately applying a rule to a set of facts. We explore causal language models as rulebased reasoners, specifically with respect to compositional rules -rules consisting of multiple elements which form a complex logical expression. Reasoning about compositional rules is challenging because it requires multiple reasoning steps, and attending to the logical relationships between elements. We introduce a new prompting method, Chain of Logic, which elicits rule-based reasoning through decomposition (solving elements as independent threads of logic), and recomposition (recombining these sub-answers to resolve the underlying logical expression). This method was inspired by the IRAC (Issue, Rule, Application, Conclusion) framework, a sequential reasoning approach used by lawyers. We evaluate chain of logic across eight rule-based reasoning tasks involving three distinct compositional rules from the LegalBench benchmark and demonstrate it consistently outperforms other prompting methods, including chain of thought and self-ask, using open-source and commercial language models. Legal tasks typically require sophisticated rule-043 based reasoning. These rules are written in natural 044 language and expressed in many forms, including 045 statutes, judicial holdings and even contract provi-046 sions. Similar to an if/then statement, a rule has 047 an antecedent (a condition that can be evaluated 048 to true or false) and a consequent (the outcome 049 triggered if the antecedent is satisfied). Even the 050 earliest recorded laws, written in Sumerian on clay 051 tablets, have used this conditional structure (Roth, 052 1995). Rule-based reasoning allows us to draw 053 conclusions by applying a rule to a set of facts to 054 determine if these preconditions are satisfied. For 055 example, if parking is prohibited (consequent) be-056 tween 2pm and 4pm (antecedent), and we know it 057 is currently 3pm, then we can conclude that parking 058 is currently prohibited. Often rules, especially com-059 plex rules, are compositional in nature meaning the 060 antecedent consists of multiple conditions joined 061 by and and or operators forming a complex logical 062 expression (see Figure 1). In law, these constituent 063 et al., 2023), across a variety of rule-based reason-119 ing tasks using both open-source and commercial 120 models. We demonstrate that given a single exam-121 ple of chain of logic, a model can learn to general-122 ize this approach to a different rule and fact pattern, 123 and thereby improve its reasoning capabilities. 124 2 Background 125 Large language models have demonstrated strong 126 capabilities as zero-shot (Kojima et al., 2023) and 127 few-shot (Brown et al., 2020) reasoners. Chain 128 of thought prompting enhanced problem solving 129 performance even further by eliciting models to 130 produce intermediate reasoning steps (Wei et al., 131 2023). This approach proved effective at solving 132 complex tasks, including arithmetic and common-133 sense reasoning tasks. Many other prompting meth-134 ods followed with a similar aim of decomposing a 135 complex task into a sequence of simpler subtasks. 136 Self-ask improved multi-hop question answering 137 performance beyond chain of thought prompting by 138 guiding the model to explicitly pose and answer in-139 termediate questions (Press et al., 2023). Similar to 140 decomposing a multi-hop question into a sequence 141 of intermediate questions, compositional rules can 142 be decomposed into rule elements. However, rea-143 soning over these rules requires not only answering 144 each element correctly, but also understanding the 145 logical relationships that exists between these el-146 ements. For example, if a rule with two elements 147 resolves to (true, false), then the final answer de-148 pends on whether the boolean relationship between 149 those elements is and or or. In Section x, we show 150 existing prompting methods can resolve rule ele-151 ments correctly while still getting the final answer 152 incorrect. This motivates the need for a more ro-153 bust prompting method which attends to both the 154 element level answers and the underlying logical 155 structure. Further motivating this work, zero-shot 156 methods have been observed outperforming one-157 shot and few-shot approaches on legal reasoning 158 tasks (Yu et al., 2022; Blair-Stanek et al., 2023b), 159 suggesting models can struggle with in-context 160 learning in a legal setting. 161 3 Chain of Logic 162 We introduce chain of logic, a new prompting 163 approach to elicit rule-based reasoning in LMs 164 through a series of instructive reasoning steps. 165 Each step in this series helps inform the next, and 166 enables the model to unravel the many reasoning 167 tasks needed to arrive at the right conclusion. Our 168 method builds on chain of thought and self-ask by 169 not only considering problem de