ACL2022

Improving Controllable Text Generation with Position-Aware Weighted Decoding

Yuxuan Gu, Xiaocheng Feng, Sicheng Ma, Jiaming Wu, Heng Gong, Bing Qin

摘要

Weighted decoding methods composed of the pretrained language model (LM) and the controller have achieved promising results for controllable text generation. However, these models often suffer from a control strength/fluency trade-off problem as higher control strength is more likely to generate incoherent and repetitive text. In this paper, we illustrate this tradeoff is arisen by the controller imposing the target attribute on the LM at improper positions. And we propose a novel framework based on existing weighted decoding methods called CAT-PAW 1 , which introduces a lightweight regulator to adjust bias signals from the controller at different decoding positions. Experiments on positive sentiment control, topic control, and language detoxification show the effectiveness of our CAT-PAW upon 4 SOTA models 2 . (Liu et al., 2021a) will drop rapidly. In addition, 038 cases in Figure 2 shows that with the increase of 039 weight λ from 0.03 to 0.09, models are more likely 040 to degenerate with repetitive, contradictory and in-041 coherent contents such as "it was war war for war". 042 Therefore, it's vital to alleviate the trade-off as an 043 ideal controllable generator should generate high-044 quality text under different control strengths. 045 Based on our analysis, the trade-off is due to 046 the controller assigning bias signals to all decod-047 ing positions while ignoring the original results of 048 LMs. This makes current models generate attribute 049 tokens at inappropriate positions. Take military 050 topic control task and PPLM model as an example, 051 which is shown in Figure 2. With prefix The potato 052 and a relatively high weight λ = 0.09, PPLM at-053 tempts to generate text highly relevant to military. 054 When it comes to the decoding step at token first, 055 candidate tokens of the LM are unrelated to the 056 military topic, but the controller enforces a military 057 P (X|a) ∝ P (X)P (a|X) and decompose it into 106 an LM P (X) and a controller P (a|X). 107 To adjust control strength of target attribute a, 108 weighted decoding methods recompose the condi-109 tional probability with additional weight λ: 110 P (X|a) ∝ P (X)P (a|X) λ (2) 112 controller P (a|X) needs to provide a bias signal 113 to the LM at step i only based on x <i . Therefore, 114 previous work (Dathathri et al., 2020) takes con-115 troller P (a|x <i ) as an approximation 3 of P (a|X) 116 at position i, modifying Equation (2) as 4 : 117 P (X|a) ∝ n i=1 P (x i |x <i )P (a|x <i ) λ . (3) 118 As shown in Equation 3, the next token is pre-119 dicted by the combination of LM and λ weighted 120 controller. However, the controller only cares about 121 how to make the prefix x <i more related to attribute 122 a while ignoring the original results of LMs. There-123 fore, as λ increases, the controller gradually takes 124 over LM's control of the decoding process. And the 125 generated text will possess higher control strength 126 with lower fluency, leading to the trade-off. 127 2.2 CAT-PAW 128 To alleviate the trade-off and generate high-quality 129 text, we present CAT-PAW with a module named 130 regulator f (a, P (x ≤i )) that can adjust bias signals 131 from the controller properly at different decoding 132 positions. Concretely, the regulator will suppress 133 the bias signal and let the LM dominate this decod-134 ing step when it is an improper position to express 135 attribute a. Otherwise, we will activate or even 136 amplify the controller. We modify Equation 3 as: 137 P (X|a) ∝ n i=1 P (x i |x <i )P (a|x <i ) λf (a,P (x ≤i )) . (4) 138 To measure whether it is an appropriate position 139 to express the target attribute, we consider the LM's 524 Adapt language models to domains and tasks. In 525