EMNLP2021
OSCaR: Orthogonal Subspace Correction and Rectification of Biases in Word Embeddings
Sunipa Dev, Tao Li, Jeff M. Phillips, Vivek Srikumar
31 citations
Abstract
Language representations are known to carry certain associations (e.g., gendered connotations) which may lead to invalid and harmful predictions in downstream tasks. While existing methods are effective at mitigating such unwanted associations by linear projection, we argue that they are too aggressive: not only do they remove such associations, they also erase information that should be retained. To address this issue, we propose OS-CAR (Orthogonal Subspace Correction and Rectification), a balanced approach of mitigation that focuses on disentangling associations between concepts that are deemed problematic, instead of removing concepts wholesale. We develop new measurements for evaluating information retention relevant to the debiasing goal. Our experiments on genderoccupation associations show that OSCAR is a well-balanced approach that ensures that semantic information is retained in the embeddings and unwanted associations are also effectively mitigated. Quantifying Bias and Information Retained Keeping in mind that removing bias and retaining information have to be done in synergy, we present how to obtain aggregated measurements for these two components. We will first describe the de-