ACL2025
A Multi-Agent Framework for Mitigating Dialect Biases in Privacy Policy Question-Answering Systems
Dorde Klisura, Astrid R. Bernaga Torres, Anna Karen Gárate-Escamilla, Rajesh Roshan Biswal, Ke Yang, Hilal Pataci, Anthony Rios
Abstract
Privacy policies inform users about data collection and usage, yet their complexity limits accessibility for diverse populations. Existing Privacy Policy Question Answering (QA) systems exhibit performance disparities across English dialects, disadvantaging speakers of nonstandard varieties. We propose a novel multiagent framework inspired by human-centered design principles to mitigate dialectal biases. Our approach integrates a Dialect Agent, which translates queries into Standard American English (SAE) while preserving dialectal intent, and a Privacy Policy Agent, which refines predictions using domain expertise. Unlike prior approaches, our method does not require retraining or dialect-specific fine-tuning, making it broadly applicable across models and domains. Evaluated on PrivacyQA and Poli-cyQA, our framework improves GPT-4o-mini's zero-shot accuracy from 0.394 to 0.601 on Pri-vacyQA and from 0.352 to 0.464 on PolicyQA, surpassing or matching few-shot baselines without additional training data. These results highlight the effectiveness of structured agent collaboration in mitigating dialect biases and underscore the importance of designing NLP systems that account for linguistic diversity to ensure equitable access to privacy information. 2 Privacy policies typically encompass ten major categories of data practices. These include First Party Collection (FP), Third Party Sharing/Collection (TP), Data Retention (DR), and Data Security (DS), which explain how and why first and third parties collect, process, store, share, and protect customer data. User rights are addressed through categories like User Choice/Control (UCC), User Access, Edit, Deletion (UAED), and Do Not Track (DNT) (Wilson et al., 2016) .