EMNLP2024

Community-Cross-Instruct: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities

Zihao He, Minh Duc Chu, Rebecca Dorn, Siyi Guo, Kristina Lerman

被引用 1 次

摘要

Social scientists use surveys to probe the opinions and beliefs of populations, but these methods are slow, costly, and prone to biases. Recent advances in large language models (LLMs) enable the creation of computational representations or "digital twins" of populations that generate human-like responses mimicking the population's language, styles, and attitudes. We introduce COMMUNITY-CROSS-INSTRUCT, an unsupervised framework for aligning LLMs to online communities to elicit their beliefs. Given a corpus of a community's online discussions, COMMUNITY-CROSS-INSTRUCT automatically generates instruction-output pairs by an advanced LLM to (1) finetune a foundational LLM to faithfully represent that community, and (2) evaluate the alignment of the finetuned model to the community. We demonstrate the method's utility in accurately representing political and diet communities on Reddit. Unlike prior methods requiring human-authored instructions, COMMUNITY-CROSS-INSTRUCT generates instructions in a fully unsupervised manner, enhancing scalability and generalization across domains. This work enables costeffective and automated surveying of diverse online communities 1 . * Equal Contribution. 1 Code and data are available at https://github.com/ zihaohe123/community-cross-instruct Multiple Choice Predicted Answers Community Data Open Ended Multiple Choice Open Ended LLM Instruction-Response Dataset (1) Instruction Generation (2) Fine-Tuning (3) Evaluation or Llama 3.1 GPT-3.5 GPT-4o Comm-aligned LLM Comm-aligned LLM Figure 1: Illustration of COMMUNITY-CROSS-INSTRUCT to align an LLM to a community. (1) Openended instructions and multi-choice survey questions are generated by an advanced LLM from the community data. (2) A foundational LLM is aligned to the community through instruction-tuning on the open-ended instructions. (3) The alignment of the finetuned LLM to the community is measured using the generated survey questions. bias (Hill et al., 1997) , where participants fail to answer questions, and self-selection bias due to the choices participants make to participate in the survey (Heckman, 1990) . In addition, social stigmas may taint responses (Goel and Salganik, 2010), especially for hard-to-reach and marginalized groups. Recent breakthroughs in generative AI and especially large language models (LLMs) enable new capabilities for creating computational representations of human populations -their digital twins (El Saddik, 2018) -by ingesting vast textual data they create, for example, in online discussion forums. These LLM-based models generate human-like responses that mimic the language, communication styles, and attitudes of populations they are aligned to, allowing us to probe their worldviews, biases, and sentiments in a cost-effective and automated manner. Previous works have leveraged such LLM-based representations to mine opinions Instruction: How should the government handle the taxation of legalized marijuana? Response from r/Liberal: Taxes should fund public services and health initiatives. Response from r/NeutralPolitics: Balanced taxes to ensure regulation without overburdening consumers. Response from r/Anarcho_Capitalism: Minimal or no taxes to prevent black markets. Response from r/Conservative: Avoid high taxes to prevent strengthening black markets. Response from r/AskThe_Donald: Avoiding high taxes; focus on regulation for safety.