Emergent Mind

The Political Preferences of LLMs

(2402.01789)
Published Feb 2, 2024 in cs.CY , cs.AI , and cs.CL

Abstract

We report here a comprehensive analysis about the political preferences embedded in LLMs. Namely, we administer 11 political orientation tests, designed to identify the political preferences of the test taker, to 24 state-of-the-art conversational LLMs, both close and open source. The results indicate that when probed with questions/statements with political connotations most conversational LLMs tend to generate responses that are diagnosed by most political test instruments as manifesting preferences for left-of-center viewpoints. We note that this is not the case for base (i.e. foundation) models upon which LLMs optimized for conversation with humans are built. However, base models' suboptimal performance at coherently answering questions suggests caution when interpreting their classification by political orientation tests. Though not conclusive, our results provide preliminary evidence for the intriguing hypothesis that the embedding of political preferences into LLMs might be happening mostly post-pretraining. Namely, during the supervised fine-tuning (SFT) and/or Reinforcement Learning (RL) stages of the conversational LLMs training pipeline. We provide further support for this hypothesis by showing that LLMs are easily steerable into target locations of the political spectrum via SFT requiring only modest compute and custom data, illustrating the ability of SFT to imprint political preferences onto LLMs. As LLMs have started to displace more traditional information sources such as search engines or Wikipedia, the implications of political biases embedded in LLMs has important societal ramifications.

Overview

  • The paper investigates political biases in LLMs by conducting 11 political orientation tests on 24 conversational LLMs.

  • Both open-source and closed-source models, including pre-trained base models, were analyzed using political science instruments to assess their political leanings.

  • Findings indicate a tendency of conversational LLMs to lean left-of-center, with political biases likely introduced during Supervised Fine Tuning (SFT) and Reinforcement Learning (RL), not pre-training.

  • The study emphasizes the need for scrutiny in detecting and addressing political biases in LLMs to maintain fairness in AI-mediated information.

Introduction

The paper under analysis presents a comprehensive examination of the political preferences inherent in current state-of-the-art LLMs. This is done through an extensive survey involving diverse conversational LLMs by subjecting them to a suite of 11 political orientation tests. The study's findings have significant implications, as LLMs increasingly serve as substitutes for traditional information sources. The prevalence of LLMs underscores the importance of understanding any potential biases they may house, especially given the less scrutinized domain of political biases compared to the extensively studied areas of gender and race biases in AI.

Methods

The methodological framework adopted involves a systematic attempt to quantify the political leanings of LLMs using well-established political science instruments. A total of 24 conversational LLMs from various organizations were tested. To ensure a comprehensive analysis, both open-source and closed-source models, including base models that underwent only pre-training, were included. For each LLM, political orientation tests were conducted multiple times, and responses were averaged to gauge the aggregate political inclination.

The research acknowledges the complexity of interpreting the outcomes from base models due to their less coherent responses when presented with politically-charged inquiries. Special techniques, such as the inclusion of specific prefixes and suffixes in prompts, were employed to mitigate this issue and coax more structured answers from these LLMs.

Results

The empirical findings reveal a noticeable trend of conversational LLMs exhibiting preferences for left-of-center viewpoints, as diagnosed by most political orientation tests. Noteworthy is that such tendencies were not apparent in the base models, indicating that the incorporation of political biases could predominantly occur during later stages, like Supervised Fine Tuning (SFT) and Reinforcement Learning (RL), rather than pre-training. Moreover, it was demonstrated how LLMs could be easily fine-tuned with targeted political content to skew their responses towards specific positions on the political spectrum, bolstering the hypothesis regarding the potential role of SFT in embedding political biases.

Discussion

The paper concludes with a thought-provoking contemplation on the implications of these findings. It highlights the consensus among various organizations’ LLM outputs leaning towards left-of-center responses when prompted with political questions. The authors posit that the vastness and versatility of the corpora used for pre-training could indeed enable LLMs to capture a broad swathe of the political landscape, despite any unbalanced representation within the training data. Contrastingly, SFT emerges as a plausible catalyst for infusing distinct political preferences into LLMs, a conjecture substantiated by the authors' ability to fine-tune LLMs to specific political bearings.

Finally, the paper calls attention to the notable exception of the Nolan Test results that consistently classified most LLMs as politically moderate, prompting further investigation into the reliability of political orientation tests. Reflecting on the transformative role of LLMs in the landscape of information sources, this study urges scrutiny and rigor in the detection and potential rectification of political biases within LLMs, a vital step to uphold the integrity and fairness in the era of AI-mediated information dissemination.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.