Personality Traits in Large Language Models (2307.00184v4)

Published 1 Jul 2023 in cs.CL, cs.AI, cs.CY, and cs.HC

Abstract: The advent of LLMs has revolutionized natural language processing, enabling the generation of coherent and contextually relevant human-like text. As LLMs increasingly powerconversational agents used by the general public world-wide, the synthetic personality traits embedded in these models, by virtue of training on large amounts of human data, is becoming increasingly important. Since personality is a key factor determining the effectiveness of communication, we present a novel and comprehensive psychometrically valid and reliable methodology for administering and validating personality tests on widely-used LLMs, as well as for shaping personality in the generated text of such LLMs. Applying this method to 18 LLMs, we found: 1) personality measurements in the outputs of some LLMs under specific prompting configurations are reliable and valid; 2) evidence of reliability and validity of synthetic LLM personality is stronger for larger and instruction fine-tuned models; and 3) personality in LLM outputs can be shaped along desired dimensions to mimic specific human personality profiles. We discuss the application and ethical implications of the measurement and shaping method, in particular regarding responsible AI.

References (127)

Citations (97)

View on Semantic Scholar

Summary

The paper establishes a psychometric framework using adapted IPIP-NEO and BFI inventories to measure and validate the Big Five personality traits in LLMs.
It demonstrates that large, instruction-tuned models achieve high internal consistency and robust construct and criterion validity, with key metrics like Cronbach's α in the high 0.90s.
Prompt engineering is used to precisely shape personality traits, with significant effects observed in both controlled survey responses and real-world generative tasks.

Quantifying and Shaping Personality Traits in LLMs

Introduction

The paper "Personality Traits in LLMs" (2307.00184) presents a comprehensive psychometric framework for measuring, validating, and shaping personality traits in LLMs. The authors address the critical question of whether LLMs can reliably and validly simulate human-like personality traits, and if so, whether these traits can be systematically controlled via prompting. The work leverages established psychometric instruments, rigorous construct validation, and novel prompt engineering to both assess and manipulate the Big Five personality dimensions in LLMs, with a focus on the PaLM model family across multiple scales and training regimes.

Psychometric Assessment Methodology

The authors adapt two canonical personality inventories—the 300-item IPIP-NEO and the 44-item BFI—to the LLM context, using structured prompts that combine persona descriptions, item instructions, and response postambles. This design introduces controlled variance necessary for psychometric analysis and enables the administration of personality tests in a reproducible, model-agnostic manner. The LLMs are scored by ranking the conditional log probabilities of response options, ensuring independence across items and mitigating order effects.

Reliability is quantified using Cronbach's $\alpha$ , Guttman's $\lambda_6$ , and McDonald's $\omega$ , while construct validity is established through convergent, discriminant, and criterion validity analyses. Convergent validity is assessed by correlating IPIP-NEO and BFI subscale scores, discriminant validity by comparing convergent and off-diagonal correlations, and criterion validity by relating personality scores to theoretically-linked external measures (e.g., aggression, affect, creativity).

Reliability and Construct Validity Results

The paper finds that only sufficiently large and instruction-tuned models (notably PaLM 62B and 540B) yield personality measurements that are both reliable and valid by psychometric standards. For these models, internal consistency metrics for the Big Five domains are in the high 0.90s, and convergent correlations between IPIP-NEO and BFI subscales reach $r \geq 0.90$ for 540B, with discriminant validity differences exceeding 0.5. Criterion validity is also robust: for example, LLM-simulated agreeableness is strongly negatively correlated with aggression, and neuroticism is strongly positively correlated with negative affect, mirroring human data.

Figure 2: Convergent Pearson's correlations between IPIP-NEO and BFI scores by model, demonstrating strong construct validity for larger, instruction-tuned LLMs.

These results indicate that, for large and instruction-tuned LLMs, psychometric test responses are not only internally consistent but also externally meaningful, supporting the claim that such models can simulate human-like personality traits in a manner that is both reliable and valid.

Personality Trait Shaping via Prompt Engineering

Building on validated measurement, the authors introduce a prompt engineering methodology to shape LLM personality traits along the Big Five dimensions at nine intensity levels, using Likert-type qualifiers and 104 trait adjectives. Two experiments are conducted: (1) single-trait shaping, where each trait is manipulated independently, and (2) multi-trait shaping, where all five traits are concurrently controlled.

Figure 3: Ridge plots showing the frequency distributions of IPIP-NEO personality scores generated by 62B as targeted prompts shape each of the Big Five domains to one of nine different levels.

In the single-trait setting, Spearman's $\rho$ between prompted trait level and observed IPIP-NEO score exceeds 0.90 for all domains and models tested, with median score shifts spanning the full range of the scale. In the multi-trait setting, the largest model (540B) achieves clear separation between "extremely low" and "extremely high" levels for all five traits, with median differences exceeding 2.5 points.

Figure 4: Ridge plots showing the effectiveness of tested models in concurrently shaping specific LLM personality traits, with clear separation between low and high trait levels, especially for 540B.

These results demonstrate that, for sufficiently large and well-trained LLMs, personality traits can be precisely and independently controlled via prompt engineering, both in isolation and in combination.

Downstream Behavioral Validation

To address concerns of common method bias, the authors evaluate whether psychometric signals of LLM personality generalize to real-world generative tasks. Using the Apply Magic Sauce API, they rate the personality of 225,000 social media status updates generated by 540B under different personality shaping prompts. The correlation between survey-based and language-based personality scores averages $r = 0.55$ , exceeding human baselines ( $r = 0.38$ ). Prompted trait levels are also strongly correlated with personality observed in generated text ( $\rho$ up to 0.77 for agreeableness).

Figure 6: Word cloud for "extremely low" prompted neuroticism, showing positive affective language in generated social media updates.

These findings confirm that LLM personality shaping is not limited to survey responses but manifests in open-ended, user-facing outputs, supporting the external validity of the approach.

Implications, Limitations, and Future Directions

The work has significant implications for responsible AI, alignment, and safety. The ability to measure and shape LLM personality enables proactive auditing for undesirable behavioral patterns and supports the development of chatbots with consistent, customizable, and safer personality profiles. The methodology is model-agnostic and can be extended to other architectures and psychometric constructs.

However, the paper is limited by its focus on English-language, Western-centric data and the Big Five model. The authors note the need for cross-cultural validation, alternative personality frameworks (e.g., HEXACO), and further exploration of the impact of context length and generative vs. scoring inference modes. The ethical risks of anthropomorphization, manipulation, and misuse are also discussed, highlighting the need for transparent and regulated deployment.

Conclusion

This paper establishes a rigorous, psychometrically-grounded methodology for measuring, validating, and shaping personality traits in LLMs. The results show that large, instruction-tuned LLMs can reliably simulate and express human-like personality traits, and that these traits can be precisely controlled via prompt engineering, with effects observable in both survey responses and real-world generative tasks. The work provides a foundation for principled LLM assessment and personality alignment, with broad implications for AI safety, customization, and ethical deployment.