Are Large Language Models Moral Hypocrites? A Study Based on Moral Foundations (2405.11100v2)
Abstract: LLMs have taken centre stage in debates on Artificial Intelligence. Yet there remains a gap in how to assess LLMs' conformity to important human values. In this paper, we investigate whether state-of-the-art LLMs, GPT-4 and Claude 2.1 (Gemini Pro and LLAMA 2 did not generate valid results) are moral hypocrites. We employ two research instruments based on the Moral Foundations Theory: (i) the Moral Foundations Questionnaire (MFQ), which investigates which values are considered morally relevant in abstract moral judgements; and (ii) the Moral Foundations Vignettes (MFVs), which evaluate moral cognition in concrete scenarios related to each moral foundation. We characterise conflicts in values between these different abstractions of moral evaluation as hypocrisy. We found that both models displayed reasonable consistency within each instrument compared to humans, but they displayed contradictory and hypocritical behaviour when we compared the abstract values present in the MFQ to the evaluation of concrete moral violations of the MFV.
- Moral Foundations of Large Language Models. In The AAAI 2023 Workshop on Representation Learning for Responsible Human-Centric AI. New York, NY, USA.
- Exploring the Psychology of LLMs’ Moral and Legal Reasoning. Artificial Intelligence, 104145.
- Anthropic. 2023. Claude 2. https://www.anthropic.com/index/claude-2.
- Foundations of Morality in Iran. Evolution and Human Behavior, 41(5): 367–384.
- Morality beyond the WEIRD: How the Nomological Network of Morality Varies across Cultures. Journal of Personality and Social Psychology, 125(5): 1157–1188.
- Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback. arXiv:2204.05862.
- Constitutional AI: Harmlessness from AI Feedback. arxiv:2212.08073.
- On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’21, 610–623. New York, NY, USA: Association for Computing Machinery. ISBN 978-1-4503-8309-7.
- Sparks of Artificial General Intelligence: Early Experiments with GPT-4.
- Moral Foundations Vignettes: A Standardized Stimulus Database of Scenarios Based on Moral Foundations Theory. Behavior Research Methods, 47(4): 1178–1198.
- Hypocrisy and Moral Seriousness. American Philosophical Quarterly, 31(4): 343–349.
- Multiple Moral Foundations Predict Responses to Sacrificial Dilemmas. Personality and Individual Differences, 85: 60–65.
- Can AI Language Models Replace Human Participants? Trends in Cognitive Sciences, 27(7): 597–600.
- Dobolyi, D. 2023. Moral Foundations Theory | Moralfoundations.Org.
- European Union Agency for Law Enforcement Cooperation. 2023. ChatGPT: The Impact of Large Language Models on Law Enforcement. LU: Publications Office.
- Frimer, J. 2019. Moral Foundations Dictionary 2.0.
- Gabriel, I. 2020. Artificial Intelligence, Values, and Alignment. Minds and Machines, 30(3): 411–437.
- Google, G. T. 2023. Gemini: A Family of Highly Capable Multimodal Models. arxiv:2312.11805.
- Chapter Two - Moral Foundations Theory: The Pragmatic Validity of Moral Pluralism. In Devine, P.; and Plant, A., eds., Advances in Experimental Social Psychology, volume 47, 55–130. Academic Press.
- Liberals and Conservatives Rely on Different Sets of Moral Foundations. Journal of Personality and Social Psychology, 96(5): 1029–1046.
- Mapping the Moral Domain. Journal of personality and social psychology, 101(2): 366.
- Policy Shaping: Integrating Human Feedback with Reinforcement Learning. In Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, NIPS’13, 2625–2633. Red Hook, NY, USA: Curran Associates Inc.
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification. arxiv:2307.11031.
- LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models. arxiv:2308.11462.
- Evaluating Large Language Models in Generating Synthetic HCI Research Data: A Case Study. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI ’23, 1–19. New York, NY, USA: Association for Computing Machinery. ISBN 978-1-4503-9421-5.
- Ideology Justifies Morality: Political Beliefs Predict Moral Foundations. American Journal of Political Science, 63(4): 788–806.
- The Extended Moral Foundations Dictionary (eMFD): Development and Applications of a Crowd-Sourced Approach to Extracting Moral Intuitions from Text. Behavior Research Methods, 53(1): 232–246.
- Hutson, M. 2023. Guinea Pigbots: Doing Research with Human Subjects Is Costly and Cumbersome. Can AI Chatbots Replace Them? Science, 381(6654): 121–123.
- Hypocrisy and Moral Authority. Journal of Ethics and Social Philosophy, 12(2): 191–222.
- Testing Measurement Invariance of the Moral Foundations Questionnaire Across 27 Countries. Assessment, 27(2): 365–372.
- Understanding Libertarian Morality: The Psychological Dispositions of Self-Identified Libertarians. PLOS ONE, 7(8): e42366.
- Predicting Demographics, Moral Foundations, and Human Values from Digital Behaviours. Computers in Human Behavior, 92: 428–445.
- Kittay, E. F. 1982. On Hypocrisy1. Metaphilosophy, 13(3-4): 277–289.
- Many Labs 2: Investigating Variation in Replicability Across Samples and Settings. Advances in Methods and Practices in Psychological Science, 1(4): 443–490.
- Kozlov, M. 2023. First Global Survey Reveals Who Is Doing ‘Gain of Function’ Research on Pathogens and Why. Nature, 621(7980): 668–669.
- Hypocritical Flip-Flop, or Courageous Evolution? When Leaders Change Their Moral Minds. Journal of Personality and Social Psychology, 113(5): 730–752.
- Exploring the Use of Large Language Models for Improving the Awareness of Mindfulness. In Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, CHI EA ’23, 1–7. New York, NY, USA: Association for Computing Machinery. ISBN 978-1-4503-9422-2.
- The Sources of Four Commonly Reported Cutoff Criteria: What Did They Really Say? Organizational Research Methods, 9(2): 202–220.
- Translation and Validation of the Moral Foundations Vignettes (MFVs) for the Portuguese Language in a Brazilian Sample. Judgment and Decision Making, 15(1): 149–158.
- Martínez, E. 2024. Re-Evaluating GPT-4’s Bar Exam Performance. Artificial Intelligence and Law.
- To Protect Science, We Must Use LLMs as Zero-Shot Translators. Nature Human Behaviour, 7(11): 1830–1832.
- Truth Machines: Synthesizing Veracity in AI Language Models. AI & SOCIETY.
- The Moral Foundations Taxonomy: Structural Validity and Relation to Political Ideology in Sweden. Personality and Individual Differences, 76: 28–32.
- OpenAI. 2023. GPT-4 Technical Report. arxiv:2303.08774.
- Diminished Diversity-of-Thought in a Standard Large Language Model. arxiv:2302.07267.
- INACIA: Integrating Large Language Models in Brazilian Audit Courts: Opportunities and Challenges. arxiv:2401.05273.
- GPT Is an Effective Tool for Multilingual Psychological Text Analysis.
- Whose Opinions Do Language Models Reflect? arxiv:2303.17548.
- Savelka, J. 2023. Unlocking Practical Applications in Legal Domain: Evaluation of GPT for Zero-Shot Semantic Annotation of Legal Texts. In Nineteenth International Conference on Artificial Intelligence and Law (ICAIL 2023), 5. Braga, Portugal: ACM, New York, NY, USA.
- The Theory of Dyadic Morality: Reinventing Moral Judgment by Redefining Harm. Personality and Social Psychology Review, 22(1): 32–70.
- Towards Understanding Sycophancy in Language Models. ArXiv:2310.13548 [cs, stat].
- Simmons, G. 2022. Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored to Political Identity. arxiv:2209.12106.
- Large Language Models Encode Clinical Knowledge. Nature, 620(7972): 172–180.
- Intuitive Ethics and Political Orientations: Testing Moral Foundations as a Theory of Political Ideology. American Journal of Political Science, 61(2): 424–437.
- Selective Annotation Makes Language Models Better Few-Shot Learners. In The Eleventh International Conference on Learning Representations.
- Large Language Models in Medicine. Nature Medicine, 29(8): 1930–1940.
- Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv:2307.09288.
- Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning. arxiv:2301.11916.
- Emergent Abilities of Large Language Models. arxiv:2206.07682.
- Validation of the Moral Foundations Questionnaire in Turkey and Its Relation to Cultural Schemas of Individualism and Collectivism. Personality and Individual Differences, 99: 149–154.
- Why Johnny Can’t Prompt: How Non-AI Experts Try (and Fail) to Design LLM Prompts. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI ’23, 1–21. New York, NY, USA: Association for Computing Machinery. ISBN 978-1-4503-9421-5.
- Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models. arxiv:2309.01219.
- Fine-Tuning Language Models from Human Preferences. arxiv:1909.08593.