Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 34 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 30 tok/s Pro
GPT-4o 80 tok/s Pro
Kimi K2 198 tok/s Pro
GPT OSS 120B 461 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

On the Decision-Making Abilities in Role-Playing using Large Language Models (2402.18807v1)

Published 29 Feb 2024 in cs.CL and cs.AI

Abstract: LLMs are now increasingly utilized for role-playing tasks, especially in impersonating domain-specific experts, primarily through role-playing prompts. When interacting in real-world scenarios, the decision-making abilities of a role significantly shape its behavioral patterns. In this paper, we concentrate on evaluating the decision-making abilities of LLMs post role-playing thereby validating the efficacy of role-playing. Our goal is to provide metrics and guidance for enhancing the decision-making abilities of LLMs in role-playing tasks. Specifically, we first use LLMs to generate virtual role descriptions corresponding to the 16 personality types of Myers-Briggs Type Indicator (abbreviated as MBTI) representing a segmentation of the population. Then we design specific quantitative operations to evaluate the decision-making abilities of LLMs post role-playing from four aspects: adaptability, exploration$&$exploitation trade-off ability, reasoning ability, and safety. Finally, we analyze the association between the performance of decision-making and the corresponding MBTI types through GPT-4. Extensive experiments demonstrate stable differences in the four aspects of decision-making abilities across distinct roles, signifying a robust correlation between decision-making abilities and the roles emulated by LLMs. These results underscore that LLMs can effectively impersonate varied roles while embodying their genuine sociological characteristics.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Graph of thoughts: Solving elaborate problems with large language models. arXiv preprint arXiv:2308.09687.
  2. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258.
  3. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  4. Uncovering chatgpt’s capabilities in recommender systems. In Proceedings of the 17th ACM Conference on Recommender Systems.
  5. Toxicity in chatgpt: Analyzing persona-assigned language models. arXiv preprint arXiv:2304.05335.
  6. A survey for in-context learning. arXiv preprint arXiv:2301.00234.
  7. Reasoning, decision making and rationality. Cognition, 49(1-2):165–187.
  8. The dark triad of personality: A 10 year review. Social and personality psychology compass, 7(3):199–216.
  9. Samuel J Gershman. 2018. Deconstructing the human algorithms for exploration. Cognition, 173:34–42.
  10. Reasoning with language model is planning with world model. arXiv preprint arXiv:2305.14992.
  11. Measuring massive multitask language understanding. arXiv preprint arXiv:2009.03300.
  12. Better zero-shot reasoning with role-play prompting. arXiv preprint arXiv:2308.07702.
  13. Focussing in reasoning and decision making. Cognition, 49(1-2):37–66.
  14. Camel: Communicative agents for" mind" exploration of large scale language model society. arXiv preprint arXiv:2303.17760.
  15. Is gpt-3 a psychopath? evaluating large language models from a psychological perspective. arXiv preprint arXiv:2212.10529.
  16. Isabel Briggs Myers and Peter B Myers. 2010. Gifts differing: Understanding personality type. Nicholas Brealey.
  17. Zero-shot task generalization with multi-task deep reinforcement learning. In International Conference on Machine Learning, pages 2661–2670. PMLR.
  18. Generative agents: Interactive simulacra of human behavior. arXiv preprint arXiv:2304.03442.
  19. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
  20. In-context impersonation reveals large language models’ strengths and biases. arXiv preprint arXiv:2305.14930.
  21. Multitask prompted training enables zero-shot task generalization. arXiv preprint arXiv:2110.08207.
  22. Hyperbandit: Contextual bandit with hypernewtork for time-varying user preferences in streaming recommendation. arXiv preprint arXiv:2308.08497.
  23. Towards out-of-distribution generalization: A survey. arXiv preprint arXiv:2108.13624.
  24. Dan Simon. 2001. Kalman filtering. Embedded systems programming, 14(6):72–79.
  25. Can chatgpt write a good boolean query for systematic review literature search? arXiv preprint arXiv:2302.03495.
  26. Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171.
  27. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
  28. Large language models are diverse role-players for summarization evaluation. arXiv preprint arXiv:2303.15078.
  29. Modeling user activity preference by leveraging user spatial temporal characteristics in lbsns. Journal of IEEE Transactions on Systems, Man, and Cybernetics: Systems, 45:129–142.
  30. Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601.
  31. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629.
  32. Reinforcement learning: Exploration–exploitation dilemma in multi-agent foraging task. Opsearch, 49:223–236.
  33. Hellaswag: Can a machine really finish your sentence? arXiv preprint arXiv:1905.07830.
  34. A survey of large language models. arXiv preprint arXiv:2303.18223.
  35. Prompt consistency for zero-shot task generalization. arXiv preprint arXiv:2205.00049.
Citations (1)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.