Learning and Sustaining Shared Normative Systems via Bayesian Rule Induction in Markov Games (2402.13399v2)
Abstract: A universal feature of human societies is the adoption of systems of rules and norms in the service of cooperative ends. How can we build learning agents that do the same, so that they may flexibly cooperate with the human institutions they are embedded in? We hypothesize that agents can achieve this by assuming there exists a shared set of norms that most others comply with while pursuing their individual desires, even if they do not know the exact content of those norms. By assuming shared norms, a newly introduced agent can infer the norms of an existing population from observations of compliance and violation. Furthermore, groups of agents can converge to a shared set of norms, even if they initially diverge in their beliefs about what the norms are. This in turn enables the stability of the normative system: since agents can bootstrap common knowledge of the norms, this leads the norms to be widely adhered to, enabling new entrants to rapidly learn those norms. We formalize this framework in the context of Markov games and demonstrate its operation in a multi-agent environment via approximately Bayesian rule induction of obligative and prohibitive norms. Using our approach, agents are able to rapidly learn and sustain a variety of cooperative institutions, including resource management norms and compensation for pro-social labor, promoting collective welfare while still allowing agents to act in their own interests.
- Melting Pot 2.0. http://arxiv.org/abs/2211.13746 arXiv:2211.13746 [cs] version: 3.
- Carlos E. Alchourrón. 1969. Logic of Norms and Logic of Normative Propositions. Logique et Analyse 12, 47 (1969), 242–268. https://www.jstor.org/stable/44083577 Publisher: Peeters Publishers.
- Joan Aldous and Reuben Hill. 1965. Social Cohesion, Lineage Type, and Intergenerational Transmission*. Social Forces 43, 4 (May 1965), 471–482. https://doi.org/10.2307/2574453
- Thomas Arnold and Daniel Kasenberg. 2017. Value Alignment or Misalignment: What Will Keep Systems Accountable?. In AAAI Workshop on AI, Ethics, and Society.
- Robert J. Aumann. 1987. Correlated Equilibrium as an Expression of Bayesian Rationality. Econometrica 55, 1 (1987), 1–18. https://doi.org/10.2307/1911154 Publisher: [Wiley, Econometric Society].
- Robert Axelrod and William D Hamilton. 1981. The evolution of cooperation. science 211, 4489 (1981), 1390–1396.
- Alisabeth Ayars. 2016. Can model-free reinforcement learning explain deontological moral judgments? Cognition 150 (2016), 232–242.
- Constitutional AI: Harmlessness from AI feedback. arXiv preprint arXiv:2212.08073 (2022).
- Learning to act using real-time dynamic programming. Artificial Intelligence 72, 1 (Jan. 1995), 81–138. https://doi.org/10.1016/0004-3702(94)00011-O
- Vern Bengtson. 2018. Global Aging and Challenges to Families. Routledge.
- Cristina Bicchieri. 2005. The Grammar of Society: The Nature and Dynamics of Social Norms. Cambridge University Press. Google-Books-ID: 4N1FDIZvcI8C.
- Cristina Bicchieri and Yoshitaka Fukui. 1999. The great illusion: Ignorance, informational cascades, and the persistence of unpopular norms. In Experience, Reality, and Scientific Explanation: Essays in Honor of Merrilee and Wesley Salmon. Springer, 89–121.
- Social Norms. In The Stanford Encyclopedia of Philosophy (winter 2018 ed.), Edward N. Zalta (Ed.). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/win2018/entries/social-norms/
- Ken Binmore and Larry Samuelson. 1994. An economist’s perspective on the evolution of norms. Journal of Institutional and Theoretical Economics (JITE)/Zeitschrift für die gesamte Staatswissenschaft (1994), 45–63.
- Guido Boella and Leendert van der Torre. 2006. An architecture of a normative system: counts-as conditionals, obligations and permissions. In Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems (AAMAS ’06). Association for Computing Machinery, New York, NY, USA, 229–231. https://doi.org/10.1145/1160633.1160671
- Introduction to normative multiagent systems. Computational & Mathematical Organization Theory 12, 2 (Oct. 2006), 71–79. https://doi.org/10.1007/s10588-006-9537-7
- Bowles. 2006. Group Competition, Reproductive Leveling, and the Evolution of Human Altruism | Science. https://www.science.org/doi/abs/10.1126/science.1134829
- Michael E. Bratman. 2013. Shared Agency: A Planning Theory of Acting Together. Oxford University Press. Google-Books-ID: jcs8BAAAQBAJ.
- John J Camilleri. 2017. Contracts and Computation. Doctoral. University of Gothenburg, Gothenburg.
- Get It in Writing: Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL. https://doi.org/10.48550/arXiv.2208.10469 arXiv:2208.10469 [cs, econ].
- A Bayesian Approach to Norm Identification. (2016).
- Fiery Cushman. 2013. Action, outcome, and value: A dual-system framework for morality. Personality and social psychology review 17, 3 (2013), 273–292.
- Punishment as communication. The Oxford handbook of moral psychology (2019), 197–209.
- Incapacitation and just deserts as motives for punishment. Law and human behavior 24 (2000), 659–683.
- A theory of learning to infer. Psychological review 127, 3 (2020), 412.
- Contemporary Approaches to the Social Contract. In The Stanford Encyclopedia of Philosophy (winter 2021 ed.), Edward N. Zalta (Ed.). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/win2021/entries/contractarianism-contemporary/
- Benjamin Eysenbach and Sergey Levine. 2019. If MaxEnt RL is the Answer, What is the Question? https://arxiv.org/abs/1910.01913v1
- Ernst Fehr and Ivo Schurtenberger. 2018. Normative foundations of human cooperation. Nature Human Behaviour 2, 7 (July 2018), 458–468. https://doi.org/10.1038/s41562-018-0385-5
- Wolfgang Gaissmaier and Lael J Schooler. 2008. The smart potential behind probability matching. Cognition 109, 3 (2008), 416–422.
- Herbert Gintis. 2010. Social norms as choreography. Politics, Philosophy & Economics 9, 3 (Aug. 2010), 251–264. https://doi.org/10.1177/1470594X09345474 Publisher: SAGE Publications.
- P. J. Gmytrasiewicz and P. Doshi. 2005. A Framework for Sequential Planning in Multi-Agent Settings. Journal of Artificial Intelligence Research 24 (July 2005), 49–79. https://doi.org/10.1613/jair.1579
- The Role of Family in the Intergenerational Transmission of Collective Action. Social Psychological and Personality Science 12, 6 (Aug. 2021), 856–867. https://doi.org/10.1177/1948550620949378 Publisher: SAGE Publications Inc.
- A Rational Analysis of Rule-Based Concept Learning. Cognitive Science 32, 1 (2008), 108–154. https://doi.org/10.1080/03640210701802071 _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1080/03640210701802071.
- Concepts in a probabilistic language of thought. Technical Report. Center for Brains, Minds and Machines (CBMM).
- Gillian Kereldena Hadfield. 2017. Rules for a flat world: Why humans invented law and how to reinvent it for a complex global economy. Oxford University Press.
- Gillian K. Hadfield and Barry R. Weingast. 2012. What Is Law? A Coordination Model of the Characteristics of Legal Order. Journal of Legal Analysis 4, 2 (Dec. 2012), 471–514. https://doi.org/10.1093/jla/las008
- Gillian K. Hadfield and Barry R. Weingast. 2014. Microfoundations of the Rule of Law. Annual Review of Political Science 17, 1 (2014), 21–42. https://doi.org/10.1146/annurev-polisci-100711-135226 _eprint: https://doi.org/10.1146/annurev-polisci-100711-135226.
- Legible Normativity for AI Alignment: The Value of Silly Rules. http://arxiv.org/abs/1811.01267 arXiv:1811.01267 [cs].
- Kurtis Hagen. 2010. The propriety of Confucius: A sense-of-ritual. Asian Philosophy 20, 1 (2010), 1–25.
- The Emergence of Social Norms and Conventions. Trends in Cognitive Sciences 23, 2 (Feb. 2019), 158–169. https://doi.org/10.1016/j.tics.2018.11.003
- Feature-based Joint Planning and Norm Learning in Collaborative Games. (2016).
- Inequity aversion improves cooperation in intertemporal social dilemmas. In Advances in Neural Information Processing Systems, Vol. 31. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2018/hash/7fea637fd6d02b8f0adf6f7dc36aed93-Abstract.html
- Lab Experiments for the Study of Social-Ecological Systems. Science 328, 5978 (April 2010), 613–617. https://doi.org/10.1126/science.1183532 Publisher: American Association for the Advancement of Science.
- Social influence as intrinsic motivation for multi-agent deep reinforcement learning. In International conference on machine learning. PMLR, 3040–3049.
- Ehud Kalai and Ehud Lehrer. 1995. Subjective games and equilibria. Games and Economic Behavior 8, 1 (1995), 123–163.
- Daniel Kasenberg and Matthias Scheutz. 2018. Inverse Norm Conflict Resolution. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society (AIES ’18). Association for Computing Machinery, New York, NY, USA, 178–183. https://doi.org/10.1145/3278721.3278775
- Lawrence Kohlberg and Richard H Hersh. 1977. Moral development: A review of the theory. Theory into practice 16, 2 (1977), 53–59.
- When it is not out of line to get out of line: The role of universalization and outcome-based reasoning in rule-breaking judgments. 45, 45 (2023).
- Model-free conventions in multi-agent reinforcement learning with heterogeneous preferences. http://arxiv.org/abs/2010.09054 arXiv:2010.09054 [cs].
- Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot. http://arxiv.org/abs/2107.06857 arXiv:2107.06857 [cs].
- Adam Lerer and Alexander Peysakhovich. 2019. Learning Existing Social Conventions via Observationally Augmented Self-Play. http://arxiv.org/abs/1806.10071 arXiv:1806.10071 [cs].
- Resource-rational contractualism: A triple theory of moral cognition. https://doi.org/10.31234/osf.io/p48t7
- The logic of universalization guides moral judgment. Proceedings of the National Academy of Sciences 117, 42 (2020), 26158–26169.
- Role-Based Modeling for Designing Agent Behavior in Self-Organizing Multi-Agent Systems. International Journal of Software Engineering and Knowledge Engineering 28, 01 (Jan. 2018), 79–96. https://doi.org/10.1142/S0218194018500043 Publisher: World Scientific Publishing Co.
- Young children conform more to norms than to preferences. Plos one 16, 5 (2021), e0251228.
- Divide-and-conquer with sequential Monte Carlo. Journal of Computational and Graphical Statistics 26, 2 (2017), 445–458.
- Michael L. Littman. 1994. Markov games as a framework for multi-agent reinforcement learning. In Machine Learning Proceedings 1994, William W. Cohen and Haym Hirsh (Eds.). Morgan Kaufmann, San Francisco (CA), 157–163. https://doi.org/10.1016/B978-1-55860-335-6.50027-1
- Lang2LTL: Translating Natural Language Commands to Temporal Robot Task Specification. arXiv preprint arXiv:2302.11649 (2023).
- Sample-Efficient Reinforcement Learning of Partially Observable Markov Games. https://doi.org/10.48550/arXiv.2206.01315 arXiv:2206.01315 [cs, stat].
- John Mikhail. 2007. Universal moral grammar: Theory, evidence and the future. Trends in cognitive sciences 11, 4 (2007), 143–152.
- Adam Morris and Fiery Cushman. 2018. A common framework for theories of norm compliance. Social Philosophy and Policy 35, 1 (2018), 101–127.
- Norm emergence in multiagent systems: a viewpoint paper. Autonomous Agents and Multi-Agent Systems 33, 6 (Nov. 2019), 706–749. https://doi.org/10.1007/s10458-019-09422-0
- Stephen Muggleton. 1994. Bayesian Inductive Logic Programming. In Machine Learning Proceedings 1994, William W. Cohen and Haym Hirsh (Eds.). Morgan Kaufmann, San Francisco (CA), 371–379. https://doi.org/10.1016/B978-1-55860-335-6.50052-0
- John J. Nay. 2022. Law Informs Code: A Legal Informatics Approach to Aligning Artificial Intelligence with Humans. https://doi.org/10.48550/arXiv.2209.13020 arXiv:2209.13020 [cs].
- Rational learners and moral rules. Mind & Language 31, 5 (2016), 530–554.
- Social norms as solutions. Science 354, 6308 (Oct. 2016), 42–43. https://doi.org/10.1126/science.aaf8317 Publisher: American Association for the Advancement of Science.
- LINC: A neurosymbolic approach for logical reasoning by combining language models with first-order logic provers. arXiv preprint arXiv:2310.15164 (2023).
- Elinor Ostrom. 1990. Governing the commons: The evolution of institutions for collective action. Cambridge university press.
- Diana Panke and Ulrich Petersohn. 2012. Why international norms disappear sometimes. European journal of international relations 18, 4 (2012), 719–742.
- Computational approaches to habits in a model-free world. Current Opinion in Behavioral Sciences 20 (2018), 104–109.
- Steven T Piantadosi and Robert A Jacobs. 2016. Four problems solved by the probabilistic language of thought. Current Directions in Psychological Science 25, 1 (2016), 54–59.
- A multi-agent reinforcement learning model of common-pool resource appropriation. In Advances in Neural Information Processing Systems, Vol. 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2017/hash/2b0f658cbffd284984fb11d90254081f-Abstract.html
- Modeling punishment as a rational communicative social action. In Proceedings of the annual meeting of the cognitive science society, Vol. 44.
- Young children’s understanding of the context-relativity of normative rules in conventional games. British Journal of Developmental Psychology 27, 2 (2009), 445–456. https://doi.org/10.1348/026151008X337752 _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1348/026151008X337752.
- The sources of normativity: Young children’s awareness of the normative structure of games. Developmental Psychology 44, 3 (2008), 875–881. https://psycnet.apa.org/doiLanding?doi=10.1037%2F0012-1649.44.3.875
- A Tutorial on Thompson Sampling. Foundations and Trends® in Machine Learning 11, 1 (July 2018), 1–96. https://doi.org/10.1561/2200000070 Publisher: Now Publishers, Inc.
- Identifying prohibition norms in agent societies. Artificial Intelligence and Law 21, 1 (March 2013), 1–46. https://doi.org/10.1007/s10506-012-9126-7
- Norm emergence in agent societies formed by dynamically changing networks. Web Intelligence and Agent Systems: An International Journal 7, 3 (Jan. 2009), 223–232. https://doi.org/10.3233/WIA-2009-0164 Publisher: IOS Press.
- T. M. Scanlon. 2000. What We Owe to Each Other. Harvard University Press. Google-Books-ID: 9OPsDwAAQBAJ.
- Young children attribute normativity to novel actions without pedagogy or normative language. Developmental Science 14, 3 (2011), 530–539. https://doi.org/10.1111/j.1467-7687.2010.01000.x _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1467-7687.2010.01000.x.
- Ryosuke Shibusawa and Toshiharu Sugawara. 2014. Norm Emergence via Influential Weight Propagation in Complex Networks. 2014 European Network Intelligence Conference (Sept. 2014), 30–37. https://doi.org/10.1109/ENIC.2014.28 Conference Name: 2014 European Network Intelligence Conference (ENIC) ISBN: 9781479969142 Place: Wroclaw, Poland Publisher: IEEE.
- John Maynard Smith. 1982. Evolution and the Theory of Games. Cambridge University Press.
- Stephanie Stacy. 2022. The Imagined We: Shared Bayesian Theory of Mind for Modeling Communication. Ph.D. Dissertation. Los Angeles. https://www.proquest.com/openview/795eae7dc98cb5364a1643c68c56481d/1?pq-origsite=gscholar&cbl=18750&diss=y
- Kim-Pong Tam. 2015. Understanding Intergenerational Cultural Transmission Through the Role of Perceived Norms. Journal of Cross-Cultural Psychology 46, 10 (Nov. 2015), 1260–1266. https://doi.org/10.1177/0022022115600074 Publisher: SAGE Publications Inc.
- Toshiyuki Tanaka. 1998. A Theory of Mean Field Approximation. In Advances in Neural Information Processing Systems, Vol. 11. MIT Press. https://proceedings.neurips.cc/paper_files/paper/1998/hash/a368b0de8b91cfb3f91892fbf1ebd4b2-Abstract.html
- Bootstrapping an Imagined We for Cooperation. (2011).
- Bootstrapping an Imagined We for Cooperation.. In CogSci.
- Michael Tomasello and Malinda Carpenter. 2007. Shared intentionality. Developmental science 10, 1 (2007), 121–125.
- Gisela Trommsdorff. 2005. Parent–Child Relations Over the Lifespan: A Cross-Cultural Perspective. In Parenting Beliefs, Behaviors, and Parent-Child Relations. Psychology Press. Num Pages: 42.
- Raimo Tuomela. 1995. The Importance of Us: A Philosophical Study of Basic Social Notions. Stanford University Press, Stanford, Calif.
- Edna Ullmann-Margalit. 2015. The emergence of norms. OUP Oxford.
- A learning agent that acquires social norms from public sanctions in decentralized multi-agent settings. Collective Intelligence 2, 2 (April 2023), 26339137231162025. https://doi.org/10.1177/26339137231162025 Publisher: SAGE Publications.
- Georg Henrik Von Wright. 1981. On the logic of norms and actions. In New studies in deontic logic: Norms, actions, and the foundations of ethics. Springer, 3–35.
- Meta-learning MCMC proposals. Advances in neural information processing systems 31 (2018).
- Richard A. Watson and Eörs Szathmáry. 2016. How Can Evolution Learn? Trends in Ecology & Evolution 31, 2 (Feb. 2016), 147–157. https://doi.org/10.1016/j.tree.2015.11.009 Publisher: Elsevier.
- Michael P Wellman and Max Henrion. 1993. Explaining’explaining away’. IEEE Transactions on Pattern Analysis and Machine Intelligence 15, 3 (1993), 287–292.
- From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought. http://arxiv.org/abs/2306.12672 arXiv:2306.12672 [cs].
- Too Many Cooks: Bayesian Inference for Coordinating Multi‐Agent Collaboration. Topics in Cognitive Science 13, 2 (April 2021), 414–432. https://doi.org/10.1111/tops.12525
- A tale of three probabilistic families: Discriminative, descriptive, and generative models. Quart. Appl. Math. 77, 2 (2019), 423–465.
- Tan Zhi-Xuan. 2022. What Should AI Owe To Us? Accountable and Aligned AI Systems via Contractualist AI Alignment. (2022). https://www.lesswrong.com/posts/Cty2rSMut483QgBQ2/what-should-ai-owe-to-us-accountable-and-aligned-ai-systems
- That’s Mine! Learning Ownership Relations and Norms for Robots. http://arxiv.org/abs/1812.02576 arXiv:1812.02576 [cs].
- Pragmatic Instruction Following and Goal Assistance via Cooperative Language Guided Inverse Plan Search. In Proceedings of the 23rd International Conference on Autonomous Agents and MultiAgent Systems.
- Integrating bottom-up/top-down for object recognition by data driven Markov chain Monte Carlo. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No. PR00662), Vol. 1. IEEE, 738–745.