Designing and Evaluating Dialogue LLMs for Co-Creative Improvised Theatre (2405.07111v1)
Abstract: Social robotics researchers are increasingly interested in multi-party trained conversational agents. With a growing demand for real-world evaluations, our study presents LLMs deployed in a month-long live show at the Edinburgh Festival Fringe. This case study investigates human improvisers co-creating with conversational agents in a professional theatre setting. We explore the technical capabilities and constraints of on-the-spot multi-party dialogue, providing comprehensive insights from both audience and performer experiences with AI on stage. Our human-in-the-loop methodology underlines the challenges of these LLMs in generating context-relevant responses, stressing the user interface's crucial role. Audience feedback indicates an evolving interest for AI-driven live entertainment, direct human-AI interaction, and a diverse range of expectations about AI's conversational competence and utility as a creativity support tool. Human performers express immense enthusiasm, varied satisfaction, and the evolving public opinion highlights mixed emotions about AI's role in arts.
- 2023. Palm 2 technical report. arXiv preprint arXiv:2305.10403.
- 2021. A general language assistant as a laboratory for alignment. arXiv preprint arXiv:2112.00861.
- 2022. Training a helpful and harmless assistant with reinforcement learning from human feedback. arXiv preprint arXiv:2204.05862.
- 2021. Collaborative storytelling with human actors and ai narrators. Intl Conf Computational Creativity.
- 2020. Language models are few-shot learners. Advances in neural information processing systems 33:1877–1901.
- 2000. Robot improv: Using drama to create believable agents. In IEEE International Conference on Robotics and Automation, volume 4, 4002–4008. IEEE.
- Campbell, J. 2008. The hero with a thousand faces, volume 17. New World Library.
- 2017. Creation and staging of android theatre “sayonara” towards developing highly human-like robots. Future Internet 9(4):75.
- 2020. Grounding conversations with improvised dialogues. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2398–2413.
- 2011. Computational creativity theory: The face and idea descriptive models. In ICCC, 90–95.
- Frosio, G. 2023. Generative ai in court. Court (September 1, 2023). in Nikos Koutras and Niloufer Selvadurai (eds), Recreating Creativity, Reinventing Inventiveness-International Perspectives on AI and IP Governance (Routledge, 2023, Forthcoming).
- 2023. Is gpt-4 good enough to evaluate jokes? In Intl Conf Computational Creativity.
- 2014. Perception of an android robot in japan and australia: A cross-cultural comparison. In Social Robotics: 6th Intl Conf, ICSR, 166–175. Springer.
- 2023. Are you talking to me? a case study in emotional human-machine interaction. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, volume 19, 417–424.
- 2023. Ai art and its impact on artists. In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, 363–374.
- Johnstone, K. 2014. Impro for storytellers. Routledge.
- 2003. Directions for multi-party human-computer interaction research. In HLT-NAACL Workshop on Research Directions in Dialogue Processing, 7–9.
- 2022. Cognitive technologies and artificial intelligence in social perception. Management Systems in Production Engineering.
- 2021. Social robots on a global stage: establishing a role for culture during human–robot interaction. International Journal of Social Robotics 13(6):1307–1333.
- 2016. Opensubtitles2016: Extracting large parallel corpora from movie and tv subtitles.
- 2020. Do digital agents do dada? In Intl Conf Computational Creativity.
- 2016. Improvisational computational storytelling in open worlds. In Interactive Storytelling: 9th Intl Conf on Interactive Digital Storytelling, 73–84. Springer.
- 2017a. Improvised theatre alongside artificial intelligences. In Intl Conf Artificial Intelligence and Interactive Digital Entertainment.
- 2017b. Improvised comedy as a turing test. arXiv preprint arXiv:1711.08819.
- 2018. Improbotics: Exploring the imitation game using machine intelligence in improvised theatre. In AAAI Conf Artificial Intelligence and Interactive Digital Entertainment, volume 14, 59–66.
- 2019. Human improvised theatre augmented with artificial intelligence. In Proceedings of the 2019 on Creativity and Cognition. 527–530.
- 2020. Rosetta code: Improv in any language. In Intl Conf Computational Creativity, 115–122.
- 2017. Theatrical approach: Designing human-like behaviour in humanoid robots. Robotics and Autonomous Systems 89:158–166.
- OpenAI, R. 2023. Gpt-4 technical report. arXiv 2303–08774.
- 2023. Exploring relationship development with social chatbots: A mixed-method study of replika. Computers in Human Behavior 140:107600.
- 2018. Meld: A multimodal multi-party dataset for emotion recognition in conversations. arXiv preprint arXiv:1810.02508.
- 2019. Language models are unsupervised multitask learners. OpenAI blog 1(8):9.
- 2023. Robust speech recognition via large-scale weak supervision. In International Conference on Machine Learning, 28492–28518. PMLR.
- Toplyn, J. 2022. Witscript 2: A system for generating improvised jokes without wordplay. In Intl Conf Computational Creativity.
- 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
- Traum, D. 2003. Issues in multiparty dialogues. In Workshop on Agent Communication Languages, 201–211. Springer.
- 2019. Learning to speak and act in a fantasy text adventure game. In Empirical Methods in Natural Language Processing, 673–683.
- 2020. Comedians in cafes getting data: evaluating timing and adaptivity in real-world robot comedy performance. In Intl Conf Human-Robot Interaction, 223–231.
- 2023. Multi-party chat: Conversational agents in group settings with humans and models. arXiv preprint arXiv:2304.13835.
- 2021. Ethical and social risks of harm from language models. arXiv preprint arXiv:2112.04359.
- 2019. Automatically generating engaging presentation slide decks. In Intl Conf Computational Intelligence in Music, Sound, Art and Design, 127–141.
- 2022. A survey on recent advances in social robotics. Robotics 11(4):75.
- 2023. Investigating ai teammate communication strategies and their impact in human-ai teams for effective teamwork. dl.acm.org 7:1–31.
- 2022. Multi-party empathetic dialogue generation: A new task for dialog systems. In Association for Computational Linguistics, 298–307.