Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
104 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
40 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Designing and Evaluating Dialogue LLMs for Co-Creative Improvised Theatre (2405.07111v1)

Published 11 May 2024 in cs.CL

Abstract: Social robotics researchers are increasingly interested in multi-party trained conversational agents. With a growing demand for real-world evaluations, our study presents LLMs deployed in a month-long live show at the Edinburgh Festival Fringe. This case study investigates human improvisers co-creating with conversational agents in a professional theatre setting. We explore the technical capabilities and constraints of on-the-spot multi-party dialogue, providing comprehensive insights from both audience and performer experiences with AI on stage. Our human-in-the-loop methodology underlines the challenges of these LLMs in generating context-relevant responses, stressing the user interface's crucial role. Audience feedback indicates an evolving interest for AI-driven live entertainment, direct human-AI interaction, and a diverse range of expectations about AI's conversational competence and utility as a creativity support tool. Human performers express immense enthusiasm, varied satisfaction, and the evolving public opinion highlights mixed emotions about AI's role in arts.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. 2023. Palm 2 technical report. arXiv preprint arXiv:2305.10403.
  2. 2021. A general language assistant as a laboratory for alignment. arXiv preprint arXiv:2112.00861.
  3. 2022. Training a helpful and harmless assistant with reinforcement learning from human feedback. arXiv preprint arXiv:2204.05862.
  4. 2021. Collaborative storytelling with human actors and ai narrators. Intl Conf Computational Creativity.
  5. 2020. Language models are few-shot learners. Advances in neural information processing systems 33:1877–1901.
  6. 2000. Robot improv: Using drama to create believable agents. In IEEE International Conference on Robotics and Automation, volume 4, 4002–4008. IEEE.
  7. Campbell, J. 2008. The hero with a thousand faces, volume 17. New World Library.
  8. 2017. Creation and staging of android theatre “sayonara” towards developing highly human-like robots. Future Internet 9(4):75.
  9. 2020. Grounding conversations with improvised dialogues. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2398–2413.
  10. 2011. Computational creativity theory: The face and idea descriptive models. In ICCC, 90–95.
  11. Frosio, G. 2023. Generative ai in court. Court (September 1, 2023). in Nikos Koutras and Niloufer Selvadurai (eds), Recreating Creativity, Reinventing Inventiveness-International Perspectives on AI and IP Governance (Routledge, 2023, Forthcoming).
  12. 2023. Is gpt-4 good enough to evaluate jokes? In Intl Conf Computational Creativity.
  13. 2014. Perception of an android robot in japan and australia: A cross-cultural comparison. In Social Robotics: 6th Intl Conf, ICSR, 166–175. Springer.
  14. 2023. Are you talking to me? a case study in emotional human-machine interaction. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, volume 19, 417–424.
  15. 2023. Ai art and its impact on artists. In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, 363–374.
  16. Johnstone, K. 2014. Impro for storytellers. Routledge.
  17. 2003. Directions for multi-party human-computer interaction research. In HLT-NAACL Workshop on Research Directions in Dialogue Processing, 7–9.
  18. 2022. Cognitive technologies and artificial intelligence in social perception. Management Systems in Production Engineering.
  19. 2021. Social robots on a global stage: establishing a role for culture during human–robot interaction. International Journal of Social Robotics 13(6):1307–1333.
  20. 2016. Opensubtitles2016: Extracting large parallel corpora from movie and tv subtitles.
  21. 2020. Do digital agents do dada? In Intl Conf Computational Creativity.
  22. 2016. Improvisational computational storytelling in open worlds. In Interactive Storytelling: 9th Intl Conf on Interactive Digital Storytelling, 73–84. Springer.
  23. 2017a. Improvised theatre alongside artificial intelligences. In Intl Conf Artificial Intelligence and Interactive Digital Entertainment.
  24. 2017b. Improvised comedy as a turing test. arXiv preprint arXiv:1711.08819.
  25. 2018. Improbotics: Exploring the imitation game using machine intelligence in improvised theatre. In AAAI Conf Artificial Intelligence and Interactive Digital Entertainment, volume 14, 59–66.
  26. 2019. Human improvised theatre augmented with artificial intelligence. In Proceedings of the 2019 on Creativity and Cognition. 527–530.
  27. 2020. Rosetta code: Improv in any language. In Intl Conf Computational Creativity, 115–122.
  28. 2017. Theatrical approach: Designing human-like behaviour in humanoid robots. Robotics and Autonomous Systems 89:158–166.
  29. OpenAI, R. 2023. Gpt-4 technical report. arXiv 2303–08774.
  30. 2023. Exploring relationship development with social chatbots: A mixed-method study of replika. Computers in Human Behavior 140:107600.
  31. 2018. Meld: A multimodal multi-party dataset for emotion recognition in conversations. arXiv preprint arXiv:1810.02508.
  32. 2019. Language models are unsupervised multitask learners. OpenAI blog 1(8):9.
  33. 2023. Robust speech recognition via large-scale weak supervision. In International Conference on Machine Learning, 28492–28518. PMLR.
  34. Toplyn, J. 2022. Witscript 2: A system for generating improvised jokes without wordplay. In Intl Conf Computational Creativity.
  35. 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  36. Traum, D. 2003. Issues in multiparty dialogues. In Workshop on Agent Communication Languages, 201–211. Springer.
  37. 2019. Learning to speak and act in a fantasy text adventure game. In Empirical Methods in Natural Language Processing, 673–683.
  38. 2020. Comedians in cafes getting data: evaluating timing and adaptivity in real-world robot comedy performance. In Intl Conf Human-Robot Interaction, 223–231.
  39. 2023. Multi-party chat: Conversational agents in group settings with humans and models. arXiv preprint arXiv:2304.13835.
  40. 2021. Ethical and social risks of harm from language models. arXiv preprint arXiv:2112.04359.
  41. 2019. Automatically generating engaging presentation slide decks. In Intl Conf Computational Intelligence in Music, Sound, Art and Design, 127–141.
  42. 2022. A survey on recent advances in social robotics. Robotics 11(4):75.
  43. 2023. Investigating ai teammate communication strategies and their impact in human-ai teams for effective teamwork. dl.acm.org 7:1–31.
  44. 2022. Multi-party empathetic dialogue generation: A new task for dialog systems. In Association for Computational Linguistics, 298–307.

Summary

  • The paper introduces a novel approach using LLMs as interactive partners in live improvised theatre.
  • It documents the deployment of ChatGPT-3.5, PaLM 2, and Llama 2 in diverse performance formats with real-time human curation.
  • Survey feedback from audiences and performers highlights both the creative potential and mechanical limitations of AI dialogue.

Designing and Evaluating Dialogue LLMs for Co-Creative Improvised Theatre

Introduction

AI isn't just for your smart speaker or chess-playing algorithms anymore. AI has been making inroads into more creative, social, and interactive areas. One fascinating example of this is a paper involving LLMs designed for interactive improvised theatre performances. Imagine watching a live improv show where one of the actors is not human, but an AI! This paper details the deployment of these AI-driven conversational agents during a month-long series of live performances at the Edinburgh Festival Fringe.

The Experiment: AI in Live Theatre

Setting the Stage

Improvised theatre is a dynamic and unpredictable environment, making it an excellent playground for experimenting with AI co-creativity. In these Fringe performances, teams of professional human improvisers shared the stage with conversational agents powered by three different LLMs: Chat GPT-3.5 (OpenAI), PaLM 2 (Google), and Llama 2 (Meta). The AI's lines were delivered through a human actor referred to as the "Cyborg," who received the lines via an earpiece and acted them out on stage.

Challenges

The complexity of live, multi-party dialogue presented several hurdles:

  1. Speech Recognition: Multiple microphones were needed to identify different speakers on stage.
  2. Physical Context: AI needed to understand not just words but gestures, tone, and other physical cues.
  3. Timely Responses: The AI's responses had to be appropriately timed, so it relied on continuous speech recognition supplemented with metadata typed live by an operator to provide context.

A human-in-the-loop system allowed a curator to select the best response from the AI's generated lines during performances, ensuring the output was contextually relevant.

Putting AI to the Test: Formats

To explore how these AI systems could cope in the intense setting of live improvised theatre, various games were designed:

  1. Speed Dating: AI had to perform rapid-fire dialogues with different characters.
  2. Wedding Speech: AI helped generate coherent, humorous speeches incorporating both scripted and audience inputs.
  3. Couples' Therapy and Meet the Parents: AI had to juggle conversations involving multiple interaction dynamics.
  4. Hero's Journey: A complex narrative where AI had to participate in an evolving long-form story.

Audience and Performer Surveys

Surveys were conducted to evaluate the audience's perception of AI in live performance and to gauge the performers' experiences.

Audience Feedback

Audience responses revealed a mixed bag of fascination and skepticism:

  • People were generally curious about AI's role and capabilities.
  • There was excitement about AI's potential in creative fields, but less optimism about its storytelling abilities.
  • AI's responses were viewed as somewhat machine-like and often required human improvisers to work around its limitations.

Performer Feedback

Performers noted various challenges and enjoyments:

  • AI often provided non-sequiturs, adding a layer of unpredictability that improvisers had to creatively integrate.
  • Some performers found AI responses too mechanical, missing the nuanced understanding a human partner would bring.
  • Yet, AI often spurred unexpected and humorous outcomes, making scenes more dynamic.

Practical and Theoretical Implications

Practical Implications

From a practical standpoint, this experiment highlights several potential areas for enhancing human-AI collaboration in real-time creative settings:

  • Enhancing Context Understanding: Improved speech recognition and context-setting mechanisms could make AI interactions more fluid.
  • Refined Curatorial Tools: Developing better UI tools for real-time curation could allow faster, more intuitive scene management.

Theoretical Implications

The research also provides insights into AI's evolving role in social and creative contexts:

  • Human-Centered AI: Highlighted the importance of human-in-the-loop systems to guide AI, making the performances more enjoyable and coherent.
  • Public Perception: Showed that live exposure to AI can demystify its capabilities and limitations, contributing to a more informed public discourse around AI technologies.

Future Developments

Enhanced Multi-Party Dialogue

Future iterations could focus on:

  • Advanced Turn-Taking Algorithms: Improving the AI’s ability to manage and participate effectively in multi-party conversations.
  • Physically Interactive Systems: Incorporating non-verbal cues like gestures and facial expressions to make AI interactions more lifelike.

Application Beyond Theatre

These findings have broader implications than just theatre:

  • Social Robotics: Use cases in social robots where AI can engage in authentic, multi-party dialogues.
  • Education and Training: AI-driven participation in creative learning environments to assist with social and communication skills.

Conclusion

By thrusting AI into the limelight of live theatre, this paper sheds light on both the capabilities and limitations of conversational LLMs in complex, real-world settings. It opens up exciting avenues for future research and development, emphasizing the importance of human-AI collaboration. Whether for entertainment or more serious applications, AI's role in our social and creative lives is not just feasible but increasingly fascinating.