A Robot Walks into a Bar: Can Language Models Serve as Creativity Support Tools for Comedy? An Evaluation of LLMs' Humour Alignment with Comedians (2405.20956v2)

Published 31 May 2024 in cs.AI and cs.CL

Abstract: We interviewed twenty professional comedians who perform live shows in front of audiences and who use artificial intelligence in their artistic process as part of 3-hour workshops on AI x Comedy'' conducted at the Edinburgh Festival Fringe in August 2023 and online. The workshop consisted of a comedy writing session with LLMs, a human-computer interaction questionnaire to assess the Creativity Support Index of AI as a writing tool, and a focus group interrogating the comedians' motivations for and processes of using AI, as well as their ethical concerns about bias, censorship and copyright. Participants noted that existing moderation strategies used in safety filtering and instruction-tuned LLMs reinforced hegemonic viewpoints by erasing minority groups and their perspectives, and qualified this as a form of censorship. At the same time, most participants felt the LLMs did not succeed as a creativity support tool, by producing bland and biased comedy tropes, akin tocruise ship comedy material from the 1950s, but a bit less racist''. Our work extends scholarship about the subtle difference between, one the one hand, harmful speech, and on the other hand, offensive'' language as a practice of resistance, satire andpunching up''. We also interrogate the global value alignment behind such LLMs, and discuss the importance of community-based value alignment and data ownership to build AI tools that better suit artists' needs.

Citations (6)

View on Semantic Scholar

Summary

The paper's main contribution is an empirical evaluation of LLMs in comedy writing through structured sessions, surveys, and focus groups.
Methodology combined quantitative Creativity Support Index scores and qualitative feedback, with an average CSI of 54.6 indicating moderate performance.
Findings reveal that while LLMs can aid in draft structuring, they struggle to produce original, nuanced humor aligned with comedians' cultural contexts.

Evaluating the Use of LLMs as Creativity Support Tools for Comedy Writing

The paper "A Robot Walks into a Bar: Can LLMs Serve as Creativity Support Tools for Comedy? An Evaluation of LLMs' Humour Alignment with Comedians" by Mirowski et al. addresses the potential and limitations of LLMs in the domain of comedy writing. The researchers conducted an empirical paper with twenty professional comedians to assess the utility and drawbacks of using LLMs like ChatGPT and Bard as tools for generating humorous content. The paper incorporated structured comedy-writing sessions, surveys evaluating the creativity support provided by LLMs, and focus group discussions to understand the participant's perceptions on various aspects of using AI in comedy. The findings from the paper expose significant challenges and considerations in adopting LLMs for creative applications, particularly in a culturally sensitive and context-dependent domain such as comedy.

Study Design and Methodology

The paper was carefully structured to capture a broad spectrum of comedians' experiences and attitudes towards using LLMs. The participants included professional comedians who were familiar with AI tools. The paper involved:

Comedy Writing Sessions: Participants engaged in writing comedy sketches using LLMs. They were encouraged to generate material they would be comfortable presenting in live shows, focusing on setups, structures, and joke generation.
Surveys:
- Quantitative Evaluation: Utilizing a Likert scale and the Creativity Support Index (CSI) to measure various dimensions like helpfulness, expressiveness, immersion, and collaboration facilitated by the LLMs.
- Qualitative Questions: Open-ended questions to capture participants' direct experiences and thoughts on AI-generated content.
Focus Group Discussions: This qualitative component explored deeper insights into the participants' creative processes, ethical considerations regarding AI use, moderation and censorship concerns, and the perceived effectiveness of LLMs in producing high-quality comedic material.

Quantitative Findings

The quantitative results, as summarized by the Creativity Support Index, highlighted a mediocre performance of the LLMs in aiding comedy writing with an average score of 54.6 out of 100. Notably:

Comedians who had previously used AI tools or performed AI-written material rated the LLMs more positively.
Users who generated content in multiple languages also reported a higher CSI.

Despite some participants finding the AI-generated content helpful for preliminary drafts and structuring, the general consensus was that the output lacked the uniqueness and expressivity required for high-quality comedy. The participants did not feel substantial ownership or pride over the AI-generated material, indicating that the current capabilities of LLMs are limited when it comes to producing engaging and authentic comedic content.

Qualitative Insights

The focus group discussions provided nuanced perspectives on the limitations and potential of LLMs in comedy writing. Several critical themes emerged:

Quality and Blandness of Generated Outputs: Participants overwhelmingly characterized the AI-generated material as uninspired and generic. They noted the difficulty in steering LLMs towards producing specific or interesting content that felt fresh and original.
Creative Agency and Moderation: The intrinsic moderation and censorship imposed by current LLMs were seen as significant hindrances. Comedians expressed frustration over the tools' inability to handle offensive or edgy humor, which is often a staple in the comedy domain. There was a strong preference for comedians to have control over the moderation process, rather than relying on the AI's pre-tuned safety filters.
Marginalization and Representation: The default outputs of LLMs were perceived to reinforce hegemonic viewpoints, often sidelining minority perspectives. Attempts to generate content involving non-dominant identities were met with superficial adjustments, revealing the models' inability to authentically capture diverse cultural nuances.
Contextual Understanding: Effective comedy relies heavily on context—personal experiences, audience demographics, and cultural settings. Participants highlighted the fundamental limitation of LLMs in lacking situational awareness and the ability to incorporate subtext, which are crucial elements in delivering humor.
Ownership and Ethical Concerns: While there was less concern over ownership due to the low quality of outputs, the ethical implications of training models on copyrighted material raised significant questions. The participants debated the balance between enhancing model performance and respecting intellectual property rights.

Discussion and Implications

The paper's findings underline the complexity of aligning LLMs with the specific creative needs of comedians. The current global, one-size-fits-all approach towards fine-tuning these models, often based on broad criteria like being "Harmless, Helpful, and Honest" (HHH), proves inadequate for creative tasks that demand cultural sensitivity and contextual understanding. The authors suggest a pivot towards community-based value alignment, where AI tools are fine-tuned with data and feedback from specific user groups to better reflect their norms and values.

Additionally, incorporating more explicit relational contexts into LLM moderation mechanisms could mitigate the overly cautious filtering that currently stifles creativity. There is also a call for transparency and improvements in data governance to responsibly curate training datasets and empower artists in understanding and controlling the AI tools they use.

Conclusion

The paper by Mirowski et al. offers a comprehensive examination of the role and limitations of LLMs in comedy writing. The empirical insights gathered from professional comedians reveal that while LLMs can assist with preliminary drafting and structural support, they fall short in generating genuinely humorous and contextually rich material. Achieving effective AI-assisted creative tools requires addressing issues of cultural value alignment, context incorporation, and ethical data usage, paving the way for more nuanced and adaptable AI systems tailored to the creative industries.

PDF Markdown

Related Papers

Tweets

https://twitter.com/MirowskiPiotr/status/1798019313401794918

https://twitter.com/juliettelove29/status/1798026895113720153

https://twitter.com/kukreja_abhinav/status/1836712099596140678

https://twitter.com/tanmit/status/1797990709993328956

https://twitter.com/trengriffin/status/1803629036805169654

https://twitter.com/mctalentowen/status/1797552383826047449

YouTube

Show All Videos

HackerNews

A Robot Walks into a Bar: Can LLMs Serve as Creativity Support Tools for Comedy? (6 points, 1 comment)
A Robot Walks into a Bar: An Evaluation of LLMs' Humour Alignment with Comedians (2 points, 0 comments)
A Robot Walks into a Bar: LLMs as Creativity Support Tools for Comedy? (1 point, 0 comments)