Emergent Mind

SARD: A Human-AI Collaborative Story Generation

(2403.01575)
Published Mar 3, 2024 in cs.HC and cs.AI

Abstract

Generative artificial intelligence (GenAI) has ushered in a new era for storytellers, providing a powerful tool to ignite creativity and explore uncharted narrative territories. As technology continues to advance, the synergy between human creativity and AI-generated content holds the potential to redefine the landscape of storytelling. In this work, we propose SARD, a drag-and-drop visual interface for generating a multi-chapter story using LLMs. Our evaluation of the usability of SARD and its creativity support shows that while node-based visualization of the narrative may help writers build a mental model, it exerts unnecessary mental overhead to the writer and becomes a source of distraction as the story becomes more elaborated. We also found that AI generates stories that are less lexically diverse, irrespective of the complexity of the story. We identified some patterns and limitations of our tool that can guide the development of future human-AI co-writing tools.

Overview

  • The paper introduces SARD, a visual interface that combines human creativity with AI to generate multi-chapter stories, highlighting both the potential and challenges of AI in creative writing.

  • SARD offers a storyboard-based tool integrated with generative AI models like GPT-4 for structuring narratives, demonstrating the system's deep integration with AI for creative workflow enhancement.

  • Evaluations of SARD reveal issues such as cognitive overload due to its node-based visualization, a reduction in lexical diversity, and a desire for more user control over the narrative generation process.

  • Future directions suggest improving user interaction with AI prompts, refining the node-based system to reduce cognitive load, and leveraging LLMs for more responsive story element generation.

Human-AI Collaborative Story Generation through SARD

Introduction

In the domain of storytelling, where human creativity forms the crux, the advent and integration of generative artificial intelligence (GenAI) technologies like LLMs have signaled a new age of narrative innovation. The paper under review introduces SARD, a visual drag-and-drop interface designed to harness the capabilities of LLMs for the generation of multi-chapter stories, examining its efficacy in bridging human creativity with AI's computational prowess. While SARD innovates on process facilitation, it surfaces challenges pertinent to node-based narrative visualization, lexical diversity of AI-generated content, and the cognitive load on writers. These findings underscore the complexities of integrating AI in creative processes and set a foundational path for future explorations in human-AI co-creative writing tools.

Methodology and System Design

SARD's architecture is predicated on a user-friendly, storyboard-based authoring tool, pivotally integrated with generative AI models through REST API and WebSocket communications. The system allows users to initiate stories by selecting genres and structures, then populate their narratives with characters, events, and relationships through a drag-and-drop interface. Descriptive content generation from images and narrative coherence across chapters are leveraged through well-crafted prompts to GPT-4, demonstrating a deep integration of AI in the creative workflow.

Key aspects of SARD include:

  • A canvas that serves as a visual storyboard for adding and linking narrative elements.
  • Options for setting the narrative genre and structure, defining the boundaries within which AI generates content.
  • The allowance for adding characters, events, and actions through nodes, providing a structured approach to story development.
  • Event ordering functionality, ensuring logical narrative progression.

Evaluation and Findings

Two distinct studies were conducted to evaluate SARD—a usability and collaboration examination followed by an assessment of story quality. Insights reveal a nuanced relationship between the tool's usability and the creative support it offers. Despite moderate sentiments on system usability, findings highlight a disparity between the intended and actual support of creative expression and collaboration with AI.

Key findings include:

  • Node-based visualization aids in building a mental model of the narrative, though it may contribute to cognitive overload as narratives complexify.
  • SARD-generated stories exhibit reduced lexical diversity, prompting a reconsideration of AI's role in enhancing narrative creativity.
  • User feedback suggests a desire for greater control over the narrative generation process, pointing towards necessary refinements in the interactivity and transparency of AI-driven prompts and outputs.

Implications and Future Directions

The exploration into SARD's capabilities and limitations yields rich insights into the evolving landscape of AI-assisted storytelling. This research opens several avenues for future investigations, particularly in enhancing user control and contribution in AI-collaborative processes, refining the cognitive ergonomics of authoring interfaces, and addressing the qualitative aspects of AI-generated content—specifically in terms of creativity, diversity, and alignment with user intentions.

Anticipated future work includes:

  • Enabling direct user interaction with AI prompts, facilitating a more intimate co-creative process.
  • Refinement of the node-based system to reduce cognitive load and improve narrative visualization.
  • Leveraging LLMs for more versatile and responsive story element generation, potentially through natural text input describing story plots.

Conclusion

The study of SARD as a human-AI collaborative tool for story generation embarks on a critical inquiry into the feasibilities and challenges of integrating AI into creative writing. Through meticulous design and evaluative rigor, the research provides a foundational understanding of current limitations while positing a future where AI not only augments human creativity but does so in a manner that resonates with the user's cognitive and creative expectations.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.