Papers
Topics
Authors
Recent
2000 character limit reached

EIPE-text: Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation (2310.08185v1)

Published 12 Oct 2023 in cs.CL and cs.AI

Abstract: Plan-and-Write is a common hierarchical approach in long-form narrative text generation, which first creates a plan to guide the narrative writing. Following this approach, several studies rely on simply prompting LLMs for planning, which often yields suboptimal results. In this paper, we propose a new framework called Evaluation-guided Iterative Plan Extraction for long-form narrative text generation (EIPE-text), which extracts plans from the corpus of narratives and utilizes the extracted plans to construct a better planner. EIPE-text has three stages: plan extraction, learning, and inference. In the plan extraction stage, it iteratively extracts and improves plans from the narrative corpus and constructs a plan corpus. We propose a question answer (QA) based evaluation mechanism to automatically evaluate the plans and generate detailed plan refinement instructions to guide the iterative improvement. In the learning stage, we build a better planner by fine-tuning with the plan corpus or in-context learning with examples in the plan corpus. Finally, we leverage a hierarchical approach to generate long-form narratives. We evaluate the effectiveness of EIPE-text in the domains of novels and storytelling. Both GPT-4-based evaluations and human evaluations demonstrate that our method can generate more coherent and relevant long-form narratives. Our code will be released in the future.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. Sparks of artificial general intelligence: Early experiments with gpt-4.
  2. A survey on evaluation of large language models.
  3. Talebrush: sketching stories with generative pretrained language models. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, pages 1–19.
  4. Hierarchical neural story generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 889–898, Melbourne, Australia. Association for Computational Linguistics.
  5. Strategies for structuring story generation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2650–2660, Florence, Italy. Association for Computational Linguistics.
  6. Content planning for neural story generation with aristotelian rescoring. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4319–4338, Online. Association for Computational Linguistics.
  7. Long text generation by modeling sentence-level and discourse-level coherence. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6379–6393, Online. Association for Computational Linguistics.
  8. LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations.
  9. Haozhe Ji and Minlie Huang. 2021. DiscoDVT: Generating long text with discourse-aware discrete variational transformer. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 4208–4224, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  10. Coauthor: Designing a human-ai collaborative writing dataset for exploring language model capabilities. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, pages 1–19.
  11. Gpteval: Nlg evaluation using gpt-4 with better human alignment. arXiv preprint arXiv:2303.16634.
  12. Co-writing screenplays and theatre scripts with language models: Evaluation by industry professionals. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pages 1–34.
  13. Generating high-quality and informative conversation responses with sequence-to-sequence models. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2210–2219, Copenhagen, Denmark. Association for Computational Linguistics.
  14. Llama: Open and efficient foundation language models.
  15. DOC: Improving long story coherence with detailed outline control. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3378–3465, Toronto, Canada. Association for Computational Linguistics.
  16. Re3: Generating longer stories with recursive reprompting and revision. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 4393–4479, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  17. Plan-and-write: Towards better automatic storytelling. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 7378–7385.
  18. Wordcraft: story writing with large language models. In 27th International Conference on Intelligent User Interfaces, pages 841–852.
  19. Recurrentgpt: Interactive generation of (arbitrarily) long text.
Citations (6)

Summary

  • The paper introduces a novel evaluation-guided iterative plan extraction framework that refines narrative plans through QA-based feedback.
  • The methodology combines tree-structured plan sketching with reinforcement-like iterative refinement, yielding coherent and domain-specific narratives.
  • Experimental results show improved narrative coherence by 5.0% and relevance by 13.3%, outperforming state-of-the-art methods.

Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation

Introduction

The paper "EIPE-text: Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation" (2310.08185) presents a novel framework for generating coherent and relevant long-form narratives. EIPE-text tackles the challenges posed by existing hierarchical generation approaches that rely predominantly on LLMs for planning, and often yield suboptimal results without thorough domain-specific adaptations. The proposed system enhances the planning capabilities by automatically extracting high-quality plans from narrative corpora, thus enabling better planner learning. Figure 1

Figure 1: A Comprehensive Visual Overview of the EIPE-text Framework. The Plan Extraction stage initiates with Plan Sketching, followed by QA-based Evaluation, refining, and constructing a plan corpus for learning.

Methodology

Plan Extraction

EIPE-text's methodology is divided into three distinct stages: plan extraction, learning, and inference. The plan extraction phase initiates with the creation of tree-structured plans based on the narrative corpus, leveraging LLMs in a sketching step. Subsequently, QA-pairs are generated to evaluate the plan's comprehensiveness. The self-evaluation process employs a QA-based mechanism to assess quality and alignment with the source narratives, offering detailed refinement instructions for iterative improvement. Figure 2

Figure 2: An Example of the Plan Refinement Process.

This refinement is crucial for addressing errors identified during the evaluation, ensuring high-quality plans that preserve the narrative's thematic integrity. The iterative process continues until the plan meets predefined evaluation standards, a self-improving loop resembling reinforcement learning, effectively enhancing plan quality.

Learning

In the learning stage, EIPE-text employs two enhancement strategies: fine-tuning with the plan corpus and in-context learning with demonstration examples. Fine-tuning adjusts planner capabilities across multiple domains, while in-context learning permits rapid adaptation to specific styles using selected demonstrations, contributing to generating domain-specific high-quality plans.

Inference

The inference stage consists of two principal activities: plan generation and narrative generation. Here, the planner first generates a plan based on an input topic, followed by the creation of narratives using the structured plan. The hierarchical approach ensures logical organization and detailed narratives, improving coherence and relevance. Figure 3

Figure 3: Average accuracy curve of iterative refinement process.

Experimental Results

The authors tested EIPE-text in domains such as novels and storytelling, demonstrating notable improvements over state-of-the-art baselines like recurrentGPT. Evaluations conducted using both GPT-4 and human judges revealed superior coherence and relevance in narratives generated by EIPE-text. Specifically, human evaluations reflected substantial coherence and relevance enhancements of 5.0% and 13.3%, respectively.

Implications

The implications of EIPE-text extend beyond practical applications in narrative generation to theoretical contributions in improving LLM planning capabilities. By integrating QA-based evaluations with iterative refinement, the framework offers a robust method for enhancing narrative coherence in diverse domains, including scriptwriting and journalism.

The paper suggests that EIPE-text could serve as a cornerstone for future AI systems that require narrative generation with high domain specificity, potentially leading to more creative and expressive writing tasks.

Limitations and Future Research

The effectiveness of EIPE-text largely depends on the reasoning capabilities of the underlying LLMs, thus limiting its application to models like GPT-4 and Claude that possess strong reasoning faculties. Additionally, the framework's reliance on domain-specific data implies challenges in out-of-domain generalization.

Future research may explore expansions of EIPE-text across diverse narrative domains and investigate methods for boosting its out-of-domain performance and scalability.

Conclusion

EIPE-text successfully addresses the inherent challenges in long-form narrative text generation, offering a sophisticated framework for plan extraction, learning, and inference. Its rigorous evaluation mechanism and iterative refinement process significantly enhance planning and narrative outcomes, paving the way for further exploration and advancements in AI-driven storytelling and text generation.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.