Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 33 tok/s Pro
GPT-5 High 33 tok/s Pro
GPT-4o 97 tok/s Pro
Kimi K2 180 tok/s Pro
GPT OSS 120B 435 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Leveraging Large Language Models for Learning Complex Legal Concepts through Storytelling (2402.17019v4)

Published 26 Feb 2024 in cs.CL and cs.HC

Abstract: Making legal knowledge accessible to non-experts is crucial for enhancing general legal literacy and encouraging civic participation in democracy. However, legal documents are often challenging to understand for people without legal backgrounds. In this paper, we present a novel application of LLMs in legal education to help non-experts learn intricate legal concepts through storytelling, an effective pedagogical tool in conveying complex and abstract concepts. We also introduce a new dataset LegalStories, which consists of 294 complex legal doctrines, each accompanied by a story and a set of multiple-choice questions generated by LLMs. To construct the dataset, we experiment with various LLMs to generate legal stories explaining these concepts. Furthermore, we use an expert-in-the-loop approach to iteratively design multiple-choice questions. Then, we evaluate the effectiveness of storytelling with LLMs through randomized controlled trials (RCTs) with legal novices on 10 samples from the dataset. We find that LLM-generated stories enhance comprehension of legal concepts and interest in law among non-native speakers compared to only definitions. Moreover, stories consistently help participants relate legal concepts to their lives. Finally, we find that learning with stories shows a higher retention rate for non-native speakers in the follow-up assessment. Our work has strong implications for using LLMs in promoting teaching and learning in the legal field and beyond.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (70)
  1. H Porter Abbott. 2020. The Cambridge introduction to narrative. Cambridge University Press.
  2. Craig Eilert Abrahamson. 1998. Storytelling as a pedagogical tool in higher education. Education, 118(3):440–452.
  3. Nancy E Adams. 2015. Bloom’s taxonomy of cognitive learning objectives. Journal of the Medical Library Association: JMLA, 103(3):152.
  4. Review on neural question generation for education purposes. International Journal of Artificial Intelligence in Education, pages 1–38.
  5. Automatic story generation: Challenges and attempts. In Proceedings of the Third Workshop on Narrative Understanding, pages 72–83, Virtual. Association for Computational Linguistics.
  6. Generating scientific definitions with controllable complexity. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8298–8317, Dublin, Ireland. Association for Computational Linguistics.
  7. Unilmv2: Pseudo-masked language models for unified language model pre-training. In International conference on machine learning, pages 642–652. PMLR.
  8. Robert W Benson. 1984. The end of legalese: The game is over. New York University Review of Law & Social Change, 13:519.
  9. Scalable educational question generation with pre-trained language models. In International Conference on Artificial Intelligence in Education, pages 327–339. Springer.
  10. Rick Busselle and Helena Bilandzic. 2008. Fictionality and perceived realism in experiencing stories: A model of narrative comprehension and engagement. Communication theory, 18(2):255–280.
  11. Rick Busselle and Helena Bilandzic. 2009. Measuring narrative engagement. Media psychology, 12(4):321–347.
  12. A storytelling learning model for legal education. International Association for Development of the Information Society, 2014:29–36.
  13. Unsupervised simplification of legal texts. arXiv preprint arXiv:2209.00557.
  14. Simpatico: A text simplification system for senate and house bills. In Proceedings of the 11th National Natural Language Processing Research Symposium, pages 26–32.
  15. Kevyn Collins-Thompson. 2014. Computational assessment of text readability: A survey of current and future research. ITL-International Journal of Applied Linguistics, 165(2):97–135.
  16. Chatlaw: Open-source legal large language model with integrated external knowledge bases. arXiv preprint arXiv:2306.16092.
  17. Michael Curtotti and Eric McCreath. 2013. A right to access implies a right to know: An open online platform for research on the readability of law. J. Open Access L., 1:1.
  18. Michael F Dahlstrom. 2014. Using narratives and storytelling to communicate science with nonexpert audiences. Proceedings of the national academy of sciences, 111(supplement_4):13614–13620.
  19. Ruth Davidhizar and Giny Lonser. 2003. Storytelling as a teaching technique. Nurse educator, 28(5):217–221.
  20. Atefeh Farzindar and Guy Lapalme. 2004. Legal text summarization by exploration of the thematic structure and argumentative roles. In Text Summarization Branches Out, pages 27–34, Barcelona, Spain. Association for Computational Linguistics.
  21. A comparison of features for automatic readability assessment.
  22. Quiz maker: Automatic quiz generation from text using nlp. In Futuristic Trends in Networks and Computing Technologies: Select Proceedings of Fourth International Conference on FTNCT 2021, pages 523–533. Springer.
  23. Kathleen Marie Gallagher. 2011. In search of a theoretical basis for storytelling in education research: Story as method. International Journal of Research & Method in Education, 34(1):49–61.
  24. Dee Gardner and Mark Davies. 2014. A new academic vocabulary list. Applied linguistics, 35(3):305–327.
  25. Text simplification for legal domain: Insights and challenges. In Proceedings of the Natural Legal Language Processing Workshop 2022, pages 296–304, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
  26. Katie L Glonek and Paul E King. 2014. Listening to narratives: An experimental examination of storytelling in the classroom. International journal of listening, 28(1):32–46.
  27. Definition modelling for appropriate specificity. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 2499–2509, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  28. Chatgpt for good? on opportunities and challenges of large language models for education. Learning and individual differences, 103:102274.
  29. A free format legal question answering system. In Proceedings of the Natural Legal Language Processing Workshop 2021, pages 107–113, Punta Cana, Dominican Republic. Association for Computational Linguistics.
  30. Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel.
  31. Stephen M Kromka and Alan K Goodboy. 2019. Classroom storytelling: Using instructor narratives to increase student recall, affect, and attention. Communication Education, 68(1):20–43.
  32. The influence of text characteristics on perceived and actual difficulty of health information. International journal of medical informatics, 79(6):438–449.
  33. Evaluating online health information: Beyond readability formulas. In AMIA Annual Symposium Proceedings, volume 2008, page 394. American Medical Informatics Association.
  34. Prompting large language models with chain-of-thought for few-shot knowledge base question generation. arXiv preprint arXiv:2310.08395.
  35. Asking questions the human way: Scalable question-answer generation from text corpus. In Proceedings of The Web Conference 2020, pages 2032–2043.
  36. Expert-authored and machine-generated short-answer questions for assessing students learning performance. Educational Technology & Society, 24(3):159–173.
  37. Readingquizmaker: A human-nlp collaborative system that supports instructors to design high-quality reading quiz questions. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pages 1–18.
  38. The implications of large language models for cs teachers and students. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education, volume 2.
  39. The law and NLP: Bridging disciplinary disconnects. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 3445–3454, Singapore. Association for Computational Linguistics.
  40. Laura Manor and Junyi Jessy Li. 2019. Plain English summarization of contracts. In Proceedings of the Natural Legal Language Processing Workshop 2019, pages 1–11, Minneapolis, Minnesota. Association for Computational Linguistics.
  41. Susana Martinez-Conde and Stephen L Macknik. 2017. Finding the plot in science storytelling in hopes of enhancing science communication. Proceedings of the National Academy of Sciences, 114(31):8127–8129.
  42. Jorge Martinez-Gil. 2023. A survey on legal question–answering systems. Computer Science Review, 48:100552.
  43. Carrie J Menkel-Meadow. 1999. When winning isn’t everything: The lawyer as problem solver. Hofstra L. Rev., 28:905.
  44. Randall Munroe. 2015. Thing explainer: complicated stuff in simple words. Hachette UK.
  45. ACCoRD: A multi-document approach to generating diverse descriptions of scientific concepts. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 200–213, Abu Dhabi, UAE. Association for Computational Linguistics.
  46. Towards personalized descriptions of scientific concepts. In The Fifth Widening Natural Language Processing Workshop at EMNLP.
  47. Pre-training with scientific text improves educational question generation (student abstract). In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 16288–16289.
  48. Qurious: Question generation pretraining for text generation. arXiv preprint arXiv:2004.11026.
  49. Christos H Papadimitriou. 2003. Mythematics: in praise of storytelling in the teaching of computer science and math. ACM SIGCSE Bulletin, 35(4):7–9.
  50. Emily Pitler and Ani Nenkova. 2008. Revisiting readability: A unified framework for predicting text quality. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 186–195, Honolulu, Hawaii. Association for Computational Linguistics.
  51. Deictic centers and the cognitive structure of narrative comprehension. State University of New York (Buffalo). Department of Computer Science.
  52. Educational multi-question generation for reading comprehension. In Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022), pages 216–223, Seattle, Washington. Association for Computational Linguistics.
  53. Comparing and developing tools to measure the readability of domain-specific texts. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4831–4842, Hong Kong, China. Association for Computational Linguistics.
  54. Automatic generation of programming exercises and code explanations using large language models. In Proceedings of the 2022 ACM Conference on International Computing Education Research-Volume 1, pages 27–43.
  55. Explaining legal concepts with augmented large language models (gpt-4). arXiv preprint arXiv:2306.09525.
  56. Neha Srikanth and Junyi Jessy Li. 2021. Elaborative simplification: Content addition and explanation generation in text simplification. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 5123–5137, Online. Association for Computational Linguistics.
  57. I do not understand what i cannot define: Automatic question generation with pedagogically-driven content selection. arXiv preprint arXiv:2110.04123.
  58. Educational automatic question generation improves reading comprehension in non-native speakers: A learner-centric case study. Frontiers in Artificial Intelligence, 5:900304.
  59. Llama 2: Open foundation and fine-tuned chat models, 2023. URL https://arxiv. org/abs/2307.09288.
  60. Generating multiple choice questions for computing courses using large language models.
  61. On the automatic generation and simplification of children’s stories. arXiv preprint arXiv:2310.18502.
  62. Towards human-like educational question generation with large language models. In International conference on artificial intelligence in education, pages 153–166. Springer.
  63. Evaluating reading comprehension exercises generated by LLMs: A showcase of ChatGPT in education applications. In Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023), pages 610–625, Toronto, Canada. Association for Computational Linguistics.
  64. The next chapter: A study of large language models in storytelling. In Proceedings of the 16th International Natural Language Generation Conference, pages 323–351, Prague, Czechia. Association for Computational Linguistics.
  65. MEGATRON-CNTRL: Controllable story generation with external knowledge using large-scale language models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2831–2845, Online. Association for Computational Linguistics.
  66. Machine comprehension by text-to-text neural question generation. In Proceedings of the 2nd Workshop on Representation Learning for NLP, pages 15–25, Vancouver, Canada. Association for Computational Linguistics.
  67. Unleashing the power of large language models for legal applications. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, pages 5257–5258.
  68. Jec-qa: a legal-domain question answering dataset. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 9701–9708.
  69. Neural question generation from text: A preliminary study. In Natural Language Processing and Chinese Computing: 6th CCF International Conference, NLPCC 2017, Dalian, China, November 8–12, 2017, Proceedings 6, pages 662–671. Springer.
  70. Future sight: Dynamic story generation with large pretrained language models. arXiv preprint arXiv:2212.09947.
Citations (11)

Summary

  • The paper presents an expert-in-the-loop pipeline integrating LLMs and human expertise to generate engaging legal stories and evaluative questions.
  • It shows that LLM-generated narratives increase comprehension, retention, and engagement, especially for non-native speakers, compared to traditional legal definitions.
  • The study’s RCTs and error analyses highlight GPT-4 as superior and emphasize the need for expert feedback in refining AI-assisted educational content.

Introduction

The paper "Leveraging LLMs for Learning Complex Legal Concepts through Storytelling" explores the application of LLMs in legal education, focusing specifically on storytelling as a medium to explain complex legal concepts. Storytelling has long been an effective pedagogical tool, making abstract concepts more relatable and understandable. This paper evaluates how non-experts can benefit from LLM-generated stories to enhance their comprehension and interest in intricate legal doctrines.

Expert-in-the-loop Pipeline

The research introduces an innovative expert-in-the-loop pipeline for generating and refining legal educational content. This pipeline integrates LLMs with human expertise to generate stories and multiple-choice questions based on legal definitions sourced from Wikipedia. As depicted in the provided pipeline diagram, the system follows three main stages: story generation, question generation, and expert critique. Figure 1

Figure 1: Illustration of the expert-in-the-loop pipeline. The left section demonstrates the procedure to produce an LLM-generated story from the concept.

Story Generation

LLMs demonstrated the ability to produce engaging narratives that explain legislative principles effectively within a constrained word count. The stories generated are evaluated to ensure ease of comprehension, relevance, and factual accuracy.

Question Generation

The paper leverages pedagogical research in cognitive learning to design three types of questions: concept understanding, prediction application, and limitation evaluation. LLM-generated questions underwent expert review to ensure their reliability and validity in testing comprehension of legal concepts.

Evaluation and Results

Two-fold evaluation measures assessed story quality and question integrity. Human evaluations and linguistic complexity metrics highlighted the benefits and limitations of different LLMs, with GPT-4 emerging as the most proficient.

Human Evaluation of Stories

Evaluation results showed that the readability and coherence of LLM-generated stories were superior to raw legal definitions, enhancing participants' engagement and understanding. Figure 2

Figure 2:\Distribution of questions with or without issues generated by LLaMA 2, GPT-3.5, and GPT-4.

Error Analysis

While generating questions, LLMs occasionally produced flawed or confusing questions, emphasizing the necessity of expert feedback to refine educational outputs continuously. Figure 3

Figure 3: Distribution of different issues among the questions generated by LLaMA 2, GPT-3.5, and GPT-4.

Randomized Controlled Trials

The paper conducts RCTs to validate the effectiveness of storytelling in legal concept comprehension among native and non-native English speakers. The paper design involves comparison between control (definition-only) and treatment (definition+story) groups, assessing comprehension, relevance, interest, and retention.

Findings

Non-native speakers showed improved comprehension and retention with LLM-generated stories, establishing storytelling as a potent tool for enhancing legal education. Results indicated that stories help people relate concepts to personal experiences, which enriches learning beyond traditional definitions.

Implications and Future Directions

The successful application of LLMs in legal storytelling opens pathways for broader integration in educational contexts. Future developments may include refining these tools to address specific educational requirements, moving towards more personalized and adaptive learning models.

Conclusion

The paper presents a promising methodology for leveraging LLMs to improve comprehension of complex legal concepts through storytelling, particularly highlighting how non-native speakers benefit from this approach. This sets a precedent for future work in applying AI and LLMs to educational domains, advocating for their continued development and integration in learning processes.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 5 tweets and received 123 likes.

Upgrade to Pro to view all of the tweets about this paper: