DISC-LawLLM: Fine-tuning Large Language Models for Intelligent Legal Services

Published 20 Sep 2023 in cs.CL | (2309.11325v2)

Abstract: We propose DISC-LawLLM, an intelligent legal system utilizing LLMs to provide a wide range of legal services. We adopt legal syllogism prompting strategies to construct supervised fine-tuning datasets in the Chinese Judicial domain and fine-tune LLMs with legal reasoning capability. We augment LLMs with a retrieval module to enhance models' ability to access and utilize external legal knowledge. A comprehensive legal benchmark, DISC-Law-Eval, is presented to evaluate intelligent legal systems from both objective and subjective dimensions. Quantitative and qualitative results on DISC-Law-Eval demonstrate the effectiveness of our system in serving various users across diverse legal scenarios. The detailed resources are available at https://github.com/FudanDISC/DISC-LawLLM.

Abstract PDF HTML Upgrade to Chat

References (40)

Citations (57)

View on Semantic Scholar

Summary

The paper introduces DISC-LawLLM, a fine-tuned LLM that uses legal syllogism prompting and retrieval augmentation to enhance legal reasoning in dynamic Chinese judicial contexts.
It employs a two-step training process with Supervised Fine-Tuning on a specialized DISC-Law-SFT dataset and a retrieval module integrated with a 13.2B Baichuan-13B-Base model to outperform existing legal AI systems.
Results from objective and subjective evaluations demonstrate marked improvements in legal information extraction and judgment prediction, offering practical benefits for legal professionals and law students.

DISC-LawLLM: Advancing Legal Services with LLMs

The paper introduces DISC-LawLLM, a system specifically designed to leverage LLMs for a wide range of intelligent legal services. Structured on a legal syllogism prompting strategy, this model fine-tunes LLMs to improve legal reasoning within the Chinese Judicial context. By incorporating a retrieval module, DISC-LawLLM enhances its ability to utilize external legal knowledge, addressing the dynamic nature of legal databases.

Methodology and Dataset Construction

The authors construct a supervised fine-tuning dataset, DISC-Law-SFT, which consists of distinct subsets, focusing on legal reasoning and domain-specific knowledge integration. This dataset is derived from multiple sources, including public NLP legal task datasets, legal raw text, and open-source instruction datasets. The paper employs GPT-3.5-turbo for enhancing output consistency with legal syllogism, creating instruction samples for tasks like legal information extraction, judgment prediction, and text summarization.

Training and Model Architecture

The training of DISC-LawLLM is accomplished via two primary steps: Supervised Fine-Tuning (SFT) and Retrieval Augmentation. The architecture is based on the Baichuan-13B-Base model with 13.2 billion parameters, which is further fine-tuned using DISC-Law-SFT. Retrieval Augmentation is implemented by integrating an external retrieval framework that dynamically accesses an evolving legal knowledge base, ensuring accurate and current legal references.

Evaluation Framework

The authors propose a comprehensive evaluation framework, DISC-Law-Eval, which provides both objective and subjective assessments. Objective evaluation examines legal knowledge and reasoning via multi-choice questions from various legal exams. Subjective evaluation involves qualitative analysis using a question-answering paradigm, scored by GPT-3.5, assessing accuracy, completeness, and clarity.

Results and Implications

The results demonstrate that DISC-LawLLM significantly surpasses existing general and legal LLMs in objective evaluations, even outperforming GPT-3.5-turbo in multiple legal domains. It indicates superior jurisprudential reasoning, particularly for complex legal tasks. In subjective evaluations, DISC-LawLLM shows improvements in average scoring across key dimensions, highlighting its applicability in real-world scenarios.

Practical and Theoretical Contributions

From a practical perspective, DISC-LawLLM offers substantial advantages over traditional legal systems, simplifying tasks for legal professionals, enhancing legal consultation accessibility, and serving educational purposes for law students. Theoretically, the paper contributes to the field of LegalAI by demonstrating how fine-tuning with legal syllogism and retrieval mechanisms can enhance LLM capabilities in specialized domains.

Future Directions

This paper opens avenues for extending DISC-LawLLM to other legal systems and languages, with the potential to integrate even broader repositories of legal knowledge. Future developments could explore multi-modal inputs and deeper integration with court databases to further enrich the system's applicability and reliability in diverse legal contexts.

Overall, DISC-LawLLM represents a significant step forward in utilizing LLMs for legal applications, setting a robust foundation for future advancements in AI-driven legal services.

Markdown

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

Continue Learning

Authors (11)

Collections

GitHub

GitHub - FudanDISC/DISC-LawLLM: DISC-LawLLM, an intelligent legal system utilizing large language models (LLMs) to provide a wide range of legal services (457 stars)

DISC-LawLLM: Fine-tuning Large Language Models for Intelligent Legal Services

Summary

DISC-LawLLM: Advancing Legal Services with LLMs

Methodology and Dataset Construction

Training and Model Architecture

Evaluation Framework

Results and Implications

Practical and Theoretical Contributions

Future Directions

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (11)

Collections

GitHub