Chatlaw: A Multi-Agent Collaborative Legal Assistant with Knowledge Graph Enhanced Mixture-of-Experts Large Language Model (2306.16092v2)

Published 28 Jun 2023 in cs.CL

Abstract: AI legal assistants based on LLMs can provide accessible legal consulting services, but the hallucination problem poses potential legal risks. This paper presents Chatlaw, an innovative legal assistant utilizing a Mixture-of-Experts (MoE) model and a multi-agent system to enhance the reliability and accuracy of AI-driven legal services. By integrating knowledge graphs with artificial screening, we construct a high-quality legal dataset to train the MoE model. This model utilizes different experts to address various legal issues, optimizing the accuracy of legal responses. Additionally, Standardized Operating Procedures (SOP), modeled after real law firm workflows, significantly reduce errors and hallucinations in legal services. Our MoE model outperforms GPT-4 in the Lawbench and Unified Qualification Exam for Legal Professionals by 7.73% in accuracy and 11 points, respectively, and also surpasses other models in multiple dimensions during real-case consultations, demonstrating our robust capability for legal consultation.

References (14)

Citations (96)

View on Semantic Scholar

Summary

The paper introduces a dedicated Chinese legal LLM, ChatLaw, that integrates external knowledge to significantly reduce hallucinations during inference.
It employs specialized modules for legal feature word extraction and text similarity analysis to optimize context-specific performance.
Experimental results using an ELO-based scoring system demonstrate that ChatLaw outperforms traditional models on complex legal reasoning tasks.

ChatLaw: Open-Source Legal LLM with Integrated External Knowledge Bases

The paper "ChatLaw: Open-Source Legal LLM with Integrated External Knowledge Bases" addresses a significant gap in the development of targeted large-scale LLMs within the Chinese legal domain. Unlike earlier endeavors such as BloombergGPT and FinGPT, the paper focuses on creating a dedicated and open-source legal LLM named ChatLaw to bolster digital transformation in legal fields.

Core Contributions

The authors identify several key contributions with the development of ChatLaw:

Mitigation of Hallucination: The paper presents a strategy to decrease hallucination phenomena by enhancing the model's training and incorporating modules during inference. This structure incorporates "consult," "reference," "self-suggestion," and "response" modules that integrate domain-specific knowledge and accurate information from external sources.
Legal Feature Word Extraction Model: A model is trained to extract legal feature words efficiently, facilitating effective analysis of legal contexts within user input.
Legal Text Similarity Calculation Model: By employing a BERT-based approach, the authors create a model to measure textual similarity, enabling efficient retrieval of similar legal documents for further analysis.
Chinese Legal Exam Testing Dataset: A unique dataset is curated specifically for evaluating model performance in legal multiple-choice questions, supplemented with an ELO arena scoring mechanism.

Dataset and Methodology

The dataset construction is meticulous, involving a multi-step process to ensure comprehensiveness. It includes real-world legal data such as news articles, social media content, legal regulations, judicial interpretations, and legal consultation scenarios. After data curation, rigorous cleaning processes filter incoherent content, and the ChatGPT API is employed for data augmentation.

Utilizing Ziya-LLaMA-13B, the authors fine-tuned the ChatLaw model with Low-Rank Adaptation (LoRA), further reducing hallucinations with a self-suggestion role. Specific pre-trained models also addressed keyword extraction for accurate legal text retrieval, leveraging a novel algorithm for improved accuracy.

Results and Analysis

Experimental evaluations utilize a compilation of national judicial exam questions. However, due to low accuracy rates across models, traditional accuracy assessments do not suffice. Thus, the authors adopt an ELO-based scoring mechanism to provide a meaningful comparison.

Significant insights include:

Incorporation of legal domain data improves model performance on multiple-choice questions.
Task-specific training enhances the performance, as observed with ChatLaw outperforming GPT-4.
Larger models generally show enhanced capabilities in handling complex legal logic and reasoning tasks.

Implications and Future Work

The implications of this work establish a solid foundation for future research in legally structured LLMing. By proposing an innovative integration of vector knowledge bases with LLMs, the authors pave the way for reducing hallucinations and improving problem-solving capabilities in specific domains.

However, the work also acknowledges limitations, particularly concerning general tasks and logical reasoning due to the base model's scale. Future research directions may involve refining generalization capabilities and minimizing social risks associated with model deployment.

In conclusion, the paper provides a robust framework for developing domain-specific LLMs in the legal sector, with potential applications extending beyond the immediate legal environment, inviting further exploration into enhanced performance and broader application areas.

Related Papers

GitHub

GitHub - PKU-YuanGroup/ChatLaw: ChatLaw：A Powerful LLM Tailored for Chinese Legal. 中文法律大模型 (6,804 stars)

YouTube

Show All Videos