Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Unsupervised Question Answering System with Multi-level Summarization for Legal Text (2403.13107v2)

Published 19 Mar 2024 in cs.CL, cs.CY, and cs.LG

Abstract: This paper summarizes Team SCaLAR's work on SemEval-2024 Task 5: Legal Argument Reasoning in Civil Procedure. To address this Binary Classification task, which was daunting due to the complexity of the Legal Texts involved, we propose a simple yet novel similarity and distance-based unsupervised approach to generate labels. Further, we explore the Multi-level fusion of Legal-Bert embeddings using ensemble features, including CNN, GRU, and LSTM. To address the lengthy nature of Legal explanation in the dataset, we introduce T5-based segment-wise summarization, which successfully retained crucial information, enhancing the model's performance. Our unsupervised system witnessed a 20-point increase in macro F1-score on the development set and a 10-point increase on the test set, which is promising given its uncomplicated architecture.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (20)
  1. The legal argument reasoning task in civil procedure. In Proceedings of the Natural Legal Language Processing Workshop 2022, pages 194–207, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
  2. LEGAL-BERT: The muppets straight out of law school. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2898–2904, Online. Association for Computational Linguistics.
  3. Empirical evaluation of gated recurrent neural networks on sequence modeling.
  4. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
  5. Legal transformer models may not always help.
  6. Deberta: Decoding-enhanced bert with disentangled attention. In 2021 International Conference on Learning Representations. Under review.
  7. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput., 9(8):1735–1780.
  8. Understanding convolutional neural networks for text classification. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 56–65, Brussels, Belgium. Association for Computational Linguistics.
  9. Diederik P. Kingma and Jimmy Ba. 2017. Adam: A method for stochastic optimization.
  10. Roberta: A robustly optimized bert pretraining approach.
  11. Interpretable long-form legal question answering with retrieval-augmented large language models.
  12. Jorge Martinez-Gil. 2023. A survey on legal question–answering systems. Comput. Sci. Rev., 48(C).
  13. Distributed representations of words and phrases and their compositionality. In Neural and Information Processing System (NIPS).
  14. Vinod Nair and Geoffrey E. Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML’10, page 807–814, Madison, WI, USA. Omnipress.
  15. Pre-trained language models for the legal domain: A case study on indian law. In Proceedings of the Nineteenth International Conference on Artificial Intelligence and Law, ICAIL ’23, page 187–196, New York, NY, USA. Association for Computing Machinery.
  16. GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, Doha, Qatar. Association for Computational Linguistics.
  17. Exploring the limits of transfer learning with a unified text-to-text transformer. Technical report, Google.
  18. Deep learning based weighted feature fusion approach for sentiment analysis. IEEE Access, 7:140252–140260.
  19. Attention is all you need.
  20. Huggingface’s transformers: State-of-the-art natural language processing.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. M Manvith Prabhu (2 papers)
  2. Haricharana Srinivasa (2 papers)
  3. Anand Kumar M (7 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.