Cause and Effect: Can Large Language Models Truly Understand Causality? (2402.18139v3)
Abstract: With the rise of LLMs(LLMs), it has become crucial to understand their capabilities and limitations in deciphering and explaining the complex web of causal relationships that language entails. Current methods use either explicit or implicit causal reasoning, yet there is a strong need for a unified approach combining both to tackle a wide array of causal relationships more effectively. This research proposes a novel architecture called Context Aware Reasoning Enhancement with Counterfactual Analysis(CARE CA) framework to enhance causal reasoning and explainability. The proposed framework incorporates an explicit causal detection module with ConceptNet and counterfactual statements, as well as implicit causal detection through LLMs. Our framework goes one step further with a layer of counterfactual explanations to accentuate LLMs understanding of causality. The knowledge from ConceptNet enhances the performance of multiple causal reasoning tasks such as causal discovery, causal identification and counterfactual reasoning. The counterfactual sentences add explicit knowledge of the not caused by scenarios. By combining these powerful modules, our model aims to provide a deeper understanding of causal relationships, enabling enhanced interpretability. Evaluation of benchmark datasets shows improved performance across all metrics, such as accuracy, precision, recall, and F1 scores. We also introduce CausalNet, a new dataset accompanied by our code, to facilitate further research in this domain.
- Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116.
- Louis Anthony Cox. 2024. An ai assistant to help review and improve causal reasoning in epidemiological documents. Global Epidemiology, 7:100130.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
- e-care: a new dataset for exploring explainable causal reasoning. Submitted on 12 May 2022.
- Large language models: A comprehensive survey of its applications, challenges, limitations, and future prospects.
- Inductive reasoning in humans and large language models. Cognitive Systems Research, 83:101155.
- Deberta: Decoding-enhanced bert with disentangled attention. arXiv preprint arXiv:2006.03654.
- Benchmarking and explaining large language model-based code generation: A causality-centric approach.
- Mistral 7b. arXiv preprint arXiv:2310.06825.
- Cladder: Assessing causal reasoning in language models.
- Cladder: Assessing causal reasoning in language models. NeurIPS 2023; updated with CLadder dataset v1.5.
- Causal reasoning and large language models: Opening a new frontier for causality.
- Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942.
- The magic of if: Investigating causal reasoning abilities in large language models of code.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
- OpenAI. https://platform.openai.com/docs/introduction.
- Ellie Pavlick. 2023. Symbols and grounding in large language models. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 381.
- Leveraging large language models for topic classification in the domain of public affairs. Accepted in ICDAR 2023 Workshop on Automatic Domain-Adapted and Personalized Document Analysis.
- Counterfactual story reasoning and generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5043–5053, Hong Kong, China. Association for Computational Linguistics.
- Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
- Com2sense: A commonsense reasoning benchmark with complementary sentences. In Findings of the Association for Computational Linguistics: ACL 2021. In Proceedings of Findings of the Association for Computational Linguistics: ACL 2021 (ACL-Findings). Contains 16 pages, 14 figures, and 11 tables.
- Conceptnet 5.5: An open multilingual graph of general knowledge. pages 4444–4451.
- Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
- Bidirectional encoder representations from transformers-like large language models in patient safety and pharmacovigilance: A comprehensive assessment of causal inference implications. Experimental Biology and Medicine, 248(21):1908–1917. PMID: 38084745.
- Large language models are better reasoners with self-verification. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 2550–2575.
- Yang Xu. 2021. Global divergence and local convergence of utterance semantic representations in dialogue. In Proceedings of the Society for Computation in Linguistics 2021, pages 116–124, Online. Association for Computational Linguistics.
- Causal parrots: Large language models may talk causality but are not causal. arXiv preprint arXiv:2308.13067.
- Understanding causality with large language models: Feasibility and opportunities.
- Causality analysis for evaluating the security of large language models.
- Through the lens of core competency: Survey on evaluation of large language models.