ItD: Large Language Models Can Teach Themselves Induction through Deduction (2403.05789v1)
Abstract: Although LLMs are showing impressive performance on a wide range of Natural Language Processing tasks, researchers have found that they still have limited ability to conduct induction. Recent works mainly adopt ``post processes'' paradigms to improve the performance of LLMs on induction (e.g., the hypothesis search & refinement methods), but their performance is still constrained by the inherent inductive capability of the LLMs. In this paper, we propose a novel framework, Induction through Deduction (ItD), to enable the LLMs to teach themselves induction through deduction. The ItD framework is composed of two main components: a Deductive Data Generation module to generate induction data and a Naive Bayesian Induction module to optimize the fine-tuning and decoding of LLMs. Our empirical results showcase the effectiveness of ItD on two induction benchmarks, achieving relative performance improvement of 36% and 10% compared with previous state-of-the-art, respectively. Our ablation study verifies the effectiveness of two key modules of ItD. We also verify the effectiveness of ItD across different LLMs and deductors. The data and code of this paper can be found at https://anonymous.4open.science/r/ItD-E844.
- A large-scale benchmark for few-shot program induction and synthesis. In International Conference on Machine Learning, pages 175–186. PMLR.
- A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity. arXiv preprint arXiv:2302.04023.
- François Chollet. 2019. On the measure of intelligence. arXiv preprint arXiv:1911.01547.
- Qlora: Efficient finetuning of quantized llms. arXiv preprint arXiv:2305.14314.
- Large language models are not abstract reasoners. arXiv preprint arXiv:2305.19555.
- Jerzy W Grzymala-Busse. 2023. Rule induction. In Machine Learning for Data Science Handbook: Data Mining and Knowledge Discovery Handbook, pages 55–74. Springer.
- Instruction induction: From few examples to natural language task descriptions. arXiv preprint arXiv:2205.10782.
- Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685.
- Building machines that learn and think like people. Behavioral and brain sciences, 40:e253.
- Large language models as general pattern machines. arXiv preprint arXiv:2307.04721.
- Comparing humans, gpt-4, and gpt-4v on abstraction and reasoning tasks. arXiv preprint arXiv:2311.09247.
- Guideline learning for in-context information extraction. arXiv preprint arXiv:2310.05066.
- Charles S Peirce. 1868. Questions concerning certain faculties claimed for man. The Journal of Speculative Philosophy, 2(2):103–114.
- Phenomenal yet puzzling: Testing inductive reasoning capabilities of language models with hypothesis refinement. arXiv preprint arXiv:2310.08559.
- Joshua Stewart Rule. 2020. The child as hacker: building more human-like models of learning. Ph.D. thesis, Massachusetts Institute of Technology.
- Steven A Sloman and David Lagnado. 2005. The problem of induction. The Cambridge handbook of thinking and reasoning, pages 95–116.
- Jianlin Su. 2023. Naive bayes-based context extension. https://github.com/bojone/NBCE.
- Expnote: Black-box large language models are better task solvers with experience notebook. arXiv preprint arXiv:2311.07032.
- Large language models are in-context semantic reasoners rather than symbolic reasoners. arXiv preprint arXiv:2305.14825.
- Hypothesis search: Inductive reasoning with language models. arXiv preprint arXiv:2309.05660.
- Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171.
- Failures pave the way: Enhancing large language models through tuning-free rule accumulation. arXiv preprint arXiv:2310.15746.
- Neural, symbolic and neural-symbolic reasoning on knowledge graphs. AI Open, 2:14–35.
- Expel: Llm agents are experiential learners. arXiv preprint arXiv:2308.10144.
- Large language models can learn rules. arXiv preprint arXiv:2310.07064.