Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Decomposition for Enhancing Attention: Improving LLM-based Text-to-SQL through Workflow Paradigm (2402.10671v3)

Published 16 Feb 2024 in cs.CL

Abstract: In-context learning of large-LLMs has achieved remarkable success in the field of natural language processing, while extensive case studies reveal that the single-step chain-of-thought prompting approach faces challenges such as attention diffusion and inadequate performance in complex tasks like text-to-SQL. To improve the contextual learning capabilities of LLMs in text-to-SQL, a workflow paradigm method is proposed, aiming to enhance the attention and problem-solving scope of LLMs through decomposition. Specifically, the information determination module for eliminating redundant information and the brand-new prompt structure based on problem classification greatly enhance the model's attention. Additionally, the inclusion of self-correction and active learning modules greatly expands the problem-solving scope of LLMs, hence improving the upper limit of LLM-based approaches. Extensive experiments conducted on three datasets demonstrate that our approach outperforms other methods by a significant margin. About 2-3 percentage point improvements compared to the existing baseline on the Spider Dev, Spider-Realistic, and Bird Dev datasets and new SOTA results on the Spider Test dataset are achieved. Our code is available on GitHub: \url{https://github.com/FlyingFeather/DEA-SQL}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. LGESQL: Line graph enhanced text-to-SQL model with mixed local and non-local relations. In ACL.
  2. Shuaichen Chang and Eric Fosler-Lussier. 2023a. How to prompt llms for text-to-sql: A study in zero-shot, single-domain, and cross-domain settings. arXiv preprint arXiv:2305.11853.
  3. Shuaichen Chang and Eric Fosler-Lussier. 2023b. Selective demonstrations for cross-domain text-to-sql. In Findings of EMNLP.
  4. Structure-grounded pretraining for text-to-sql. In NAACL.
  5. A survey for in-context learning. arXiv preprint arXiv:2301.00234.
  6. C3: Zero-shot text-to-sql with chatgpt. arXiv preprint arXiv:2307.07306.
  7. Improving text-to-sql evaluation methodology. In ACL.
  8. Text-to-sql empowered by large language models: A benchmark evaluation. arXiv preprint arXiv:2308.15363.
  9. Towards complex text-to-sql in cross-domain database with intermediate representation. In ACL.
  10. S2sql: Injecting syntax to question-schema interaction graph encoder for text-to-sql parsers. In ACL.
  11. Re-examining the role of schema linking in text-to-sql. In EMNLP.
  12. Resdsql: Decoupling schema linking and skeleton parsing for text-to-sql. In AAAI.
  13. Graphix-t5: Mixing pre-trained transformers with graph-aware layers for text-to-sql parsing. In ACL.
  14. Few-shot aspect category sentiment analysis via meta-learning. ACM Transactions on Information Systems.
  15. A comprehensive evaluation of chatgpt’s zero-shot text-to-sql capability. arXiv preprint arXiv:2303.13547.
  16. OpenAI. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
  17. Training language models to follow instructions with human feedback. In NeurIPS.
  18. Mohammadreza Pourreza and Davood Rafiei. 2023. DIN-SQL: Decomposed in-context learning of text-to-SQL with self-correction. In NeurIPS.
  19. A survey on text-to-sql parsing: Concepts, methods, and future directions. arXiv preprint arXiv:2208.13629.
  20. Evaluating the text-to-sql capabilities of large language models. arXiv preprint arXiv:2204.00498.
  21. Picard: Parsing incrementally for constrained auto-regressive decoding from language models. In EMNLP.
  22. Exploring chain-of-thought style prompting for text-to-sql. In EMNLP.
  23. Rat-sql: Relation-aware schema encoding and linking for text-to-sql parsers. In ACL.
  24. Mac-sql: Multi-agent collaboration for text-to-sql. arXiv preprint arXiv:2312.11242.
  25. Chain-of-thought prompting elicits reasoning in large language models. In NeurIPS.
  26. A paradigm shift in machine translation: Boosting translation performance of large language models. arXiv preprint arXiv:2309.11674.
  27. Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task. In EMNLP.
  28. Act-sql: In-context learning for text-to-sql with automatically-generated chain-of-thought. In Findings of EMNLP.
  29. Semantic evaluation for text-to-sql with distilled test suite. In EMNLP.
Citations (11)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets