Papers
Topics
Authors
Recent
2000 character limit reached

Tree Prompting: Efficient Task Adaptation without Fine-Tuning (2310.14034v1)

Published 21 Oct 2023 in cs.CL and cs.LG

Abstract: Prompting LMs is the main interface for applying them to new tasks. However, for smaller LMs, prompting provides low accuracy compared to gradient-based finetuning. Tree Prompting is an approach to prompting which builds a decision tree of prompts, linking multiple LM calls together to solve a task. At inference time, each call to the LM is determined by efficiently routing the outcome of the previous call using the tree. Experiments on classification datasets show that Tree Prompting improves accuracy over competing methods and is competitive with fine-tuning. We also show that variants of Tree Prompting allow inspection of a model's decision-making process.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (63)
  1. Hierarchical shrinkage: improving the accuracy and interpretability of tree-based methods. arXiv:2202.00858 [cs, stat]. ArXiv: 2202.00858.
  2. PromptSource: An integrated development environment and repository for natural language prompts. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 93–104, Dublin, Ireland. Association for Computational Linguistics.
  3. Graph of thoughts: Solving elaborate problems with large language models. arXiv preprint arXiv:2308.09687.
  4. Classification and Regression Trees. Wadsworth and Brooks, Monterey, CA.
  5. Leo Breiman. 1996. Bagging predictors. Machine learning, 24:123–140.
  6. Leo Breiman. 2001. Random forests. Machine Learning, 45(1):5–32.
  7. Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc.
  8. Harrison Chase. 2023. Langchain: Building applications with llms through composability. https://github.com/hwchase17/langchain.
  9. Frugalgpt: How to use large language models while reducing cost and improving performance. arXiv preprint arXiv:2305.05176.
  10. Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785–794.
  11. Bart: Bayesian additive regression trees. The Annals of Applied Statistics, 4(1):266–298.
  12. Michael Collins. 1997. Three generative, lexicalised models for statistical parsing. In 35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics, pages 16–23, Madrid, Spain. Association for Computational Linguistics.
  13. Vinícius G Costa and Carlos E Pedreira. 2022. Recent advances in decision trees: An updated survey. Artificial Intelligence Review, pages 1–36.
  14. The pascal recognising textual entailment challenge. In Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment: First PASCAL Machine Learning Challenges Workshop, MLCW 2005, Southampton, UK, April 11-13, 2005, Revised Selected Papers, pages 177–190. Springer.
  15. The commitmentbank: Investigating projection in naturally occurring discourse. In proceedings of Sinn und Bedeutung, volume 23, pages 107–124.
  16. Lingjia Deng and Janyce Wiebe. 2015. MPQA 3.0: An entity/event-level sentiment corpus. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1323–1328, Denver, Colorado. Association for Computational Linguistics.
  17. Yoav Freund and Robert E Schapire. 1997. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences, 55(1):119–139.
  18. Experiments with a new boosting algorithm. In icml, volume 96, pages 148–156. Citeseer.
  19. Promptboosting: Black-box text classification with ten forward passes.
  20. Toward semantics-based answer pinpointing. In Proceedings of the First International Conference on Human Language Technology Research.
  21. Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 168–177.
  22. Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2225–2240, Dublin, Ireland. Association for Computational Linguistics.
  23. Optimal sparse decision trees. Advances in Neural Information Processing Systems (NeurIPS).
  24. How can we know what language models know? Transactions of the Association for Computational Linguistics, 8:423–438.
  25. Learning optimal fair classification trees. arXiv preprint arXiv:2201.09932.
  26. Teven Le Scao and Alexander Rush. 2021. How many data points is a prompt worth? In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2627–2636, Online. Association for Computational Linguistics.
  27. Dbpedia–a large-scale, multilingual knowledge base extracted from wikipedia. Semantic web, 6(2):167–195.
  28. Xin Li and Dan Roth. 2002. Learning question classifiers. In COLING 2002: The 19th International Conference on Computational Linguistics.
  29. Cutting down on prompts and parameters: Simple few-shot learning with language models. In Findings of the Association for Computational Linguistics: ACL 2022, pages 2824–2835, Dublin, Ireland. Association for Computational Linguistics.
  30. Jieyi Long. 2023. Large language model guided tree-of-thought.
  31. Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8086–8098, Dublin, Ireland. Association for Computational Linguistics.
  32. Large language model is not a good few-shot information extractor, but a good reranker for hard samples!
  33. Learning word vectors for sentiment analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 142–150, Portland, Oregon, USA. Association for Computational Linguistics.
  34. Self-refine: Iterative refinement with self-feedback.
  35. David M. Magerman. 1995. Statistical decision-tree models for parsing. In 33rd Annual Meeting of the Association for Computational Linguistics, pages 276–283, Cambridge, Massachusetts, USA. Association for Computational Linguistics.
  36. Good debt or bad debt: Detecting semantic orientations in economic texts. Journal of the Association for Information Science and Technology, 65.
  37. On the risks of stealing the decoding algorithms of language models.
  38. OpenAI. 2023. Gpt-4 technical report.
  39. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), pages 271–278, Barcelona, Spain.
  40. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), pages 115–124, Ann Arbor, Michigan. Association for Computational Linguistics.
  41. Boosted prompt ensembles for large language models. arXiv preprint arXiv:2304.05970.
  42. Measuring and narrowing the compositionality gap in language models.
  43. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
  44. Alexander M. Rush. 2023. Mini chain: A tiny library for coding with large language models. https://github.com/srush/MiniChain.
  45. CARER: Contextualized affect representations for emotion recognition. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3687–3697, Brussels, Belgium. Association for Computational Linguistics.
  46. Fabrizio Sebastiani. 2002. Machine learning in automated text categorization. ACM computing surveys (CSUR), 34(1):1–47.
  47. Augmenting interpretable models with llms during training.
  48. Explaining patterns in data with language models via interpretable autoprompting.
  49. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1631–1642, Seattle, Washington, USA. Association for Computational Linguistics.
  50. Interactive and visual prompt engineering for ad-hoc task adaptation with large language models.
  51. Fast interpretable greedy-tree sums (figs). arXiv:2201.11931 [cs, stat]. ArXiv: 2201.11931.
  52. Attention is all you need. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
  53. {NBDT}: Neural-backed decision tree. In International Conference on Learning Representations.
  54. Ben Wang and Aran Komatsuzaki. 2021. GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/mesh-transformer-jax.
  55. Iteratively prompt pre-trained language models for chain of thought. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2714–2730, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  56. Annotating expressions of opinions and emotions in language. Language resources and evaluation, 39:165–210.
  57. $k$NN prompting: Beyond-context learning with calibration-free nearest neighbor inference. In The Eleventh International Conference on Learning Representations.
  58. Tree of thoughts: Deliberate problem solving with large language models.
  59. Prefer: Prompt ensemble learning via feedback-reflect-refine. arXiv preprint arXiv:2308.12033.
  60. Summit: Iterative text summarization via chatgpt.
  61. Tianyuan Zhang and Zhanxing Zhu. 2019. Interpreting adversarially trained convolutional neural networks. In Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 7502–7511. PMLR.
  62. Character-level convolutional networks for text classification. In Advances in Neural Information Processing Systems, volume 28. Curran Associates, Inc.
  63. Describing differences between text distributions with natural language. In Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pages 27099–27116. PMLR.
Citations (12)

Summary

  • The paper introduces Tree Prompting, which constructs decision trees from LM prompts to enable task-specific adaptation without fine-tuning.
  • It demonstrates a 16.2% accuracy improvement over basic few-shot methods, leveraging dynamic prompt generation for enhanced interpretability.
  • The approach efficiently uses a fixed number of LM queries, paving the way for scalable and interpretable predictive modeling.

Tree Prompting: Efficient Task Adaptation without Fine-Tuning

Introduction

The study presented in "Tree Prompting: Efficient Task Adaptation without Fine-Tuning" (2310.14034) provides a novel approach for using LMs to adapt to specific tasks without engaging in traditional fine-tuning processes. Through the construction of decision trees composed of prompts, the methodology offers improvements in accuracy and efficiency in classification tasks. This essay elaborates on the foundational concepts, experimental results, and future implications of Tree Prompting.

Conceptual Framework and Methodology

Tree Prompting innovatively builds a decision tree where each node corresponds to a prompt evaluated by the LM. This design streamlines the process of classification by leveraging the outcomes of previous prompts to inform subsequent decisions, thereby forming a coherent decision-making pathway (Figure 1). Figure 1

Figure 1: Illustration of Tree Prompting where a subset of training data is used to prompt the LM at each node to partition the input space.

The primary method for determining prompts involves sampling few-shot examples from training datasets, a strategy inspired by bagging predictors. The prompts are converted into binary decision splits using verbalizers—functional mappings of LM outputs into discrete actions. This decision tree structure translates fine-tuning data into sparse, yet effective, models that only necessitate a fixed number of LM queries during inference.

Experimental Results and Analysis

Classification Accuracy

Experiments on 13 datasets demonstrate the prowess of Tree Prompting over conventional few-shot and ensemble prompting methods. Notably, Tree Prompting consistently outperformed its counterparts in terms of accuracy across different LM sizes and datasets. Results indicate an average accuracy improvement over basic few-shot methods by 16.2% when using smaller models like GPT-2 Small. However, stability was noted to be less compared to gradient fine-tuning, yet Tree Prompting excelled in specific tasks, highlighting its efficacy in maximizing classification performance through tree structures (Figure 2). Figure 2

Figure 2: Performance as a function of the number of LM evaluations per example (#LM calls), showing Tree Prompting’s efficiency.

Interpretability and Dynamic Prompts

An advantageous feature of decision trees lies in their capacity for interpretability. Through Tree Prompting, researchers can inspect decision nodes for insights on model predictions. Additionally, dynamic prompt generation at nodes offers adaptability and enhanced understanding of the underlying decision process. Figure 3

Figure 3: Tree Prompting tree with dynamic prompts showing the decision-making process on MR dataset.

Comparison with kNN Prompting

Compared against kNN Prompting, Tree Prompting showcased superior performance on most datasets by utilizing rigorous decision pathways expressed through trees rather than relying solely on nearest-neighbor aspects. This integration provides more flexibility in representing the input space without predefined verbalizer constraints. Figure 4

Figure 4: Example tree for the MR dataset demonstrating the efficacy of Tree Prompting using dynamic prompt selection.

Implications and Future Directions

The Tree Prompting approach proposes an adaptable framework for leveraging LMs without the computationally expensive gradient-based fine-tuning. The efficiency of inference and adaptability to large LMs through repeated prompt evaluations presents advantageous applications, particularly in resource-centric environments. Additionally, future avenues explore generalization capabilities to broader tasks beyond text classification, potential improvements in decision-tree algorithms, and integration of diverse computational constraints for dynamic node navigation.

Tree Prompting’s capacity to merge prompt generation and tree structural learning offers a scalable and interpretative method for predictive modeling using LMs. Future research can explore modular configurations, extending its application across tasks such as function calling in programming or optimizing traversals within complex decision trees to further enhance efficiency and interpretability.

Conclusion

In summary, Tree Prompting offers a significant shift in how LMs are utilized for task-specific adaptations. Providing an efficient alternative to fine-tuning, the research stands as both a paradigm and a practical tool for machine learning applications. It encourages further exploration into modular decision frameworks, paving the way for more refined and interpretable task adaptation methods within AI technologies.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.