In-Context Learning for Text Classification with Many Labels (2309.10954v2)
Abstract: In-context learning (ICL) using LLMs for tasks with many labels is challenging due to the limited context window, which makes it difficult to fit a sufficient number of examples in the prompt. In this paper, we use a pre-trained dense retrieval model to bypass this limitation, giving the model only a partial view of the full label space for each inference call. Testing with recent open-source LLMs (OPT, LLaMA), we set new state of the art performance in few-shot settings for three common intent classification datasets, with no finetuning. We also surpass fine-tuned performance on fine-grained sentiment classification in certain cases. We analyze the performance across number of in-context examples and different model scales, showing that larger models are necessary to effectively and consistently make use of larger context lengths for ICL. By running several ablations, we analyze the model's use of: a) the similarity of the in-context examples to the current input, b) the semantic content of the class names, and c) the correct correspondence between examples and labels. We demonstrate that all three are needed to varying degrees depending on the domain, contrary to certain recent works.
- Language Models are Few-shot Learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual.
- Efficient Intent Detection with Dual Sentence Encoders. In Proceedings of the 2nd Workshop on NLP for ConvAI - ACL 2020. Data available at https://github.com/PolyAI-LDN/task-specific-datasets.
- PaLM: Scaling Language Modeling with Pathways. J. Mach. Learn. Res., 24:240:1–240:113.
- GoEmotions: A Dataset of Fine-grained Emotions. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4040–4054, Online. Association for Computational Linguistics.
- State-of-the-art generalisation research in NLP: a taxonomy and review. CoRR, abs/2210.03050.
- Dense Passage Retrieval for Open-domain Question Answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, pages 6769–6781. Association for Computational Linguistics.
- The Impact of Positional Encoding on Length Generalization in Transformers. CoRR, abs/2305.19466.
- An Evaluation Dataset for Intent Classification and Out-of-scope Prediction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019, pages 1311–1316. Association for Computational Linguistics.
- Few-shot Parameter-efficient Fine-tuning is Better and Cheaper than In-context Learning. In NeurIPS.
- What Makes Good In-context Examples for GPT-3? In Proceedings of Deep Learning Inside Out: The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, DeeLIO at ACL 2022, Dublin, Ireland and Online, May 27, 2022, pages 100–114. Association for Computational Linguistics.
- Lost in the Middle: How Language Models Use Long Contexts. CoRR, abs/2307.03172.
- Benchmarking Natural Language Understanding Services for Building Conversational Agents. In Increasing Naturalness and Flexibility in Spoken Dialogue Interaction - 10th International Workshop on Spoken Dialogue Systems, IWSDS 2019, Syracuse, Sicily, Italy, 24-26 April 2019, volume 714 of Lecture Notes in Electrical Engineering, pages 165–183. Springer.
- Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-shot Prompt Order Sensitivity. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, pages 8086–8098. Association for Computational Linguistics.
- DialoGLUE: A Natural Language Understanding Benchmark for Task-oriented Dialogue. CoRR, abs/2009.13570.
- Rethinking the Role of Demonstrations: What Makes In-context Learning Work? In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, pages 11048–11064. Association for Computational Linguistics.
- AdapterHub: A Framework for Adapting Transformers. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, EMNLP 2020 - Demos, Online, November 16-20, 2020, pages 46–54. Association for Computational Linguistics.
- MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7654–7673, Online. Association for Computational Linguistics.
- Scaling Language Models: Methods, Analysis & Insights from Training Gopher. CoRR, abs/2112.11446.
- In-context Retrieval-augmented Language Models. CoRR, abs/2302.00083.
- Impact of Pretraining Term Frequencies on Few-shot Reasoning. CoRR, abs/2202.07206.
- Nils Reimers and Iryna Gurevych. 2019a. Sentence-BERT: Sentence Embeddings using Siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992, Hong Kong, China. Association for Computational Linguistics.
- Nils Reimers and Iryna Gurevych. 2019b. Sentence-BERT: Sentence Embeddings using Siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992, Hong Kong, China. Association for Computational Linguistics.
- Learning To Retrieve Prompts for In-context Learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022, Seattle, WA, United States, July 10-15, 2022, pages 2655–2671. Association for Computational Linguistics.
- REPLUG: Retrieval-augmented Black-box Language Models. CoRR, abs/2301.12652.
- LLaMA: Open and Efficient Foundation Language Models. CoRR, abs/2302.13971.
- Efficient Few-shot Learning Without Prompts. CoRR, abs/2209.11055.
- ConvFiT: Conversational Fine-tuning of Pretrained Language Models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 1151–1168, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Larger language models do in-context learning differently. CoRR, abs/2303.03846.
- An Explanation of In-context Learning as Implicit Bayesian Inference. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net.
- OPT: Open Pre-trained Transformer Language Models. CoRR, abs/2205.01068.
- Calibrate Before Use: Improving Few-shot Performance of Language Models. CoRR, abs/2102.09690.