Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 47 tok/s
Gemini 2.5 Pro 44 tok/s Pro
GPT-5 Medium 13 tok/s Pro
GPT-5 High 12 tok/s Pro
GPT-4o 64 tok/s Pro
Kimi K2 160 tok/s Pro
GPT OSS 120B 452 tok/s Pro
Claude Sonnet 4 36 tok/s Pro
2000 character limit reached

"In-Context Learning" or: How I learned to stop worrying and love "Applied Information Retrieval" (2405.01116v1)

Published 2 May 2024 in cs.IR

Abstract: With the increasing ability of LLMs, in-context learning (ICL) has evolved as a new paradigm for NLP, where instead of fine-tuning the parameters of an LLM specific to a downstream task with labeled examples, a small number of such examples is appended to a prompt instruction for controlling the decoder's generation process. ICL, thus, is conceptually similar to a non-parametric approach, such as $k$-NN, where the prediction for each instance essentially depends on the local topology, i.e., on a localised set of similar instances and their labels (called few-shot examples). This suggests that a test instance in ICL is analogous to a query in IR, and similar examples in ICL retrieved from a training set relate to a set of documents retrieved from a collection in IR. While standard unsupervised ranking models can be used to retrieve these few-shot examples from a training set, the effectiveness of the examples can potentially be improved by re-defining the notion of relevance specific to its utility for the downstream task, i.e., considering an example to be relevant if including it in the prompt instruction leads to a correct prediction. With this task-specific notion of relevance, it is possible to train a supervised ranking model (e.g., a bi-encoder or cross-encoder), which potentially learns to optimally select the few-shot examples. We believe that the recent advances in neural rankers can potentially find a use case for this task of optimally choosing examples for more effective downstream ICL predictions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (88)
  1. Can Generative LLMs Create Query Variants for Test Collections? An Exploratory Study. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (Taipei, Taiwan) (SIGIR ’23). Association for Computing Machinery, New York, NY, USA, 1869–1873. https://doi.org/10.1145/3539618.3591960
  2. Where to Stop Reading a Ranked List? Threshold Optimization Using Truncated Score Distributions. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval (Boston, MA, USA) (SIGIR ’09). Association for Computing Machinery, New York, NY, USA, 524–531. https://doi.org/10.1145/1571941.1572031
  3. Ask Me Anything: A simple strategy for prompting language models. arXiv:2210.02441 [cs.CL]
  4. Choppy: Cut Transformer for Ranked List Truncation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, China) (SIGIR ’20). Association for Computing Machinery, New York, NY, USA, 1513–1516. https://doi.org/10.1145/3397271.3401188
  5. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
  6. Overview of the TREC 2014 Session Track. In Proc. of TREC 2014.
  7. Retrievability based Document Selection for Relevance Feedback with Automatically Generated Query Variants. In CIKM. ACM, 125–134.
  8. Novelty and diversity in information retrieval evaluation. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Singapore, Singapore) (SIGIR ’08). Association for Computing Machinery, New York, NY, USA, 659–666. https://doi.org/10.1145/1390334.1390446
  9. Overview of the TREC 2009 Web Track. In Proceedings of The Eighteenth Text REtrieval Conference, TREC 2009, Gaithersburg, Maryland, USA, November 17-20, 2009 (NIST Special Publication, Vol. 500-278), Ellen M. Voorhees and Lori P. Buckland (Eds.). National Institute of Standards and Technology (NIST). http://trec.nist.gov/pubs/trec18/papers/WEB09.OVERVIEW.pdf
  10. Predicting Query Performance. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’02). Association for Computing Machinery, New York, NY, USA, 299–306.
  11. The Power of Noise: Redefining Retrieval for RAG Systems. arXiv:2401.14887 [cs.IR]
  12. Ronan Cummins. 2014. Document Score Distribution Models for Query Performance Inference and Prediction. ACM Trans. Inf. Syst. 32, 1, Article 2 (2014), 28 pages.
  13. Zhuyun Dai and Jamie Callan. 2019. Deeper Text Understanding for IR with Contextual Neural Language Modeling. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (Paris, France) (SIGIR’19). Association for Computing Machinery, New York, NY, USA, 985–988. https://doi.org/10.1145/3331184.3331303
  14. Deep-QPP: A Pairwise Interaction-based Deep Learning Model for Supervised Query Performance Prediction. In WSDM ’22: The Fifteenth ACM International Conference on Web Search and Data Mining, Virtual Event / Tempe, AZ, USA, February 21 - 25, 2022, K. Selcuk Candan, Huan Liu, Leman Akoglu, Xin Luna Dong, and Jiliang Tang (Eds.). ACM, 201–209. https://doi.org/10.1145/3488560.3498491
  15. A Relative Information Gain-based Query Performance Prediction Framework with Generated Query Variants. ACM Trans. Inf. Syst. 41, 2 (2023), 38:1–38:31.
  16. Ranking a Stream of News. In Proceedings of the 14th International Conference on World Wide Web (Chiba, Japan) (WWW ’05). Association for Computing Machinery, New York, NY, USA, 97–106. https://doi.org/10.1145/1060745.1060764
  17. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186.
  18. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423
  19. Fernando Diaz. 2007. Performance Prediction Using Spatial Autocorrelation. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’07). Association for Computing Machinery, New York, NY, USA, 583–590.
  20. A Survey on In-context Learning. arXiv:2301.00234 [cs.CL]
  21. A Geometric Framework for Query Performance Prediction in Conversational Search. In Proceedings of 46th international ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2023 July 23–27, 2023, Taipei, Taiwan. ACM. https://doi.org/10.1145/3539618.3591625
  22. TopicVis: a GUI for topic-based feedback and navigation. In SIGIR. ACM, 1103–1104.
  23. Debasis Ganguly and Gareth J. F. Jones. 2018. A non-parametric topical relevance model. Inf. Retr. J. 21, 5 (2018), 449–479.
  24. An LDA-smoothed relevance model for document expansion: a case study for spoken document retrieval. In SIGIR. ACM, 1057–1060.
  25. Debasis Ganguly and Emine Yilmaz. 2023. Query-specific Variable Depth Pooling via Query Performance Prediction. In SIGIR. ACM, 2303–2307.
  26. The Pile: An 800GB Dataset of Diverse Text for Language Modeling. arXiv preprint arXiv:2101.00027 (2020).
  27. Rethink Training of BERT Rerankers in Multi-Stage Retrieval Pipeline. CoRR abs/2101.08751 (2021). arXiv:2101.08751 https://arxiv.org/abs/2101.08751
  28. Precise Zero-Shot Dense Retrieval without Relevance Labels. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, Anna Rogers, Jordan L. Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, 1762–1777. https://doi.org/10.18653/V1/2023.ACL-LONG.99
  29. SimCSE: Simple Contrastive Learning of Sentence Embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.). Association for Computational Linguistics, 6894–6910. https://doi.org/10.18653/V1/2021.EMNLP-MAIN.552
  30. Michael Gutmann and Aapo Hyvärinen. 2010. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 9), Yee Whye Teh and Mike Titterington (Eds.). PMLR, Chia Laguna Resort, Sardinia, Italy, 297–304. https://proceedings.mlr.press/v9/gutmann10a.html
  31. Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation. CoRR abs/2010.02666 (2020). arXiv:2010.02666 https://arxiv.org/abs/2010.02666
  32. Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification. arXiv preprint arXiv:2108.02035 (2021).
  33. Gautier Izacard and Edouard Grave. 2021. Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, EACL 2021, Online, April 19 - 23, 2021, Paola Merlo, Jörg Tiedemann, and Reut Tsarfaty (Eds.). Association for Computational Linguistics, 874–880. https://doi.org/10.18653/V1/2021.EACL-MAIN.74
  34. Atlas: Few-shot Learning with Retrieval Augmented Language Models. J. Mach. Learn. Res. 24 (2023), 251:1–251:43. http://jmlr.org/papers/v24/23-0037.html
  35. Mistral 7B. arXiv:2310.06825 [cs.CL]
  36. Dense Passage Retrieval for Open-Domain Question Answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, 6769–6781. https://doi.org/10.18653/V1/2020.EMNLP-MAIN.550
  37. Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25-30, 2020, Jimmy X. Huang, Yi Chang, Xueqi Cheng, Jaap Kamps, Vanessa Murdock, Ji-Rong Wen, and Yiqun Liu (Eds.). ACM, 39–48. https://doi.org/10.1145/3397271.3401075
  38. Diverse Demonstrations Improve In-context Compositional Generalization. arXiv:2212.06800 [cs.CL]
  39. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 7871–7880. https://doi.org/10.18653/v1/2020.acl-main.703
  40. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). https://proceedings.neurips.cc/paper/2020/hash/6b493230205f780e1bc26945df7481e5-Abstract.html
  41. An Encoder Attribution Analysis for Dense Passage Retriever in Open-Domain Question Answering. In Proceedings of the 2nd Workshop on Trustworthy Natural Language Processing (TrustNLP 2022). Association for Computational Linguistics, Seattle, U.S.A., 1–11. https://doi.org/10.18653/v1/2022.trustnlp-1.1
  42. Few-shot In-context Learning on Knowledge Base Question Answering. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, Toronto, Canada, 6966–6980. https://doi.org/10.18653/v1/2023.acl-long.385
  43. In-Batch Negatives for Knowledge Distillation with Tightly-Coupled Teachers for Dense Retrieval. In Proceedings of the 6th Workshop on Representation Learning for NLP, RepL4NLP@ACL-IJCNLP 2021, Online, August 6, 2021, Anna Rogers, Iacer Calixto, Ivan Vulic, Naomi Saphra, Nora Kassner, Oana-Maria Camburu, Trapit Bansal, and Vered Shwartz (Eds.). Association for Computational Linguistics, 163–173. https://doi.org/10.18653/V1/2021.REPL4NLP-1.17
  44. What Makes Good In-Context Examples for GPT-3?. In Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, Eneko Agirre, Marianna Apidianaki, and Ivan Vulić (Eds.). Association for Computational Linguistics, Dublin, Ireland and Online, 100–114. https://doi.org/10.18653/v1/2022.deelio-1.10
  45. RoBERTa: A Robustly Optimized BERT Pretraining Approach. http://arxiv.org/abs/1907.11692
  46. Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Linguistics, Dublin, Ireland, 8086–8098. https://doi.org/10.18653/v1/2022.acl-long.556
  47. In-context Learning with Retrieved Demonstrations for Language Models: A Survey. arXiv:2401.11624 [cs.CL]
  48. CEDR: Contextualized Embeddings for Document Ranking. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, France, July 21-25, 2019, Benjamin Piwowarski, Max Chevalier, Éric Gaussier, Yoelle Maarek, Jian-Yun Nie, and Falk Scholer (Eds.). ACM, 1101–1104. https://doi.org/10.1145/3331184.3331317
  49. Solution for Information Overload Using Faceted Search–A Review. IEEE Access 8 (2020), 119554–119585. https://doi.org/10.1109/ACCESS.2020.3005536
  50. Yury A. Malkov and Dmitry A. Yashunin. 2020. Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs. IEEE Trans. Pattern Anal. Mach. Intell. 42, 4 (2020), 824–836. https://doi.org/10.1109/TPAMI.2018.2889473
  51. The impact of result diversification on search behaviour and performance. Inf. Retr. J. 22, 5 (2019), 422–446.
  52. In-Context Learning for Text Classification with Many Labels. In Proceedings of the 1st GenBench Workshop on (Benchmarking) Generalisation in NLP, Dieuwke Hupkes, Verna Dankers, Khuyagbaatar Batsuren, Koustuv Sinha, Amirhossein Kazemnejad, Christos Christodoulopoulos, Ryan Cotterell, and Elia Bruni (Eds.). Association for Computational Linguistics, Singapore, 173–184. https://doi.org/10.18653/v1/2023.genbench-1.14
  53. Large Language Model Augmented Narrative Driven Recommendations. arXiv:2306.02250 [cs.IR]
  54. Large Dual Encoders Are Generalizable Retrievers. arXiv:2112.07899 [cs.IR]
  55. Rodrigo Frassetto Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. CoRR abs/1901.04085 (2019). arXiv:1901.04085 http://arxiv.org/abs/1901.04085
  56. Document Ranking with a Pretrained Sequence-to-Sequence Model. In Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16-20 November 2020 (Findings of ACL, Vol. EMNLP 2020), Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, 708–718. https://doi.org/10.18653/V1/2020.FINDINGS-EMNLP.63
  57. Harrie Oosterhuis. 2021. Computationally Efficient Optimization of Plackett-Luce Ranking Models for Relevance and Fairness. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (, Virtual Event, Canada,) (SIGIR ’21). Association for Computing Machinery, New York, NY, USA, 1023–1032. https://doi.org/10.1145/3404835.3462830
  58. OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL]
  59. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (Eds.), Vol. 35. Curran Associates, Inc., 27730–27744. https://proceedings.neurips.cc/paper_files/paper/2022/file/b1efde53be364a73914f58805a001731-Paper-Conference.pdf
  60. Vern I. Paulsen and Mrinal Raghupathi. 2016. An Introduction to the Theory of Reproducing Kernel Hilbert Spaces. Cambridge University Press.
  61. How Does Generative Retrieval Scale to Millions of Passages? arXiv:2305.11841 [cs.IR]
  62. RankZephyr: Effective and Robust Zero-Shot Listwise Reranking is a Breeze! CoRR abs/2312.02724 (2023). https://doi.org/10.48550/ARXIV.2312.02724 arXiv:2312.02724
  63. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9.
  64. Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. arXiv:1908.10084 [cs.CL]
  65. Query Performance Prediction for Multifield Document Retrieval. In Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval (Virtual Event, Norway) (ICTIR ’20). Association for Computing Machinery, New York, NY, USA, 49–52. https://doi.org/10.1145/3409256.3409821
  66. Estimating Gaussian mixture models in the local neighbourhood of embedded word vectors for query performance prediction. Information Processing and Management 56, 3 (2019), 1026 – 1045.
  67. Learning To Retrieve Prompts for In-Context Learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Marine Carpuat, Marie-Catherine de Marneffe, and Ivan Vladimir Meza Ruiz (Eds.). Association for Computational Linguistics, Seattle, United States, 2655–2671. https://doi.org/10.18653/v1/2022.naacl-main.191
  68. Exploiting query reformulations for web search result diversification. In Proceedings of the 19th International Conference on World Wide Web (Raleigh, North Carolina, USA) (WWW ’10). Association for Computing Machinery, New York, NY, USA, 881–890. https://doi.org/10.1145/1772690.1772780
  69. Timo Schick and Hinrich Schütze. 2021. Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, Paola Merlo, Jorg Tiedemann, and Reut Tsarfaty (Eds.). Association for Computational Linguistics, Online, 255–269. https://doi.org/10.18653/v1/2021.eacl-main.20
  70. Measuring and Comparing the Consistency of IR Models for Query Pairs with Similar and Different Information Needs. In CIKM. ACM, 4449–4453.
  71. LexMAE: Lexicon-Bottlenecked Pretraining for Large-Scale Retrieval. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net. https://openreview.net/pdf?id=PfpEtB3-csK
  72. Predicting Query Performance by Query-Drift Estimation. ACM Trans. Inf. Syst. 30, 2, Article 11 (2012), 35 pages.
  73. Unsupervised Query Performance Prediction for Neural Models with Pairwise Rank Preferences. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2023, Taipei, Taiwan, July 23-27, 2023, Hsin-Hsi Chen, Wei-Jou (Edward) Duh, Hen-Hsen Huang, Makoto P. Kato, Josiane Mothe, and Barbara Poblete (Eds.). ACM, 2486–2490. https://doi.org/10.1145/3539618.3592082
  74. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Seattle, Washington, USA, 1631–1642. https://aclanthology.org/D13-1170
  75. An Information-theoretic Approach to Prompt Engineering Without Ground Truth Labels. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Linguistics, Dublin, Ireland, 819–862. https://doi.org/10.18653/v1/2022.acl-long.60
  76. RoFormer: Enhanced Transformer with Rotary Position Embedding. arXiv:2104.09864 [cs.CL]
  77. Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, 14918–14937. https://aclanthology.org/2023.emnlp-main.923
  78. In-context Learning of Large Language Models for Controlled Dialogue Summarization: A Holistic Benchmark and Empirical Analysis. In Proceedings of the 4th New Frontiers in Summarization Workshop, Yue Dong, Wen Xiao, Lu Wang, Fei Liu, and Giuseppe Carenini (Eds.). Association for Computational Linguistics, Singapore, 56–67. https://doi.org/10.18653/v1/2023.newsum-1.6
  79. Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv:2307.09288 [cs.CL]
  80. Aspect-based academic search using domain-specific KB. In Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14–17, 2020, Proceedings, Part II 42. Springer, 418–424.
  81. Ben Wang. 2021. Mesh-Transformer-JAX: Model-Parallel Implementation of Transformer Language Model with JAX. https://github.com/kingoflolz/mesh-transformer-jax.
  82. Ben Wang and Aran Komatsuzaki. 2022. GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model, 2021.
  83. Query2doc: Query Expansion with Large Language Models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singapore, 9414–9423. https://doi.org/10.18653/v1/2023.emnlp-main.585
  84. RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (Eds.). Association for Computational Linguistics, 538–548. https://doi.org/10.18653/V1/2022.EMNLP-MAIN.35
  85. Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net. https://openreview.net/forum?id=zeFrfgyZln
  86. Information Needs, Queries, and Query Performance Prediction. In Proc. of SIGIR ’19. Association for Computing Machinery, New York, NY, USA, 395–404.
  87. Learning k for KNN Classification. 8, 3, Article 43 (jan 2017), 19 pages. https://doi.org/10.1145/2990508
  88. Yun Zhou and W. Bruce Croft. 2007. Query Performance Prediction in Web Search Environments. In Proc. 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’07). Association for Computing Machinery, New York, NY, USA, 543–550.
Citations (1)

Summary

  • The paper demonstrates that integrating IR principles, such as dynamic example scoring, significantly boosts predictive accuracy in In-Context Learning.
  • The study introduces an adaptive ICL method that mirrors k-nearest neighbors by adjusting the few-shot prompt based on local data similarities.
  • The research highlights the use of supervised neural rankers and Query Performance Prediction to optimize example selection and reduce bias.

In-Context Learning and Information Retrieval in AI

This essay provides an analysis of the paper titled "In-Context Learning" or: How I learned to stop worrying and love "Applied Information Retrieval" (2405.01116). This work examines the convergence between In-Context Learning (ICL) — a burgeoning paradigm in NLP facilitated by LLMs — and established techniques in the field of Information Retrieval (IR). The authors propose that the principles and advancements in IR can significantly enhance the effectiveness of ICL.

Introduction

The paper presents a novel perspective by arguing that the foundation of In-Context Learning (ICL) aligns closely with principles found in Information Retrieval (IR). It juxtaposes the role ICL plays in the field of NLP with non-parametric approaches such as kk-Nearest Neighbors (kk-NN), emphasizing the reliance on local similarities within a dataset. In the context of ICL, a few labeled examples from a training set are appended to a prompt to guide an LLM's generation, which modifies the prediction process from relying on pre-trained model fine-tuning to emphasizing prompt-based guidance (Figure 1). Figure 1

Figure 1: A workflow diagram illustrating how three verticals of IR research fit into the workflow of in-context learning (ICL).

While traditional IR focuses on relevance ranking, which can be leveraged to retrieve few-shot examples for ICL, this research introduces the concept of "usefulness-specific relevance," where an example is considered valuable if it improves predictive accuracy. The work highlights how advanced neural rankers might be used to select optimal demonstrations for ICL, stressing that the field of Information Retrieval offers robust methodologies that can enhance ICL workflows effectively.

In-Context Learning: Conceptual Framework

In-Context Learning differs fundamentally from traditional supervised learning, as it bypasses iterative parameter fine-tuning in favor of leveraging few-shot labeled examples via prompt engineering to guide the decoder's generative process. The prediction in ICL depends heavily on the context provided by a few-shot set of examples appended to the prompt, making it akin to kk-NN, although with frozen LLM parameters (Figure 2). The framework of ICL consists of embedding test instances as queries to retrieve relevant, high-utility examples from a training set, which align with the predictive task's objectives such as text classification or generative tasks like QA or summarization. Figure 2

Figure 2: Example workflow of In-Context Learning for sentiment classification.

Query Performance Prediction (QPP) and Adaptive ICL

The paper explores the parallel between ICL's demand for useful training examples and Information Retrieval's (IR) ranking tasks, such as QPP, which could enhance ICL by efficiently selecting relevant training examples. The authors suggest that both unsupervised and supervised retrieval models can be calibrated to select optimal few-shot examples, employing ranking objectives and predictive models learned from training instances to enhance the efficacy of example retrieval.

Adaptive Example Selection

A proposed approach to optimize the number of examples in a prompt involves adaptive ICL, where the number of examples is dynamically determined based on the test instance, much like adjusting the neighborhood size in kk-NN for variable density distributions (Figure 3). Figure 3

Figure 3: Motivation behind using a variable sized neighborhood for kk-NN classification.

This approach necessitates scoring each example based on its potential utility for a correct prediction, which is a non-trivial task that the paper describes through a supervised learning approach. The prediction accuracy relies significantly on these localized examples, emphasizing their critical role in informing the model's decision-making process.

Practical and Theoretical Implications

The investigation outlines how the integration of IR strategies, particularly QPP, faceted, and diversified search paradigms, into ICL can create synergies that enhance ICL. Such integration would allow AI systems to perform better in tasks with varying demands on training data, such as adapting the number of examples dynamically — an area where they could borrow significantly from the vast IR literature.

Furthermore, the paper underscores the significance of orchestrating relevance and diversity in training examples to reduce the bias towards certain topical aspects, essential for a balanced exploration of the input space. Highlighting the challenges faced when utilizing score distributions and pointing out the necessity of adopting supervised approaches using neural rankers for more accurate ICL example selection.

Conclusion

This paper positions itself at the intersection of In-Context Learning and Information Retrieval, postulating that enhanced retrieval and diversity mechanisms can significantly contribute towards more effective ICL applications. By recommending IR methodologies such as Query Performance Prediction, supervised ranking, and faceted IR, the research opens new avenues for the refinement of ICL. Continued exploration along these lines may lead to paradigms that further heighten the adaptability and efficiency of AI models, providing tangible benefits for complex downstream AI tasks. Importantly, this research provides a framework to guide future work in merging these distinct yet synergistic fields.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.