ListT5: Listwise Reranking with Fusion-in-Decoder Improves Zero-shot Retrieval (2402.15838v3)
Abstract: We propose ListT5, a novel reranking approach based on Fusion-in-Decoder (FiD) that handles multiple candidate passages at both train and inference time. We also introduce an efficient inference framework for listwise ranking based on m-ary tournament sort with output caching. We evaluate and compare our model on the BEIR benchmark for zero-shot retrieval task, demonstrating that ListT5 (1) outperforms the state-of-the-art RankT5 baseline with a notable +1.3 gain in the average NDCG@10 score, (2) has an efficiency comparable to pointwise ranking models and surpasses the efficiency of previous listwise ranking models, and (3) overcomes the lost-in-the-middle problem of previous listwise rerankers. Our code, model checkpoints, and the evaluation framework are fully open-sourced at \url{https://github.com/soyoung97/ListT5}.
- Ms marco: A human generated machine reading comprehension dataset.
- Overview of the trec 2020 deep learning track.
- Exaranker: Explanation-augmented neural ranker.
- Fid-light: Efficient and effective retrieval-augmented text generation.
- Gautier Izacard and Edouard Grave. 2021. Leveraging passage retrieval with generative models for open domain question answering.
- Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of ir techniques. ACM Trans. Inf. Syst., 20(4):422–446.
- Dense passage retrieval for open-domain question answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6769–6781, Online. Association for Computational Linguistics.
- Fid-ex: Improving sequence-to-sequence models for extractive rationale generation.
- Lost in the middle: How language models use long contexts.
- Zero-shot listwise document reranking with a large language model.
- Keith McLuckie and Angus Barber. 1986. Tournament Sort, pages 68–86. Macmillan Education UK, London.
- Relational pooling for graph representations. In Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 4663–4673. PMLR.
- Large dual encoders are generalizable retrievers. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9844–9855, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Document ranking with a pretrained sequence-to-sequence model. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 708–718, Online. Association for Computational Linguistics.
- OpenAI. 2022. Introducing chatgpt. https://openai.com/blog/chatgpt.
- The expando-mono-duo design pattern for text ranking with pretrained sequence-to-sequence models.
- Rankvicuna: Zero-shot listwise document reranking with open-source large language models.
- Rankzephyr: Effective and robust zero-shot listwise reranking is a breeze!
- Large language models are effective text rankers with pairwise ranking prompting.
- Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140):1–67.
- Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks.
- Stephen Robertson and Hugo Zaragoza. 2009. The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends in Information Retrieval, 3(4):333–389.
- In defense of cross-encoders for zero-shot retrieval.
- Instruction distillation makes large language models efficient zero-shot rankers.
- Is chatgpt good at search? investigating large language models as re-ranking agents.
- Found in the middle: Permutation self-consistency improves listwise ranking in large language models.
- Beir: A heterogenous benchmark for zero-shot evaluation of information retrieval models.
- Learning list-level domain-invariant representations for ranking.
- Dmitry Yarotsky. 2018. Universal approximations of invariant maps by neural networks.
- Coco-dr: Combating distribution shifts in zero-shot dense retrieval with contrastive and distributionally robust learning.
- Rankt5: Fine-tuning t5 for text ranking with ranking losses.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.