Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 56 tok/s
Gemini 2.5 Pro 39 tok/s Pro
GPT-5 Medium 15 tok/s Pro
GPT-5 High 16 tok/s Pro
GPT-4o 99 tok/s Pro
Kimi K2 155 tok/s Pro
GPT OSS 120B 476 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

Event GDR: Event-Centric Generative Document Retrieval (2405.06886v1)

Published 11 May 2024 in cs.IR, cs.AI, and cs.CL

Abstract: Generative document retrieval, an emerging paradigm in information retrieval, learns to build connections between documents and identifiers within a single model, garnering significant attention. However, there are still two challenges: (1) neglecting inner-content correlation during document representation; (2) lacking explicit semantic structure during identifier construction. Nonetheless, events have enriched relations and well-defined taxonomy, which could facilitate addressing the above two challenges. Inspired by this, we propose Event GDR, an event-centric generative document retrieval model, integrating event knowledge into this task. Specifically, we utilize an exchange-then-reflection method based on multi-agents for event knowledge extraction. For document representation, we employ events and relations to model the document to guarantee the comprehensiveness and inner-content correlation. For identifier construction, we map the events to well-defined event taxonomy to construct the identifiers with explicit semantic structure. Our method achieves significant improvement over the baselines on two datasets, and also hopes to provide insights for future research.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. Understanding Differential Search Index for Text Retrieval. In Findings of the ACL. 10701–10717.
  2. Improving Factuality and Reasoning in Language Models through Multiagent Debate. arXiv:2305.14325 [cs.CL]
  3. Trigger-Argument based Explanation for Event Detection. In Findings of ACL. 5046–5058.
  4. Dense Passage Retrieval for Open-Domain Question Answering. In Proceedings of the EMNLP. 6769–6781.
  5. Multiview Identifiers Enhanced Generative Retrieval. In Proceedings of the ACL. 6636–6648.
  6. Generation-Augmented Retrieval for Open-Domain Question Answering. In Proceedings of the ACL. 4089–4100.
  7. How Does Generative Retrieval Scale to Millions of Passages? arXiv preprint arXiv:2305.11841 (2023).
  8. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res. 21, 1 (2020).
  9. Semantic-Enhanced Differentiable Search Index Inspired by Learning Strategies. In Proceedings of the SIGKDD.
  10. Transformer memory as a differentiable search index. In Proceedings of the NeurIPS 35 (2022), 21831–21843.
  11. MAVEN: A Massive General Domain Event Detection Dataset. In Proceedings of the EMNLP. 1652–1671.
  12. A neural corpus indexer for document retrieval. In Proceedings of the NeurIPS 35 (2022).
  13. mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer. In Proceedings of the NAACL. 483–498.
  14. ReAct: Synergizing Reasoning and Acting in Language Models. arXiv:2210.03629 [cs.CL]
  15. Bridging the gap between indexing and retrieval for differentiable search index with query generation. arXiv preprint arXiv:2206.10128 (2022).
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com