Papers
Topics
Authors
Recent
Search
2000 character limit reached

Event GDR: Event-Centric Generative Document Retrieval

Published 11 May 2024 in cs.IR, cs.AI, and cs.CL | (2405.06886v1)

Abstract: Generative document retrieval, an emerging paradigm in information retrieval, learns to build connections between documents and identifiers within a single model, garnering significant attention. However, there are still two challenges: (1) neglecting inner-content correlation during document representation; (2) lacking explicit semantic structure during identifier construction. Nonetheless, events have enriched relations and well-defined taxonomy, which could facilitate addressing the above two challenges. Inspired by this, we propose Event GDR, an event-centric generative document retrieval model, integrating event knowledge into this task. Specifically, we utilize an exchange-then-reflection method based on multi-agents for event knowledge extraction. For document representation, we employ events and relations to model the document to guarantee the comprehensiveness and inner-content correlation. For identifier construction, we map the events to well-defined event taxonomy to construct the identifiers with explicit semantic structure. Our method achieves significant improvement over the baselines on two datasets, and also hopes to provide insights for future research.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. Understanding Differential Search Index for Text Retrieval. In Findings of the ACL. 10701–10717.
  2. Improving Factuality and Reasoning in Language Models through Multiagent Debate. arXiv:2305.14325 [cs.CL]
  3. Trigger-Argument based Explanation for Event Detection. In Findings of ACL. 5046–5058.
  4. Dense Passage Retrieval for Open-Domain Question Answering. In Proceedings of the EMNLP. 6769–6781.
  5. Multiview Identifiers Enhanced Generative Retrieval. In Proceedings of the ACL. 6636–6648.
  6. Generation-Augmented Retrieval for Open-Domain Question Answering. In Proceedings of the ACL. 4089–4100.
  7. How Does Generative Retrieval Scale to Millions of Passages? arXiv preprint arXiv:2305.11841 (2023).
  8. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res. 21, 1 (2020).
  9. Semantic-Enhanced Differentiable Search Index Inspired by Learning Strategies. In Proceedings of the SIGKDD.
  10. Transformer memory as a differentiable search index. In Proceedings of the NeurIPS 35 (2022), 21831–21843.
  11. MAVEN: A Massive General Domain Event Detection Dataset. In Proceedings of the EMNLP. 1652–1671.
  12. A neural corpus indexer for document retrieval. In Proceedings of the NeurIPS 35 (2022).
  13. mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer. In Proceedings of the NAACL. 483–498.
  14. ReAct: Synergizing Reasoning and Acting in Language Models. arXiv:2210.03629 [cs.CL]
  15. Bridging the gap between indexing and retrieval for differentiable search index with query generation. arXiv preprint arXiv:2206.10128 (2022).

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.