Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-Document Keyphrase Extraction: Dataset, Baselines and Review (2110.01073v2)

Published 3 Oct 2021 in cs.CL

Abstract: Keyphrase extraction has been extensively researched within the single-document setting, with an abundance of methods, datasets and applications. In contrast, multi-document keyphrase extraction has been infrequently studied, despite its utility for describing sets of documents, and its use in summarization. Moreover, no prior dataset exists for multi-document keyphrase extraction, hindering the progress of the task. Recent advances in multi-text processing make the task an even more appealing challenge to pursue. To stimulate this pursuit, we present here the first dataset for the task, MK-DUC-01, which can serve as a new benchmark, and test multiple keyphrase extraction baselines on our data. In addition, we provide a brief, yet comprehensive, literature review of the task.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Ori Shapira (16 papers)
  2. Ramakanth Pasunuru (32 papers)
  3. Ido Dagan (72 papers)
  4. Yael Amsterdamer (11 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.