Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DocOIE: A Document-level Context-Aware Dataset for OpenIE (2105.04271v2)

Published 10 May 2021 in cs.CL

Abstract: Open Information Extraction (OpenIE) aims to extract structured relational tuples (subject, relation, object) from sentences and plays critical roles for many downstream NLP applications. Existing solutions perform extraction at sentence level, without referring to any additional contextual information. In reality, however, a sentence typically exists as part of a document rather than standalone; we often need to access relevant contextual information around the sentence before we can accurately interpret it. As there is no document-level context-aware OpenIE dataset available, we manually annotate 800 sentences from 80 documents in two domains (Healthcare and Transportation) to form a DocOIE dataset for evaluation. In addition, we propose DocIE, a novel document-level context-aware OpenIE model. Our experimental results based on DocIE demonstrate that incorporating document-level context is helpful in improving OpenIE performance. Both DocOIE dataset and DocIE model are released for public.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Kuicai Dong (17 papers)
  2. Yilin Zhao (17 papers)
  3. Aixin Sun (99 papers)
  4. Jung-Jae Kim (8 papers)
  5. Xiaoli Li (120 papers)
Citations (13)

Summary

We haven't generated a summary for this paper yet.