REXEL: An End-to-end Model for Document-Level Relation Extraction and Entity Linking (2404.12788v1)
Abstract: Extracting structured information from unstructured text is critical for many downstream NLP applications and is traditionally achieved by closed information extraction (cIE). However, existing approaches for cIE suffer from two limitations: (i) they are often pipelines which makes them prone to error propagation, and/or (ii) they are restricted to sentence level which prevents them from capturing long-range dependencies and results in expensive inference time. We address these limitations by proposing REXEL, a highly efficient and accurate model for the joint task of document level cIE (DocIE). REXEL performs mention detection, entity typing, entity disambiguation, coreference resolution and document-level relation classification in a single forward pass to yield facts fully linked to a reference knowledge graph. It is on average 11 times faster than competitive existing approaches in a similar setting and performs competitively both when optimised for any of the individual subtasks and a variety of combinations of different joint tasks, surpassing the baselines by an average of more than 6 F1 points. The combination of speed and accuracy makes REXEL an accurate cost-efficient system for extracting structured information at web-scale. We also release an extension of the DocRED dataset to enable benchmarking of future work on DocIE, which is available at https://github.com/amazon-science/e2e-docie.
- Improving entity disambiguation by reasoning over a knowledge base. pages 2899–2912.
- ReFinED: An efficient zero-shot-capable approach to end-to-end entity linking. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track, pages 209–220. Association for Computational Linguistics.
- ReFinED: An efficient zero-shot-capable approach to end-to-end entity linking. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track, pages 209–220, Hybrid: Seattle, Washington + Online. Association for Computational Linguistics.
- Bidirectional Recurrent Convolutional Neural Network for Relation Classification. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 756–765, Berlin, Germany. Association for Computational Linguistics.
- Markus Eberts and Adrian Ulges. 2021. An End-to-end Model for Entity-level Relation Extraction using Multi-instance Learning. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 3650–3660, Online. Association for Computational Linguistics.
- Reinforcement Learning for Relation Classification From Noisy Data. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1).
- Pierre-Etienne Genest and Guy Lapalme. 2012. Fully abstractive approach to guided summarization. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 354–358.
- Hierarchical Relation Extraction with Coarse-to-Fine Grained Attention. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2236–2245, Brussels, Belgium. Association for Computational Linguistics.
- Pere-Lluís Huguet Cabot and Roberto Navigli. 2021. REBEL: Relation extraction by end-to-end language generation. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 2370–2381, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- GenIE: Generative information extraction. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4626–4643, Seattle, United States. Association for Computational Linguistics.
- Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.
- Higher-Order Coreference Resolution with Coarse-to-Fine Inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 687–692, New Orleans, Louisiana. Association for Computational Linguistics.
- RoBERTa: A Robustly Optimized BERT Pretraining Approach.
- Seq2rDF: 2018 ISWC Posters and Demonstrations, Industry and Blue Sky Ideas Tracks, ISWC-P and D-Industry-BlueSky 2018. CEUR Workshop Proceedings, 2180.
- KnowledgeNet: A Benchmark Dataset for Knowledge Base Population. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 749–758, Hong Kong, China. Association for Computational Linguistics.
- Makoto Miwa and Mohit Bansal. 2016. End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1105–1116, Berlin, Germany. Association for Computational Linguistics.
- Makoto Miwa and Yutaka Sasaki. 2014. Modeling Joint Entity and Relation Extraction with Table Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1858–1869, Doha, Qatar. Association for Computational Linguistics.
- Open Information Extraction for Knowledge Graph Construction. In Database and Expert Systems Applications, Communications in Computer and Information Science, pages 103–113, Cham. Springer International Publishing.
- Named Entity Recognition and Relation Extraction: State-of-the-Art. ACM Computing Surveys, 54(1):20:1–20:39.
- End-to-end Relation Extraction using Neural Networks and Markov Logic Networks. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 818–827, Valencia, Spain. Association for Computational Linguistics.
- Robustness evaluation of entity disambiguation using prior probes: the case of entity overshadowing. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10501–10510, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Deeptype: Multilingual entity linking by neural type system evolution. In AAAI.
- Lance Ramshaw and Mitch Marcus. 1995. Text chunking using transformation-based learning. In Third Workshop on Very Large Corpora.
- Set Generation Networks for End-to-End Knowledge Base Population. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 9650–9660, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Separating retention from extraction in the evaluation of end-to-end Relation Extraction. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10438–10449, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Revisiting docred – addressing the false negative problem in relation extraction. In Proceedings of EMNLP.
- Neural Relation Extraction for Knowledge Base Enrichment. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 229–240, Florence, Italy. Association for Computational Linguistics.
- Injecting Knowledge Base Information into End-to-End Joint Entity and Relation Extraction and Coreference Resolution. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1952–1957, Online. Association for Computational Linguistics.
- Global-to-Local Neural Networks for Document-Level Relation Extraction. EMNLP.
- Huggingface’s transformers: State-of-the-art natural language processing. ArXiv, abs/1910.03771.
- Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction. Proceedings of the AAAI Conference on Artificial Intelligence, 35(16):14149–14157. Number: 16.
- Liyan Xu and Jinho D. Choi. 2022. Modeling Task Interactions in Document-Level Joint Entity and Relation Extraction. ArXiv:2205.01909 [cs].
- Xuchen Yao and Benjamin Van Durme. 2014. Information Extraction over Structured Data: Question Answering with Freebase. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 956–966, Baltimore, Maryland. Association for Computational Linguistics.
- DocRED: A Large-Scale Document-Level Relation Extraction Dataset. arXiv:1906.06127 [cs]. ArXiv: 1906.06127.
- DWIE: An entity-centric dataset for multi-task document-level information extraction. Information Processing & Management, 58(4):102563.
- Double Graph Based Reasoning for Document-level Relation Extraction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1630–1640, Online. Association for Computational Linguistics.
- Document-level Relation Extraction as Semantic Segmentation. arXiv:2106.03618 [cs]. ArXiv: 2106.03618.
- Nacime Bouziani (7 papers)
- Shubhi Tyagi (5 papers)
- Joseph Fisher (6 papers)
- Jens Lehmann (80 papers)
- Andrea Pierleoni (8 papers)