KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation

Published 13 Nov 2019 in cs.CL | (1911.06136v3)

Abstract: Pre-trained language representation models (PLMs) cannot well capture factual knowledge from text. In contrast, knowledge embedding (KE) methods can effectively represent the relational facts in knowledge graphs (KGs) with informative entity embeddings, but conventional KE models cannot take full advantage of the abundant textual information. In this paper, we propose a unified model for Knowledge Embedding and Pre-trained LanguagE Representation (KEPLER), which can not only better integrate factual knowledge into PLMs but also produce effective text-enhanced KE with the strong PLMs. In KEPLER, we encode textual entity descriptions with a PLM as their embeddings, and then jointly optimize the KE and language modeling objectives. Experimental results show that KEPLER achieves state-of-the-art performances on various NLP tasks, and also works remarkably well as an inductive KE model on KG link prediction. Furthermore, for pre-training and evaluating KEPLER, we construct Wikidata5M, a large-scale KG dataset with aligned entity descriptions, and benchmark state-of-the-art KE methods on it. It shall serve as a new KE benchmark and facilitate the research on large KG, inductive KE, and KG with text. The source code can be obtained from https://github.com/THU-KEG/KEPLER.

Abstract PDF Upgrade to Chat

Authors (7)

Citations (607)

View on Semantic Scholar

Summary

The paper introduces KEPLER, a unified model that simultaneously optimizes knowledge embedding and masked language modeling objectives to enhance both factual recall and language understanding.
It integrates entity descriptions from knowledge graphs into a Transformer-based PLM, yielding state-of-the-art results in relation classification, few-shot learning, and entity typing.
The approach demonstrates that joint optimization can bridge factual gaps in PLMs while retaining robust linguistic representations, paving the way for future enhancements in knowledge integration.

KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation

The paper presents KEPLER, a unified model that integrates knowledge embedding (KE) and pre-trained LLMs (PLMs) to enhance both language representation and factual knowledge retrieval. This approach addresses the limitations inherent in PLMs, such as BERT and RoBERTa, which, while effective for linguistic tasks, do not capture factual knowledge effectively. Conversely, KE models efficiently represent relational facts from knowledge graphs (KGs) but cannot leverage the rich textual information effectively.

Methodology

KEPLER bridges the gap between PLMs and KE by extending the capabilities of PLMs to include factual knowledge. The model accomplishes this by encoding entity descriptions from a KG as entities within the PLM itself, optimizing these embeddings alongside traditional language modeling tasks.

Specifically, KEPLER combines two key objectives in its framework:

Knowledge Embedding (KE) Objective: This component uses entity descriptions from KGs to generate embeddings and employs a scoring function akin to TransE to train these embeddings effectively.
Masked Language Modeling (MLM) Objective: Retaining this objective from traditional PLMs ensures that KEPLER's language representations remain robust and contextually aware.

KEPLER encodes entities and text into a unified semantic space using a Transformer model, maintaining the model structure of RoBERTa to avoid additional inference complexity.

Experimental Evaluation

KEPLER was evaluated on various NLP tasks and knowledge integration scenarios, demonstrating its ability to incorporate factual knowledge without compromising language understanding capability.

NLP Tasks: KEPLER achieved state-of-the-art results across several challenging datasets such as TACRED for relation classification, FewRel for few-shot learning, and OpenEntity for entity typing.
Knowledge Embedding Tasks: On tasks such as link prediction in knowledge graphs, KEPLER showed enhanced capability, especially in an inductive setting where unseen entities are involved.

The paper also introduces Wikidata5M, a large-scale knowledge graph dataset aligned with entity descriptions to serve as a comprehensive benchmark for testing such models.

Results and Implications

KEPLER’s performance indicates that joint optimization of KE and MLM objectives can enhance a PLM’s ability to recall factual knowledge while maintaining linguistic robustness. The integration of KE with PLMs opens up new possibilities for building models that efficiently leverage both structured and unstructured data.

Future Directions

The suggested future work includes:

Exploring more sophisticated KE methods to enhance KEPLER's knowledge representation capabilities without increasing complexity.
Developing better knowledge probing methodologies to accurately assess the model's knowledge retention and retrieval capabilities across diverse factual datasets.

In summary, KEPLER represents a significant step forward in the integration of linguistic and factual knowledge, providing a robust framework for applications requiring nuanced understanding from textual and structured data sources.

Markdown Report Issue