Emergent Mind

Text Generation from Knowledge Graphs with Graph Transformers

(1904.02342)
Published Apr 4, 2019 in cs.CL

Abstract

Generating texts which express complex ideas spanning multiple sentences requires a structured representation of their content (document plan), but these representations are prohibitively expensive to manually produce. In this work, we address the problem of generating coherent multi-sentence texts from the output of an information extraction system, and in particular a knowledge graph. Graphical knowledge representations are ubiquitous in computing, but pose a significant challenge for text generation techniques due to their non-hierarchical nature, collapsing of long-distance dependencies, and structural variety. We introduce a novel graph transforming encoder which can leverage the relational structure of such knowledge graphs without imposing linearization or hierarchical constraints. Incorporated into an encoder-decoder setup, we provide an end-to-end trainable system for graph-to-text generation that we apply to the domain of scientific text. Automatic and human evaluations show that our technique produces more informative texts which exhibit better document structure than competitive encoder-decoder methods.

Overview

  • The paper introduces a novel approach for generating coherent multi-sentence scientific texts from knowledge graphs using a new graph-transforming encoder in an encoder-decoder setup.

  • The encoder, based on the Transformer architecture, leverages both global and local context within non-hierarchical knowledge graphs to inform each node.

  • The GraphWriter model preserves label information and includes a global context node to improve information flow within the graph.

  • Empirical evaluation shows that GraphWriter outperforms other methods, achieving better coherence and richness in text generation from scientific abstracts' knowledge graphs.

  • The study suggests future research directions for reducing redundancies and improving entity coverage in generated texts, with AGENDA dataset laying the groundwork for further explorations.

Introduction

The domain of text generation from structured representations such as knowledge graphs has witnessed significant progress with the utilization of new neural network architectures. In traditional text-generation tasks, coherence and relevance across multiple sentences can pose substantial difficulties. Research by Koncel-Kedziorski et al. addresses these complexities by suggesting a novel technique for generating coherent multi-sentence scientific texts from knowledge graphs.

Methodology

Their method employs an encoder-decoder setup where a novel graph-transforming encoder takes center stage. This encoder adapts the influential Transformer architecture to the realm of graphs and derives input from a knowledge graph without imposing linearization or hierarchical structure. This architectural choice enables the model, named GraphWriter, to leverage global and local contextual information within the graphs, which are intrinsically non-hierarchical and abound with long-distance dependencies.

In essence, GraphWriter integrates the self-attention mechanism, allowing it to contextually inform each node (entity or relation) in the knowledge graph based on its immediate connections and global structure. The method preserves graph label information through bipartite transformations and introduces a global context node to encourage information flow across disparate sections of the graph.

Experiments and Evaluation

The effectiveness of the proposed approach is tested using a collection of scientific abstracts paired with corresponding knowledge graphs. These graphs are cultivated using state-of-the-art information extraction techniques which aggregate entities and their interrelations. Notably, the encoder, capable of global perspective, and the attention-based decoder with a copy mechanism, form an end-to-end trainable system that achieves informative and well-structured scientific text generation.

The authors subject GraphWriter to rigorous human and automatic evaluations, highlighting its superior performance compared to several baselines including Graph Attention Networks and those that do not utilize knowledge at all. They demonstrate quantitatively and qualitatively that incorporating structured knowledge into the text generation process enhances both the coherence and information richness of the generated texts.

Conclusions

This work opens up promising directions for future advancements. It paves the way for further explorations into the elimination of redundancies and improving entity coverage within generated texts. The novel GraphWriter model, along with its foundational dataset, AGENDA, establishes a new frontier in the graph-to-text generation landscape and provides a credible platform for subsequent research within the area of document plan-based text generation.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.