Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GPT-GNN: Generative Pre-Training of Graph Neural Networks (2006.15437v1)

Published 27 Jun 2020 in cs.LG, cs.SI, and stat.ML

Abstract: Graph neural networks (GNNs) have been demonstrated to be powerful in modeling graph-structured data. However, training GNNs usually requires abundant task-specific labeled data, which is often arduously expensive to obtain. One effective way to reduce the labeling effort is to pre-train an expressive GNN model on unlabeled data with self-supervision and then transfer the learned model to downstream tasks with only a few labels. In this paper, we present the GPT-GNN framework to initialize GNNs by generative pre-training. GPT-GNN introduces a self-supervised attributed graph generation task to pre-train a GNN so that it can capture the structural and semantic properties of the graph. We factorize the likelihood of the graph generation into two components: 1) Attribute Generation and 2) Edge Generation. By modeling both components, GPT-GNN captures the inherent dependency between node attributes and graph structure during the generative process. Comprehensive experiments on the billion-scale Open Academic Graph and Amazon recommendation data demonstrate that GPT-GNN significantly outperforms state-of-the-art GNN models without pre-training by up to 9.1% across various downstream tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Ziniu Hu (51 papers)
  2. Yuxiao Dong (119 papers)
  3. Kuansan Wang (18 papers)
  4. Kai-Wei Chang (292 papers)
  5. Yizhou Sun (149 papers)
Citations (498)

Summary

  • The paper introduces GPT-GNN, a framework that generates both node attributes and graph structures to reduce reliance on labeled data.
  • It employs a permutation-based generative strategy that effectively captures complex semantic and structural graph information.
  • Empirical results on billion-scale datasets show up to a 9.1% improvement, underscoring the model’s enhanced performance across tasks.

GPT-GNN: Generative Pre-Training of Graph Neural Networks

The advancement of Graph Neural Networks (GNNs) has significantly influenced the modeling of graph-structured data. However, the training of GNNs traditionally relies on task-specific labeled data, which are often costly and difficult to acquire. The paper "GPT-GNN: Generative Pre-Training of Graph Neural Networks" addresses this challenge by introducing a generative pre-training framework for GNNs that leverages unlabeled data, thereby mitigating the dependency on extensive labeled datasets.

Key Contributions

The central contribution of this work is the development of the GPT-GNN framework, which aims to pre-train GNNs through a self-supervised graph generation task. This task involves two components: Attribute Generation and Edge Generation. By learning to generate both node attributes and graph structure, GPT-GNN captures the dependencies between node attributes and graph structure, which are critical for effective graph representation.

The paper demonstrates that such a generative pre-training strategy significantly enhances the performance of GNNs across various downstream tasks. Empirical evidence from experiments conducted on billion-scale datasets like the Open Academic Graph and Amazon recommendation data indicates that GPT-GNN achieves up to 9.1% improvement over state-of-the-art GNN models that do not utilize pre-training.

Generative Pre-Training Approach

GPT-GNN leverages a permutation-based generative strategy wherein the likelihood of node attributes and structures is factorized into attribute and edge generation tasks. By modeling the graph distribution through this two-step generative process, GPT-GNN effectively encodes both the semantic and structural characteristics of the graph.

The generative pre-training involves creating node representations that predict not only the attributes of nodes but also their connections—a task akin to masked LLMing in NLP domains. This innovative approach adds robustness to the GNNs, enabling them to adapt to different tasks with minimal labeled data.

Experimental Validation

The paper provides a thorough experimental evaluation across both heterogeneous and homogeneous graph datasets. For instance, in the context of the Open Academic Graph, tasks such as paper-field and author name disambiguation see notable improvements post pre-training. Furthermore, the evaluation on different transfer scenarios, such as time transfer and field transfer, demonstrates the flexibility and generalizability of the GPT-GNN model.

Implications and Future Directions

This work opens up new avenues for improving GNN efficiency and efficacy by reducing reliance on labeled data. From a practical standpoint, GPT-GNN can expedite advancements in fields like recommendation systems, bioinformatics, and social network analysis, where labeled data are scarce.

Theoretically, GPT-GNN provides a framework that can be extended to other types of neural networks where input data have an inherent graph structure. Future developments might explore integrating GPT-GNN with even more diverse forms of graph data or enhance the pre-training tasks to capture additional semantic nuances.

Additionally, the comparative analysis with other pre-training approaches such as Graph Infomax and GAE positions GPT-GNN as a superior alternative for tasks requiring deep semantic understanding of graph datasets.

In conclusion, GPT-GNN represents a significant stride in generative pre-training for graph neural networks, proposing a robust mechanism for initializing GNNs that render them not only less dependent on labeled samples but also more adept at generalizing across varied domains. The implications of such advancements reach well beyond traditional graph applications, promising enhancements in machine learning tasks that operate on complex interconnected datasets.