SimMLP: Training MLPs on Graphs without Supervision (2402.08918v3)
Abstract: Graph Neural Networks (GNNs) have demonstrated their effectiveness in various graph learning tasks, yet their reliance on neighborhood aggregation during inference poses challenges for deployment in latency-sensitive applications, such as real-time financial fraud detection. To address this limitation, recent studies have proposed distilling knowledge from teacher GNNs into student Multi-Layer Perceptrons (MLPs) trained on node content, aiming to accelerate inference. However, these approaches often inadequately explore structural information when inferring unseen nodes. To this end, we introduce SimMLP, a Self-supervised framework for learning MLPs on graphs, designed to fully integrate rich structural information into MLPs. Notably, SimMLP is the first MLP-learning method that can achieve equivalence to GNNs in the optimal case. The key idea is to employ self-supervised learning to align the representations encoded by graph context-aware GNNs and neighborhood dependency-free MLPs, thereby fully integrating the structural information into MLPs. We provide a comprehensive theoretical analysis, demonstrating the equivalence between SimMLP and GNNs based on mutual information and inductive bias, highlighting SimMLP's advanced structural learning capabilities. Additionally, we conduct extensive experiments on 20 benchmark datasets, covering node classification, link prediction, and graph classification, to showcase SimMLP's superiority over state-of-the-art baselines, particularly in scenarios involving unseen nodes (e.g., inductive and cold-start node classification) where structural insights are crucial. Our codes are available at: https://github.com/Zehong-Wang/SimMLP.
- Deep variational information bottleneck. In ICLR, 2017.
- Learning representations by maximizing mutual information across views. In NeurIPS, 2020.
- Noise2self: Blind denoising by self-supervision. In ICML, 2019.
- Mutual information neural estimation. In ICML, 2018.
- A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses. In ECCV, 2020.
- The ssl interplay: Augmentations, inductive bias, and generalization. In ICML, 2023.
- Deep clustering for unsupervised learning of visual features. In ECCV, 2018.
- Unsupervised learning of visual features by contrasting cluster assignments. NeurIPS, 2020.
- Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. In AAAI, 2020a.
- On graph neural networks versus graph-augmented MLPs. In ICLR, 2021.
- A simple framework for contrastive learning of visual representations. In ICML, 2020b.
- Big self-supervised models are strong semi-supervised learners. NeurIPS, 2020c.
- VQ-GNN: A universal framework to scale up graph neural networks using vector quantization. In NeurIPS, 2021.
- Node representation learning in graph via node-to-neighbourhood mutual information maximization. In CVPR, 2022.
- Graph neural networks with learnable structural and positional representations. In ICLR, 2022.
- The lottery ticket hypothesis: Finding sparse, trainable neural networks. In ICLR, 2019.
- Combining neural networks with personalized pagerank for classification on graphs. In ICLR, 2019.
- Neural message passing for quantum chemistry. In ICML, 2017.
- Bootstrap your own latent-a new approach to self-supervised learning. In NeurIPS, 2020.
- Linkless link prediction via relational distillation. In ICML, 2023.
- Deep learning with limited numerical precision. In ICML, 2015.
- Inductive representation learning on large graphs. In NeurIPS, 2017.
- Learning both weights and connections for efficient neural network. In NeurIPS, 2015.
- MLPInit: Embarrassingly simple GNN training acceleration with MLP initialization. In ICLR, 2023.
- Pre-training graph neural networks for cold-start users and items representation. In WSDM, 2021.
- Contrastive multi-view representation learning on graphs. In ICML, 2020.
- Momentum contrast for unsupervised visual representation learning. In CVPR, 2020.
- Using self-supervised learning can improve model robustness and uncertainty. In NeurIPS, 2019.
- Distilling the knowledge in a neural network. arXiv, 2015.
- Measuring and improving the use of graph information in graph neural networks. In ICLR, 2019.
- Graphmae: Self-supervised masked graph autoencoders. In KDD, 2022.
- Open graph benchmark: Datasets for machine learning on graphs. In NeurIPS, 2020.
- Graph-mlp: Node classification without message passing in graph. arXiv, 2021.
- Towards the generalization of contrastive self-supervised learning. In ICLR, 2023.
- Quantization and training of neural networks for efficient integer-arithmetic-only inference. In CVPR, 2018.
- Understanding dimensional collapse in contrastive self-supervised learning. In ICLR, 2022.
- Semi-supervised classification with graph convolutional networks. In ICLR, 2017.
- Self-supervised label augmentation via input transformations. In ICML, 2020.
- Deepgcns: Can gcns go as deep as cnns? In ICCV, 2019.
- Focal loss for dense object detection. In ICCV, 2017.
- An MLP-based algorithm for efficient contrastive graph recommendations. In SIGIR, 2022.
- Wiki-cs: A wikipedia-based benchmark for graph neural networks. arXiv, 2020.
- Tudataset: A collection of benchmark datasets for learning with graphs. arXiv, 2020.
- Muthén, B. Latent variable analysis. The Sage handbook of quantitative methodology for the social sciences, 2004.
- graph2vec: Learning distributed representations of graphs. arXiv, 2017.
- Pitfalls of graph neural network evaluation. arXiv, 2018.
- Weisfeiler-lehman graph kernels. JMLR, 2011.
- Normalized cuts and image segmentation. TPAMI, 2000.
- Opening the black box of deep neural networks via information. arXiv, 2017.
- A simple min-cut algorithm. Journal of the ACM, 1997.
- Infograph: Unsupervised and semi-supervised graph-level representation learning via mutual information maximization. In ICLR, 2020.
- Patient knowledge distillation for bert model compression. In EMNLP, 2019.
- All in one: Multi-task prompting for graph neural networks. In KDD, 2023.
- Large-scale representation learning on graphs via bootstrapping. In ICLR, 2022.
- Contrastive representation distillation. In ICLR, 2020a.
- What makes for good views for contrastive learning? NeurIPS, 2020b.
- Learning MLPs on graphs: A unified view of effectiveness, robustness, and efficiency. In ICLR, 2023.
- Deep learning and the information bottleneck principle. In ITW, 2015.
- The information bottleneck method. arXiv, 2000.
- Graph attention networks. In ICLR, 2018.
- Deep graph infomax. In ICLR, 2019.
- Apan: Asynchronous propagation attention network for real-time temporal graph embedding. In SIGMOD, 2021.
- Graph explicit neural networks: Explicitly encoding graphs for efficient and accurate inference. In WSDM, 2023.
- Anti-money laundering in bitcoin: Experimenting with graph convolutional networks for financial forensics. arXiv, 2019.
- Simplifying graph convolutional networks. In ICML, 2019.
- Self-supervised representation learning via latent graph prediction. In ICML, 2022.
- Infogcl: Information-aware graph contrastive learning. In NeurIPS, 2021.
- How powerful are graph neural networks? In ICLR, 2019.
- Tinygnn: Learning efficient graph neural networks. In KDD, 2020.
- Deep graph kernels. In KDD, 2015.
- Extract the knowledge of graph neural networks and go beyond it: An effective knowledge distillation framework. In WWW, 2021.
- Graph neural networks are inherently good generalizers: Insights by bridging GNNs and MLPs. In ICLR, 2023a.
- Vqgraph: Graph vector-quantization for bridging gnns and mlps. arXiv, 2023b.
- Revisiting semi-supervised learning with graph embeddings. In ICML, 2016.
- Graph contrastive learning with augmentations. In NeurIPS, 2020.
- Graph contrastive learning automated. In ICML, 2021.
- Barlow twins: Self-supervised learning via redundancy reduction. In ICML, 2021.
- Graphsaint: Graph sampling based inductive learning method. In ICLR, 2020.
- Graph-less neural networks: Teaching old MLPs new tricks via distillation. In ICLR, 2022.
- Data augmentation for graph neural networks. In AAAI, 2021.
- Accelerating large scale real-time gnn inference using channel pruning. In VLDB, 2021.
- Deep graph contrastive representation learning. arXiv, 2020.
- Graph contrastive learning with adaptive augmentation. In WWW, 2021.