Learning by Sampling and Compressing: Efficient Graph Representation Learning with Extremely Limited Annotations (2003.06100v2)

Published 13 Mar 2020 in cs.LG, cs.SI, and stat.ML

Abstract: Graph convolution network (GCN) attracts intensive research interest with broad applications. While existing work mainly focused on designing novel GCN architectures for better performance, few of them studied a practical yet challenging problem: How to learn GCNs from data with extremely limited annotation? In this paper, we propose a new learning method by sampling strategy and model compression to overcome this challenge. Our approach has multifold advantages: 1) the adaptive sampling strategy largely suppresses the GCN training deviation over uniform sampling; 2) compressed GCN-based methods with a smaller scale of parameters need fewer labeled data to train; 3) the smaller scale of training data is beneficial to reduce the human resource cost to label them. We choose six popular GCN baselines and conduct extensive experiments on three real-world datasets. The results show that by applying our method, all GCN baselines cut down the annotation requirement by as much as 90$\%$ and compress the scale of parameters more than 6$\times$ without sacrificing their strong performance. It verifies that the training method could extend the existing semi-supervised GCN-based methods to the scenarios with the extremely small scale of labeled data.