Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
124 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SGQuant: Squeezing the Last Bit on Graph Neural Networks with Specialized Quantization (2007.05100v2)

Published 9 Jul 2020 in cs.LG and stat.ML

Abstract: With the increasing popularity of graph-based learning, Graph Neural Networks (GNNs) win lots of attention from the research and industry field because of their high accuracy. However, existing GNNs suffer from high memory footprints (e.g., node embedding features). This high memory footprint hurdles the potential applications towards memory-constrained devices, such as the widely-deployed IoT devices. To this end, we propose a specialized GNN quantization scheme, SGQuant, to systematically reduce the GNN memory consumption. Specifically, we first propose a GNN-tailored quantization algorithm design and a GNN quantization fine-tuning scheme to reduce memory consumption while maintaining accuracy. Then, we investigate the multi-granularity quantization strategy that operates at different levels (components, graph topology, and layers) of GNN computation. Moreover, we offer an automatic bit-selecting (ABS) to pinpoint the most appropriate quantization bits for the above multi-granularity quantizations. Intensive experiments show that SGQuant can effectively reduce the memory footprint from 4.25x to 31.9x compared with the original full-precision GNNs while limiting the accuracy drop to 0.4% on average.

Citations (44)

Summary

We haven't generated a summary for this paper yet.