Cross-Channel Intragroup Sparsity Neural Network (1910.11971v2)
Abstract: Modern deep neural networks rely on overparameterization to achieve state-of-the-art generalization. But overparameterized models are computationally expensive. Network pruning is often employed to obtain less demanding models for deployment. Fine-grained pruning removes individual weights in parameter tensors and can achieve a high model compression ratio with little accuracy degradation. However, it introduces irregularity into the computing dataflow and often does not yield improved model inference efficiency in practice. Coarse-grained model pruning, while realizing satisfactory inference speedup through removal of network weights in groups, e.g. an entire filter, often lead to significant accuracy degradation. This work introduces the cross-channel intragroup (CCI) sparsity structure, which can prevent the inference inefficiency of fine-grained pruning while maintaining outstanding model performance. We then present a novel training algorithm designed to perform well under the constraint imposed by the CCI-Sparsity. Through a series of comparative experiments we show that our proposed CCI-Sparsity structure and the corresponding pruning algorithm outperform prior art in inference efficiency by a substantial margin given suited hardware acceleration in the future.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.