Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MCA: Moment Channel Attention Networks (2403.01713v1)

Published 4 Mar 2024 in cs.CV

Abstract: Channel attention mechanisms endeavor to recalibrate channel weights to enhance representation abilities of networks. However, mainstream methods often rely solely on global average pooling as the feature squeezer, which significantly limits the overall potential of models. In this paper, we investigate the statistical moments of feature maps within a neural network. Our findings highlight the critical role of high-order moments in enhancing model capacity. Consequently, we introduce a flexible and comprehensive mechanism termed Extensive Moment Aggregation (EMA) to capture the global spatial context. Building upon this mechanism, we propose the Moment Channel Attention (MCA) framework, which efficiently incorporates multiple levels of moment-based information while minimizing additional computation costs through our Cross Moment Convolution (CMC) module. The CMC module via channel-wise convolution layer to capture multiple order moment information as well as cross channel features. The MCA block is designed to be lightweight and easily integrated into a variety of neural network architectures. Experimental results on classical image classification, object detection, and instance segmentation tasks demonstrate that our proposed method achieves state-of-the-art results, outperforming existing channel attention methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. Measure theory and probability theory, volume 19. Springer.
  2. Tests for skewness, kurtosis, and normality for time series data. Journal of Business & Economic Statistics, 23(1): 49–60.
  3. GCNet: Non-Local Networks Meet Squeeze-Excitation Networks and Beyond. In Proceedings of the IEEE/CVF international conference on computer vision workshops, 0–0.
  4. End-to-end object detection with transformers. In European conference on computer vision, 213–229. Springer.
  5. Statistical Inference. Duxbury Press, 2nd edition.
  6. A^ 2-nets: Double attention networks. Advances in neural information processing systems, 31.
  7. Deformable convolutional networks. In Proceedings of the IEEE international conference on computer vision, 764–773.
  8. THE SMALLEST UPPER BOUND FOR THE pTH ABSOLUTE CENTRAL MOMENT OF A CLASS OF RANDOM VARIABLES. Mathematical Scientist, 37(2).
  9. Flusser, J. 2000. On the independence of rotation moment invariants. Pattern recognition, 33(9): 1405–1410.
  10. Rotation moment invariants for recognition of symmetric objects. IEEE Transactions on Image Processing, 15(12): 3784–3790.
  11. Dual attention network for scene segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 3146–3154.
  12. Global second-order pooling convolutional networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3024–3033.
  13. Statistical inference. Oxford Science Publications.
  14. Digital image processing. Prentice Hall.
  15. Draw: A recurrent neural network for image generation. In International Conference on Machine Learning, 1462–1471. PMLR.
  16. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, 2961–2969.
  17. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.
  18. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 7132–7141.
  19. Huawei Technologies Co., L. 2022. Deep Learning Frameworks. In Artificial Intelligence Technology, 123–135. Springer.
  20. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning, 448–456. PMLR.
  21. Adam: A Method for Stochastic Optimization. CoRR, abs/1412.6980.
  22. Srm: A style-based recalibration module for convolutional neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 1854–1862.
  23. Selective kernel networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 510–519.
  24. Momentum^ 2 Teacher: Momentum Teacher with Momentum Statistics for Self-Supervised Learning. arXiv preprint arXiv:2101.07525.
  25. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, 2980–2988.
  26. Microsoft coco: Common objects in context. In European conference on computer vision, 740–755. Springer.
  27. Using simulated training data of voxel-level generative models to improve 3D neuron reconstruction. IEEE transactions on medical imaging, 41(12): 3624–3635.
  28. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European conference on computer vision (ECCV), 116–131.
  29. Bam: Bottleneck attention module. arXiv preprint arXiv:1807.06514.
  30. Handbook of the normal distribution, volume 150. CRC Press.
  31. Fcanet: Frequency channel attention networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 783–792.
  32. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28.
  33. Recalibrating fully convolutional networks with spatial and channel “squeeze and excitation” blocks. IEEE transactions on medical imaging, 38(2): 540–549.
  34. Gaussian context transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15129–15138.
  35. Imagenet large scale visual recognition challenge. International journal of computer vision, 115(3): 211–252.
  36. Content-based image retrieval at the end of the early years. IEEE Transactions on pattern analysis and machine intelligence, 22(12): 1349–1380.
  37. Vl-bert: Pre-training of generic visual-linguistic representations. arXiv preprint arXiv:1908.08530.
  38. Calibrating the adaptive learning rate to improve convergence of ADAM. Neurocomputing, 481: 333–356.
  39. Attention is all you need. Advances in neural information processing systems, 30.
  40. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 11531–11539.
  41. Hierarchical Pyramid Diverse Attention Networks for Face Recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 8326–8335.
  42. Non-local neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 7794–7803.
  43. Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV), 3–19.
  44. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1492–1500.
  45. Attngan: Fine-grained text to image generation with attentional generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1316–1324.
  46. Neural aggregation network for video face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 4362–4371.
  47. Gated channel transformation for visual recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11794–11803.
  48. Color texture moments for content-based image retrieval. In Proceedings. International Conference on Image Processing, volume 3, 929–932. IEEE.
  49. Ocnet: Object context network for scene parsing. arXiv preprint arXiv:1809.00916.
  50. Central moment discrepancy (cmd) for domain-invariant representation learning. ICLR.
  51. Robust unsupervised domain adaptation for neural networks via moment alignment. Information Sciences, 483: 174–191.
  52. Self-attention generative adversarial networks. In International conference on machine learning, 7354–7363. PMLR.
  53. A Sufficient Condition for Convergences of Adam and RMSProp. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 11119–11127.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yangbo Jiang (1 paper)
  2. Zhiwei Jiang (24 papers)
  3. Le Han (3 papers)
  4. Zenan Huang (8 papers)
  5. Nenggan Zheng (16 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.