Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improving Feature Stability during Upsampling -- Spectral Artifacts and the Importance of Spatial Context (2311.17524v2)

Published 29 Nov 2023 in cs.CV

Abstract: Pixel-wise predictions are required in a wide variety of tasks such as image restoration, image segmentation, or disparity estimation. Common models involve several stages of data resampling, in which the resolution of feature maps is first reduced to aggregate information and then increased to generate a high-resolution output. Previous works have shown that resampling operations are subject to artifacts such as aliasing. During downsampling, aliases have been shown to compromise the prediction stability of image classifiers. During upsampling, they have been leveraged to detect generated content. Yet, the effect of aliases during upsampling has not yet been discussed w.r.t. the stability and robustness of pixel-wise predictions. While falling under the same term (aliasing), the challenges for correct upsampling in neural networks differ significantly from those during downsampling: when downsampling, some high frequencies can not be correctly represented and have to be removed to avoid aliases. However, when upsampling for pixel-wise predictions, we actually require the model to restore such high frequencies that can not be encoded in lower resolutions. The application of findings from signal processing is therefore a necessary but not a sufficient condition to achieve the desirable output. In contrast, we find that the availability of large spatial context during upsampling allows to provide stable, high-quality pixel-wise predictions, even when fully learning all filter weights.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (84)
  1. On the unreasonable vulnerability of transformers for image restoration-and an easy fix. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3707–3717, 2023a.
  2. Cospgd: a unified white-box adversarial attack for pixel-wise prediction tasks, 2023b.
  3. Checkerboard artifact free sub-pixel convolution: A note on sub-pixel convolution, resize convolution and convolution resize. arXiv preprint arXiv:1707.02937, 2017.
  4. Layer normalization, 2016.
  5. Bi3D: Stereo depth estimation via binary classifications. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
  6. Segnet: A deep convolutional encoder-decoder architecture for image segmentation, 2015.
  7. Label-efficient semantic segmentation with diffusion models, 2021.
  8. End-to-end object detection with transformers, 2020.
  9. Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp), pages 39–57. IEEE, 2017.
  10. Unsupervised learning of visual features by contrasting cluster assignments, 2020.
  11. A closer look at fourier spectrum discrepancies for cnn-generated images detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7200–7209, 2021.
  12. Pyramid stereo matching network, 2018.
  13. Simple baselines for image restoration. In European Conference on Computer Vision, pages 17–33. Springer, 2022.
  14. Scaling up your kernels to 31x31: Revisiting large kernel design in cnns, 2022.
  15. A guide to convolution arithmetic for deep learning. arXiv preprint arXiv:1603.07285, 2016.
  16. Watch your up-convolution: Cnn based generative deep neural networks are failing to reproduce spectral distributions. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7890–7899, 2020.
  17. The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html, 2012.
  18. Flownet: Learning optical flow with convolutional networks, 2015.
  19. Computer Vision: A Modern Approach. Prentice Hall, 2003.
  20. Swagan: A style-based wavelet-driven generative model. ACM Transactions on Graphics (TOG), 40(4):1–11, 2021.
  21. Imagenet-trained cnns are biased towards texture; increasing shape bias improves accuracy and robustness. In International Conference on Learning Representations, 2018.
  22. Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020.
  23. Explaining and harnessing adversarial examples, 2014.
  24. Frequencylowcut pooling–plug & play against catastrophic overfitting. arXiv preprint arXiv:2204.00491, 2022a.
  25. Aliasing coincides with cnns vulnerability towards adversarial attacks. In The AAAI-22 Workshop on Adversarial Machine Learning and Beyond, pages 1–5, 2022b.
  26. Aliasing and adversarial robust generalization of cnns. Machine Learning, pages 1–27, 2022c.
  27. Segpgd: An effective and efficient adversarial attack for evaluating and boosting segmentation robustness, 2022.
  28. Segnext: Rethinking convolutional attention design for semantic segmentation, 2022.
  29. Semantic contours from inverse detectors. In International Conference on Computer Vision (ICCV), 2011.
  30. Deep residual learning for image recognition, 2015.
  31. Beyond the spectrum: Detecting deepfakes via re-synthesis. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pages 2534–2541. International Joint Conferences on Artificial Intelligence Organization, 2021. Main Track.
  32. Gaussian error linear units (gelus), 2016.
  33. Towards improving robustness of compressed cnns. In ICML Workshop on Uncertainty and Robustness in Deep Learning (UDL), 2021.
  34. Anti-aliasing deep image classifiers using novel depth adaptive blurring and activation function, 2021.
  35. Mobilenets: Efficient convolutional neural networks for mobile vision applications, 2017.
  36. Densely connected convolutional networks, 2016.
  37. Flownet 2.0: Evolution of optical flow estimation with deep networks, 2016.
  38. Spectral distribution aware image generation. In Proceedings of the AAAI conference on artificial intelligence, pages 1734–1742, 2021.
  39. Alias-free generative adversarial networks. Advances in Neural Information Processing Systems, 34:852–863, 2021.
  40. Spatial frequency bias in convolutional generative adversarial networks. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 7152–7159, 2022.
  41. Adversarial machine learning at scale, 2017.
  42. Revisiting stereo depth estimation from a sequence-to-sequence perspective with transformers, 2020.
  43. More convnets in the 2020s: Scaling up kernels beyond 51x51 using sparsity, 2022a.
  44. A convnet for the 2020s, 2022b.
  45. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440, 2015.
  46. Sgdr: Stochastic gradient descent with warm restarts, 2016.
  47. Decoupled weight decay regularization, 2017.
  48. A frequency perspective of adversarial robustness, 2021.
  49. Monocular depth estimators: Vulnerabilities and attacks. ArXiv, abs/2005.14302, 2020.
  50. A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2016. arXiv:1512.02134.
  51. Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2574–2582, 2016.
  52. Image deconvolution ringing artifact detection and removal via psf frequency analysis. In Computer Vision – ECCV 2014, pages 247–262, Cham, 2014. Springer International Publishing.
  53. Deep multi-scale convolutional neural network for dynamic scene deblurring. In CVPR, 2017.
  54. Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE international conference on computer vision, pages 1520–1528, 2015.
  55. Deconvolution and checkerboard artifacts. Distill, 2016.
  56. Large kernel matters – improve semantic segmentation by global convolutional network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
  57. Adversarial attack driven data augmentation for accurate and robust medical image segmentation, 2021.
  58. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
  59. Designing network design spaces. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
  60. U-net: Convolutional networks for biomedical image segmentation, 2015.
  61. Detection defenses: An empty promise against adversarial patch attacks on optical flow. arXiv preprint arXiv:2310.17403, 2023.
  62. Attacking motion estimation with adversarial snow. arXiv preprint arXiv:2210.11242, 2022a.
  63. A perturbation-constrained adversarial attack for evaluating the robustness of optical flow, 2022b.
  64. Distracting downpour: Adversarial weather attacks for motion estimation, 2023.
  65. Towards the first adversarially robust neural network model on mnist. In International Conference on Learning Representations, 2018.
  66. segcv. segcv/pspnet. https://github.com/segcv/PSPNet/blob/master/Train.md, 2021.
  67. C.E. Shannon. Communication in the presence of noise. Proceedings of the IRE, 37(1):10–21, 1949.
  68. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network, 2016.
  69. Understanding kernel size in blind deconvolution, 2017.
  70. Very deep convolutional networks for large-scale image recognition, 2014.
  71. Intriguing properties of neural networks. In International Conference on Learning Representations, 2014.
  72. Efficientnetv2: Smaller models and faster training, 2021.
  73. Raft: Recurrent all-pairs field transforms for optical flow, 2020.
  74. Training data-efficient image transformers &; distillation through attention, 2020.
  75. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4):600–612, 2004.
  76. Deep convolutional neural network for image deconvolution. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2014.
  77. Adversarial patch attacks on monocular depth estimation networks. IEEE Access, 8:179094–179104, 2020.
  78. Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5728–5739, 2022.
  79. Deconvolutional networks. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 2528–2535, 2010.
  80. Richard Zhang. Making convolutional networks shift-invariant again. In ICML, 2019.
  81. Hengshuang Zhao. semseg. https://github.com/hszhao/semseg, 2019.
  82. Pyramid scene parsing network, 2016.
  83. PSANet: Point-wise spatial attention network for scene parsing. In ECCV, 2018.
  84. Delving deeper into anti-aliasing in convnets. In BMVC, 2020.
Citations (4)

Summary

We haven't generated a summary for this paper yet.

Youtube Logo Streamline Icon: https://streamlinehq.com