LADDER: An Efficient Framework for Video Frame Interpolation (2404.11108v1)
Abstract: Video Frame Interpolation (VFI) is a crucial technique in various applications such as slow-motion generation, frame rate conversion, video frame restoration etc. This paper introduces an efficient video frame interpolation framework that aims to strike a favorable balance between efficiency and quality. Our framework follows a general paradigm consisting of a flow estimator and a refinement module, while incorporating carefully designed components. First of all, we adopt depth-wise convolution with large kernels in the flow estimator that simultaneously reduces the parameters and enhances the receptive field for encoding rich context and handling complex motion. Secondly, diverging from a common design for the refinement module with a UNet-structure (encoder-decoder structure), which we find redundant, our decoder-only refinement module directly enhances the result from coarse to fine features, offering a more efficient process. In addition, to address the challenge of handling high-definition frames, we also introduce an innovative HD-aware augmentation strategy during training, leading to consistent enhancement on HD images. Extensive experiments are conducted on diverse datasets, Vimeo90K, UCF101, Xiph and SNU-FILM. The results demonstrate that our approach achieves state-of-the-art performance with clear improvement while requiring much less FLOPs and parameters, reaching to a better spot for balancing efficiency and quality.
- Depth-Aware Video Frame Interpolation. In IEEE Conference on Computer Vision and Pattern Recognition.
- Two deterministic half-quadratic regularization algorithms for computed imaging. In ICIP, volume 2, 168–172 vol.2.
- Video Frame Interpolation via Deformable Separable Convolution. In AAAI.
- Channel Attention Is All You Need for Video Frame Interpolation. In AAAI.
- Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs. arXiv preprint arXiv:2203.06717.
- Residual Conv-Deconv Grid Network for Semantic Segmentation. In Proceedings of the British Machine Vision Conference, 2017.
- Fourier space losses for efficient perceptual image super-resolution. In ICCV.
- Many-to-many Splatting for Efficient Video Frame Interpolation.
- Real-Time Intermediate Flow Estimation for Video Frame Interpolation. In Proceedings of the European Conference on Computer Vision (ECCV).
- Super slomo: High quality estimation of multiple intermediate frames for video interpolation. In CVPR.
- IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- AdaCoF: Adaptive Collaboration of Flows for Video Frame Interpolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
- Decoupled weight decay regularization. In ICLR.
- Video Frame Interpolation with Transformer. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
- Montgomery, C. 1994. Xiph.org video test media (derf’s collection). In Online,https://media.xiph.org/video/derf/.
- Context-aware Synthesis for Video Frame Interpolation. In CVPR.
- Context-aware synthesis for video frame interpolation. In CVPR.
- Softmax Splatting for Video Frame Interpolation. In IEEE Conference on Computer Vision and Pattern Recognition.
- Video Frame Interpolation via Adaptive Convolution. In IEEE Conference on Computer Vision and Pattern Recognition.
- Video Frame Interpolation via Adaptive Separable Convolution. In IEEE International Conference on Computer Vision.
- BMBC: Bilateral Motion Estimation with Bilateral Cost Volume for Video Interpolation. In European Conference on Computer Vision.
- Asymmetric Bilateral Motion Estimation for Video Frame Interpolation. In International Conference on Computer Vision.
- Im-net for high resolution video frame interpolation. In CVPR.
- Large Kernel Matters – Improve Semantic Segmentation by Global Convolutional Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
- FILM: Frame Interpolation for Large Motion. In European Conference on Computer Vision (ECCV).
- U-Net: Convolutional Networks for Biomedical Image Segmentation. In Navab, N.; Hornegger, J.; III, W. M. W.; and Frangi, A. F., eds., Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015 - 18th International Conference Munich, Germany, October 5 - 9, 2015, Proceedings, Part III, volume 9351 of Lecture Notes in Computer Science, 234–241. Springer.
- UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild. CoRR, abs/1212.0402.
- Video Compression through Image Interpolation. In ECCV.
- Optimizing Video Prediction via Video Frame Interpolation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition.
- Quadratic video interpolation. In NeurIPS.
- Video Enhancement with Task-Oriented Flow. International Journal of Computer Vision (IJCV), 127(8): 1106–1125.
- Extracting motion and appearance via inter-frame attention for efficient video frame interpolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5682–5692.
- View Synthesis by Appearance Flow. In European Conference on Computer Vision.