LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation (1805.07036v1)

Published 18 May 2018 in cs.CV

Abstract: FlowNet2, the state-of-the-art convolutional neural network (CNN) for optical flow estimation, requires over 160M parameters to achieve accurate flow estimation. In this paper we present an alternative network that outperforms FlowNet2 on the challenging Sintel final pass and KITTI benchmarks, while being 30 times smaller in the model size and 1.36 times faster in the running speed. This is made possible by drilling down to architectural details that might have been missed in the current frameworks: (1) We present a more effective flow inference approach at each pyramid level through a lightweight cascaded network. It not only improves flow estimation accuracy through early correction, but also permits seamless incorporation of descriptor matching in our network. (2) We present a novel flow regularization layer to ameliorate the issue of outliers and vague flow boundaries by using a feature-driven local convolution. (3) Our network owns an effective structure for pyramidal feature extraction and embraces feature warping rather than image warping as practiced in FlowNet2. Our code and trained models are available at https://github.com/twhui/LiteFlowNet .

Summary

The paper demonstrates that LiteFlowNet achieves competitive optical flow accuracy while reducing model size by 30 times compared to FlowNet2.
It employs a pyramidal encoder-decoder architecture with feature warping to enable efficient coarse-to-fine flow estimation.
The cascaded flow inference and feature-driven regularization yield sub-pixel accurate flow estimations suitable for real-time applications.

An Overview of LiteFlowNet: A Lightweight CNN for Optical Flow Estimation

The paper "LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation" authored by Tak-Wai Hui, Xiaoou Tang, and Chen Change Loy, addresses the complexities of optical flow estimation which is a critical task in computer vision. Unlike its predecessors like FlowNet2, which comprises over 160 million parameters, LiteFlowNet achieves superior performance while significantly reducing the model size by 30 times and enhancing running speed by 1.36 times.

Key Contributions and Architecture

The authors propose several key architectural innovations:

Pyramidal Feature Extraction: LiteFlowNet comprises an encoder-decoder framework. The encoder maps image pairs into pyramids of multi-scale features, facilitating coarse-to-fine flow estimation by the decoder.
Feature Warping: Distinct from FlowNet2, LiteFlowNet engages in feature warping. By directly warping feature maps rather than images, this method reduces the distance in the feature space, enhancing computational efficiency and accuracy.
Cascaded Flow Inference: At each pyramid level, a novel cascade of lightweight networks improves early flow correction accuracy without passing significant errors to subsequent levels. This also enables descriptor matching for pixel-accurate flow estimations refined to sub-pixel accuracy.
Flow Regularization: The authors introduce a feature-driven local convolution layer to regularize flow fields, ameliorating vague flow boundaries and outliers. This layer adapts its convolutions based on local features, flow estimates, and occlusion probabilities.

Numerical Results and Analysis

Evaluation of LiteFlowNet on established benchmarks such as the Sintel and KITTI demonstrates its effectiveness. The model achieves comparable or superior results to FlowNet2 while dramatically reducing computational demands. Explicitly, LiteFlowNet outperforms FlowNet2 on the challenging Sintel final pass with lesser parameter complexity. Moreover, its lightweight nature and speed make it suitable for real-time applications.

Implications and Future Directions

Practically, LiteFlowNet’s advancements suggest a significant leap in deploying CNN-based optical flow estimation in resource-constrained environments such as drones and embedded systems. Theoretically, the integration of feature-driven regularization and pyramid-based estimation presents a potential template for future networks addressing similar vision tasks.

Speculating further, a natural extension could involve exploring unsupervised learning paradigms to reduce dependency on labeled data. Additionally, the framework could be adapted for other computer vision applications such as video frame interpolation or scene flow estimation.

In summary, LiteFlowNet presents a robust, efficient alternative for optical flow estimation, paving the way for deploying CNN solutions across varied real-world applications without the prohibitive costs of larger models.

Related Papers

GitHub

GitHub - twhui/LiteFlowNet: LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation, CVPR 2018 (Spotlight paper, 6.6%) (581 stars)