Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ZeroFlow: Scalable Scene Flow via Distillation (2305.10424v8)

Published 17 May 2023 in cs.CV and cs.LG

Abstract: Scene flow estimation is the task of describing the 3D motion field between temporally successive point clouds. State-of-the-art methods use strong priors and test-time optimization techniques, but require on the order of tens of seconds to process full-size point clouds, making them unusable as computer vision primitives for real-time applications such as open world object detection. Feedforward methods are considerably faster, running on the order of tens to hundreds of milliseconds for full-size point clouds, but require expensive human supervision. To address both limitations, we propose Scene Flow via Distillation, a simple, scalable distillation framework that uses a label-free optimization method to produce pseudo-labels to supervise a feedforward model. Our instantiation of this framework, ZeroFlow, achieves state-of-the-art performance on the Argoverse 2 Self-Supervised Scene Flow Challenge while using zero human labels by simply training on large-scale, diverse unlabeled data. At test-time, ZeroFlow is over 1000x faster than label-free state-of-the-art optimization-based methods on full-size point clouds (34 FPS vs 0.028 FPS) and over 1000x cheaper to train on unlabeled data compared to the cost of human annotation (\$394 vs ~\$750,000). To facilitate further research, we release our code, trained model weights, and high quality pseudo-labels for the Argoverse 2 and Waymo Open datasets at https://vedder.io/zeroflow.html

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. RMS-FlowNet: Efficient and Robust Multi-Scale Scene Flow Estimation for Large-Scale Point Clouds. In Int. Conf. Rob. Aut., pp.  883–889. IEEE, 2022.
  2. SLIM: Self-supervised LiDAR scene flow and motion segmentation. In Int. Conf. Comput. Vis., pp.  13126–13136, 2021.
  3. Pointflownet: Learning representations for rigid motion estimation from point clouds. In Int. Conf. Comput. Vis., pp.  7962–7971, 2019.
  4. Michael Black. Novelty in science: A guide to reviewers. https://medium.com/@black_51980/novelty-in-science-8f1fd1a0a143, 2022.
  5. On the opportunities and risks of foundation models. ArXiv, 2021.
  6. Language Models are Few-Shot Learners. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (eds.), Advances in Neural Information Processing Systems, volume 33, pp.  1877–1901, 2020.
  7. Object modelling by registration of multiple range images. Img. Vis. Comput., 10(3):145–155, 1992.
  8. Re-Evaluating LiDAR Scene Flow for Autonomous Driving. arXiv preprint, 2023.
  9. Rigid scene flow for 3d lidar scans. In Int. Conf. Intel. Rob. Sys., pp.  1765–1770. IEEE, 2016.
  10. Exploiting Rigidity Constraints for LiDAR Scene Flow Estimation. In IEEE Conf. Comput. Vis. Pattern Recog., pp.  12776–12785, 2022.
  11. Smooth shells: Multi-scale shape registration with functional maps. In IEEE Conf. Comput. Vis. Pattern Recog., pp.  12265–12274, 2020.
  12. 3D Object Detection with a Self-supervised Lidar Scene Flow Backbone. In Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, and Tal Hassner (eds.), Computer Vision – ECCV 2022, pp. 247–265, Cham, 2022. Springer Nature Switzerland.
  13. Weakly supervised learning of rigid 3d scene flow. In IEEE Conf. Comput. Vis. Pattern Recog., pp.  5692–5703, 2021.
  14. Hplflownet: Hierarchical permutohedral lattice flownet for scene flow estimation on large-scale point clouds. In IEEE Conf. Comput. Vis. Pattern Recog., pp.  3254–3263, 2019.
  15. Dynamic 3D Scene Analysis by Point Cloud Accumulation. In European Conference on Computer Vision, ECCV, 2022.
  16. Deformation and Correspondence Aware Unsupervised Synthetic-to-Real Scene Flow Estimation for Point Clouds. In IEEE Conf. Comput. Vis. Pattern Recog., pp.  7233–7243, 2022.
  17. Scalable Scene Flow From Point Clouds in the Real World. IEEE Robotics and Automation Letters, 12 2021.
  18. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  19. Segment Anything. arXiv:2304.02643, 2023.
  20. Flowstep3d: Model unrolling for self-supervised scene flow estimation. In IEEE Conf. Comput. Vis. Pattern Recog., pp.  4114–4123, 2021.
  21. PointPillars: Fast Encoders for Object Detection From Point Clouds. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  12689–12697, 2019.
  22. Towards streaming perception. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16, pp.  473–488. Springer, 2020.
  23. HCRF-Flow: Scene flow from point clouds with continuous high-order CRFs and position-aware flow embedding. In IEEE Conf. Comput. Vis. Pattern Recog., pp.  364–373, 2021a.
  24. RigidFlow: Self-Supervised Scene Flow Learning on Point Clouds by Local Rigidity Prior. In IEEE Conf. Comput. Vis. Pattern Recog., pp.  16959–16968, 2022.
  25. Neural Scene Flow Prior. Advances in Neural Information Processing Systems, 34, 2021b.
  26. FlowNet3D: Learning Scene Flow in 3D Point Clouds. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
  27. Application of Laser Systems for Detection and Ranging in the Modern Road Transportation and Maritime Sector. Sensors, 22(16), 2022. ISSN 1424-8220.
  28. VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training. arXiv preprint arXiv:2210.00030, 2022.
  29. LIV: Language-Image Representations and Rewards for Robotic Control. arXiv preprint arXiv:2306.00958, 2023.
  30. A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
  31. Just Go With the Flow: Self-Supervised Scene Flow Estimation. In IEEE Conf. Comput. Vis. Pattern Recog., June 2020.
  32. Motion Inspired Unsupervised Perception and Prediction in Autonomous Driving. European Conference on Computer Vision (ECCV), 2022.
  33. OpenAI. Gpt-4 technical report, 2023.
  34. An empirical analysis of range for 3d object detection. arXiv preprint arXiv:2308.04054, 2023.
  35. Scene flow from point clouds with or without learning. In Int. Conf. 3D Vis., pp.  261–270. IEEE, 2020.
  36. Flot: Scene flow on point clouds guided by optimal transport. In Eur. Conf. Comput. Vis., pp.  527–544. Springer, 2020.
  37. Improving language understanding by generative pre-training. 2018.
  38. R3M: A Universal Visual Representation for Robot Manipulation. Conference on Robot Learning (CoRL) 2022, 03 2022.
  39. Scalability in Perception for Autonomous Driving: Waymo Open Dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
  40. Self-supervised learning of non-rigid residual flow and ego-motion. In Int. Conf. 3D Vis., pp.  150–159. IEEE, 2020.
  41. Sparse PointPillars: Maintaining and Exploiting Input Sparsity to Improve Runtime on Embedded Systems. International Conference on Intelligent Robots and Systems (IROS), 2022.
  42. Velodyne Lidar Alpha Prime. Velodyne Lidar, 11 2019.
  43. Voyager: An Open-Ended Embodied Agent with Large Language Models. arXiv preprint arXiv: Arxiv-2305.16291, 2023.
  44. PointMotionNet: Point-Wise Motion Learning for Large-Scale LiDAR Point Clouds Sequences. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp.  4418–4427, 2022.
  45. Inverting the pose forecasting pipeline with spf2: Sequential pointcloud forecasting for sequential pose forecasting. In Conference on robot learning, pp.  11–20. PMLR, 2021.
  46. Argoverse 2: Next Generation Datasets for Self-driving Perception and Forecasting. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS Datasets and Benchmarks 2021), 2021.
  47. Pointpwc-net: Cost volume on point clouds for (self-) supervised scene flow estimation. In Eur. Conf. Comput. Vis., pp.  88–107. Springer, 2020.
  48. FlowMOT: 3D Multi-Object Tracking by Scene Flow Association. ArXiv, abs/2012.07541, 2020.
  49. PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking. In ICCV, 2023.
Citations (5)

Summary

We haven't generated a summary for this paper yet.