Multi-Object Manipulation via Object-Centric Neural Scattering Functions (2306.08748v1)
Abstract: Learned visual dynamics models have proven effective for robotic manipulation tasks. Yet, it remains unclear how best to represent scenes involving multi-object interactions. Current methods decompose a scene into discrete objects, but they struggle with precise modeling and manipulation amid challenging lighting conditions as they only encode appearance tied with specific illuminations. In this work, we propose using object-centric neural scattering functions (OSFs) as object representations in a model-predictive control framework. OSFs model per-object light transport, enabling compositional scene re-rendering under object rearrangement and varying lighting conditions. By combining this approach with inverse parameter estimation and graph-based neural dynamics models, we demonstrate improved model-predictive control performance and generalization in compositional multi-object environments, even in previously unseen scenarios and harsh lighting conditions.
- Vision-Only Robot Navigation in a Neural Radiance World. IEEE Robotics and Automation Letters (RA-L), 7(2), 2022.
- Fitvid: Overfitting in pixel-level video prediction. arXiv preprint arXiv:2106.13195, 2021.
- The navigation and control technology inside the ar. drone micro uav. IFAC Proceedings Volumes, 44(1):1477–1484, 2011.
- The YCB object and model set: Towards common benchmarks for manipulation research. In 2015 international conference on advanced robotics (ICAR), pages 510–517. IEEE, 2015.
- Benchmarking in manipulation research: The YCB object and model set and benchmarking protocols. arXiv preprint arXiv:1502.03143, 2015.
- Differentiable physics simulation of dynamics-augmented neural objects. arXiv preprint arXiv:2210.09420, 2022.
- PyBullet, a python module for physics simulation for games, robotics and machine learning. 2016.
- Object rearrangement using learned implicit collision functions. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 6010–6017, 2021.
- PoseRBPF: A rao–blackwellized particle filter for 6-D object pose tracking. IEEE Transactions on Robotics, 37(5):1328–1342, 2021.
- Learning multi-object dynamics with compositional neural radiance fields, 2022.
- Reinforcement learning with neural radiance fields. arXiv preprint arXiv:2206.01634, 2022.
- Visual foresight: Model-based deep reinforcement learning for vision-based robotic control. arXiv preprint arXiv:1812.00568, 2018.
- From points to multi-object 3d reconstruction. In CVPR, pages 4588–4597, 2021.
- Deep visual foresight for planning robot motion. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pages 2786–2793, 2017.
- World models. arXiv preprint arXiv:1803.10122, 2018.
- Learning neural implicit functions as object representations for robotic manipulation. arXiv preprint arXiv:2112.04812, 2021.
- Dream to control: Learning behaviors by latent imagination. arXiv preprint arXiv:1912.01603, 2019.
- Learning latent dynamics for planning from pixels. In International conference on machine learning, pages 2555–2565. PMLR, 2019.
- Mastering atari with discrete world models. arXiv preprint arXiv:2010.02193, 2020.
- Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (cma-es). Evolutionary Computation, 11(1):1–18, 2003.
- Visuospatial foresight for multi-step, multi-task fabric manipulation. arXiv preprint arXiv:2003.09044, 2020.
- Mesh-based dynamics with occlusion reasoning for cloth manipulation. arXiv preprint arXiv:2206.02881, 2022.
- Dex-NeRF: Using a neural radiance field to grasp transparent objects. In Conference on Robot Learning (CoRL), 2020.
- iPose: Instance-aware 6D pose estimation of partly occluded objects. In ACCV, pages 477–492. Springer, 2018.
- CodeNeRF: Disentangled neural radiance fields for object categories. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 12949–12958, 2021.
- Reasoning about physical interactions with object-oriented prediction and planning. In International Conference on Learning Representations, 2019.
- Synergies between affordance and geometry: 6-DoF grasp detection via implicit representations. Robotics: science and systems, 2021.
- Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot. Autonomous robots, 40(3):429–455, 2016.
- CosyPose: Consistent multi-view multi-object 6D pose estimation. In ECCV, pages 574–591. Springer, 2020.
- 3D neural scene representations for visuomotor control. arXiv preprint arXiv:2107.04004, 2021.
- Visual grounding of learned physical models. In International conference on machine learning, pages 5927–5936. PMLR, 2020.
- Causal discovery in physical systems from videos. Advances in Neural Information Processing Systems, 33:9180–9192, 2020.
- Learning particle dynamics for manipulating rigid bodies, deformable objects, and fluids. arXiv preprint arXiv:1810.01566, 2018.
- Propagation networks for model-based control under partial observation. In 2019 International Conference on Robotics and Automation (ICRA), pages 1205–1211. IEEE, 2019.
- CDPN: Coordinates-based disentangled pose network for real-time rgb-based 6-DoF object pose estimation. In ICCV, pages 7678–7687, 2019.
- Learning visible connectivity dynamics for cloth smoothing. In Conference on Robot Learning, pages 256–266. PMLR, 2022.
- KPAM: Keypoint affordances for category-level robotic manipulation. In The International Symposium of Robotics Research, pages 132–157. Springer, 2019.
- Keypoints into the future: Self-supervised correspondence in model-based reinforcement learning. arXiv preprint arXiv:2009.05085, 2020.
- NeRF: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
- Pix2pose: Pixel-wise coordinate regression of objects for 6D pose estimation. In ICCV, pages 7668–7677, 2019.
- PVNet: Pixel-wise voting network for 6DoF pose estimation. In CVPR, pages 4561–4570, 2019.
- NeRP: Neural rearrangement planning for unknown objects. arXiv preprint arXiv:2106.01352, 2021.
- KiloNeRF: Speeding up neural radiance fields with thousands of tiny mlps. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14335–14345, 2021.
- Mastering atari, go, chess and shogi by planning with a learned model. Nature, 588(7839):604–609, 2020.
- Neural descriptor fields: SE(3)-equivariant object representations for manipulation. arXiv preprint arXiv:2112.05124, 2021.
- Unsupervised discovery and composition of object light fields. arXiv preprint arXiv:2205.03923, 2022.
- Real-Time Seamless Single Shot 6D Object Pose Prediction. In CVPR, 2018.
- A control-centric benchmark for video prediction. In International Conference on Learning Representations, 2023.
- Entity abstraction in visual model-based reinforcement learning. In Conference on Robot Learning, pages 1439–1456, 2020.
- Model predictive path integral control using covariance variable importance sampling. arXiv preprint arXiv:1509.01149, 2015.
- Lance Williams. Casting curved shadows on curved surfaces. In Proceedings of the 5th annual conference on Computer graphics and interactive techniques, pages 270–274, 1978.
- Daydreamer: World models for physical robot learning. arXiv preprint arXiv:2206.14176, 2022.
- PoseCNN: A convolutional neural network for 6D object pose estimation in cluttered scenes. 2018.
- Learning object-compositional neural radiance field for editable scene rendering. In International Conference on Computer Vision (ICCV), October 2021.
- iNeRF: Inverting neural radiance fields for pose estimation. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021.
- Unsupervised discovery of object radiance fields. In International Conference on Learning Representations, 2022.
- Learning object-centric neural scattering functions for free-viewpoint relighting and scene composition. arXiv preprint arXiv:2303.06138, 2023.
- DPOD: 6D pose object detector and refiner. In ICCV, 2019.
- Objects as points. arXiv preprint arXiv:1904.07850, 2019.