Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TaCOS: Task-Specific Camera Optimization with Simulation (2404.11031v3)

Published 17 Apr 2024 in cs.CV and cs.RO

Abstract: The performance of perception tasks is heavily influenced by imaging systems. However, designing cameras with high task performance is costly, requiring extensive camera knowledge and experimentation with physical hardware. Additionally, cameras and perception tasks are mostly designed in isolation, whereas recent methods that jointly design cameras and tasks have shown improved performance. Therefore, we present a novel end-to-end optimization approach that co-designs cameras with specific vision tasks. This method combines derivative-free and gradient-based optimizers to support both continuous and discrete camera parameters within manufacturing constraints. We leverage recent computer graphics techniques and physical camera characteristics to simulate the cameras in virtual environments, making the design process cost-effective. We validate our simulations against physical cameras and provide a procedurally generated virtual environment. Our experiments demonstrate that our method designs cameras that outperform common off-the-shelf options, and more efficiently compared to the state-of-the-art approach, requiring only 2 minutes to design a camera on an example experiment compared with 67 minutes for the competing method. Designed to support the development of cameras under manufacturing constraints, multiple cameras, and unconventional cameras, we believe this approach can advance the fully automated design of cameras.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. H. Blasinski, J. Farrell, T. Lian, Z. Liu, and B. Wandell, “Optimizing image acquisition systems for autonomous driving,” Electronic Imaging, vol. 2018, no. 5, pp. 161–1, 2018.
  2. Z. Liu, S. Minghao, J. Zhang, S. Liu, H. Blasinski, T. Lian, and B. Wandell, “A system for generating complex physically accurate sensor images for automotive applications,” Electronic Imaging, vol. 2019, pp. 53–1, 01 2019.
  3. Z. Liu, T. Lian, J. Farrell, and B. Wandell, “Soft prototyping camera designs for car detection based on a convolutional neural network,” in Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2019, pp. 0–0.
  4. K. Weikl, D. Schroeder, D. Blau, Z. Liu, and W. Stechele, “End-to-end imaging system optimization for computer vision in driving automation,” in Proceedings of IS&T Int’l. Symp. on Electronic Imaging: Autonomous Vehicles and Machines, 2021.
  5. Z. Liu, D. Shah, A. Rahimpour, D. Upadhyay, J. Farrell, and B. A. Wandell, “Using simulation to quantify the performance of automotive perception systems,” arXiv preprint arXiv:2303.00983, 2023.
  6. V. Sitzmann, S. Diamond, Y. Peng, X. Dun, S. Boyd, W. Heidrich, F. Heide, and G. Wetzstein, “End-to-end optimization of optics and image processing for achromatic extended depth of field and super-resolution imaging,” ACM Trans. Graph, vol. 37, no. 4, pp. 1–13, 2018.
  7. Q. Sun, C. Wang, F. Qiang, D. Xiong, and H. Wolfgang, “End-to-end complex lens design with differentiable ray tracing,” ACM Trans. Graph, vol. 40, no. 4, pp. 1–13, 2021.
  8. L. He, G. Wang, and Z. Hu, “Learning depth from single images with deep neural network embedding focal length,” IEEE Transactions on Image Processing, vol. 27, no. 9, pp. 4676–4689, 2018.
  9. J. Chang and G. Wetzstein, “Deep optics for monocular depth estimation and 3d object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 10 193–10 202.
  10. H. Ikoma, C. M. Nguyen, C. A. Metzler, Y. Peng, and G. Wetzstein, “Depth from defocus with learned optics for imaging and occlusion-aware depth estimation,” in 2021 IEEE International Conference on Computational Photography.   IEEE, 2021, pp. 1–12.
  11. S.-H. Baek and F. Heide, “Polka lines: Learning structured illumination and reconstruction for active stereo,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 5757–5767.
  12. E. Tseng, F. Yu, Y. Yang, F. Mannan, K. S. Arnaud, D. Nowrouzezahrai, J.-F. Lalonde, and F. Heide, “Hyperparameter optimization in black-box image processing using differentiable proxies.” ACM Trans. Graph., vol. 38, no. 4, pp. 27–1, 2019.
  13. E. Tseng, A. Mosleh, F. Mannan, K. St-Arnaud, A. Sharma, Y. Peng, A. Braun, D. Nowrouzezahrai, J.-F. Lalonde, and F. Heide, “Differentiable compound optics and processing pipeline optimization for end-to-end camera design,” ACM Trans. Graph., vol. 40, no. 2, pp. 1–19, 2021.
  14. N. Robidoux, L. E. G. Capel, D.-e. Seo, A. Sharma, F. Ariza, and F. Heide, “End-to-end high dynamic range camera pipeline optimization,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 6297–6307.
  15. G. Côté, F. Mannan, S. Thibault, J.-F. Lalonde, and F. Heide, “The differentiable lens: Compound lens search over glass surfaces and materials for object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 20 803–20 812.
  16. C. A. Metzler, H. Ikoma, Y. Peng, and G. Wetzstein, “Deep optics for single-shot high-dynamic-range imaging,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 1375–1385.
  17. J. N. Martel, L. K. Mueller, S. J. Carey, P. Dudek, and G. Wetzstein, “Neural sensors: Learning pixel exposures for hdr imaging and video compressive sensing with programmable sensors,” IEEE transactions on pattern analysis and machine intelligence, vol. 42, no. 7, pp. 1642–1653, 2020.
  18. J. Chang, V. Sitzmann, X. Dun, W. Heidrich, and G. Wetzstein, “Hybrid optical-electronic convolutional neural networks with optimized diffractive optics for image classification,” Scientific reports, vol. 8, no. 1, p. 12324, 2018.
  19. S. Diamond, V. Sitzmann, F. Julca-Aguilar, S. Boyd, G. Wetzstein, and F. Heide, “Dirty pixels: Towards end-to-end image processing and perception,” ACM Trans. Graph., vol. 40, no. 3, pp. 1–15, 2021.
  20. Y. Zhang, B. Dong, and F. Heide, “All you need is raw: Defending against adversarial attacks with camera image pipelines,” in European Conference on Computer Vision.   Springer, 2022, pp. 323–343.
  21. C. M. Nguyen, J. N. Martel, and G. Wetzstein, “Learning spatially varying pixel exposures for motion deblurring,” in 2022 IEEE International Conference on Computational Photography.   IEEE, 2022, pp. 1–11.
  22. T. Klinghoffer, K. Tiwary, N. Behari, B. Agrawalla, and R. Raskar, “Diser: Designing imaging systems with reinforcement learning,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 23 632–23 642.
  23. A. Foi, “Clipped noisy images: Heteroskedastic modeling and practical denoising,” Signal Processing, vol. 89, no. 12, pp. 2609–2629, 2009.
  24. Z. Michalewicz and M. Schoenauer, “Evolutionary algorithms for constrained parameter optimization problems,” Evolutionary computation, vol. 4, no. 1, pp. 1–32, 1996.
  25. B. Wandell, D. Cardinal, T. Lian, Z. Liu, Z. Lyu, D. Brainard, H. Blasinski, A. Ni, J. Dowling, M. Furth, T. Goossens, A. Ji, and J. Maxwell, “Iset/iset3d,” 2 2024. [Online]. Available: https://github.com/ISET/iset3d
  26. B. Wandell, D. Cardinal, D. Brainard, Z. Lyu, Z. Liu, A. Webster, J. Farrell, T. Lian, H. Blasinski, and K. Feng, “Iset/isetcam,” 2 2024. [Online]. Available: https://github.com/ISET/isetcam
  27. A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V. Koltun, “Carla: An open urban driving simulator,” in Conference on robot learning.   PMLR, 2017, pp. 1–16.
  28. Y. Hou, X. Leng, T. Gedeon, and L. Zheng, “Optimizing camera configurations for multi-view pedestrian detection,” arXiv preprint arXiv:2312.02144, 2023.
  29. Epic Games, “Unreal engine.”
  30. N. Ratner and Y. Y. Schechner, “Illumination multiplexing within fundamental limits,” in 2007 IEEE Conference on Computer Vision and Pattern Recognition.   IEEE, 2007, pp. 1–8.
  31. C. Liu, W. T. Freeman, R. Szeliski, and S. B. Kang, “Noise estimation from a single image,” in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1.   IEEE, 2006, pp. 901–908.
  32. T. Wang and D. G. Dansereau, “Multiplexed illumination for classifying visually similar objects,” Applied Optics, vol. 60, no. 10, pp. B23–B31, 2021.
  33. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proccedings of the 3rd International Conference on Learning Representations, 2015.
  34. R. Mur-Artal, J. M. M. Montiel, and J. D. Tardos, “Orb-slam: a versatile and accurate monocular slam system,” IEEE transactions on robotics, vol. 31, no. 5, pp. 1147–1163, 2015.
  35. J. Tomasi, B. Wagstaff, S. L. Waslander, and J. Kelly, “Learned camera gain and exposure control for improved visual feature detection and matching,” IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 2028–2035, 2021.
  36. E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “Orb: An efficient alternative to sift or surf,” in 2011 International conference on computer vision.   Ieee, 2011, pp. 2564–2571.
  37. Itseez, “Open source computer vision library,” https://github.com/itseez/opencv, 2015.
  38. S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” Advances in neural information processing systems, vol. 28, 2015.
  39. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
  40. D. G. Dansereau, B. Girod, and G. Wetzstein, “Liff: Light field features in scale and depth,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 8042–8051.

Summary

We haven't generated a summary for this paper yet.