Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Observation-Augmented Contextual Multi-Armed Bandits for Robotic Exploration with Uncertain Semantic Data (2312.12583v1)

Published 19 Dec 2023 in cs.RO and cs.LG

Abstract: For robotic decision-making under uncertainty, the balance between exploitation and exploration of available options must be carefully taken into account. In this study, we introduce a new variant of contextual multi-armed bandits called observation-augmented CMABs (OA-CMABs) wherein a decision-making agent can utilize extra outcome observations from an external information source. CMABs model the expected option outcomes as a function of context features and hidden parameters, which are inferred from previous option outcomes. In OA-CMABs, external observations are also a function of context features and thus provide additional evidence about the hidden parameters. Yet, if an external information source is error-prone, the resulting posterior updates can harm decision-making performance unless the presence of errors is considered. To this end, we propose a robust Bayesian inference process for OA-CMABs that is based on the concept of probabilistic data validation. Our approach handles complex mixture model parameter priors and hybrid observation likelihoods for semantic data sources, allowing us to develop validation algorithms based on recently develop probabilistic semantic data association techniques. Furthermore, to more effectively cope with the combined sources of uncertainty in OA-CMABs, we derive a new active inference algorithm for option selection based on expected free energy minimization. This generalizes previous work on active inference for bandit-based robotic decision-making by accounting for faulty observations and non-Gaussian inference. Our approaches are demonstrated on a simulated asynchronous search site selection problem for space exploration. The results show that even if incorrect observations are provided by external information sources, efficient decision-making and robust parameter inference are still achieved in a wide variety of experimental conditions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. Adam, J. R. 2016. Europa Mission Overview. Technical report.
  2. Ahmed, N. 2018. Data-Free/Data-Sparse Softmax Parameter Estimation With Structured Class Geometries. IEEE Signal Processing Letters, 25: 1–1.
  3. Bayesian Multicategorical Soft Data Fusion for Human–Robot Collaboration. IEEE Transactions on Robotics, 29(1): 189–206.
  4. Finite-Time Analysis of the Multiarmed Bandit Problem. Mach. Learn., 47(2–3): 235–256.
  5. The probabilistic data association filter. IEEE Control Systems Magazine, 29(6): 82–100.
  6. Bishop, C. M. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics). Berlin, Heidelberg: Springer-Verlag. ISBN 0387310738.
  7. Survey on Applications of Multi-Armed and Contextual Bandits. In 2020 IEEE Congress on Evolutionary Computation (CEC), 1–8.
  8. Data Validation for Machine Learning. In MLSys.
  9. OceanWATERS Lander Robotic Arm Operation. In 2021 IEEE Aerospace Conference (50100), 1–11.
  10. Friston, K. 2010. The free-energy principle: A unified brain theory? Nature Reviews Neuroscience, 11(2): 127–138.
  11. Science goals and mission architecture of the Europa lander mission concept. The Planetary Science Journal, 3(1): 22.
  12. The impact of selecting a validation method in machine learning on predicting basketball game outcomes. Symmetry, 12(3): 431.
  13. Review of the recent advances and applications of LIBS-based imaging. Spectrochimica Acta Part B: Atomic Spectroscopy, 151: 41–53.
  14. Multiple hypothesis tracking revisited. In Proceedings of the IEEE international conference on computer vision, 4696–4704.
  15. Kochenderfer, M. J. 2015. Decision making under uncertainty: theory and application. MIT press.
  16. Kurniawati, H. 2022. Partially observable markov decision processes and robotics. Annual Review of Control, Robotics, and Autonomous Systems, 5: 253–277.
  17. Active Inference in Robotics and Artificial Agents: Survey and Challenges. CoRR, abs/2112.01871.
  18. Tracking in clutter with nearest neighbor filters: analysis and performance. IEEE transactions on aerospace and electronic systems, 32(3): 995–1010.
  19. An empirical evaluation of active inference in multi-armed bandits. Neural Networks; 2021 Special Issue on AI and Brain Science: AI-powered Brain Science, 144: 229–246.
  20. Everybody Needs Somebody Sometimes: Validation of Adaptive Recovery in Robotic Space Operations. IEEE Robotics and Automation Letters, 4(2): 1216–1223.
  21. REASON-RECOURSE Software for Science Operations of Autonomous Robotic Landers. In 2023 IEEE Aerospace Conference, 1–11.
  22. Europa Clipper Mission Concept: Exploring Jupiter’s Ocean Moon. EOS Transactions, 95(20): 165–167.
  23. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, 779–788.
  24. Runnalls, A. 2007. Kullback-Leibler Approach to Gaussian Mixture Reduction. Aerospace and Electronic Systems, IEEE Transactions on, 43: 989 – 999.
  25. Salmond, D. J. 1990. Mixture reduction algorithms for target tracking in clutter. In Drummond, O. E., ed., Signal and Data Processing of Small Targets 1990, volume 1305 of Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, 434–445.
  26. A step-by-step tutorial on active inference and its application to empirical data. Journal of mathematical psychology, 107: 102632.
  27. Thompson, W. R. 1933. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25: 285–294.
  28. Human–Robot Communications of Probabilistic Beliefs via a Dirichlet Process Mixture of Statements. IEEE Transactions on Robotics, 34(5): 1280–1298.
  29. Active Inference for Autonomous Decision-Making with Contextual Multi-Armed Bandits. In 2023 IEEE International Conference on Robotics and Automation (ICRA), 7916–7922.
  30. Probabilistic Semantic Data Association for Collaborative Human-Robot Sensing. IEEE Transactions on Robotics, 39(4): 3008–3023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Shohei Wakayama (6 papers)
  2. Nisar Ahmed (50 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.