Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 58 tok/s
Gemini 2.5 Pro 52 tok/s Pro
GPT-5 Medium 12 tok/s Pro
GPT-5 High 17 tok/s Pro
GPT-4o 95 tok/s Pro
Kimi K2 179 tok/s Pro
GPT OSS 120B 463 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

Decision Mamba: Reinforcement Learning via Sequence Modeling with Selective State Spaces (2403.19925v1)

Published 29 Mar 2024 in cs.LG and cs.AI

Abstract: Decision Transformer, a promising approach that applies Transformer architectures to reinforcement learning, relies on causal self-attention to model sequences of states, actions, and rewards. While this method has shown competitive results, this paper investigates the integration of the Mamba framework, known for its advanced capabilities in efficient and effective sequence modeling, into the Decision Transformer architecture, focusing on the potential performance enhancements in sequential decision-making tasks. Our study systematically evaluates this integration by conducting a series of experiments across various decision-making environments, comparing the modified Decision Transformer, Decision Mamba, with its traditional counterpart. This work contributes to the advancement of sequential decision-making models, suggesting that the architecture and training methodology of neural networks can significantly impact their performance in complex tasks, and highlighting the potential of Mamba as a valuable tool for improving the efficacy of Transformer-based models in reinforcement learning scenarios.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. An optimistic perspective on offline reinforcement learning. In International Conference on Machine Learning, pp. 104–114. PMLR, 2020.
  2. A framework for behavioural cloning. In Machine Intelligence 15, pp.  103–129, 1995.
  3. The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 47:253–279, 2013.
  4. Openai gym. arXiv preprint arXiv:1606.01540, 2016.
  5. Decision transformer: Reinforcement learning via sequence modeling. Advances in neural information processing systems, 34:15084–15097, 2021.
  6. Decision S4: Efficient sequence-based RL via state spaces layers. In The Eleventh International Conference on Learning Representations, 2023.
  7. D4rl: Datasets for deep data-driven reinforcement learning. arXiv preprint arXiv:2004.07219, 2020.
  8. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2023.
  9. Hippo: Recurrent memory with optimal polynomial projections. Advances in neural information processing systems, 33:1474–1487, 2020.
  10. Combining recurrent, convolutional, and continuous-time models with linear state space layers. Advances in neural information processing systems, 34:572–585, 2021.
  11. On the parameterization and initialization of diagonal state space models. Advances in Neural Information Processing Systems, 35:35971–35983, 2022a.
  12. Efficiently modeling long sequences with structured state spaces. In International Conference on Learning Representations, 2022b.
  13. Diagonal state spaces are as effective as structured state spaces. Advances in Neural Information Processing Systems, 35:22982–22994, 2022.
  14. Decision convformer: Local filtering in metaformer is sufficient for decision making. In The Twelfth International Conference on Learning Representations, 2024.
  15. Offline reinforcement learning: Tutorial, review, and perspectives on open problems. arXiv preprint arXiv:2005.01643, 2020.
  16. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
  17. Human-level control through deep reinforcement learning. Nature, 518(7540):529–533, 2015.
  18. Improving language understanding by generative pre-training.
  19. Mastering atari games with limited data. Advances in neural information processing systems, 34:25476–25488, 2021.
Citations (13)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube