FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning (2406.00645v2)

Published 2 Jun 2024 in cs.LG, cs.AI, and cs.CV

Abstract: In this work, we investigate how to leverage pre-trained visual-LLMs (VLM) for online Reinforcement Learning (RL). In particular, we focus on sparse reward tasks with pre-defined textual task descriptions. We first identify the problem of reward misalignment when applying VLM as a reward in RL tasks. To address this issue, we introduce a lightweight fine-tuning method, named Fuzzy VLM reward-aided RL (FuRL), based on reward alignment and relay RL. Specifically, we enhance the performance of SAC/DrQ baseline agents on sparse reward tasks by fine-tuning VLM representations and using relay RL to avoid local minima. Extensive experiments on the Meta-world benchmark tasks demonstrate the efficacy of the proposed method. Code is available at: https://github.com/fuyw/FuRL.

Citations (5)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

GitHub

GitHub - fuyw/FuRL (6 stars)

FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning (2406.00645v2)

Summary

Related Papers

GitHub