Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 30 tok/s Pro
GPT-5 High 23 tok/s Pro
GPT-4o 99 tok/s Pro
Kimi K2 190 tok/s Pro
GPT OSS 120B 425 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

ROLeR: Effective Reward Shaping in Offline Reinforcement Learning for Recommender Systems (2407.13163v2)

Published 18 Jul 2024 in cs.IR and cs.AI

Abstract: Offline reinforcement learning (RL) is an effective tool for real-world recommender systems with its capacity to model the dynamic interest of users and its interactive nature. Most existing offline RL recommender systems focus on model-based RL through learning a world model from offline data and building the recommendation policy by interacting with this model. Although these methods have made progress in the recommendation performance, the effectiveness of model-based offline RL methods is often constrained by the accuracy of the estimation of the reward model and the model uncertainties, primarily due to the extreme discrepancy between offline logged data and real-world data in user interactions with online platforms. To fill this gap, a more accurate reward model and uncertainty estimation are needed for the model-based RL methods. In this paper, a novel model-based Reward Shaping in Offline Reinforcement Learning for Recommender Systems, ROLeR, is proposed for reward and uncertainty estimation in recommendation systems. Specifically, a non-parametric reward shaping method is designed to refine the reward model. In addition, a flexible and more representative uncertainty penalty is designed to fit the needs of recommendation systems. Extensive experiments conducted on four benchmark datasets showcase that ROLeR achieves state-of-the-art performance compared with existing baselines. The source code can be downloaded at https://github.com/ArronDZhang/ROLeR.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 4 tweets and received 4 likes.

Upgrade to Pro to view all of the tweets about this paper: