Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 164 tok/s
Gemini 2.5 Pro 46 tok/s Pro
GPT-5 Medium 21 tok/s Pro
GPT-5 High 27 tok/s Pro
GPT-4o 72 tok/s Pro
Kimi K2 204 tok/s Pro
GPT OSS 120B 450 tok/s Pro
Claude Sonnet 4.5 34 tok/s Pro
2000 character limit reached

CIM-PPO:Proximal Policy Optimization with Liu-Correntropy Induced Metric (2110.10522v3)

Published 20 Oct 2021 in cs.LG and cs.AI

Abstract: As a popular Deep Reinforcement Learning (DRL) algorithm, Proximal Policy Optimization (PPO) has demonstrated remarkable efficacy in numerous complex tasks. According to the penalty mechanism in a surrogate, PPO can be classified into PPO with KL divergence (PPO-KL) and PPO with Clip (PPO-Clip). In this paper, we analyze the impact of asymmetry in KL divergence on PPO-KL and highlight that when this asymmetry is pronounced, it will misguide the improvement of the surrogate. To address this issue, we represent the PPO-KL in inner product form and demonstrate that the KL divergence is a Correntropy Induced Metric (CIM) in Euclidean space. Subsequently, we extend the PPO-KL to the Reproducing Kernel Hilbert Space (RKHS), redefine the inner products with RKHS, and propose the PPO-CIM algorithm. Moreover, this paper states that the PPO-CIM algorithm has a lower computation cost in policy gradient and proves that PPO-CIM can guarantee the new policy is within the trust region while the kernel satisfies some conditions. Finally, we design experiments based on six Mujoco continuous-action tasks to validate the proposed algorithm. The experimental results validate that the asymmetry of KL divergence can affect the policy improvement of PPO-KL and show that the PPO-CIM can perform better than both PPO-KL and PPO-Clip in most tasks.

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube