Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Hadamard Representations: Augmenting Hyperbolic Tangents in RL (2406.09079v4)

Published 13 Jun 2024 in cs.LG

Abstract: Activation functions are one of the key components of a deep neural network. The most commonly used activation functions can be classed into the category of continuously differentiable (e.g. tanh) and piece-wise linear functions (e.g. ReLU), both having their own strengths and drawbacks with respect to downstream performance and representation capacity through learning. In reinforcement learning, the performance of continuously differentiable activations often falls short as compared to piece-wise linear functions. We show that the dying neuron problem in RL is not exclusive to ReLUs and actually leads to additional problems in the case of continuously differentiable activations such as tanh. To alleviate the dying neuron problem with these activations, we propose a Hadamard representation that unlocks the advantages of continuously differentiable activations. Using DQN, PPO and PQN in the Atari domain, we show faster learning, a reduction in dead neurons and increased effective rank.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets