Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sparse-softmax: A Simpler and Faster Alternative Softmax Transformation (2112.12433v1)

Published 23 Dec 2021 in cs.LG and cs.CL

Abstract: The softmax function is widely used in artificial neural networks for the multiclass classification problems, where the softmax transformation enforces the output to be positive and sum to one, and the corresponding loss function allows to use maximum likelihood principle to optimize the model. However, softmax leaves a large margin for loss function to conduct optimizing operation when it comes to high-dimensional classification, which results in low-performance to some extent. In this paper, we provide an empirical study on a simple and concise softmax variant, namely sparse-softmax, to alleviate the problem that occurred in traditional softmax in terms of high-dimensional classification problems. We evaluate our approach in several interdisciplinary tasks, the experimental results show that sparse-softmax is simpler, faster, and produces better results than the baseline models.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Shaoshi Sun (1 paper)
  2. Zhenyuan Zhang (31 papers)
  3. BoCheng Huang (1 paper)
  4. Pengbin Lei (1 paper)
  5. Jianlin Su (31 papers)
  6. Shengfeng Pan (8 papers)
  7. Jiarun Cao (4 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.