Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Robust descent using smoothed multiplicative noise (1810.06207v1)

Published 15 Oct 2018 in stat.ML and cs.LG

Abstract: To improve the off-sample generalization of classical procedures minimizing the empirical risk under potentially heavy-tailed data, new robust learning algorithms have been proposed in recent years, with generalized median-of-means strategies being particularly salient. These procedures enjoy performance guarantees in the form of sharp risk bounds under weak moment assumptions on the underlying loss, but typically suffer from a large computational overhead and substantial bias when the data happens to be sub-Gaussian, limiting their utility. In this work, we propose a novel robust gradient descent procedure which makes use of a smoothed multiplicative noise applied directly to observations before constructing a sum of soft-truncated gradient coordinates. We show that the procedure has competitive theoretical guarantees, with the major advantage of a simple implementation that does not require an iterative sub-routine for robustification. Empirical tests reinforce the theory, showing more efficient generalization over a much wider class of data distributions.

Citations (27)

Summary

We haven't generated a summary for this paper yet.