Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
157 tokens/sec
GPT-4o
43 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sub-sampled Newton Methods with Non-uniform Sampling (1607.00559v2)

Published 2 Jul 2016 in math.OC and stat.ML

Abstract: We consider the problem of finding the minimizer of a convex function $F: \mathbb Rd \rightarrow \mathbb R$ of the form $F(w) := \sum_{i=1}n f_i(w) + R(w)$ where a low-rank factorization of $\nabla2 f_i(w)$ is readily available. We consider the regime where $n \gg d$. As second-order methods prove to be effective in finding the minimizer to a high-precision, in this work, we propose randomized Newton-type algorithms that exploit \textit{non-uniform} sub-sampling of ${\nabla2 f_i(w)}{i=1}{n}$, as well as inexact updates, as means to reduce the computational complexity. Two non-uniform sampling distributions based on {\it block norm squares} and {\it block partial leverage scores} are considered in order to capture important terms among ${\nabla2 f_i(w)}{i=1}{n}$. We show that at each iteration non-uniformly sampling at most $\mathcal O(d \log d)$ terms from ${\nabla2 f_i(w)}_{i=1}{n}$ is sufficient to achieve a linear-quadratic convergence rate in $w$ when a suitable initial point is provided. In addition, we show that our algorithms achieve a lower computational complexity and exhibit more robustness and better dependence on problem specific quantities, such as the condition number, compared to similar existing methods, especially the ones based on uniform sampling. Finally, we empirically demonstrate that our methods are at least twice as fast as Newton's methods with ridge logistic regression on several real datasets.

Citations (113)

Summary

We haven't generated a summary for this paper yet.