Emergent Mind

Computing Approximate $\ell_p$ Sensitivities

(2311.04158)
Published Nov 7, 2023 in cs.LG , cs.DS , and stat.ML

Abstract

Recent works in dimensionality reduction for regression tasks have introduced the notion of sensitivity, an estimate of the importance of a specific datapoint in a dataset, offering provable guarantees on the quality of the approximation after removing low-sensitivity datapoints via subsampling. However, fast algorithms for approximating $\ellp$ sensitivities, which we show is equivalent to approximate $\ellp$ regression, are known for only the $\ell2$ setting, in which they are termed leverage scores. In this work, we provide efficient algorithms for approximating $\ellp$ sensitivities and related summary statistics of a given matrix. In particular, for a given $n \times d$ matrix, we compute $\alpha$-approximation to its $\ell1$ sensitivities at the cost of $O(n/\alpha)$ sensitivity computations. For estimating the total $\ellp$ sensitivity (i.e. the sum of $\ellp$ sensitivities), we provide an algorithm based on importance sampling of $\ellp$ Lewis weights, which computes a constant factor approximation to the total sensitivity at the cost of roughly $O(\sqrt{d})$ sensitivity computations. Furthermore, we estimate the maximum $\ell1$ sensitivity, up to a $\sqrt{d}$ factor, using $O(d)$ sensitivity computations. We generalize all these results to $\ellp$ norms for $p > 1$. Lastly, we experimentally show that for a wide class of matrices in real-world datasets, the total sensitivity can be quickly approximated and is significantly smaller than the theoretical prediction, demonstrating that real-world datasets have low intrinsic effective dimensionality.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.