Emergent Mind

Probabilistic Polynomials and Hamming Nearest Neighbors

(1507.05106)
Published Jul 17, 2015 in cs.DS , cs.CC , and math.CO

Abstract

We show how to compute any symmetric Boolean function on $n$ variables over any field (as well as the integers) with a probabilistic polynomial of degree $O(\sqrt{n \log(1/\epsilon)})$ and error at most $\epsilon$. The degree dependence on $n$ and $\epsilon$ is optimal, matching a lower bound of Razborov (1987) and Smolensky (1987) for the MAJORITY function. The proof is constructive: a low-degree polynomial can be efficiently sampled from the distribution. This polynomial construction is combined with other algebraic ideas to give the first subquadratic time algorithm for computing a (worst-case) batch of Hamming distances in superlogarithmic dimensions, exactly. To illustrate, let $c(n) : \mathbb{N} \rightarrow \mathbb{N}$. Suppose we are given a database $D$ of $n$ vectors in ${0,1}{c(n) \log n}$ and a collection of $n$ query vectors $Q$ in the same dimension. For all $u \in Q$, we wish to compute a $v \in D$ with minimum Hamming distance from $u$. We solve this problem in $n{2-1/O(c(n) \log2 c(n))}$ randomized time. Hence, the problem is in "truly subquadratic" time for $O(\log n)$ dimensions, and in subquadratic time for $d = o((\log2 n)/(\log \log n)2)$. We apply the algorithm to computing pairs with maximum inner product, closest pair in $\ell_1$ for vectors with bounded integer entries, and pairs with maximum Jaccard coefficients.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.