Papers
Topics
Authors
Recent
2000 character limit reached

Locality-Sensitive Hashing Scheme based on Longest Circular Co-Substring (2004.05345v1)

Published 11 Apr 2020 in cs.DB and cs.DS

Abstract: Locality-Sensitive Hashing (LSH) is one of the most popular methods for $c$-Approximate Nearest Neighbor Search ($c$-ANNS) in high-dimensional spaces. In this paper, we propose a novel LSH scheme based on the Longest Circular Co-Substring (LCCS) search framework (LCCS-LSH) with a theoretical guarantee. We introduce a novel concept of LCCS and a new data structure named Circular Shift Array (CSA) for $k$-LCCS search. The insight of LCCS search framework is that close data objects will have a longer LCCS than the far-apart ones with high probability. LCCS-LSH is \emph{LSH-family-independent}, and it supports $c$-ANNS with different kinds of distance metrics. We also introduce a multi-probe version of LCCS-LSH and conduct extensive experiments over five real-life datasets. The experimental results demonstrate that LCCS-LSH outperforms state-of-the-art LSH schemes.

Citations (33)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.