Emergent Mind

Any-k Algorithms for Enumerating Ranked Answers to Conjunctive Queries

(2205.05649)
Published May 11, 2022 in cs.DB , cs.DS , and cs.LO

Abstract

We study ranked enumeration for Conjunctive Queries (CQs) where the answers are ordered by a given ranking function (e.g., an ORDER BY clause in SQL). We develop "any-k" algorithms, which, without knowing the number k of desired answers, push down the ranking into joins by carefully ordering the computation of intermediate tuples and avoiding materialization of join answers until they are needed. For this to be possible, the ranking function needs to obey a particular type of monotonicity. Supported ranking functions include the common sum-of-weights, where answers are compared by the sum of input-tuple weights, as well as any commutative selective dioid. Our results extend a well-known unranked-enumeration dichotomy, which states that only free-connex CQs are tractable (under certain hardness hypotheses and for CQs without self-joins). For this class of queries and with n denoting the size of the input, the data complexity of our ranked enumeration approach for the time to the kth CQ answer is O(n+klogk), which is only a logarithmic factor slower than the O(n+k) unranked-enumeration guarantee. A core insight of our work is that ranked enumeration for CQs is closely related to Dynamic Programming and the fundamental task of path enumeration in a weighted DAG. We uncover a previously unknown tradeoff: one any-k algorithm has lower complexity when the number of returned answers is small, the other when their number is large. This tradeoff is eliminated under a stricter monotonicity property that we define and exploit for a novel algorithm that asymptotically dominates all previously known alternatives, including Eppstein's algorithm for sum-of-weights path enumeration. We empirically demonstrate the findings of our theoretical analysis in an experimental study that highlights the superiority of our approach over the join-then-rank approach that existing database systems follow.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.