Emergent Mind

Abstract

Exhaustive enumeration of all possible join orders is often avoided, and most optimizers leverage heuristics to prune the search space. The design and implementation of heuristics are well-understood when the cost model is roughly linear, and we find that these heuristics can be significantly suboptimal when there are non-linearities in cost. Ideally, instead of a fixed heuristic, we would want a strategy to guide the search space in a more data-driven waytailoring the search to a specific dataset and query workload. Recognizing the link between classical Dynamic Programming enumeration methods and recent results in Reinforcement Learning (RL), we propose a new method for learning optimized join search strategies. We present our RL-based DQ optimizer, which currently optimizes select-project-join blocks. We implement three versions of DQ to illustrate the ease of integration into existing DBMSes: (1) A version built on top of Apache Calcite, (2) a version integrated into PostgreSQL, and (3) a version integrated into SparkSQL. Our extensive evaluation shows that DQ achieves plans with optimization costs and query execution times competitive with the native query optimizer in each system, but can execute significantly faster after learning (often by orders of magnitude).

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.