Shift-Table: A Low-latency Learned Index for Range Queries using Model Correction (2101.10457v1)

Published 25 Jan 2021 in cs.DB

Abstract: Indexing large-scale databases in main memory is still challenging today. Learned index structures -- in which the core components of classical indexes are replaced with machine learning models -- have recently been suggested to significantly improve performance for read-only range queries. However, a recent benchmark study shows that learned indexes only achieve limited performance improvements for real-world data on modern hardware. More specifically, a learned model cannot learn the micro-level details and fluctuations of data distributions thus resulting in poor accuracy; or it can fit to the data distribution at the cost of training a big model whose parameters cannot fit into cache. As a consequence, querying a learned index on real-world data takes a substantial number of memory lookups, thereby degrading performance. In this paper, we adopt a different approach for modeling a data distribution that complements the model fitting approach of learned indexes. We propose Shift-Table, an algorithmic layer that captures the micro-level data distribution and resolves the local biases of a learned model at the cost of at most one memory lookup. Our suggested model combines the low latency of lookup tables with learned indexes and enables low-latency processing of range queries. Using Shift-Table, we achieve a speedup of 1.5X to 2X on real-world datasets compared to trained and tuned learned indexes.

Citations (16)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Related Papers

A Pluggable Learned Index Method via Sampling and Gap Insertion (2021)
The Case for Learned Index Structures (2017)
ALEX: An Updatable Adaptive Learned Index (2019)
A Scalable Learned Index Scheme in Storage Systems (2019)
Learned Indexes for Dynamic Workloads (2019)