F2: Designing a Key-Value Store for Large Skewed Workloads (2305.01516v2)
Abstract: Many real-world workloads present a challenging set of requirements: point operations requiring high throughput, working sets much larger than main memory, and natural skew in key access patterns for both reads and writes. We find that modern key-value designs are either optimized for memory-efficiency, sacrificing high-performance (LSM-tree designs), or achieve high-performance, saturating modern NVMe SSD bandwidth, at the cost of substantial memory resources or high disk wear (CPU-optimized designs). Unfortunately these designs are not able to handle meet the challenging demands of such larger-than-memory, skewed workloads. To this end, we present F2, a new key-value store that bridges this gap by combining the strengths of both approaches. F2 adopts a tiered, record-oriented architecture inspired by LSM-trees to effectively separate hot from cold records, while incorporating concurrent latch-free mechanisms from CPU-optimized engines to maximize performance on modern NVMe SSDs. To realize this design, we tackle key challenges and introduce several innovations, including new latch-free algorithms for multi-threaded log compaction and user operations (e.g., RMWs), as well as new components: a two-level hash index to reduce indexing overhead for cold records and a read-cache for serving read-hot data. Detailed experimental results show that F2 matches or outperforms existing solutions, achieving on average better throughput on memory-constrained environments compared to state-of-the-art systems like RocksDB (11.75x), SplinterDB (4.52x), KVell (10.56x), LeanStore (2.04x), and FASTER (2.38x). F2 also maintains its high performance across varying workload skewness levels and memory budgets, while achieving low disk write amplification.