AiSAQ: All-in-Storage ANNS with Product Quantization for DRAM-free Information Retrieval (2404.06004v2)
Abstract: Graph-based approximate nearest neighbor search (ANNS) algorithms work effectively against large-scale vector retrieval. Among such methods, DiskANN achieves good recall-speed tradeoffs using both DRAM and storage. DiskANN adopts product quantization (PQ) to reduce memory usage, which is still proportional to the scale of datasets. In this paper, we propose All-in-Storage ANNS with Product Quantization (AiSAQ), which offloads compressed vectors to the SSD index. Our method achieves $\sim$10 MB memory usage in query search with billion-scale datasets without critical latency degradation. AiSAQ also reduces the index load time for query search preparation, which enables fast switch between muitiple billion-scale indices.This method can be applied to retrievers of retrieval-augmented generation (RAG) and be scaled out with multiple-server systems for emerging datasets. Our DiskANN-based implementation is available on GitHub.
- Datasets for approximate nearest neighbor search, 2010.
- Approximate nearest neighbor queries in fixed dimensions. In SODA (1993), vol. 93, Citeseer, pp. 271–280.
- Weaviate, 2022.
- Filtered-diskann: Graph algorithms for approximate nearest neighbor search with filters. In Proceedings of the ACM Web Conference 2023 (2023), pp. 3406–3416.
- Diskann: Fast accurate billion-point nearest neighbor search on a single node. Advances in Neural Information Processing Systems 32 (2019).
- Product quantization for nearest neighbor search. IEEE transactions on pattern analysis and machine intelligence 33, 1 (2010), 117–128.
- Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems 33 (2020), 9459–9474.
- Lm-diskann: Low memory footprint in disk-native dynamic graph-based ann indexing. In 2023 IEEE International Conference on Big Data (BigData) (2023), IEEE, pp. 5987–5996.
- KILT: a benchmark for knowledge intensive language tasks. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Online, June 2021), Association for Computational Linguistics, pp. 2523–2544.
- Neurips’23 competition track: Big-ann, 2023.
- DiskANN: Graph-structured Indices for Scalable, Fast, Fresh and Filtered Approximate Nearest Neighbor Search, 2023.
- Freshdiskann: A fast and accurate graph-based ann index for streaming similarity search. arXiv preprint arXiv:2105.09613 (2021).
- Milvus: A purpose-built vector data management system. In Proceedings of the 2021 International Conference on Management of Data (New York, NY, USA, 2021), SIGMOD ’21, Association for Computing Machinery, p. 2614–2627.
- Text embeddings by weakly-supervised contrastive pre-training, 2022.