Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 60 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 18 tok/s Pro
GPT-5 High 14 tok/s Pro
GPT-4o 77 tok/s Pro
Kimi K2 159 tok/s Pro
GPT OSS 120B 456 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

Matrix Bloom Filter: An Efficient Probabilistic Data Structure for 2-tuple Batch Lookup (1912.07153v1)

Published 16 Dec 2019 in cs.DS

Abstract: With the growing scale of big data, probabilistic structures receive increasing popularity for efficient approximate storage and query processing. For example, Bloom filters (BF) can achieve satisfactory performance for approximate membership existence query at the expense of false positives. However, a standard Bloom filter can only handle univariate data and single membership existence query, which is insufficient for OLAP and machine learning applications. In this paper, we focus on a common multivariate data type, namely, 2-tuples, or equivalently, key-value pairs. We design the matrix Bloom filter as a high-dimensional extension of the standard Bloom filter. This new probabilistic data structure can not only insert and lookup a single 2-tuple efficiently, but also support these operations efficiently in batches --- a key requirement for OLAP and machine learning tasks. To further balance the insertion and query efficiency for different workload patterns, we propose two variants, namely, the maximum adaptive matrix BF and minimum storage matrix BF. Through both theoretical and empirical studies, we show the performance of matrix Bloom filter is superior on datasets with common statistical distributions; and even without them, it just degrades to a standard Bloom filter.

Citations (1)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube