Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 62 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 17 tok/s Pro
GPT-5 High 13 tok/s Pro
GPT-4o 101 tok/s Pro
Kimi K2 217 tok/s Pro
GPT OSS 120B 474 tok/s Pro
Claude Sonnet 4 36 tok/s Pro
2000 character limit reached

Parallel Weighted Random Sampling (1903.00227v3)

Published 1 Mar 2019 in cs.DS and cs.DC

Abstract: Data structures for efficient sampling from a set of weighted items are an important building block of many applications. However, few parallel solutions are known. We close many of these gaps both for shared-memory and distributed-memory machines. We give efficient, fast, and practicable parallel algorithms for building data structures that support sampling single items (alias tables, compressed data structures). This also yields a simplified and more space-efficient sequential algorithm for alias table construction. Our approaches to sampling $k$ out of $n$ items with/without replacement and to subset (Poisson) sampling are output-sensitive, i.e., the sampling algorithms use work linear in the number of different samples. This is also interesting in the sequential case. Weighted random permutation can be done by sorting appropriate random deviates. We show that this is possible with linear work using a nonlinear transformation of these deviates. Finally, we give a communication-efficient, highly scalable approach to (weighted and unweighted) reservoir sampling. This algorithm is based on a fully distributed model of streaming algorithms that might be of independent interest. Experiments for alias tables and sampling with replacement show near linear speedups both for construction and queries using up to 158 threads of shared-memory machines. An experimental evaluation of distributed weighted reservoir sampling on up to 256 nodes (5120 cores) also shows good speedups.

Citations (27)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube