Approximately Counting Subgraphs in Data Streams (2203.14225v1)
Abstract: Estimating the number of subgraphs in data streams is a fundamental problem that has received great attention in the past decade. In this paper, we give improved streaming algorithms for approximately counting the number of occurrences of an arbitrary subgraph $H$, denoted $# H$, when the input graph $G$ is represented as a stream of $m$ edges. To obtain our algorithms, we provide a generic transformation that converts constant-round sublinear-time graph algorithms in the query access model to constant-pass sublinear-space graph streaming algorithms. Using this transformation, we obtain the following results. 1. We give a $3$-pass turnstile streaming algorithm for $(1\pm \epsilon)$-approximating $# H$ in $\tilde{O}(\frac{m{\rho(H)}}{\epsilon2\cdot # H})$ space, where $\rho(H)$ is the fractional edge-cover of $H$. This improves upon and generalizes a result of McGregor et al. [PODS 2016], who gave a $3$-pass insertion-only streaming algorithm for $(1\pm \epsilon)$-approximating the number $# T$ of triangles in $\tilde{O}(\frac{m{3/2}}{\epsilon2\cdot # T})$ space if the algorithm is given additional oracle access to the degrees. 2. We provide a constant-pass streaming algorithm for $(1\pm \epsilon)$-approximating $# K_r$ in $\tilde{O}(\frac{m\lambda{r-2}}{\epsilon2\cdot # K_r})$ space for any $r\geq 3$, in a graph $G$ with degeneracy $\lambda$, where $K_r$ is a clique on $r$ vertices. This resolves a conjecture by Bera and Seshadhri [PODS 2020]. More generally, our reduction relates the adaptivity of a query algorithm to the pass complexity of a corresponding streaming algorithm, and it is applicable to all algorithms in standard sublinear-time graph query models, e.g., the (augmented) general model.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.