Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 27 tok/s
Gemini 2.5 Pro 46 tok/s Pro
GPT-5 Medium 23 tok/s Pro
GPT-5 High 29 tok/s Pro
GPT-4o 70 tok/s Pro
Kimi K2 117 tok/s Pro
GPT OSS 120B 459 tok/s Pro
Claude Sonnet 4 34 tok/s Pro
2000 character limit reached

Approximating Single-Source Personalized PageRank with Absolute Error Guarantees (2401.01019v1)

Published 2 Jan 2024 in cs.DS

Abstract: Personalized PageRank (PPR) is an extensively studied and applied node proximity measure in graphs. For a pair of nodes $s$ and $t$ on a graph $G=(V,E)$, the PPR value $\pi(s,t)$ is defined as the probability that an $\alpha$-discounted random walk from $s$ terminates at $t$, where the walk terminates with probability $\alpha$ at each step. We study the classic Single-Source PPR query, which asks for PPR approximations from a given source node $s$ to all nodes in the graph. Specifically, we aim to provide approximations with absolute error guarantees, ensuring that the resultant PPR estimates $\hat{\pi}(s,t)$ satisfy $\max_{t\in V}\big|\hat{\pi}(s,t)-\pi(s,t)\big|\le\varepsilon$ for a given error bound $\varepsilon$. We propose an algorithm that achieves this with high probability, with an expected running time of - $\widetilde{O}\big(\sqrt{m}/\varepsilon\big)$ for directed graphs, where $m=|E|$; - $\widetilde{O}\big(\sqrt{d_{\mathrm{max}}}/\varepsilon\big)$ for undirected graphs, where $d_{\mathrm{max}}$ is the maximum node degree in the graph; - $\widetilde{O}\left(n{\gamma-1/2}/\varepsilon\right)$ for power-law graphs, where $n=|V|$ and $\gamma\in\left(\frac{1}{2},1\right)$ is the extent of the power law. These sublinear bounds improve upon existing results. We also study the case when degree-normalized absolute error guarantees are desired, requiring $\max_{t\in V}\big|\hat{\pi}(s,t)/d(t)-\pi(s,t)/d(t)\big|\le\varepsilon_d$ for a given error bound $\varepsilon_d$, where the graph is undirected and $d(t)$ is the degree of node $t$. We give an algorithm that provides this error guarantee with high probability, achieving an expected complexity of $\widetilde{O}\left(\sqrt{\sum_{t\in V}\pi(s,t)/d(t)}\big/\varepsilon_d\right)$. This improves over the previously known $O(1/\varepsilon_d)$ complexity.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. Local computation of pagerank contributions. In Proc. 5th Int. Workshop Algorithms Models Web Graph, volume 4863, pages 150–165, 2007. doi:10.1007/978-3-540-77004-6_12.
  2. Local computation of pagerank contributions. Internet Math., 5(1):23–45, 2008. doi:10.1080/15427951.2008.10129302.
  3. Reid Andersen and Fan R. K. Chung. Detecting sharp drops in pagerank and a simplified local partitioning algorithm. In Proc. 4th Int. Conf. Theory Appl. Models Comput., volume 4484, pages 1–12, 2007. doi:10.1007/978-3-540-72504-6_1.
  4. Local graph partitioning using pagerank vectors. In Proc. 47th Annu. IEEE Symp. Found. Comput. Sci., pages 475–486, 2006. doi:10.1109/FOCS.2006.44.
  5. Using pagerank to locally partition a graph. Internet Math., 4(1):35–64, 2007. doi:10.1080/15427951.2007.10129139.
  6. On the choice of kernel and labelled data in semi-supervised learning methods. In Proc. 10th Int. Workshop Algorithms Models Web Graph, volume 8305, pages 56–67, 2013. doi:10.1007/978-3-319-03536-9_5.
  7. Quick detection of top-k personalized pagerank lists. In Proc. 8th Int. Workshop Algorithms Models Web Graph, volume 6732, pages 50–61, 2011. doi:10.1007/978-3-642-21286-4_5.
  8. Fast personalized pagerank on mapreduce. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 973–984, 2011. doi:10.1145/1989323.1989425.
  9. Fast incremental and personalized pagerank. Proc. VLDB Endowment, 4(3):173–184, 2010. URL: http://www.vldb.org/pvldb/vol4/p173-bahmani.pdf, doi:10.14778/1929861.1929864.
  10. Emergence of scaling in random networks. Science, 286(5439):509–512, 1999. doi:10.1126/science.286.5439.509.
  11. Pavel Berkhin. Bookmark-coloring algorithm for personalized pagerank computing. Internet Math., 3(1):41–62, 2006. doi:10.1080/15427951.2006.10129116.
  12. Scaling graph neural networks with approximate pagerank. In Proc. 26th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 2464–2473, 2020. doi:10.1145/3394486.3403296.
  13. Directed scale-free graphs. In Proc. ACM-SIAM Symp. Discrete Algorithms, pages 132–139, 2003. URL: http://dl.acm.org/citation.cfm?id=644108.644133.
  14. The anatomy of a large-scale hypertextual web search engine. Comput. Netw., 30(1-7):107–117, 1998. doi:10.1016/S0169-7552(98)00110-X.
  15. Fan R. K. Chung and Lincoln Lu. Survey: Concentration inequalities and martingale inequalities: A survey. Internet Math., 3(1):79–127, 2006. doi:10.1080/15427951.2006.10129115.
  16. Efficient processing of network proximity queries via chebyshev acceleration. In Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 1515–1524, 2016. doi:10.1145/2939672.2939828.
  17. Towards scaling fully personalized pagerank: Algorithms, lower bounds, and experiments. Internet Math., 2(3):333–358, 2005. doi:10.1080/15427951.2005.10129104.
  18. Variational perspective on local graph clustering. Math. Program., 174(1-2):553–573, 2019. URL: https://doi.org/10.1007/s10107-017-1214-8, doi:10.1007/S10107-017-1214-8.
  19. Fast and exact top-k search for random walk with restart. Proc. VLDB Endowment, 5(5):442–453, 2012. URL: http://vldb.org/pvldb/vol5/p442_yasuhirofujiwara_vldb2012.pdf, doi:10.14778/2140436.2140441.
  20. Efficient ad-hoc search for personalized pagerank. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 445–456, 2013. doi:10.1145/2463676.2463717.
  21. Efficient personalized pagerank with accuracy assurance. In Proc. 18th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 15–23, 2012. doi:10.1145/2339530.2339538.
  22. David F. Gleich. Pagerank beyond the web. SIAM Rev., 57(3):321–363, 2015. doi:10.1137/140976649.
  23. Distributed algorithms on exact personalized pagerank. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 479–494, 2017. doi:10.1145/3035918.3035920.
  24. Parallel personalized pagerank on dynamic graphs. Proc. VLDB Endowment, 11(1):93–106, 2017. URL: http://www.vldb.org/pvldb/vol11/p93-guo.pdf, doi:10.14778/3151113.3151121.
  25. Massively parallel algorithms for personalized pagerank. Proc. VLDB Endowment, 14(9):1668–1680, 2021. URL: http://www.vldb.org/pvldb/vol14/p1668-wang.pdf, doi:10.14778/3461535.3461554.
  26. Random generation of combinatorial structures from a uniform distribution. Theor. Comput. Sci., 43:169–188, 1986. doi:10.1016/0304-3975(86)90174-X.
  27. Bepi: Fast and memory-efficient method for billion-scale random walk with restart. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 789–804, 2017. doi:10.1145/3035918.3035950.
  28. Predict then propagate: Graph neural networks meet personalized pagerank. In Proc. 7th Int. Conf. Learn. Representations, 2019. URL: https://openreview.net/forum?id=H1gL-2A9Ym.
  29. Efficient personalized pagerank computation: The power of variance-reduced monte carlo approaches. Proc. ACM Manage. Data, 1(2):160:1–160:26, 2023. doi:10.1145/3589305.
  30. Efficient personalized pagerank computation: A spanning forests sampling based approach. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 2048–2061, 2022. doi:10.1145/3514221.3526140.
  31. Index-free approach with theoretical guarantee for efficient random walk with restart query. In Proc. 36th Int. Conf. Data Eng., pages 913–924, 2020. doi:10.1109/ICDE48307.2020.00084.
  32. Wenqing Lin. Distributed algorithms for fully personalized pagerank on large graphs. In Proc. Int. Conf. World Wide Web, pages 1084–1094, 2019. doi:10.1145/3308558.3313555.
  33. Personalized pagerank estimation and search: A bidirectional approach. In Proc. 9th ACM Int. Conf. Web Search Data Mining, pages 163–172, 2016. doi:10.1145/2835776.2835823.
  34. Fast-ppr: scaling personalized pagerank estimation for large graphs. In Proc. 20th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 1436–1445, 2014. doi:10.1145/2623330.2623745.
  35. Personalized pagerank to a target node. CoRR, abs/1304.4658, 2013. URL: http://arxiv.org/abs/1304.4658, arXiv:1304.4658.
  36. Computing personalized pagerank quickly by exploiting graph structures. Proc. VLDB Endowment, 7(12):1023–1034, 2014. URL: http://www.vldb.org/pvldb/vol7/p1023-maehara.pdf, doi:10.14778/2732977.2732978.
  37. Efficient pagerank tracking in evolving networks. In Proc. 21st ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 875–884, 2015. doi:10.1145/2783258.2783297.
  38. Asymmetric transitivity preserving graph embedding. In Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 1105–1114, 2016. doi:10.1145/2939672.2939751.
  39. Realtime top-k personalized pagerank over large graphs on gpus. Proc. VLDB Endowment, 13(1):15–28, 2019. URL: http://www.vldb.org/pvldb/vol13/p15-shi.pdf, doi:10.14778/3357377.3357379.
  40. Bear: Block elimination approach for random walk with restart on large graphs. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 1571–1585, 2015. doi:10.1145/2723372.2723716.
  41. Verse: Versatile graph embeddings from similarity measures. In Proc. Int. Conf. World Wide Web, pages 539–548, 2018. doi:10.1145/3178876.3186120.
  42. Alastair J Walker. New fast method for generating discrete random numbers with arbitrary frequency distributions. Electronics Letters, 8(10):127–128, 1974. doi:10.1049/el:19740097.
  43. Approximate graph propagation. In Proc. 27th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 1686–1696, 2021. doi:10.1145/3447548.3467243.
  44. Personalized pagerank to a target node, revisited. In Proc. 26th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 657–667, 2020. doi:10.1145/3394486.3403108.
  45. Parallelizing approximate single-source personalized pagerank queries on shared memory. VLDB J., 28(6):923–940, 2019. URL: https://doi.org/10.1007/s00778-019-00576-7, doi:10.1007/S00778-019-00576-7.
  46. Hubppr: Effective indexing for approximate personalized pagerank. Proc. VLDB Endowment, 10(3):205–216, 2016. URL: http://www.vldb.org/pvldb/vol10/p205-wang.pdf, doi:10.14778/3021924.3021936.
  47. Efficient algorithms for approximate single-source personalized pagerank queries. ACM Trans. Database Syst., 44(4):18:1–18:37, 2019. doi:10.1145/3360902.
  48. Fora: Simple and effective approximate single-source personalized pagerank. In Proc. 23rd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 505–514, 2017. doi:10.1145/3097983.3098072.
  49. Prsim: Sublinear time simrank computation on large power-law graphs. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 1042–1059, 2019. doi:10.1145/3299869.3319873.
  50. Topppr: Top-k personalized pagerank queries with precision guarantees on large graphs. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 441–456, 2018. doi:10.1145/3183713.3196920.
  51. Unifying the global and local approaches: An efficient power iteration with forward push. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 1996–2008, 2021. doi:10.1145/3448016.3457298.
  52. Fast and unified local search for random walk based k-nearest-neighbor query in large graphs. In Proc. ACM SIGMOD Int. Conf. Manage. Data, pages 1139–1150, 2014. doi:10.1145/2588555.2610500.
  53. Local higher-order graph clustering. In Proc. 23rd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 555–564, 2017. doi:10.1145/3097983.3098069.
  54. Scalable graph embeddings via sparse transpose proximities. In Proc. 25th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pages 1429–1437, 2019. doi:10.1145/3292500.3330860.
  55. Fast and accurate random walk with restart on dynamic graphs with guarantees. In Proc. Int. Conf. World Wide Web, pages 409–418, 2018. doi:10.1145/3178876.3186107.
  56. Tpa: Fast, scalable, and accurate method for approximate random walk with restart on billion scale graphs. In Proc. 34th Int. Conf. Data Eng., pages 1132–1143, 2018. doi:10.1109/ICDE.2018.00105.
  57. Irwr: incremental random walk with restart. In Proc. 36th ACM SIGIR Int. Conf. Res. Develop. Inf. Retrieval, pages 1017–1020, 2013. doi:10.1145/2484028.2484114.
  58. Random walk with restart over dynamic graphs. In Proc. 16th Int. Conf. Data Mining, pages 589–598, 2016. doi:10.1109/ICDM.2016.0070.
  59. Incremental and accuracy-aware personalized pagerank through scheduled approximation. Proc. VLDB Endowment, 6(6):481–492, 2013. URL: http://www.vldb.org/pvldb/vol6/p481-zhu.pdf, doi:10.14778/2536336.2536348.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.