Message-Passing on Hypergraphs: Detectability, Phase Transitions and Higher-Order Information (2312.00708v2)
Abstract: Hypergraphs are widely adopted tools to examine systems with higher-order interactions. Despite recent advancements in methods for community detection in these systems, we still lack a theoretical analysis of their detectability limits. Here, we derive closed-form bounds for community detection in hypergraphs. Using a Message-Passing formulation, we demonstrate that detectability depends on hypergraphs' structural properties, such as the distribution of hyperedge sizes or their assortativity. Our formulation enables a characterization of the entropy of a hypergraph in relation to that of its clique expansion, showing that community detection is enhanced when hyperedges highly overlap on pairs of nodes. We develop an efficient Message-Passing algorithm to learn communities and model parameters on large systems. Additionally, we devise an exact sampling routine to generate synthetic data from our probabilistic model. With these methods, we numerically investigate the boundaries of community detection in synthetic datasets, and extract communities from real systems. Our results extend the understanding of the limits of community detection in hypergraphs and introduce flexible mathematical tools to study systems with higher-order interactions.
- Girvan M and Newman M E J 2002 Proceedings of the National Academy of Sciences 99 7821–7826 URL https://www.pnas.org/doi/abs/10.1073/pnas.122653799
- Newman M E J 2001 Proceedings of the National Academy of Sciences 98 404–409 URL https://www.pnas.org/doi/abs/10.1073/pnas.98.2.404
- Shekhtman L M, Shai S and Havlin S 2015 New Journal of Physics 17 123007 URL https://dx.doi.org/10.1088/1367-2630/17/12/123007
- Fortunato S 2010 Physics Reports 486 75–174 ISSN 0370-1573 URL https://www.sciencedirect.com/science/article/pii/S0370157309002841
- Abbe E 2018 Journal of Machine Learning Research 18 1–86 URL http://jmlr.org/papers/v18/16-480.html
- Chodrow P S, Veldt N and Benson A R 2021 Science Advances 7 eabh1303 URL https://www.science.org/doi/abs/10.1126/sciadv.abh1303
- Contisciani M, Battiston F and De Bacco C 2022 Nature Communications 13 7229 ISSN 2041-1723 URL https://doi.org/10.1038/s41467-022-34714-7
- Dumitriu I and Wang H 2023 Exact recovery for the non-uniform Hypergraph Stochastic Block Model (Preprint 2304.13139)
- Chien I, Lin C Y and Wang I H 2018 Community Detection in Hypergraphs: Optimal Statistical Limit and Efficient Algorithms Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research vol 84) ed Storkey A and Perez-Cruz F (PMLR) pp 871–879 URL https://proceedings.mlr.press/v84/chien18a.html
- Liang J, Ke C and Honorio J 2021 Information Theoretic Limits of Exact Recovery in Sub-hypergraph Models for Community Detection 2021 IEEE International Symposium on Information Theory (ISIT) pp 2578–2583
- Pal S and Zhu Y 2021 Random Structures & Algorithms 59 407–463 URL https://onlinelibrary.wiley.com/doi/abs/10.1002/rsa.21006
- Zhang Q and Tan V Y F 2023 IEEE Transactions on Information Theory 69 453–471
- Gu Y and Polyanskiy Y 2023 Weak recovery threshold for the hypergraph stochastic block model (Preprint 2303.14689)
- Cole S and Zhu Y 2020 Linear Algebra and its Applications 593 45–73 ISSN 0024-3795 URL https://www.sciencedirect.com/science/article/pii/S0024379520300562
- Lin C Y, Chien I E and Wang I H 2017 On the fundamental statistical limit of community detection in random hypergraphs 2017 IEEE International Symposium on Information Theory (ISIT) pp 2178–2182
- Yuan M and Shang Z 2021 Stat 10 e407 URL https://onlinelibrary.wiley.com/doi/abs/10.1002/sta4.407
- Chodrow P, Eikmeier N and Haddock J 2023 SIAM Journal on Mathematics of Data Science 5 251–279 URL https://doi.org/10.1137/22M1494713
- Pearl J 1982 Reverend Bayes on Inference Engines: A Distributed Hierarchical Approach Proceedings of the Second AAAI Conference on Artificial Intelligence AAAI’82 (AAAI Press) p 133–136
- Murphy K P 2012 Machine learning: a probabilistic perspective (MIT press) URL https://probml.github.io/pml-book/book0.html
- Mézard M, Parisi G and Virasoro M 1986 Spin Glass Theory and Beyond (World Scientific) URL https://www.worldscientific.com/doi/abs/10.1142/0271
- Mézard M and Parisi G 2001 The European Physical Journal B - Condensed Matter and Complex Systems 20 217–233 ISSN 1434-6036 URL https://doi.org/10.1007/PL00011099
- Holland P W, Laskey K B and Leinhardt S 1983 Social Networks 5 109–137 ISSN 0378-8733 URL https://www.sciencedirect.com/science/article/pii/0378873383900217
- Wasserman S and Faust K 1994 Social Network Analysis: Methods and Applications Structural Analysis in the Social Sciences (Cambridge University Press)
- Kamiński B, Prałat P and Théberge F 2023 Journal of Complex Networks 11 cnad028 ISSN 2051-1329 URL https://doi.org/10.1093/comnet/cnad028
- Ruggeri N, Battiston F and Bacco C D 2023 A framework to generate hypergraphs with community structure (Preprint 2212.08593)
- Cantwell G T and Newman M E J 2019 Proceedings of the National Academy of Sciences 116 23398–23403 URL https://www.pnas.org/doi/abs/10.1073/pnas.1914893116
- Kirkley A, Cantwell G T and Newman M E J 2021 Science Advances 7 eabf1211 URL https://www.science.org/doi/abs/10.1126/sciadv.abf1211
- Dempster A P, Laird N M and Rubin D B 1977 Journal of the Royal Statistical Society. Series B (Methodological) 39 1–38 ISSN 00359246 full publication date: 1977 URL http://www.jstor.org/stable/2984875
- Chodrow P S 2020 Journal of Complex Networks 8 cnaa018 ISSN 2051-1329 URL https://doi.org/10.1093/comnet/cnaa018
- Kesten H and Stigum B 1967 Journal of Mathematical Analysis and Applications 17 309–338 ISSN 0022-247X URL https://www.sciencedirect.com/science/article/pii/0022247X67901552
- Kesten H and Stigum B P 1966 The Annals of Mathematical Statistics 37 1463 – 1481 URL https://doi.org/10.1214/aoms/1177699139
- Mézard M and Montanari A 2006 Journal of Statistical Physics 124 1317–1350 ISSN 1572-9613 URL https://doi.org/10.1007/s10955-006-9162-3
- Merchan L and Nemenman I 2016 Journal of Statistical Physics 162 1294–1308 ISSN 1572-9613 URL https://doi.org/10.1007/s10955-016-1456-5
- Blei D M, Ng A Y and Jordan M I 2003 J. Mach. Learn. Res. 3 993–1022 ISSN 1532-4435 URL https://jmlr.csail.mit.edu/papers/v3/blei03a.html
- Campbell L L 1966 Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 5 217–225 ISSN 1432-2064 URL https://doi.org/10.1007/BF00533058
- Cover T and Thomas J 2006 Elements of Information Theory (Wiley) ISBN 9780471748816 URL https://books.google.de/books?id=EuhBluW31hsC
- Young J G, Petri G and Peixoto T P 2021 Communications Physics 4 135 ISSN 2399-3650 URL https://doi.org/10.1038/s42005-021-00637-w
- Mastrandrea R, Fournet J and Barrat A 2015 PLOS ONE 10 1–26 URL https://doi.org/10.1371/journal.pone.0136497
- Newman M E J and Clauset A 2016 Nature Communications 7 11863 ISSN 2041-1723 URL https://doi.org/10.1038/ncomms11863
- Contisciani M, Power E A and De Bacco C 2020 Scientific Reports 10 15736 ISSN 2045-2322 URL https://doi.org/10.1038/s41598-020-72626-y
- Badalyan A, Ruggeri N and De Bacco C 2023 Hypergraphs with node attributes: structure and inference (Preprint 2311.03857)
- Landry N W, Young J G and Eikmeier N 2023 The simpliciality of higher-order networks (Preprint 2308.13918)
- A Poisson approximation of the binomial Pois(N#π#κd)Poissubscript𝑁#subscript𝜋#subscript𝜅𝑑\displaystyle\mathrm{Pois}\left(N_{\#}\frac{\pi_{\#}}{\kappa_{d}}\right)roman_Pois ( italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT divide start_ARG italic_π start_POSTSUBSCRIPT # end_POSTSUBSCRIPT end_ARG start_ARG italic_κ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_ARG ) is used if N#>20subscript𝑁#20N_{\#}>20italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT > 20 and N#π#/κd<0.1subscript𝑁#subscript𝜋#subscript𝜅𝑑0.1{N_{\#}\pi_{\#}}/{\kappa_{d}}<0.1italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT italic_π start_POSTSUBSCRIPT # end_POSTSUBSCRIPT / italic_κ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT < 0.1, or if N#>100subscript𝑁#100N_{\#}>100italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT > 100 and N#π#/κd<10subscript𝑁#subscript𝜋#subscript𝜅𝑑10{N_{\#}\pi_{\#}}/{\kappa_{d}}<10italic_N start_POSTSUBSCRIPT # end_POSTSUBSCRIPT italic_π start_POSTSUBSCRIPT # end_POSTSUBSCRIPT / italic_κ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT < 10.
- Ramanujan S 1987 The Lost Notebook and other Unpublished Papers (Narosa, New Delhi)