Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Entropy of Exchangeable Random Graphs (2302.01856v2)

Published 3 Feb 2023 in cs.IT, math.CO, math.IT, math.ST, and stat.TH

Abstract: In this paper, we propose a complexity measure for exchangeable graphs by considering the graph-generating mechanism. Exchangeability for graphs implies distributional invariance under node permutations, making it a suitable default model for a wide range of graph data. For this well-studied class of graphs, we quantify complexity using graphon entropy. Graphon entropy is a graph property, meaning it is invariant under graph isomorphisms. Therefore, we focus on estimating the entropy of the generating mechanism for a graph realization, rather than selecting a specific graph feature. This approach allows us to consider the global properties of a graph, capturing its important graph-theoretic and topological characteristics, such as sparsity, symmetry, and connectedness. We introduce a consistent graphon entropy estimator that achieves the nonparametric rate for any arbitrary exchangeable graph with a smooth graphon representative. Additionally, we develop tailored entropy estimators for situations where more information about the underlying graphon is available, specifically for widely studied random graph models such as Erd\H{o}s-R\'enyi, Configuration Model and Stochastic Block Model. We determine their large-sample properties by providing a Central Limit Theorem for the first two, and a convergence rate for the third model. We also conduct a simulation study to illustrate our theoretical findings and demonstrate the connection between graphon entropy and graph structure. Finally, we investigate the role of our entropy estimator as a complexity measure for characterizing real-world graphs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (93)
  1. Statistical mechanics of complex networks. Reviews of modern physics 74, 1 (2002), 47.
  2. Aldous, D. J. Representations for partially exchangeable arrays of random variables. Journal of Multivariate Analysis 11, 4 (1981), 581–598.
  3. Source coding and graph entropies. IEEE Transactions on Information Theory 42, 5 (1996), 1329–1339.
  4. Pseudo-likelihood methods for community detection in large sparse networks. The Annals of Statistics 41, 4 (2013), 2097–2122.
  5. Entropy measures for networks: Toward an information theory of complex topologies. Physical Review E 80, 4 (2009), 045102.
  6. Shannon and von Neumann entropy of random networks with heterogeneous expected degree. Physical Review E 83, 3 (2011), 036109.
  7. Austin, T. On exchangeable random variables and the statistics of large graphs and hypergraphs. Probability Surveys 5, 0 (2008), 80–145.
  8. Austin, T. Exchangeable random measures. In Annales de l’IHP Probabilités et statistiques (2015), vol. 51, pp. 842–861.
  9. Graph characterisation using graphlet-based entropies. Pattern Recognition Letters 147 (2021), 100–107.
  10. Nonparametric entropy estimation: An overview. International Journal of Mathematical and Statistical Sciences 6, 1 (1997), 17–39.
  11. Estimating the entropy of a signal with applications. IEEE transactions on signal processing 48, 6 (2000), 1687–1694.
  12. Efficient multivariate entropy estimation via k-nearest neighbour distances. Ann. Statist. 47, 1 (2019), 288–318.
  13. Universal graph compression: Stochastic block models. IEEE Transactions on Information Theory (2023).
  14. Bianconi, G. The entropy of randomized network ensembles. EPL (Europhysics Letters) 81, (2007), 28005.
  15. Bianconi, G. Entropy of network ensembles. Physical Review E 79, 3 (2009), 036114.
  16. A nonparametric view of network models and Newman–Girvan and other modularities. Proceedings of the National Academy of Sciences 106, 50 (2009), 21068–21073.
  17. The method of moments and degree distributions for network models. The Annals of Statistics 39, 5 (2011), 2280–2301.
  18. Complex networks: Structure and dynamics. Physics reports 424, 4-5 (2006), 175–308.
  19. Bollobás, B. Threshold functions for small subgraphs. In Mathematical Proceedings of the Cambridge Philosophical Society (1981), vol. 90, Cambridge University Press, pp. 197–206.
  20. Metrics for sparse graphs. In Surveys in Combinatorics 2009, London Mathematical Society Lecture Note Series 365. Cambridge Univ. Press, Cambridge, 2009, pp. 211–287.
  21. Large deviations of empirical neighborhood distribution in sparse random graphs. Probability Theory and Related Fields 163, 1-2 (2015), 149–222.
  22. The Laplacian of a graph as a density matrix: a basic combinatorial approach to separability of mixed states. Annals of Combinatorics 10, 3 (2006), 291–317.
  23. Consistency of maximum-likelihood and variational estimators in the stochastic block model. Electronic Journal of Statistics 6 (2012), 1847–1899.
  24. Chatterjee, S. Matrix estimation by universal singular value thresholding. The Annals of Statistics 43, 1 (2015), 177–214.
  25. The large deviation principle for the Erdős-Rényi random graph. European Journal of Combinatorics 32, 7 (2011), 1000–1017.
  26. Stochastic blockmodels with a growing number of classes. Biometrika 99, 2 (2012), 273–284.
  27. Compression of graphical structures: Fundamental limits, algorithms, and experiments. IEEE Transactions on Information Theory 58, 2 (2012), 620–638.
  28. The average distances in random graphs with given expected degrees. Proceedings of the National Academy of Sciences 99, 25 (2002), 15879–15882.
  29. Dehmer, M. A novel method for measuring the structural information content of networks. Cybernetics and Systems: An International Journal 39, 8 (2008), 825–842.
  30. A history of graph entropy measures. Information Sciences 181, 1 (2011), 57–78.
  31. Universal lossless compression of graphical data. IEEE Transactions on Information Theory 66, 11 (2020), 6962–6976.
  32. A universal lossless compression method applicable to sparse graphs and heavy–tailed sparse graphs. IEEE Transactions on Information Theory 69, 2 (2022), 719–751.
  33. Evolution of networks. Advances in physics 51, 4 (2002), 1079–1187.
  34. On random graphs I. Publ. math. debrecen 6, 290-297 (1959), 18.
  35. On the evolution of random graphs. Publ. math. inst. hung. acad. sci 5, 1 (1960), 17–60.
  36. Network modularity in the presence of covariates. arXiv preprint arXiv:1603.01214 (2016).
  37. Universal tree source coding using grammar-based compression. IEEE Transactions on Information Theory 65, 10 (2019), 6399–6413.
  38. Rate-optimal graphon estimation. The Annals of Statistics 43, 6 (2015), 2624–2652.
  39. On entropy estimation by m-spacing method. Journal of Mathematical Sciences 163, 3 (2009).
  40. On the estimation of entropy. Annals of the Institute of Statistical Mathematics 45 (1993), 69–88.
  41. Optimal rates of entropy estimation over lipschitz balls. Annals of Statistics 48, 6 (12 2020), 3228–3250.
  42. Graph properties, graph limits, and entropy. Journal of Graph Theory 87, 2 (2018), 208–229.
  43. Latent space approaches to social network analysis. Journal of the american Statistical association 97, 460 (2002), 1090–1098.
  44. Stochastic blockmodels: First steps. Social networks 5, 2 (1983), 109–137.
  45. Hoover, D. N. Relations on probability spaces and arrays of random variables. Technical report, Institute for Advanced Study (1979).
  46. Janson, S. Graphons, cut norm and distance, rearrangements and coupling. New York J. Math. Monographs 24 (2013), 1–76.
  47. Graph limits and exchangeable random graphs. Rendiconti di Matematica e delle sue Applicazioni. Serie VII (2008), 33–61.
  48. Minimax estimation of functionals of discrete distributions. IEEE Transactions on Information Theory 61, 5 (2015), 2835–2885.
  49. Maximum likelihood estimation of functionals of discrete distributions. IEEE Transactions on Information Theory 63, 10 (2017), 6774–6798.
  50. Oracle inequalities for network models and sparse graphon estimation. The Annals of Statistics 45, 1 (2017), 316–354.
  51. Nonparametric entropy estimation for stationary processes and random fields, with applications to english text. IEEE Transactions on Information Theory 44, 3 (1998), 1319–1327.
  52. Körner, J. Coding of an information source having ambiguous alphabet and the entropy of graphs. In 6th Prague conference on information theory (1973), pp. 411–425.
  53. Krioukov, D. Clustering implies geometry in networks. Physical review letters 116, 20 (2016), 208302.
  54. Lauritzen, S. L. Exchangeable rasch matrices. Rend. Mat. Appl.(7) 28, 1 (2008), 83–95.
  55. Structural information and dynamical complexity of networks. IEEE Transactions on Information Theory 62, 6 (2016), 3290–3339.
  56. Exponential concentration inequality for mutual information estimation. In Neural Information Processing Systems (NIPS) (2012).
  57. On the similarity between von Neumann graph entropy and structural information: Interpretation, computation, and applications. IEEE Transactions on Information Theory 68, 4 (2022), 2182–2202.
  58. Limits of dense graph sequences. Journal of Combinatorial Theory, Series B 96, 6 (2006), 933–957.
  59. Asymmetry and structural information in preferential attachment graphs. Random Structures & Algorithms 55, 3 (2019), 696–718.
  60. Compression of preferential attachment graphs. In 2019 IEEE International Symposium on Information Theory (ISIT) (2019), IEEE, pp. 1697–1701.
  61. Nonparametric ensemble estimation of distributional functionals. arXiv preprint arXiv:1601.06884 (2016).
  62. Mowshowitz, A. Entropy and the complexity of graphs: I. an index of the relative complexity of a graph. The bulletin of mathematical biophysics 30, 1 (1968), 175–204.
  63. Mowshowitz, A. Entropy and the complexity of graphs: Ii. the information content of digraphs and infinite graphs. The Bulletin of mathematical biophysics 30, 2 (1968), 225–240.
  64. Mowshowitz, A. Entropy and the complexity of graphs: Iii. graphs with prescribed information content. The bulletin of mathematical biophysics 30, 3 (1968), 387–414.
  65. Mowshowitz, A. Entropy and the complexity of graphs: Iv. entropy measures and graphical structure. The bulletin of mathematical biophysics 30, 4 (1968), 533–546.
  66. Entropy and the complexity of graphs revisited. Entropy 14, 3 (2012), 559–570.
  67. Newman, M. Networks. Oxford university press, 2018.
  68. Finding and evaluating community structure in networks. Physical review E 69, 2 (2004), 026113.
  69. Network histograms and universality of blockmodel approximation. Proceedings of the National Academy of Sciences 111, 41 (2014), 14722–14727.
  70. Bayesian models of graphs, arrays and other exchangeable random structures. IEEE transactions on pattern analysis and machine intelligence 37, 2 (2014), 437–461.
  71. Paninski, L. Estimation of entropy and mutual information. Neural computation 15, 6 (2003), 1191–1253.
  72. Undersmoothed kernel entropy estimators. IEEE Transactions on Information Theory 54, 9 (2008), 4384–4388.
  73. Entropy of labeled versus unlabeled networks. Physical Review E 106, 5 (2022), 054308.
  74. Peixoto, T. P. Entropy of stochastic blockmodel ensembles. Physical Review E 85, 5 (2012), 056122.
  75. Rashevsky, N. Life, information theory, and topology. The bulletin of mathematical biophysics 17, 3 (1955), 229–235.
  76. Regular decomposition: an information and graph theoretic approach to stochastic block models. arXiv preprint arXiv:1704.07114 (2017).
  77. For the few not the many? The effects of affirmative action on presence, prominence, and social capital of women directors in Norway. Scandinavian Journal of Management 27, 1 (2011), 44–54.
  78. Shannon, C. E. A mathematical theory of communication. The Bell system technical journal 27, 3 (1948), 379–423.
  79. Söderberg, B. General formalism for inhomogeneous random graphs. Physical review E 66, 6 (2002), 066121.
  80. Ensemble estimators for multivariate entropy estimation. IEEE transactions on information theory 59, 7 (2013), 4374–4388.
  81. Szemerédi, E. Regular partitions of graphs. Stanford University, 1975.
  82. Trucco, E. A note on the information content of graphs. The bulletin of mathematical biophysics 18, 2 (1956), 129–135.
  83. Von Luxburg, U. A tutorial on spectral clustering. Statistics and computing 17, 4 (2007), 395–416.
  84. Differential network entropy reveals cancer system hallmarks. Scientific reports 2, 1 (2012), 1–8.
  85. Nonparametric graphon estimation. arXiv preprint arXiv:1309.5936 (2013).
  86. Edge label inference in generalized stochastic block models: from spectral theory to impossibility results. In Conference on Learning Theory (2014), PMLR, pp. 903–920.
  87. Approximate von Neumann entropy for directed graphs. Physical Review E 89, 5 (2014), 052804.
  88. Universality of the stochastic block model. Physical Review E 98, 3 (2018), 032309.
  89. Low-algorithmic-complexity entropy-deceiving graphs. Physical Review E 96, 1 (2017), 012308.
  90. A review of graph and network complexity from an algorithmic information perspective. Entropy 20, 8 (2018), 551.
  91. Correlation of automorphism group size and topological properties with program-size complexity evaluations of graphs and complex networks. Physica A: Statistical Mechanics and its Applications 404 (2014), 341–358.
  92. A universal grammar-based code for lossless compression of binary trees. IEEE Transactions on Information Theory 60, 3 (2013), 1373–1386.
  93. Consistency of community detection in networks under degree-corrected stochastic block models. The Annals of Statistics 40, 4 (2012), 2266–2292.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Anda Skeja (3 papers)
  2. Sofia C. Olhede (29 papers)

Summary

We haven't generated a summary for this paper yet.