Entropy of Exchangeable Random Graphs (2302.01856v2)
Abstract: In this paper, we propose a complexity measure for exchangeable graphs by considering the graph-generating mechanism. Exchangeability for graphs implies distributional invariance under node permutations, making it a suitable default model for a wide range of graph data. For this well-studied class of graphs, we quantify complexity using graphon entropy. Graphon entropy is a graph property, meaning it is invariant under graph isomorphisms. Therefore, we focus on estimating the entropy of the generating mechanism for a graph realization, rather than selecting a specific graph feature. This approach allows us to consider the global properties of a graph, capturing its important graph-theoretic and topological characteristics, such as sparsity, symmetry, and connectedness. We introduce a consistent graphon entropy estimator that achieves the nonparametric rate for any arbitrary exchangeable graph with a smooth graphon representative. Additionally, we develop tailored entropy estimators for situations where more information about the underlying graphon is available, specifically for widely studied random graph models such as Erd\H{o}s-R\'enyi, Configuration Model and Stochastic Block Model. We determine their large-sample properties by providing a Central Limit Theorem for the first two, and a convergence rate for the third model. We also conduct a simulation study to illustrate our theoretical findings and demonstrate the connection between graphon entropy and graph structure. Finally, we investigate the role of our entropy estimator as a complexity measure for characterizing real-world graphs.
- Statistical mechanics of complex networks. Reviews of modern physics 74, 1 (2002), 47.
- Aldous, D. J. Representations for partially exchangeable arrays of random variables. Journal of Multivariate Analysis 11, 4 (1981), 581–598.
- Source coding and graph entropies. IEEE Transactions on Information Theory 42, 5 (1996), 1329–1339.
- Pseudo-likelihood methods for community detection in large sparse networks. The Annals of Statistics 41, 4 (2013), 2097–2122.
- Entropy measures for networks: Toward an information theory of complex topologies. Physical Review E 80, 4 (2009), 045102.
- Shannon and von Neumann entropy of random networks with heterogeneous expected degree. Physical Review E 83, 3 (2011), 036109.
- Austin, T. On exchangeable random variables and the statistics of large graphs and hypergraphs. Probability Surveys 5, 0 (2008), 80–145.
- Austin, T. Exchangeable random measures. In Annales de l’IHP Probabilités et statistiques (2015), vol. 51, pp. 842–861.
- Graph characterisation using graphlet-based entropies. Pattern Recognition Letters 147 (2021), 100–107.
- Nonparametric entropy estimation: An overview. International Journal of Mathematical and Statistical Sciences 6, 1 (1997), 17–39.
- Estimating the entropy of a signal with applications. IEEE transactions on signal processing 48, 6 (2000), 1687–1694.
- Efficient multivariate entropy estimation via k-nearest neighbour distances. Ann. Statist. 47, 1 (2019), 288–318.
- Universal graph compression: Stochastic block models. IEEE Transactions on Information Theory (2023).
- Bianconi, G. The entropy of randomized network ensembles. EPL (Europhysics Letters) 81, (2007), 28005.
- Bianconi, G. Entropy of network ensembles. Physical Review E 79, 3 (2009), 036114.
- A nonparametric view of network models and Newman–Girvan and other modularities. Proceedings of the National Academy of Sciences 106, 50 (2009), 21068–21073.
- The method of moments and degree distributions for network models. The Annals of Statistics 39, 5 (2011), 2280–2301.
- Complex networks: Structure and dynamics. Physics reports 424, 4-5 (2006), 175–308.
- Bollobás, B. Threshold functions for small subgraphs. In Mathematical Proceedings of the Cambridge Philosophical Society (1981), vol. 90, Cambridge University Press, pp. 197–206.
- Metrics for sparse graphs. In Surveys in Combinatorics 2009, London Mathematical Society Lecture Note Series 365. Cambridge Univ. Press, Cambridge, 2009, pp. 211–287.
- Large deviations of empirical neighborhood distribution in sparse random graphs. Probability Theory and Related Fields 163, 1-2 (2015), 149–222.
- The Laplacian of a graph as a density matrix: a basic combinatorial approach to separability of mixed states. Annals of Combinatorics 10, 3 (2006), 291–317.
- Consistency of maximum-likelihood and variational estimators in the stochastic block model. Electronic Journal of Statistics 6 (2012), 1847–1899.
- Chatterjee, S. Matrix estimation by universal singular value thresholding. The Annals of Statistics 43, 1 (2015), 177–214.
- The large deviation principle for the Erdős-Rényi random graph. European Journal of Combinatorics 32, 7 (2011), 1000–1017.
- Stochastic blockmodels with a growing number of classes. Biometrika 99, 2 (2012), 273–284.
- Compression of graphical structures: Fundamental limits, algorithms, and experiments. IEEE Transactions on Information Theory 58, 2 (2012), 620–638.
- The average distances in random graphs with given expected degrees. Proceedings of the National Academy of Sciences 99, 25 (2002), 15879–15882.
- Dehmer, M. A novel method for measuring the structural information content of networks. Cybernetics and Systems: An International Journal 39, 8 (2008), 825–842.
- A history of graph entropy measures. Information Sciences 181, 1 (2011), 57–78.
- Universal lossless compression of graphical data. IEEE Transactions on Information Theory 66, 11 (2020), 6962–6976.
- A universal lossless compression method applicable to sparse graphs and heavy–tailed sparse graphs. IEEE Transactions on Information Theory 69, 2 (2022), 719–751.
- Evolution of networks. Advances in physics 51, 4 (2002), 1079–1187.
- On random graphs I. Publ. math. debrecen 6, 290-297 (1959), 18.
- On the evolution of random graphs. Publ. math. inst. hung. acad. sci 5, 1 (1960), 17–60.
- Network modularity in the presence of covariates. arXiv preprint arXiv:1603.01214 (2016).
- Universal tree source coding using grammar-based compression. IEEE Transactions on Information Theory 65, 10 (2019), 6399–6413.
- Rate-optimal graphon estimation. The Annals of Statistics 43, 6 (2015), 2624–2652.
- On entropy estimation by m-spacing method. Journal of Mathematical Sciences 163, 3 (2009).
- On the estimation of entropy. Annals of the Institute of Statistical Mathematics 45 (1993), 69–88.
- Optimal rates of entropy estimation over lipschitz balls. Annals of Statistics 48, 6 (12 2020), 3228–3250.
- Graph properties, graph limits, and entropy. Journal of Graph Theory 87, 2 (2018), 208–229.
- Latent space approaches to social network analysis. Journal of the american Statistical association 97, 460 (2002), 1090–1098.
- Stochastic blockmodels: First steps. Social networks 5, 2 (1983), 109–137.
- Hoover, D. N. Relations on probability spaces and arrays of random variables. Technical report, Institute for Advanced Study (1979).
- Janson, S. Graphons, cut norm and distance, rearrangements and coupling. New York J. Math. Monographs 24 (2013), 1–76.
- Graph limits and exchangeable random graphs. Rendiconti di Matematica e delle sue Applicazioni. Serie VII (2008), 33–61.
- Minimax estimation of functionals of discrete distributions. IEEE Transactions on Information Theory 61, 5 (2015), 2835–2885.
- Maximum likelihood estimation of functionals of discrete distributions. IEEE Transactions on Information Theory 63, 10 (2017), 6774–6798.
- Oracle inequalities for network models and sparse graphon estimation. The Annals of Statistics 45, 1 (2017), 316–354.
- Nonparametric entropy estimation for stationary processes and random fields, with applications to english text. IEEE Transactions on Information Theory 44, 3 (1998), 1319–1327.
- Körner, J. Coding of an information source having ambiguous alphabet and the entropy of graphs. In 6th Prague conference on information theory (1973), pp. 411–425.
- Krioukov, D. Clustering implies geometry in networks. Physical review letters 116, 20 (2016), 208302.
- Lauritzen, S. L. Exchangeable rasch matrices. Rend. Mat. Appl.(7) 28, 1 (2008), 83–95.
- Structural information and dynamical complexity of networks. IEEE Transactions on Information Theory 62, 6 (2016), 3290–3339.
- Exponential concentration inequality for mutual information estimation. In Neural Information Processing Systems (NIPS) (2012).
- On the similarity between von Neumann graph entropy and structural information: Interpretation, computation, and applications. IEEE Transactions on Information Theory 68, 4 (2022), 2182–2202.
- Limits of dense graph sequences. Journal of Combinatorial Theory, Series B 96, 6 (2006), 933–957.
- Asymmetry and structural information in preferential attachment graphs. Random Structures & Algorithms 55, 3 (2019), 696–718.
- Compression of preferential attachment graphs. In 2019 IEEE International Symposium on Information Theory (ISIT) (2019), IEEE, pp. 1697–1701.
- Nonparametric ensemble estimation of distributional functionals. arXiv preprint arXiv:1601.06884 (2016).
- Mowshowitz, A. Entropy and the complexity of graphs: I. an index of the relative complexity of a graph. The bulletin of mathematical biophysics 30, 1 (1968), 175–204.
- Mowshowitz, A. Entropy and the complexity of graphs: Ii. the information content of digraphs and infinite graphs. The Bulletin of mathematical biophysics 30, 2 (1968), 225–240.
- Mowshowitz, A. Entropy and the complexity of graphs: Iii. graphs with prescribed information content. The bulletin of mathematical biophysics 30, 3 (1968), 387–414.
- Mowshowitz, A. Entropy and the complexity of graphs: Iv. entropy measures and graphical structure. The bulletin of mathematical biophysics 30, 4 (1968), 533–546.
- Entropy and the complexity of graphs revisited. Entropy 14, 3 (2012), 559–570.
- Newman, M. Networks. Oxford university press, 2018.
- Finding and evaluating community structure in networks. Physical review E 69, 2 (2004), 026113.
- Network histograms and universality of blockmodel approximation. Proceedings of the National Academy of Sciences 111, 41 (2014), 14722–14727.
- Bayesian models of graphs, arrays and other exchangeable random structures. IEEE transactions on pattern analysis and machine intelligence 37, 2 (2014), 437–461.
- Paninski, L. Estimation of entropy and mutual information. Neural computation 15, 6 (2003), 1191–1253.
- Undersmoothed kernel entropy estimators. IEEE Transactions on Information Theory 54, 9 (2008), 4384–4388.
- Entropy of labeled versus unlabeled networks. Physical Review E 106, 5 (2022), 054308.
- Peixoto, T. P. Entropy of stochastic blockmodel ensembles. Physical Review E 85, 5 (2012), 056122.
- Rashevsky, N. Life, information theory, and topology. The bulletin of mathematical biophysics 17, 3 (1955), 229–235.
- Regular decomposition: an information and graph theoretic approach to stochastic block models. arXiv preprint arXiv:1704.07114 (2017).
- For the few not the many? The effects of affirmative action on presence, prominence, and social capital of women directors in Norway. Scandinavian Journal of Management 27, 1 (2011), 44–54.
- Shannon, C. E. A mathematical theory of communication. The Bell system technical journal 27, 3 (1948), 379–423.
- Söderberg, B. General formalism for inhomogeneous random graphs. Physical review E 66, 6 (2002), 066121.
- Ensemble estimators for multivariate entropy estimation. IEEE transactions on information theory 59, 7 (2013), 4374–4388.
- Szemerédi, E. Regular partitions of graphs. Stanford University, 1975.
- Trucco, E. A note on the information content of graphs. The bulletin of mathematical biophysics 18, 2 (1956), 129–135.
- Von Luxburg, U. A tutorial on spectral clustering. Statistics and computing 17, 4 (2007), 395–416.
- Differential network entropy reveals cancer system hallmarks. Scientific reports 2, 1 (2012), 1–8.
- Nonparametric graphon estimation. arXiv preprint arXiv:1309.5936 (2013).
- Edge label inference in generalized stochastic block models: from spectral theory to impossibility results. In Conference on Learning Theory (2014), PMLR, pp. 903–920.
- Approximate von Neumann entropy for directed graphs. Physical Review E 89, 5 (2014), 052804.
- Universality of the stochastic block model. Physical Review E 98, 3 (2018), 032309.
- Low-algorithmic-complexity entropy-deceiving graphs. Physical Review E 96, 1 (2017), 012308.
- A review of graph and network complexity from an algorithmic information perspective. Entropy 20, 8 (2018), 551.
- Correlation of automorphism group size and topological properties with program-size complexity evaluations of graphs and complex networks. Physica A: Statistical Mechanics and its Applications 404 (2014), 341–358.
- A universal grammar-based code for lossless compression of binary trees. IEEE Transactions on Information Theory 60, 3 (2013), 1373–1386.
- Consistency of community detection in networks under degree-corrected stochastic block models. The Annals of Statistics 40, 4 (2012), 2266–2292.
- Anda Skeja (3 papers)
- Sofia C. Olhede (29 papers)