Metrics matter in community detection (1901.01354v1)
Abstract: We present a critical evaluation of normalized mutual information (NMI) as an evaluation metric for community detection. NMI exaggerates the leximin method's performance on weak communities: Does leximin, in finding the trivial singletons clustering, truly outperform eight other community detection methods? Three NMI improvements from the literature are AMI, rrNMI, and cNMI. We show equivalences under relevant random models, and for evaluating community detection, we advise one-sided AMI under the $\mathbb{M}_{\mathrm{all}}$ model (all partitions of $n$ nodes). This work seeks (1) to start a conversation on robust measurements, and (2) to advocate evaluations which do not give "free lunch".
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.