Emergent Mind

Abstract

A measure of distance between two clusterings has important applications, including clustering validation and ensemble clustering. Generally, such distance measure provides navigation through the space of possible clusterings. Mostly used in cluster validation, a normalized clustering distance, a.k.a. agreement measure, compares a given clustering result against the ground-truth clustering. Clustering agreement measures are often classified into two families of pair-counting and information theoretic measures, with the widely-used representatives of Adjusted Rand Index (ARI) and Normalized Mutual Information (NMI), respectively. This paper sheds light on the relation between these two families through a generalization. It further presents an alternative algebraic formulation for these agreement measures which incorporates an intuitive clustering distance, which is defined based on the analogous between cluster overlaps and co-memberships of nodes in clusters. Unlike the original measures, it is easily extendable for different cases, including overlapping clusters and clusters of inter-related data for complex networks. These two extensions are, in particular, important in the context of finding clusters in social and information networks, a.k.a communities.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.