Emergent Mind

On the Analysis of a Label Propagation Algorithm for Community Detection

(1210.3735)
Published Oct 13, 2012 in cs.DC , cs.SI , and physics.soc-ph

Abstract

This paper initiates formal analysis of a simple, distributed algorithm for community detection on networks. We analyze an algorithm that we call \textsc{Max-LPA}, both in terms of its convergence time and in terms of the "quality" of the communities detected. \textsc{Max-LPA} is an instance of a class of community detection algorithms called \textit{label propagation} algorithms. As far as we know, most analysis of label propagation algorithms thus far has been empirical in nature and in this paper we seek a theoretical understanding of label propagation algorithms. In our main result, we define a clustered version of \er random graphs with clusters $V1, V2,..., Vk$ where the probability $p$, of an edge connecting nodes within a cluster $Vi$ is higher than $p'$, the probability of an edge connecting nodes in distinct clusters. We show that even with fairly general restrictions on $p$ and $p'$ ($p = \Omega(\frac{1}{n{1/4-\epsilon}})$ for any $\epsilon > 0$, $p' = O(p2)$, where $n$ is the number of nodes), \textsc{Max-LPA} detects the clusters $V1, V2,..., V_n$ in just two rounds. Based on this and on empirical results, we conjecture that \textsc{Max-LPA} can correctly and quickly identify communities on clustered \er graphs even when the clusters are much sparser, i.e., with $p = \frac{c\log n}{n}$ for some $c > 1$.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.