Computing well-balanced spanning trees of unweighted networks (2205.06628v3)
Abstract: A spanning tree of a network or graph is a subgraph that connects all nodes with the least number or weight of edges. The spanning tree is one of the most straightforward techniques for network simplification and sampling, and for discovering its backbone or skeleton. Prim's algorithm and Kruskal's algorithm are well-known algorithms for computing a spanning tree of a weighted network, and are therefore also the default procedure for unweighted networks in the most popular network libraries. In this paper, we empirically study the performance of these algorithms on unweighted networks and compare them with different priority-first search algorithms. We show that the structure of a network, such as the distances between the nodes, is better preserved by a simpler algorithm based on breadth-first search. The spanning trees are also most compact and well-balanced as measured by classical graph indices. We support our findings with experiments on synthetic graphs and more than a thousand real networks, and demonstrate practical applications of the computed spanning trees. We conclude that if a spanning tree is to maintain the structure of an unweighted network, the breadth-first search algorithm should be the preferred choice, and it should be implemented as such in network libraries.