Link Prediction Using Supervised Machine Learning based on Aggregated and Topological Features (2006.16327v1)

Published 29 Jun 2020 in cs.SI

Abstract: Link prediction is an important task in social network analysis. There are different characteristics (features) in a social network that can be used for link prediction. In this paper, we evaluate the effectiveness of aggregated features and topological features in link prediction using supervised learning. The aggregated features, in a social network, are some aggregation functions of the attributes of the nodes. Topological features describe the topology or structure of a social network, and its underlying graph. We evaluated the effectiveness of these features by measuring the performance of different supervised machine learning methods. Specifically, we selected five well-known supervised methods including J48 decision tree, multi-layer perceptron (MLP), support vector machine (SVM), logistic regression and Naive Bayes (NB). We measured the performance of these five methods with different sets of features of the DBLP Dataset. Our results indicate that the combination of aggregated and topological features generates the best performance. For evaluation purposes, we used accuracy, area under the ROC curve (AUC) and F-Measure. Our selected features can be used for the analysis of almost any social network. This is because these features provide the important characteristics of the underlying graph of the social networks. The significance of our work is that the selected features can be very effective in the analysis of big social networks. In such networks we usually deal with big data sets, with millions or billions of instances. Using fewer, but more effective, features can help us for the analysis of big social networks.

Citations (4)

View on Semantic Scholar