Emergent Mind

Abstract

Clustering data is a popular feature in the field of unsupervised machine learning. Most algorithms aim to find the best method to extract consistent clusters of data, but very few of them intend to cluster data that share the same intersections between two features or more. This paper proposes a method to do so. The main idea of this novel method is to generate fuzzy clusters of data using a Fuzzy C-Means (FCM) algorithm. The second part involves applying a filter that selects a range of minimum and maximum membership values, emphasizing the border data. A {\mu} parameter defines the amplitude of this range. It finally applies a k-means algorithm using the membership values generated by the FCM. Naturally, the data having similar membership values will regroup in a new crispy cluster. The algorithm is also able to find the optimal number of clusters for the FCM and the k-means algorithm, according to the consistency of the clusters given by the Silhouette Index (SI). The result is a list of data and clusters that regroup data sharing the same intersection, intersecting two features or more. ck-means allows extracting the very similar data that does not naturally fall in the same cluster but at the intersection of two clusters or more. The algorithm also always finds itself the optimal number of clusters.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.