Comparison of the methods to determine optimal number of cluster
Authors : Fatih Emre Öztürk, Neslihan Demirel
Pages : 34-45
View : 63 | Download : 44
Publication Date : 2023-06-30
Article Type : Research Article
Abstract :Clustering is an unsupervised learning that divides observations into groups based on their similarity. The most widely used clustering algorithm is k-means. However, in this clustering algorithm, the number of clusters needs to be determined in advance. In this study, the most widely used methods for determining the number of clusters, namely Average Silhouette, Caliński-Harabasz, Davies-Bouldin and Dunn Index were used. The performances of these methods were compared by Rand Index and Meila\'s Variation of Information (MVI) criteria on nine real data sets where the number of clusters was known in advance. According to these criterias, Average Silhouette was given more successful results.Keywords : Kümeleme analizi, Ortalama Silüet, Dunn Endeksi, Davies-Bouldin