Skip to main content

Table 4 Comparison of clustering performance across different datasets and clustering techniques

From: Improving clustering performance using independent component analysis and unsupervised feature learning

Baseline Performance of applied processing on different datasets (NMI, ACC)
COIL20 COIL100 CMU-PIE USPS MNIST REUTERS-10K
Method
 K-means 0.735, 0.597 0.822, 0.615 0.532, 0.239 0.659, 0.694 0.527, 0.553 0.356, 0.541
Deep learning
 AE+K-means (2016)      –, 0.818 –, 0.666
 NMF-D (2014) 0.692, – 0.719, – 0.920, 810 0.287, 0.382 0.152, 0.75  
 TSC-D (2016) 0.928, 0.899     0.651, 0.692  
 DEN (2014) 0.870, 0.725      
 DBC (2017) 0.895, 0.793 0.905, 0.775   0.724, 0.743 0.917, 0.964  
 IEC (2016)   0.787, 0.546   0.641, 0.767 0.542, 0.609  
 AEC (2013)     0.651, 0.715 0.669, 0.760  
 DCN (2016)      0.810, 0.830  
 DEC (2016)    0.924, 0.801 0.586, 0.619 –, 0.818 –, 0.722
 DCEC (2017)     0.826, 0.790 0.885, 0.890  
 DEPICT (2017)    0.974, 0.883 0.927, 0.964 0.917, 0.965  
 JULE-SF (2016) 1.000,– 0.978, – 0.984, 0.980 0.858, 0.922 0.906, 0.959  
 JULE-RC (2016) 1.000, – 0.985, – 1.000, 1.000 0.913, 0.950 0.913, 0.964  
 VaDE (2016)      –, 0.945 –, 0.798
 IMSAT (2017)      –, 0.984 –, 0.719
 SpectralNet (2018)      0.924, 0.971  
Non-deep learning
 AC-GDL (2012) 0.865, – 0.797, – 0.934, 0.842 0.824, 0.867 0.017, 0.113  
 AC-PIC (2013) 0.855, – 0.840, – 0.902, 0.797 0.840, 0.855 0.017, 0.015  
 SEC (2011)     0.511, 0.544 0.779, 0.804  
 LDMGI (2010)     0.563, 0.580 0.802, 0.842  
Oursa 0.965, 0.93 0.962, 0.897 0.986, 0.937 0.868, 0.926 0.824, 0.882 0.460, 0.714
  1. a The maximum clustering performance obtained by the processing components proposed in this study is provided for each dataset. When available, both NMI and ACC are presented in each cell, where the first value is the NMI. If the cell is blank then the clustering method was not used on the dataset. Italic font within a row indicates the maximum performance obtained for a dataset. The full names and references of the compared methods are: Deep Embedding Network (DEN) [11], Discriminatively Boosted Clustering (DBC) [55], Infinite Ensemble Clustering (IEC) [56], Autoencoder-based Clustering (AEC) [10], Deep Embedded Clustering (DEC) [8], Deep Clustering Network (DCN) [13], Deep Convolutional Embedded Clustering (DCEC) [57], Deep Embedded Regularized Clustering (DEPICT) [14], Variational Deep Embedding (VaDE) [15], autoencoder with K-means clustering (AE + K-means) [8], Information Maximizing Self-Augmented Training (IMSAT) [58], NMF with Deep learning model (NMF-D) [59], Joint Unsupervised Learning (JULE) “-SF” and “-RC” [16], Task-specific Deep Architecture for Clustering (TSC-D) [60] Graph Degree Linkage-based Agglomerative Clustering (AC-GDL) [61] and Agglomerative Clustering via Path Integral (AC-PIC) [62], Spectral Embedded Clustering (SEC) [63], and Local Discriminant Models and Global Integration (LDMGI) [52]