Abstract
Clustering samples based on similarity remains a significant challenge, especially when the goal is to accurately capture the underlying data clusters of complex arbitrary shapes. Existing density-based clustering techniques are known to be best suited for capturing arbitrarily shaped clusters. However, a key limitation of these methods is the difficulty in automatically finding the optimal set of parameters adapted to dataset characteristics, which becomes even more challenging when the data contain inherent noise. In our recent work, we proposed a Differential Evolution-based DENsity CLUstEring (DE-DENCLUE) to optimise DENCLUE parameters. This study evaluates DE-DENCLUE for its robustness in finding accurate clusters in the presence of noise in the data. DE-DENCLUE performance is compared against three other density-based clustering algorithms—DPC based on weighted local density sequence and nearest neighbour assignment (DPCSA), Density-Based Spatial Clustering of Applications with Noise (DBSCAN), and Variable Kernel Density Estimation–based DENCLUE (VDENCLUE)—across several datasets (i.e., synthetic and real). The study has consistently shown superior results for DE-DENCLUE compared to other models for most datasets with different noise levels. Clustering quality metrics such as the Silhouette Index (SI), Davies–Bouldin Index (DBI), Adjusted Rand Index (ARI), and Adjusted Mutual Information (AMI) consistently show superior SI, ARI, and AMI values across most datasets at different noise levels. However, in some cases regarding DBI, the DPCSA performed better. In conclusion, the proposed method offers a reliable and noise-resilient clustering solution for complex datasets.
Original language | English |
---|---|
Article number | 3367 |
Number of pages | 38 |
Journal | Mathematics |
Volume | 12 |
Issue number | 21 |
DOIs | |
Publication status | Published - 27 Oct 2024 |
Data Availability Statement
The original data presented in this study are openly available at http://cs.uef.fi/sipu/datasets/ (accessed on 29 August 2024) and https://archive.ics.uci.edu (accessed on 29 August 2024). The code is available under CC BY 4.0 licence at https://doi.org/10.6084/m9.figshare.26543863.Funding
Shahzad Mumtaz was supported by the University of Aberdeen start-up fund (CF10834-23).
Funders | Funder number |
---|---|
University of Aberdeen | CF10834-23 |
Keywords
- DENCLUE algorithm
- differential evolution
- density-based clustering
- parameter optimisation
- noise robustness