Data with categorical attributes are ubiquitous in the real world. However, existing partitional clustering algorithms for categorical data are prone to fall into local optima. To address this issue, in this paper we propose a novel clustering algorithm, ABC-K-Modes (Artificial Bee Colony clustering based on K-Modes), based on the traditional k-modes clustering algorithm and the artificial bee colony approach. In our approach, we first introduce a one-step k-modes procedure, and then integrate this procedure with the artificial bee colony approach to deal with categorical data. In the search process performed by scout bees, we adopt the multi-source search inspired by the idea of batch processing to accelerate the convergence of ABC-K-Modes. The performance of ABC-K-Modes is evaluated by a series of experiments in comparison with that of the other popular algorithms for categorical data.
Bibliographical noteFunding: This work was supported by the National Natural Science Foundation of
China (NSFC) under Grant Nos. (21127010, 61202309, http://www.nsfc.gov.cn/), China Postdoctoral Science Foundation under Grant No. 2013M530956 (http://res.chinapostdoctor.org.cn), the UK Economic & Social Research Council (ESRC): award reference: ES/M001628/1 (http://www.esrc.ac.uk/), Science and Technology Development Plan of Jilin province under Grant No. 20140520068JH (http://www.jlkjt.gov.cn), Fundamental Research Funds for the Central Universities under No. 14QNJJ028 (http://www.nenu.edu.cn), the open project program of Key Laboratory of Symbolic Computation andKnowledge Engineering of Ministry of Education, Jilin University under Grant No. 93K172014K07 (http://www.jlu.edu.cn). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.