A Text Multi-label Classification Scheme Based on Resampling and Ensemble Learning

Tianhao Wang, Tianrang Weng, Jiacheng Ji, Mingjun Zhong, Baili Zhang*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingPublished conference contribution

Abstract

The medical dispute cases are professional and closely related to medicine. Therefore, the mediation of cases in practice depends heavily on similar historical cases. Multi-label classification of legal documents can efficiently filter irrelevant historical cases, which helps to recommend the similar historical cases faster and better. However, the imbalance and label symbiosis of the data set directly affect multi-label classification of legal documents. Therefore, a multi-label classification scheme based on resampling and ensemble learning is presented in this paper The scheme includes two parts: in the first part, in order to reduce the impact of label symbiosis on resampling, a resampling algorithm based on the average sparsity of the label set is proposed improve the imbalance of the data set; in the second one, a multi-label classification algorithm based on ensemble learning is proposed to train multiple base classifiers and combine each base classifier with a voting strategy of one vote. It can effectively improve the effect of multi-label classification. The experimental results show that the scheme proposed in this paper can improve the effect of multi-label classification and is not only suitable for legal documents but also applicable for other text data sets with imbalanced classes and label symbiosis problems.

Original languageEnglish
Title of host publicationAdvances in Artificial Intelligence and Security
Subtitle of host publication8th International Conference on Artificial Intelligence and Security, ICAIS 2022, Proceedings, Part II
EditorsXingming Sun, Xiaorui Zhang, Zhihua Xia, Elisa Bertino
PublisherSpringer Science and Business Media Deutschland GmbH
Pages67-80
Number of pages14
ISBN (Electronic)978-3-031-06761-7
ISBN (Print)9783031067600
DOIs
Publication statusPublished - 8 Jul 2022
Event8th International Conference on Artificial Intelligence and Security , ICAIS 2022 - Qinghai, China
Duration: 15 Jul 202220 Jul 2022

Publication series

NameCommunications in Computer and Information Science
Volume1587 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference8th International Conference on Artificial Intelligence and Security , ICAIS 2022
Country/TerritoryChina
CityQinghai
Period15/07/2220/07/22

Bibliographical note

Funding Information:
Acknowledgement. This work was partly supported by the National Key R&D Program of China (2018YFC0830200), the Fundamental Research Funds for the Central Universities (2242018S30021 and 2242017S30023) and Open Research Fund from Key Laboratory of Computer Network and Information Integration in Southeast University, Ministry of Education, China.

Keywords

  • Class imbalance
  • Ensemble learning
  • Label symbiosis
  • Multi-label classification
  • Resampling algorithm

Fingerprint

Dive into the research topics of 'A Text Multi-label Classification Scheme Based on Resampling and Ensemble Learning'. Together they form a unique fingerprint.

Cite this