TY - GEN
T1 - Improving Medical Image Classification in Noisy Labels Using only Self-supervised Pretraining
AU - Khanal, Bidur
AU - Bhattarai, Binod
AU - Khanal, Bishesh
AU - Linte, Cristian A.
N1 - Acknowledgments. Research reported in this publication was supported by the National Institute of General Medical Sciences Award No. R35GM128877 of the National Institutes of Health, the Office of Advanced Cyber Infrastructure Award No. 1808530 of the National Science Foundation, and the Division Of Chemistry, Bioengineering, Environmental, and Transport Systems Award No. 2245152 of the National Science Foundation. We would like to thank the Research Computing team [31] at the Rochester Institute of Technology for proving computing resources for this research.
PY - 2023/10/1
Y1 - 2023/10/1
N2 - Noisy labels hurt deep learning-based supervised image classification performance as the models may overfit the noise and learn corrupted feature extractors. For natural image classification training with noisy labeled data, model initialization with contrastive self-supervised pretrained weights has shown to reduce feature corruption and improve classification performance. However, no works have explored: i) how other self-supervised approaches, such as pretext task-based pretraining, impact the learning with noisy label, and ii) any self-supervised pretraining methods alone for medical images in noisy label settings. Medical images often feature smaller datasets and subtle inter-class variations, requiring human expertise to ensure correct classification. Thus, it is not clear if the methods improving learning with noisy labels in natural image datasets such as CIFAR would also help with medical images. In this work, we explore contrastive and pretext task-based self-supervised pretraining to initialize the weights of a deep learning classification model for two medical datasets with self-induced noisy labels—NCT-CRC-HE-100K tissue histological images and COVID-QU-Ex chest X-ray images. Our results show that models initialized with pretrained weights obtained from self-supervised learning can effectively learn better features and improve robustness against noisy labels.
AB - Noisy labels hurt deep learning-based supervised image classification performance as the models may overfit the noise and learn corrupted feature extractors. For natural image classification training with noisy labeled data, model initialization with contrastive self-supervised pretrained weights has shown to reduce feature corruption and improve classification performance. However, no works have explored: i) how other self-supervised approaches, such as pretext task-based pretraining, impact the learning with noisy label, and ii) any self-supervised pretraining methods alone for medical images in noisy label settings. Medical images often feature smaller datasets and subtle inter-class variations, requiring human expertise to ensure correct classification. Thus, it is not clear if the methods improving learning with noisy labels in natural image datasets such as CIFAR would also help with medical images. In this work, we explore contrastive and pretext task-based self-supervised pretraining to initialize the weights of a deep learning classification model for two medical datasets with self-induced noisy labels—NCT-CRC-HE-100K tissue histological images and COVID-QU-Ex chest X-ray images. Our results show that models initialized with pretrained weights obtained from self-supervised learning can effectively learn better features and improve robustness against noisy labels.
KW - feature extraction
KW - label noise
KW - learning with noisy labels
KW - medical image classification
KW - self-supervised pretraining
KW - warm-up obstacle
UR - http://www.scopus.com/inward/record.url?scp=85174586575&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-44992-5_8
DO - 10.1007/978-3-031-44992-5_8
M3 - Published conference contribution
AN - SCOPUS:85174586575
SN - 9783031449918
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 78
EP - 90
BT - DEMI 2023: Data Engineering in Medical Imaging
A2 - Bhattarai, Binod
A2 - Ali, Sharib
A2 - Rau, Anita
A2 - Nguyen, Anh
A2 - Namburete, Ana
A2 - Caramalau, Razvan
A2 - Stoyanov, Danail
PB - Springer Science and Business Media Deutschland GmbH
T2 - 1st MICCAI Workshop on Data Engineering in Medical Imaging, DEMI 2023
Y2 - 8 October 2023 through 8 October 2023
ER -