Improving Medical Image Classification in Noisy Labels Using only Self-supervised Pretraining

Bidur Khanal; Binod Bhattarai; Bishesh Khanal; Cristian A. Linte

doi:10.1007/978-3-031-44992-5_8

Improving Medical Image Classification in Noisy Labels Using only Self-supervised Pretraining

Bidur Khanal^*, Binod Bhattarai, Bishesh Khanal, Cristian A. Linte

^*Corresponding author for this work

Computing Science

Rochester Institute of Technology

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

1 Citation (Scopus)

Abstract

Noisy labels hurt deep learning-based supervised image classification performance as the models may overfit the noise and learn corrupted feature extractors. For natural image classification training with noisy labeled data, model initialization with contrastive self-supervised pretrained weights has shown to reduce feature corruption and improve classification performance. However, no works have explored: i) how other self-supervised approaches, such as pretext task-based pretraining, impact the learning with noisy label, and ii) any self-supervised pretraining methods alone for medical images in noisy label settings. Medical images often feature smaller datasets and subtle inter-class variations, requiring human expertise to ensure correct classification. Thus, it is not clear if the methods improving learning with noisy labels in natural image datasets such as CIFAR would also help with medical images. In this work, we explore contrastive and pretext task-based self-supervised pretraining to initialize the weights of a deep learning classification model for two medical datasets with self-induced noisy labels—NCT-CRC-HE-100K tissue histological images and COVID-QU-Ex chest X-ray images. Our results show that models initialized with pretrained weights obtained from self-supervised learning can effectively learn better features and improve robustness against noisy labels.

Original language	English
Title of host publication	DEMI 2023: Data Engineering in Medical Imaging
Editors	Binod Bhattarai, Sharib Ali, Anita Rau, Anh Nguyen, Ana Namburete, Razvan Caramalau, Danail Stoyanov
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	78-90
Number of pages	13
ISBN (Print)	9783031449918
DOIs	https://doi.org/10.1007/978-3-031-44992-5_8
Publication status	Published - 1 Oct 2023
Event	1st MICCAI Workshop on Data Engineering in Medical Imaging, DEMI 2023 - Vancouver, Canada Duration: 8 Oct 2023 → 8 Oct 2023

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	14314 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	1st MICCAI Workshop on Data Engineering in Medical Imaging, DEMI 2023
Country/Territory	Canada
City	Vancouver
Period	8/10/23 → 8/10/23

Bibliographical note

Acknowledgments. Research reported in this publication was supported by the National Institute of General Medical Sciences Award No. R35GM128877 of the National Institutes of Health, the Office of Advanced Cyber Infrastructure Award No. 1808530 of the National Science Foundation, and the Division Of Chemistry, Bioengineering, Environmental, and Transport Systems Award No. 2245152 of the National Science Foundation. We would like to thank the Research Computing team [31] at the Rochester Institute of Technology for proving computing resources for this research.

Keywords

feature extraction
label noise
learning with noisy labels
medical image classification
self-supervised pretraining
warm-up obstacle

Access to Document

10.1007/978-3-031-44992-5_8Licence: Unspecified

Cite this

Khanal, B., Bhattarai, B., Khanal, B., & Linte, C. A. (2023). Improving Medical Image Classification in Noisy Labels Using only Self-supervised Pretraining. In B. Bhattarai, S. Ali, A. Rau, A. Nguyen, A. Namburete, R. Caramalau, & D. Stoyanov (Eds.), DEMI 2023: Data Engineering in Medical Imaging (pp. 78-90). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 14314 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-44992-5_8

Improving Medical Image Classification in Noisy Labels Using only Self-supervised Pretraining. / Khanal, Bidur; Bhattarai, Binod; Khanal, Bishesh et al.
DEMI 2023: Data Engineering in Medical Imaging. ed. / Binod Bhattarai; Sharib Ali; Anita Rau; Anh Nguyen; Ana Namburete; Razvan Caramalau; Danail Stoyanov. Springer Science and Business Media Deutschland GmbH, 2023. p. 78-90 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 14314 LNCS).

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

Khanal, B, Bhattarai, B, Khanal, B & Linte, CA 2023, Improving Medical Image Classification in Noisy Labels Using only Self-supervised Pretraining. in B Bhattarai, S Ali, A Rau, A Nguyen, A Namburete, R Caramalau & D Stoyanov (eds), DEMI 2023: Data Engineering in Medical Imaging. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 14314 LNCS, Springer Science and Business Media Deutschland GmbH, pp. 78-90, 1st MICCAI Workshop on Data Engineering in Medical Imaging, DEMI 2023, Vancouver, Canada, 8/10/23. https://doi.org/10.1007/978-3-031-44992-5_8

Khanal B, Bhattarai B, Khanal B, Linte CA. Improving Medical Image Classification in Noisy Labels Using only Self-supervised Pretraining. In Bhattarai B, Ali S, Rau A, Nguyen A, Namburete A, Caramalau R, Stoyanov D, editors, DEMI 2023: Data Engineering in Medical Imaging. Springer Science and Business Media Deutschland GmbH. 2023. p. 78-90. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-031-44992-5_8

Khanal, Bidur ; Bhattarai, Binod ; Khanal, Bishesh et al. / Improving Medical Image Classification in Noisy Labels Using only Self-supervised Pretraining. DEMI 2023: Data Engineering in Medical Imaging. editor / Binod Bhattarai ; Sharib Ali ; Anita Rau ; Anh Nguyen ; Ana Namburete ; Razvan Caramalau ; Danail Stoyanov. Springer Science and Business Media Deutschland GmbH, 2023. pp. 78-90 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{afe1144c4337422d91df9e015598968b,

title = "Improving Medical Image Classification in Noisy Labels Using only Self-supervised Pretraining",

abstract = "Noisy labels hurt deep learning-based supervised image classification performance as the models may overfit the noise and learn corrupted feature extractors. For natural image classification training with noisy labeled data, model initialization with contrastive self-supervised pretrained weights has shown to reduce feature corruption and improve classification performance. However, no works have explored: i) how other self-supervised approaches, such as pretext task-based pretraining, impact the learning with noisy label, and ii) any self-supervised pretraining methods alone for medical images in noisy label settings. Medical images often feature smaller datasets and subtle inter-class variations, requiring human expertise to ensure correct classification. Thus, it is not clear if the methods improving learning with noisy labels in natural image datasets such as CIFAR would also help with medical images. In this work, we explore contrastive and pretext task-based self-supervised pretraining to initialize the weights of a deep learning classification model for two medical datasets with self-induced noisy labels—NCT-CRC-HE-100K tissue histological images and COVID-QU-Ex chest X-ray images. Our results show that models initialized with pretrained weights obtained from self-supervised learning can effectively learn better features and improve robustness against noisy labels.",

keywords = "feature extraction, label noise, learning with noisy labels, medical image classification, self-supervised pretraining, warm-up obstacle",

author = "Bidur Khanal and Binod Bhattarai and Bishesh Khanal and Linte, {Cristian A.}",

note = "Acknowledgments. Research reported in this publication was supported by the National Institute of General Medical Sciences Award No. R35GM128877 of the National Institutes of Health, the Office of Advanced Cyber Infrastructure Award No. 1808530 of the National Science Foundation, and the Division Of Chemistry, Bioengineering, Environmental, and Transport Systems Award No. 2245152 of the National Science Foundation. We would like to thank the Research Computing team [31] at the Rochester Institute of Technology for proving computing resources for this research. ; 1st MICCAI Workshop on Data Engineering in Medical Imaging, DEMI 2023 ; Conference date: 08-10-2023 Through 08-10-2023",

year = "2023",

month = oct,

day = "1",

doi = "10.1007/978-3-031-44992-5_8",

language = "English",

isbn = "9783031449918",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "78--90",

editor = "Binod Bhattarai and Sharib Ali and Anita Rau and Anh Nguyen and Ana Namburete and Razvan Caramalau and Danail Stoyanov",

booktitle = "DEMI 2023: Data Engineering in Medical Imaging",

address = "Germany",

}

TY - GEN

T1 - Improving Medical Image Classification in Noisy Labels Using only Self-supervised Pretraining

AU - Khanal, Bidur

AU - Bhattarai, Binod

AU - Khanal, Bishesh

AU - Linte, Cristian A.

N1 - Acknowledgments. Research reported in this publication was supported by the National Institute of General Medical Sciences Award No. R35GM128877 of the National Institutes of Health, the Office of Advanced Cyber Infrastructure Award No. 1808530 of the National Science Foundation, and the Division Of Chemistry, Bioengineering, Environmental, and Transport Systems Award No. 2245152 of the National Science Foundation. We would like to thank the Research Computing team [31] at the Rochester Institute of Technology for proving computing resources for this research.

PY - 2023/10/1

Y1 - 2023/10/1

N2 - Noisy labels hurt deep learning-based supervised image classification performance as the models may overfit the noise and learn corrupted feature extractors. For natural image classification training with noisy labeled data, model initialization with contrastive self-supervised pretrained weights has shown to reduce feature corruption and improve classification performance. However, no works have explored: i) how other self-supervised approaches, such as pretext task-based pretraining, impact the learning with noisy label, and ii) any self-supervised pretraining methods alone for medical images in noisy label settings. Medical images often feature smaller datasets and subtle inter-class variations, requiring human expertise to ensure correct classification. Thus, it is not clear if the methods improving learning with noisy labels in natural image datasets such as CIFAR would also help with medical images. In this work, we explore contrastive and pretext task-based self-supervised pretraining to initialize the weights of a deep learning classification model for two medical datasets with self-induced noisy labels—NCT-CRC-HE-100K tissue histological images and COVID-QU-Ex chest X-ray images. Our results show that models initialized with pretrained weights obtained from self-supervised learning can effectively learn better features and improve robustness against noisy labels.

AB - Noisy labels hurt deep learning-based supervised image classification performance as the models may overfit the noise and learn corrupted feature extractors. For natural image classification training with noisy labeled data, model initialization with contrastive self-supervised pretrained weights has shown to reduce feature corruption and improve classification performance. However, no works have explored: i) how other self-supervised approaches, such as pretext task-based pretraining, impact the learning with noisy label, and ii) any self-supervised pretraining methods alone for medical images in noisy label settings. Medical images often feature smaller datasets and subtle inter-class variations, requiring human expertise to ensure correct classification. Thus, it is not clear if the methods improving learning with noisy labels in natural image datasets such as CIFAR would also help with medical images. In this work, we explore contrastive and pretext task-based self-supervised pretraining to initialize the weights of a deep learning classification model for two medical datasets with self-induced noisy labels—NCT-CRC-HE-100K tissue histological images and COVID-QU-Ex chest X-ray images. Our results show that models initialized with pretrained weights obtained from self-supervised learning can effectively learn better features and improve robustness against noisy labels.

KW - feature extraction

KW - label noise

KW - learning with noisy labels

KW - medical image classification

KW - self-supervised pretraining

KW - warm-up obstacle

UR - http://www.scopus.com/inward/record.url?scp=85174586575&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-44992-5_8

DO - 10.1007/978-3-031-44992-5_8

M3 - Published conference contribution

AN - SCOPUS:85174586575

SN - 9783031449918

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 78

EP - 90

BT - DEMI 2023: Data Engineering in Medical Imaging

A2 - Bhattarai, Binod

A2 - Ali, Sharib

A2 - Rau, Anita

A2 - Nguyen, Anh

A2 - Namburete, Ana

A2 - Caramalau, Razvan

A2 - Stoyanov, Danail

PB - Springer Science and Business Media Deutschland GmbH

T2 - 1st MICCAI Workshop on Data Engineering in Medical Imaging, DEMI 2023

Y2 - 8 October 2023 through 8 October 2023

ER -