Deep Bayesian Self-Training

Fabio De Sousa Ribeiro; Francesco Calivá; Mark Swainson; Kjartan Gudmundsson; Georgios Leontidis; Stefanos Kollias

doi:10.1007/s00521-019-04332-4

Deep Bayesian Self-Training

Fabio De Sousa Ribeiro, Francesco Calivá, Mark Swainson, Kjartan Gudmundsson, Georgios Leontidis^* (Corresponding Author), Stefanos Kollias

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

20 Citations (Scopus)

17 Downloads (Pure)

Abstract

Supervised deep learning has been highly successful in recent years, achieving state-of-the-art results in most tasks. However, with the ongoing uptake of such methods in industrial applications, the requirement for large amounts of annotated data is often a challenge. In most real-world problems, manual annotation is practically intractable due to time/labour constraints; thus, the development of automated and adaptive data annotation systems is highly sought after. In this paper, we propose both a (1) deep Bayesian self-training methodology for automatic data annotation, by leveraging predictive uncertainty estimates using variational inference and modern neural network (NN) architectures, as well as (2) a practical adaptation procedure for handling high label variability between different dataset distributions through clustering of NN latent variable representations. An experimental study on both public and private datasets is presented illustrating the superior performance of the proposed approach over standard self-training baselines, highlighting the importance of predictive uncertainty estimates in safety-critical domains.

Original language	English
Pages (from-to)	4275-4291
Number of pages	17
Journal	Neural Computing and Applications
Volume	32
Early online date	10 Jul 2019
DOIs	https://doi.org/10.1007/s00521-019-04332-4
Publication status	Published - May 2020

Bibliographical note

Acknowledgements
The authors would like to thank Mr. George Marandianos, Mrs. Mamatha Thota and Mr. Samuel Bond-Taylor for manually annotating datasets used in this study and of course the reviewers for their constructive feedback that helped to improve the manuscript. We would also like to thank Professor Luc Bidaut for enabling this collaboration.

Funding
The research presented in this paper was funded by Engineering and Physical Sciences Research Council (Reference Number EP/R005524/1) and Innovate UK (Reference Number 102908), in collaboration with the Olympus Automation Limited Company, for the project Automated Robotic Food Manufacturing System.

Keywords

Machine Learning
Deep Learning
Deep learning
Representation learning
Bayesian CNN
Variational inference
Clustering
Self-training
Adaptation
Uncertainty weighting

Access to Document

10.1007/s00521-019-04332-4Licence: CC BY

SousaRibeiro_et_al_NCA_DeepBayesianSelfTraining_VoR
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creative commons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Final published version, 1.23 MBLicence: CC BY

Cite this

@article{d20f97865b024860a764b993cb131ec5,

title = "Deep Bayesian Self-Training",

abstract = "Supervised deep learning has been highly successful in recent years, achieving state-of-the-art results in most tasks. However, with the ongoing uptake of such methods in industrial applications, the requirement for large amounts of annotated data is often a challenge. In most real-world problems, manual annotation is practically intractable due to time/labour constraints; thus, the development of automated and adaptive data annotation systems is highly sought after. In this paper, we propose both a (1) deep Bayesian self-training methodology for automatic data annotation, by leveraging predictive uncertainty estimates using variational inference and modern neural network (NN) architectures, as well as (2) a practical adaptation procedure for handling high label variability between different dataset distributions through clustering of NN latent variable representations. An experimental study on both public and private datasets is presented illustrating the superior performance of the proposed approach over standard self-training baselines, highlighting the importance of predictive uncertainty estimates in safety-critical domains.",

keywords = "Machine Learning, Deep Learning, Deep learning, Representation learning, Bayesian CNN, Variational inference, Clustering, Self-training, Adaptation, Uncertainty weighting",

author = "Ribeiro, {Fabio De Sousa} and Francesco Caliv{\'a} and Mark Swainson and Kjartan Gudmundsson and Georgios Leontidis and Stefanos Kollias",

note = "Acknowledgements The authors would like to thank Mr. George Marandianos, Mrs. Mamatha Thota and Mr. Samuel Bond-Taylor for manually annotating datasets used in this study and of course the reviewers for their constructive feedback that helped to improve the manuscript. We would also like to thank Professor Luc Bidaut for enabling this collaboration. Funding The research presented in this paper was funded by Engineering and Physical Sciences Research Council (Reference Number EP/R005524/1) and Innovate UK (Reference Number 102908), in collaboration with the Olympus Automation Limited Company, for the project Automated Robotic Food Manufacturing System.",

year = "2020",

month = may,

doi = "10.1007/s00521-019-04332-4",

language = "English",

volume = "32",

pages = "4275--4291",

journal = "Neural Computing and Applications",

issn = "0941-0643",

publisher = "Springer London",

}

TY - JOUR

T1 - Deep Bayesian Self-Training

AU - Ribeiro, Fabio De Sousa

AU - Calivá, Francesco

AU - Swainson, Mark

AU - Gudmundsson, Kjartan

AU - Leontidis, Georgios

AU - Kollias, Stefanos

N1 - Acknowledgements The authors would like to thank Mr. George Marandianos, Mrs. Mamatha Thota and Mr. Samuel Bond-Taylor for manually annotating datasets used in this study and of course the reviewers for their constructive feedback that helped to improve the manuscript. We would also like to thank Professor Luc Bidaut for enabling this collaboration. Funding The research presented in this paper was funded by Engineering and Physical Sciences Research Council (Reference Number EP/R005524/1) and Innovate UK (Reference Number 102908), in collaboration with the Olympus Automation Limited Company, for the project Automated Robotic Food Manufacturing System.

PY - 2020/5

Y1 - 2020/5

N2 - Supervised deep learning has been highly successful in recent years, achieving state-of-the-art results in most tasks. However, with the ongoing uptake of such methods in industrial applications, the requirement for large amounts of annotated data is often a challenge. In most real-world problems, manual annotation is practically intractable due to time/labour constraints; thus, the development of automated and adaptive data annotation systems is highly sought after. In this paper, we propose both a (1) deep Bayesian self-training methodology for automatic data annotation, by leveraging predictive uncertainty estimates using variational inference and modern neural network (NN) architectures, as well as (2) a practical adaptation procedure for handling high label variability between different dataset distributions through clustering of NN latent variable representations. An experimental study on both public and private datasets is presented illustrating the superior performance of the proposed approach over standard self-training baselines, highlighting the importance of predictive uncertainty estimates in safety-critical domains.

AB - Supervised deep learning has been highly successful in recent years, achieving state-of-the-art results in most tasks. However, with the ongoing uptake of such methods in industrial applications, the requirement for large amounts of annotated data is often a challenge. In most real-world problems, manual annotation is practically intractable due to time/labour constraints; thus, the development of automated and adaptive data annotation systems is highly sought after. In this paper, we propose both a (1) deep Bayesian self-training methodology for automatic data annotation, by leveraging predictive uncertainty estimates using variational inference and modern neural network (NN) architectures, as well as (2) a practical adaptation procedure for handling high label variability between different dataset distributions through clustering of NN latent variable representations. An experimental study on both public and private datasets is presented illustrating the superior performance of the proposed approach over standard self-training baselines, highlighting the importance of predictive uncertainty estimates in safety-critical domains.

KW - Machine Learning

KW - Deep Learning

KW - Deep learning

KW - Representation learning

KW - Bayesian CNN

KW - Variational inference

KW - Clustering

KW - Self-training

KW - Adaptation

KW - Uncertainty weighting

UR - http://www.scopus.com/inward/record.url?scp=85069661389&partnerID=8YFLogxK

U2 - 10.1007/s00521-019-04332-4

DO - 10.1007/s00521-019-04332-4

M3 - Article

SN - 0941-0643

VL - 32

SP - 4275

EP - 4291

JO - Neural Computing and Applications

JF - Neural Computing and Applications

ER -

Deep Bayesian Self-Training

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Georgios Leontidis

Cite this

Deep Bayesian Self-Training

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Profiles

Georgios Leontidis

Cite this