Transformer For Image Quality Assessment

Junyong You, Jari Korhonen

Research output: Chapter in Book/Report/Conference proceedingPublished conference contribution

103 Citations (Scopus)

Abstract

Transformer has become the new standard method in natural language processing (NLP), and it also attracts research interests in computer vision area. In this paper we investigate the application of Transformer in Image Quality (TRIQ) assessment. Following the original Transformer encoder employed in Vision Transformer (ViT), we propose an architecture of using a shallow Transformer encoder on the top of a feature map extracted by convolution neural networks (CNN). Adaptive positional embedding is employed in the Transformer encoder to handle images with arbitrary resolutions. Different settings of Transformer architectures have been investigated on publicly available image quality databases. We have found that the proposed TRIQ architecture achieves outstanding performance. The implementation of TRIQ is published on Github (https://github.com/junyongyou/triq).

Original languageEnglish
Title of host publication2021 IEEE International Conference on Image Processing (ICIP 2021)
PublisherIEEE Computer Society
Pages1389-1393
Number of pages5
ISBN (Electronic)9781665441155
DOIs
Publication statusPublished - 23 Aug 2021
Event2021 IEEE International Conference on Image Processing, ICIP 2021 - Anchorage, United States
Duration: 19 Sept 202122 Sept 2021

Publication series

NameProceedings - International Conference on Image Processing, ICIP
Volume2021-September
ISSN (Print)1522-4880

Conference

Conference2021 IEEE International Conference on Image Processing, ICIP 2021
Country/TerritoryUnited States
CityAnchorage
Period19/09/2122/09/21

Keywords

  • Attention
  • Hybrid model
  • Image quality assessment
  • Transformer

Fingerprint

Dive into the research topics of 'Transformer For Image Quality Assessment'. Together they form a unique fingerprint.

Cite this