TY - GEN
T1 - Gated Transformer Representing Region Importance for Image Quality Assessment
AU - You, Junyong
AU - Lin, Yuan
AU - Korhonen, Jari
PY - 2024/9/9
Y1 - 2024/9/9
N2 - Deep neural networks, particularly convolutional neural networks (CNNs), have shown significant promise in image quality assessment (IQA), yet the underlying workings of these models in IQA remain partially unexplored. This study unveils a novel positionally masked transformer, shedding light on how various regions of an image influence its overall quality. Surprisingly, the findings reveal that half of an image may exert only a marginal influence on image quality, while the remaining half proves vital. This observation has been extended to other CNN-based IQA models, unearthing a consistent pattern where specific image regions significantly shape overall quality. In a stride to understand these phenomena, three semantic measures: saliency, frequency, and objectness, have been identified, exhibiting a strong correlation with the importance of image regions in IQA. Building upon these insights, a new gated operation has been proposed, representing the fluctuating significance of regions in image quality. A gate, integrable into a transformer encoder for IQA, serves to pinpoint the crucial spatial regions, enhancing their impact by amplifying attention weights. The resulting gated transformer has been rigorously tested on publicly available IQA datasets, demonstrating exceptional performance and reinforcing the innovative nature of this approach. The success of this study paves the way for more intricate and insightful analyses of IQA.
AB - Deep neural networks, particularly convolutional neural networks (CNNs), have shown significant promise in image quality assessment (IQA), yet the underlying workings of these models in IQA remain partially unexplored. This study unveils a novel positionally masked transformer, shedding light on how various regions of an image influence its overall quality. Surprisingly, the findings reveal that half of an image may exert only a marginal influence on image quality, while the remaining half proves vital. This observation has been extended to other CNN-based IQA models, unearthing a consistent pattern where specific image regions significantly shape overall quality. In a stride to understand these phenomena, three semantic measures: saliency, frequency, and objectness, have been identified, exhibiting a strong correlation with the importance of image regions in IQA. Building upon these insights, a new gated operation has been proposed, representing the fluctuating significance of regions in image quality. A gate, integrable into a transformer encoder for IQA, serves to pinpoint the crucial spatial regions, enhancing their impact by amplifying attention weights. The resulting gated transformer has been rigorously tested on publicly available IQA datasets, demonstrating exceptional performance and reinforcing the innovative nature of this approach. The success of this study paves the way for more intricate and insightful analyses of IQA.
KW - Explainable AI (XAI)
KW - gated operation
KW - image quality assessment
KW - image region importance
KW - semantic measures
UR - http://www.scopus.com/inward/record.url?scp=85204979738&partnerID=8YFLogxK
U2 - 10.1109/IJCNN60899.2024.10650165
DO - 10.1109/IJCNN60899.2024.10650165
M3 - Published conference contribution
AN - SCOPUS:85204979738
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2024 International Joint Conference on Neural Networks, IJCNN 2024 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 International Joint Conference on Neural Networks, IJCNN 2024
Y2 - 30 June 2024 through 5 July 2024
ER -