DKTNet: Dual-Key Transformer Network for small object detection

Shoukun Xu, Jianan Gu, Yining Hua* (Corresponding Author), Yi Liu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

15 Citations (Scopus)

Abstract

Object detection is a fundamental computer vision task that plays a crucial role in a wide range of real-world applications. However, it is still a challenging task to detect the small size objects in the complex scene, due to the low resolution and noisy representation appearance caused by occlusion, distant depth view, etc. To tackle this issue, a novel transformer architecture, Dual-Key Transformer Network (DKTNet), is proposed in this paper. To improve the feature attention ability, the coherence of linear layer outputs Q and V are enhanced by a dual-K integrated from K1 and K2, which are computed along Q and V, respectively. Instead of spatial-wise attention, channel-wise self-attention mechanism is adopted to promote the important feature channels and suppress the confusing ones. Moreover, 2D and 1D convolution computations for Q, K and V are proposed. Compared with the fully-connected computation in conventional transformer architectures, the 2D convolution can better capture local details and global contextual information, and the 1D convolution can reduce network complexity significantly. Experimental evaluation is conducted on both general and small object detection datasets. The superiority of the aforementioned features in our proposed approach is demonstrated with the comparison against the state-of-the-art approaches.

Original languageEnglish
Pages (from-to)29-41
Number of pages13
JournalNeurocomputing
Volume525
DOIs
Publication statusPublished - 7 Mar 2023

Bibliographical note

Funding Information:
This work was supported by the National Natural Science Foundation of China under Grant 62001341 and Natural Science Foundation of Jiangsu Province under Grant BK20221379 .

Data Availability Statement

Data will be made available on request.

Keywords

  • Dual-key
  • Small object detection
  • Transformer

Fingerprint

Dive into the research topics of 'DKTNet: Dual-Key Transformer Network for small object detection'. Together they form a unique fingerprint.

Cite this