Rethinking Attentive Object Detection via Neural Attention Learning

Visual attention advances object detection by attending neural networks to object representations. While existing methods incorporate empirical modules to empower network attention, we rethink attentive object detection from the network learning perspective in this work. We propose a NEural Attentio...

Description complète

Détails bibliographiques
Publié dans:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 33(2024) vom: 18., Seite 1726-1739
Auteur principal: Ge, Chongjian (Auteur)
Autres auteurs: Song, Yibing, Ma, Chao, Qi, Yuankai, Luo, Ping
Format: Article en ligne
Langue:English
Publié: 2024
Accès à la collection:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Sujets:Journal Article
Description
Résumé:Visual attention advances object detection by attending neural networks to object representations. While existing methods incorporate empirical modules to empower network attention, we rethink attentive object detection from the network learning perspective in this work. We propose a NEural Attention Learning approach (NEAL) which consists of two parts. During the back-propagation of each training iteration, we first calculate the partial derivatives (a.k.a. the accumulated gradients) of the classification output with respect to the input features. We refine these partial derivatives to obtain attention response maps whose elements reflect the contributions to the final network predictions. Then, we formulate the attention response maps as extra objective functions, which are combined together with the original detection loss to train detectors in an end-to-end manner. In this way, we succeed in learning an attentive CNN model without introducing additional network structures. We apply NEAL to the two-stage object detection frameworks, which are usually composed of a CNN feature backbone, a region proposal network (RPN), and a classifier. We show that the proposed NEAL not only helps the RPN attend to objects but also enables the classifier to pay more attention to the premier positive samples. To this end, the localization (proposal generation) and classification mutually benefit from each other in our proposed method. Extensive experiments on large-scale benchmark datasets, including MS COCO 2017 and Pascal VOC 2012, demonstrate that the proposed NEAL algorithm advances the two-stage object detector over state-of-the-art approaches
Description:Date Revised 08.03.2024
published: Print-Electronic
Citation Status PubMed-not-MEDLINE
ISSN:1941-0042
DOI:10.1109/TIP.2023.3251693