Background-Aware Classification Activation Map for Weakly Supervised Object Localization

Weakly supervised object localization (WSOL) relaxes the requirement of dense annotations for object localization by using image-level annotation to supervise the learning process. However, most WSOL methods only focus on forcing the object classifier to produce high activation score on object parts...

Description complète

Détails bibliographiques
Publié dans:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 12 vom: 04. Dez., Seite 14175-14191
Auteur principal: Zhu, Lei (Auteur)
Autres auteurs: She, Qi, Chen, Qian, Meng, Xiangxi, Geng, Mufeng, Jin, Lujia, Zhang, Yibao, Ren, Qiushi, Lu, Yanye
Format: Article en ligne
Langue:English
Publié: 2023
Accès à la collection:IEEE transactions on pattern analysis and machine intelligence
Sujets:Journal Article
Description
Résumé:Weakly supervised object localization (WSOL) relaxes the requirement of dense annotations for object localization by using image-level annotation to supervise the learning process. However, most WSOL methods only focus on forcing the object classifier to produce high activation score on object parts without considering the influence of background locations, causing excessive background activations and ill-pose background score searching. Based on this point, our work proposes a novel mechanism called the background-aware classification activation map (B-CAM) to add background awareness for WSOL training. Besides aggregating an object image-level feature for supervision, our B-CAM produces an additional background image-level feature to represent the pure-background sample. This additional feature can provide background cues for the object classifier to suppress the background activations on object localization maps. Moreover, our B-CAM also trained a background classifier with image-level annotation to produce adaptive background scores when determining the binary localization mask. Experiments indicate the effectiveness of the proposed B-CAM on four different types of WSOL benchmarks, including CUB-200, ILSVRC, OpenImages, and VOC2012 datasets
Description:Date Revised 07.11.2023
published: Print-Electronic
Citation Status PubMed-not-MEDLINE
ISSN:1939-3539
DOI:10.1109/TPAMI.2023.3309621