Background-Aware Classification Activation Map for Weakly Supervised Object Localization

Weakly supervised object localization (WSOL) relaxes the requirement of dense annotations for object localization by using image-level annotation to supervise the learning process. However, most WSOL methods only focus on forcing the object classifier to produce high activation score on object parts...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 12 vom: 04. Dez., Seite 14175-14191
1. Verfasser: Zhu, Lei (VerfasserIn)
Weitere Verfasser: She, Qi, Chen, Qian, Meng, Xiangxi, Geng, Mufeng, Jin, Lujia, Zhang, Yibao, Ren, Qiushi, Lu, Yanye
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
Beschreibung
Zusammenfassung:Weakly supervised object localization (WSOL) relaxes the requirement of dense annotations for object localization by using image-level annotation to supervise the learning process. However, most WSOL methods only focus on forcing the object classifier to produce high activation score on object parts without considering the influence of background locations, causing excessive background activations and ill-pose background score searching. Based on this point, our work proposes a novel mechanism called the background-aware classification activation map (B-CAM) to add background awareness for WSOL training. Besides aggregating an object image-level feature for supervision, our B-CAM produces an additional background image-level feature to represent the pure-background sample. This additional feature can provide background cues for the object classifier to suppress the background activations on object localization maps. Moreover, our B-CAM also trained a background classifier with image-level annotation to produce adaptive background scores when determining the binary localization mask. Experiments indicate the effectiveness of the proposed B-CAM on four different types of WSOL benchmarks, including CUB-200, ILSVRC, OpenImages, and VOC2012 datasets
Beschreibung:Date Revised 07.11.2023
published: Print-Electronic
Citation Status PubMed-not-MEDLINE
ISSN:1939-3539
DOI:10.1109/TPAMI.2023.3309621