Task-Aware Weakly Supervised Object Localization With Transformer

Weakly supervised object localization (WSOL) aims to predict both object locations and categories with only image-level class labels. However, most existing methods rely on class-specific image regions for localization, resulting in incomplete object localization. To alleviate this problem, we propo...

Description complète

Détails bibliographiques
Publié dans:	IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 7 vom: 07. Juli, Seite 9109-9121
Auteur principal:	Meng, Meng (Auteur)
Autres auteurs:	Zhang, Tianzhu, Zhang, Zhe, Zhang, Yongdong, Wu, Feng
Format:	Article en ligne
Langue:	English
Publié:	2023
Accès à la collection:	IEEE transactions on pattern analysis and machine intelligence
Sujets:	Journal Article


LEADER	01000caa a22002652c 4500
001	NLM355202816
003	DE-627
005	20250304150744.0
007	cr uuu---uuuuu
008	231226s2023 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1109/TPAMI.2022.3230902 \|2 doi
028	5	2	\|a pubmed25n1183.xml
035			\|a (DE-627)NLM355202816
035			\|a (NLM)37015535
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Meng, Meng \|e verfasserin \|4 aut
245	1	0	\|a Task-Aware Weakly Supervised Object Localization With Transformer
264		1	\|c 2023
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Completed 06.06.2023
500			\|a Date Revised 06.06.2023
500			\|a published: Print-Electronic
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a Weakly supervised object localization (WSOL) aims to predict both object locations and categories with only image-level class labels. However, most existing methods rely on class-specific image regions for localization, resulting in incomplete object localization. To alleviate this problem, we propose a novel end-to-end task-aware framework with a transformer encoder-decoder architecture (TAFormer) to learn class-agnostic foreground maps, including a representation encoder, a localization decoder, and a classification decoder. The proposed TAFormer enjoys several merits. First, the designed three modules can effectively perform class-agnostic localization and classification in a task-aware manner, achieving remarkable performance for both tasks. Second, an optimal transport algorithm is proposed to provide pixel-level pseudo labels to online refine foreground maps. To the best of our knowledge, this is the first work by exploring a task-aware framework with a transformer architecture and an optimal transport algorithm to achieve accurate object localization for WSOL. Extensive experiments with four backbones on two standard benchmarks demonstrate that our TAFormer achieves favorable performance against state-of-the-art methods. Furthermore, we show that the proposed TAFormer provides higher robustness against adversarial attacks and noisy labels
650		4	\|a Journal Article
700	1		\|a Zhang, Tianzhu \|e verfasserin \|4 aut
700	1		\|a Zhang, Zhe \|e verfasserin \|4 aut
700	1		\|a Zhang, Yongdong \|e verfasserin \|4 aut
700	1		\|a Wu, Feng \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on pattern analysis and machine intelligence \|d 1979 \|g 45(2023), 7 vom: 07. Juli, Seite 9109-9121 \|w (DE-627)NLM098212257 \|x 1939-3539 \|7 nnas
773	1	8	\|g volume:45 \|g year:2023 \|g number:7 \|g day:07 \|g month:07 \|g pages:9109-9121
856	4	0	\|u http://dx.doi.org/10.1109/TPAMI.2022.3230902 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|d 45 \|j 2023 \|e 7 \|b 07 \|c 07 \|h 9109-9121