Zero-Shot Image Recognition via Learning Dual Prototype Accordance Across Meta-Domains

Zero-shot learning (ZSL) aims to recognize unseen classes by transferring semantic knowledge from seen categories. However, existing methods often struggle with the persistent semantic gap caused by limited semantic descriptors and rigid visual feature modeling. In particular, modeling pre-defined c...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 34(2025) vom: 15., Seite 6361-6373
1. Verfasser:	Ren, Bocheng (VerfasserIn)
Weitere Verfasser:	Yi, Yuanyuan, Zhang, Qingchen, Liu, Debin
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2025
Zugriff auf das übergeordnete Werk:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:	Journal Article


LEADER	01000caa a22002652c 4500
001	NLM392639858
003	DE-627
005	20251008232028.0
007	cr uuu---uuuuu
008	250917s2025 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1109/TIP.2025.3607588 \|2 doi
028	5	2	\|a pubmed25n1593.xml
035			\|a (DE-627)NLM392639858
035			\|a (NLM)40953417
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Ren, Bocheng \|e verfasserin \|4 aut
245	1	0	\|a Zero-Shot Image Recognition via Learning Dual Prototype Accordance Across Meta-Domains
264		1	\|c 2025
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Revised 08.10.2025
500			\|a published: Print
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a Zero-shot learning (ZSL) aims to recognize unseen classes by transferring semantic knowledge from seen categories. However, existing methods often struggle with the persistent semantic gap caused by limited semantic descriptors and rigid visual feature modeling. In particular, modeling pre-defined class-level attribute descriptions as ground truth hinders effective semantic-to-visual alignment to some extent. To mitigate these issues, we propose the Bilateral-guided Prototype Refinement Network (BPRN), a novel ZSL framework designed to refine dual prototypes across meta-domains of varying scales. Specifically, we first disentangle the relationships among class-level semantics and use them to generate corresponding pseudo-visual prototypes. Then, by leveraging distribution information across dual prototypes in different meta-domains, BPRN achieves bidirectional calibration between visual-to-semantic and semantic-to-visual modalities. Finally, a synthesized class-level representation derived from the refined dual prototypes is employed for inference, instead of relying on a single prototype. Extensive experiments conducted on five widely-used ZSL benchmark datasets demonstrate that BPRN consistently achieves competitive or even superior performance. Specifically, in the GZSL scenario, BPRN shows improvements of 2.1%, 7.3%, 6.1%, and 4.8% on AWA1, AWA2, SUN, and aPY, respectively, compared to existing embedding-based ZSL methods. Ablation studies and visualization analyses further validate the effectiveness of the proposed components
650		4	\|a Journal Article
700	1		\|a Yi, Yuanyuan \|e verfasserin \|4 aut
700	1		\|a Zhang, Qingchen \|e verfasserin \|4 aut
700	1		\|a Liu, Debin \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society \|d 1992 \|g 34(2025) vom: 15., Seite 6361-6373 \|w (DE-627)NLM09821456X \|x 1941-0042 \|7 nnas
773	1	8	\|g volume:34 \|g year:2025 \|g day:15 \|g pages:6361-6373
856	4	0	\|u http://dx.doi.org/10.1109/TIP.2025.3607588 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|d 34 \|j 2025 \|b 15 \|h 6361-6373