Zero-Shot Image Recognition via Learning Dual Prototype Accordance Across Meta-Domains

Zero-shot learning (ZSL) aims to recognize unseen classes by transferring semantic knowledge from seen categories. However, existing methods often struggle with the persistent semantic gap caused by limited semantic descriptors and rigid visual feature modeling. In particular, modeling pre-defined c...

Description complète

Détails bibliographiques
Publié dans:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 34(2025) vom: 15., Seite 6361-6373
Auteur principal:	Ren, Bocheng (Auteur)
Autres auteurs:	Yi, Yuanyuan, Zhang, Qingchen, Liu, Debin
Format:	Article en ligne
Langue:	English
Publié:	2025
Accès à la collection:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Sujets:	Journal Article

Description
Résumé:	Zero-shot learning (ZSL) aims to recognize unseen classes by transferring semantic knowledge from seen categories. However, existing methods often struggle with the persistent semantic gap caused by limited semantic descriptors and rigid visual feature modeling. In particular, modeling pre-defined class-level attribute descriptions as ground truth hinders effective semantic-to-visual alignment to some extent. To mitigate these issues, we propose the Bilateral-guided Prototype Refinement Network (BPRN), a novel ZSL framework designed to refine dual prototypes across meta-domains of varying scales. Specifically, we first disentangle the relationships among class-level semantics and use them to generate corresponding pseudo-visual prototypes. Then, by leveraging distribution information across dual prototypes in different meta-domains, BPRN achieves bidirectional calibration between visual-to-semantic and semantic-to-visual modalities. Finally, a synthesized class-level representation derived from the refined dual prototypes is employed for inference, instead of relying on a single prototype. Extensive experiments conducted on five widely-used ZSL benchmark datasets demonstrate that BPRN consistently achieves competitive or even superior performance. Specifically, in the GZSL scenario, BPRN shows improvements of 2.1%, 7.3%, 6.1%, and 4.8% on AWA1, AWA2, SUN, and aPY, respectively, compared to existing embedding-based ZSL methods. Ablation studies and visualization analyses further validate the effectiveness of the proposed components
Description:	Date Revised 08.10.2025 published: Print Citation Status PubMed-not-MEDLINE
ISSN:	1941-0042
DOI:	10.1109/TIP.2025.3607588