Zero-Shot Image Recognition via Learning Dual Prototype Accordance Across Meta-Domains
Zero-shot learning (ZSL) aims to recognize unseen classes by transferring semantic knowledge from seen categories. However, existing methods often struggle with the persistent semantic gap caused by limited semantic descriptors and rigid visual feature modeling. In particular, modeling pre-defined c...
Publié dans: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 34(2025) vom: 15., Seite 6361-6373 |
---|---|
Auteur principal: | |
Autres auteurs: | , , |
Format: | Article en ligne |
Langue: | English |
Publié: |
2025
|
Accès à la collection: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society |
Sujets: | Journal Article |
Résumé: | Zero-shot learning (ZSL) aims to recognize unseen classes by transferring semantic knowledge from seen categories. However, existing methods often struggle with the persistent semantic gap caused by limited semantic descriptors and rigid visual feature modeling. In particular, modeling pre-defined class-level attribute descriptions as ground truth hinders effective semantic-to-visual alignment to some extent. To mitigate these issues, we propose the Bilateral-guided Prototype Refinement Network (BPRN), a novel ZSL framework designed to refine dual prototypes across meta-domains of varying scales. Specifically, we first disentangle the relationships among class-level semantics and use them to generate corresponding pseudo-visual prototypes. Then, by leveraging distribution information across dual prototypes in different meta-domains, BPRN achieves bidirectional calibration between visual-to-semantic and semantic-to-visual modalities. Finally, a synthesized class-level representation derived from the refined dual prototypes is employed for inference, instead of relying on a single prototype. Extensive experiments conducted on five widely-used ZSL benchmark datasets demonstrate that BPRN consistently achieves competitive or even superior performance. Specifically, in the GZSL scenario, BPRN shows improvements of 2.1%, 7.3%, 6.1%, and 4.8% on AWA1, AWA2, SUN, and aPY, respectively, compared to existing embedding-based ZSL methods. Ablation studies and visualization analyses further validate the effectiveness of the proposed components |
---|---|
Description: | Date Revised 08.10.2025 published: Print Citation Status PubMed-not-MEDLINE |
ISSN: | 1941-0042 |
DOI: | 10.1109/TIP.2025.3607588 |