Semantics-Preserving Graph Propagation for Zero-Shot Object Detection

Most existing object detection models are restricted to detecting objects from previously seen categories, an approach that tends to become infeasible for rare or novel concepts. Accordingly, in this paper, we explore object detection in the context of zero-shot learning, i.e., Zero-Shot Object Dete...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - PP(2020) vom: 30. Juli
1. Verfasser: Yan, Caixia (VerfasserIn)
Weitere Verfasser: Zheng, Qinghua, Chang, Xiaojun, Luo, Minnan, Yeh, Chung-Hsing, Hauptmann, Alexander G
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2020
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article
Beschreibung
Zusammenfassung:Most existing object detection models are restricted to detecting objects from previously seen categories, an approach that tends to become infeasible for rare or novel concepts. Accordingly, in this paper, we explore object detection in the context of zero-shot learning, i.e., Zero-Shot Object Detection (ZSD), to concurrently recognize and localize objects from novel concepts. Existing ZSD algorithms are typically based on a simple mapping-transfer strategy that is susceptible to the domain shift problem. To resolve this problem, we propose a novel Semantics-Preserving Graph Propagation model for ZSD based on Graph Convolutional Networks (GCN). More specifically, we employ a graph construction module to flexibly build category graphs by incorporating diverse correlations between category nodes; this is followed by two semantics preserving modules that enhance both category and region representations through a multi-step graph propagation process. Compared to existing mapping-transfer based methods, both the semantic description and semantic structural knowledge exhibited in prior category graphs can be effectively leveraged to boost the generalization capability of the learned projection function via knowledge transfer, thereby providing a solution to the domain shift problem. Experiments on existing seen/unseen splits of three popular object detection datasets demonstrate that the proposed approach performs favorably against state-of-the-art ZSD methods
Beschreibung:Date Revised 27.02.2024
published: Print-Electronic
Citation Status Publisher
ISSN:1941-0042
DOI:10.1109/TIP.2020.3011807