6D-ViT : Category-Level 6D Object Pose Estimation via Transformer-Based Instance Representation Learning

This paper presents 6D vision transformer (6D-ViT), a transformer-based instance representation learning network suitable for highly accurate category-level object pose estimation based on RGB-D images. Specifically, a novel two-stream encoder-decoder framework is dedicated to exploring complex and...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 31(2022) vom: 15., Seite 6907-6921
1. Verfasser: Zou, Lu (VerfasserIn)
Weitere Verfasser: Huang, Zhangjin, Gu, Naijie, Wang, Guoping
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2022
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article