6D-ViT : Category-Level 6D Object Pose Estimation via Transformer-Based Instance Representation Learning
This paper presents 6D vision transformer (6D-ViT), a transformer-based instance representation learning network suitable for highly accurate category-level object pose estimation based on RGB-D images. Specifically, a novel two-stream encoder-decoder framework is dedicated to exploring complex and...
Veröffentlicht in: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 31(2022) vom: 15., Seite 6907-6921 |
---|---|
1. Verfasser: | |
Weitere Verfasser: | , , |
Format: | Online-Aufsatz |
Sprache: | English |
Veröffentlicht: |
2022
|
Zugriff auf das übergeordnete Werk: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society |
Schlagworte: | Journal Article |
Online verfügbar |
Volltext |