What Makes for Hierarchical Vision Transformer?
Recent studies indicate that hierarchical Vision Transformer (ViT) with a macro architecture of interleaved non-overlapped window-based self-attention & shifted-window operation can achieve state-of-the-art performance in various visual recognition tasks, and challenges the ubiquitous convolutio...
Publié dans: | IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 10 vom: 05. Okt., Seite 12714-12720 |
---|---|
Auteur principal: | |
Autres auteurs: | , , |
Format: | Article en ligne |
Langue: | English |
Publié: |
2023
|
Accès à la collection: | IEEE transactions on pattern analysis and machine intelligence |
Sujets: | Journal Article |
Accès en ligne |
Volltext |