PSLT : A Light-Weight Vision Transformer With Ladder Self-Attention and Progressive Shift

Vision Transformer (ViT) has shown great potential for various visual tasks due to its ability to model long-range dependency. However, ViT requires a large amount of computing resource to compute the global self-attention. In this work, we propose a ladder self-attention block with multiple branche...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 9 vom: 05. Sept., Seite 11120-11135
1. Verfasser:	Wu, Gaojie (VerfasserIn)
Weitere Verfasser:	Zheng, Wei-Shi, Lu, Yutong, Tian, Qi
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2023
Zugriff auf das übergeordnete Werk:	IEEE transactions on pattern analysis and machine intelligence
Schlagworte:	Journal Article

Online verfügbar	Volltext