What Makes for Good Tokenizers in Vision Transformer?

The architecture of transformers, which recently witness booming applications in vision tasks, has pivoted against the widespread convolutional paradigm. Relying on the tokenization process that splits inputs into multiple tokens, transformers are capable of extracting their pairwise relationships u...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 11 vom: 07. Nov., Seite 13011-13023
1. Verfasser: Qian, Shengju (VerfasserIn)
Weitere Verfasser: Zhu, Yi, Li, Wenbo, Li, Mu, Jia, Jiaya
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article