What Makes for Good Tokenizers in Vision Transformer?

The architecture of transformers, which recently witness booming applications in vision tasks, has pivoted against the widespread convolutional paradigm. Relying on the tokenization process that splits inputs into multiple tokens, transformers are capable of extracting their pairwise relationships u...

Description complète

Détails bibliographiques
Publié dans:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 11 vom: 06. Nov., Seite 13011-13023
Auteur principal: Qian, Shengju (Auteur)
Autres auteurs: Zhu, Yi, Li, Wenbo, Li, Mu, Jia, Jiaya
Format: Article en ligne
Langue:English
Publié: 2023
Accès à la collection:IEEE transactions on pattern analysis and machine intelligence
Sujets:Journal Article