VOLO : Vision Outlooker for Visual Recognition
Recently, Vision Transformers (ViTs) have been broadly explored in visual recognition. With low efficiency in encoding fine-level features, the performance of ViTs is still inferior to the state-of-the-art CNNs when trained from scratch on a midsize dataset like ImageNet. Through experimental analys...
Ausführliche Beschreibung
Bibliographische Detailangaben
Veröffentlicht in: | IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 5 vom: 12. Mai, Seite 6575-6586
|
1. Verfasser: |
Yuan, Li
(VerfasserIn) |
Weitere Verfasser: |
Hou, Qibin,
Jiang, Zihang,
Feng, Jiashi,
Yan, Shuicheng |
Format: | Online-Aufsatz
|
Sprache: | English |
Veröffentlicht: |
2023
|
Zugriff auf das übergeordnete Werk: | IEEE transactions on pattern analysis and machine intelligence
|
Schlagworte: | Journal Article |