Conv2Former : A Simple Transformer-Style ConvNet for Visual Recognition

Vision Transformers have been the most popular network architecture in visual recognition recently due to the strong ability of encode global information. However, its high computational cost when processing high-resolution images limits the applications in downstream tasks. In this paper, we take a...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - PP(2024) vom: 15. Mai
1. Verfasser: Hou, Qibin (VerfasserIn)
Weitere Verfasser: Lu, Cheng-Ze, Cheng, Ming-Ming, Feng, Jiashi
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2024
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article