Conv2Former : A Simple Transformer-Style ConvNet for Visual Recognition
Vision Transformers have been the most popular network architecture in visual recognition recently due to the strong ability of encode global information. However, its high computational cost when processing high-resolution images limits the applications in downstream tasks. In this paper, we take a...
| Publié dans: | IEEE transactions on pattern analysis and machine intelligence. - 1979. - 46(2024), 12 vom: 15. Dez., Seite 8274-8283 | 
|---|---|
| Auteur principal: | |
| Autres auteurs: | , , | 
| Format: | Article en ligne | 
| Langue: | English | 
| Publié: | 2024 | 
| Accès à la collection: | IEEE transactions on pattern analysis and machine intelligence | 
| Sujets: | Journal Article | 
| Accès en ligne | Volltext |