How Does Attention Work in Vision Transformers? A Visual Analytics Attempt
Vision transformer (ViT) expands the success of transformer models from sequential data to images. The model decomposes an image into many smaller patches and arranges them into a sequence. Multi-head self-attentions are then applied to the sequence to learn the attention between patches. Despite ma...
Ausführliche Beschreibung
Bibliographische Detailangaben
Veröffentlicht in: | IEEE transactions on visualization and computer graphics. - 1996. - 29(2023), 6 vom: 05. Juni, Seite 2888-2900
|
1. Verfasser: |
Li, Yiran
(VerfasserIn) |
Weitere Verfasser: |
Wang, Junpeng,
Dai, Xin,
Wang, Liang,
Yeh, Chin-Chia Michael,
Zheng, Yan,
Zhang, Wei,
Ma, Kwan-Liu |
Format: | Online-Aufsatz
|
Sprache: | English |
Veröffentlicht: |
2023
|
Zugriff auf das übergeordnete Werk: | IEEE transactions on visualization and computer graphics
|
Schlagworte: | Journal Article |