How Does Attention Work in Vision Transformers? A Visual Analytics Attempt 
    
    
              
              Vision transformer (ViT) expands the success of transformer models from sequential data to images. The model decomposes an image into many smaller patches and arranges them into a sequence. Multi-head self-attentions are then applied to the sequence to learn the attention between patches. Despite ma...
              Ausführliche Beschreibung
          
    
                  
        Bibliographische Detailangaben
                  | Veröffentlicht in: | IEEE transactions on visualization and computer graphics. - 1996. - 29(2023), 6 vom: 05. Juni, Seite 2888-2900 
 | 
|---|
                  | 1. Verfasser: | Li, Yiran
      (VerfasserIn) | 
|---|
                  | Weitere Verfasser: | Wang, Junpeng, 
      
        Dai, Xin, 
      
        Wang, Liang, 
      
        Yeh, Chin-Chia Michael, 
      
        Zheng, Yan, 
      
        Zhang, Wei, 
      
        Ma, Kwan-Liu | 
|---|
                  | Format: | Online-Aufsatz | 
|---|
                  | Sprache: | English | 
|---|
                  | Veröffentlicht: | 2023 
 | 
|---|
                  | Zugriff auf das übergeordnete Werk: | IEEE transactions on visualization and computer graphics 
 | 
|---|
                  | Schlagworte: | Journal Article | 
|---|