Learning to Recognize Actions on Objects in Egocentric Video With Attention Dictionaries
We present EgoACO, a deep neural architecture for video action recognition that learns to pool action-context-object descriptors from frame level features by leveraging the verb-noun structure of action labels in egocentric video datasets. The core component is class activation pooling (CAP), a diff...
Veröffentlicht in: | IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 6 vom: 11. Juni, Seite 6674-6687 |
---|---|
1. Verfasser: | |
Weitere Verfasser: | , |
Format: | Online-Aufsatz |
Sprache: | English |
Veröffentlicht: |
2023
|
Zugriff auf das übergeordnete Werk: | IEEE transactions on pattern analysis and machine intelligence |
Schlagworte: | Journal Article |
Online verfügbar |
Volltext |