Learning to Recognize Actions on Objects in Egocentric Video With Attention Dictionaries

We present EgoACO, a deep neural architecture for video action recognition that learns to pool action-context-object descriptors from frame level features by leveraging the verb-noun structure of action labels in egocentric video datasets. The core component is class activation pooling (CAP), a diff...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 6 vom: 11. Juni, Seite 6674-6687
1. Verfasser: Sudhakaran, Swathikiran (VerfasserIn)
Weitere Verfasser: Escalera, Sergio, Lanz, Oswald
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article