Multi-Moments in Time : Learning and Interpreting Models for Multi-Action Video Understanding

Videos capture events that typically contain multiple sequential, and simultaneous, actions even in the span of only a few seconds. However, most large-scale datasets built to train models for action recognition in video only provide a single label per video. Consequently, models can be incorrectly...

Description complète

Détails bibliographiques
Publié dans:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 44(2022), 12 vom: 09. Dez., Seite 9434-9445
Auteur principal: Monfort, Mathew (Auteur)
Autres auteurs: Pan, Bowen, Ramakrishnan, Kandan, Andonian, Alex, McNamara, Barry A, Lascelles, Alex, Fan, Quanfu, Gutfreund, Dan, Feris, Rogerio Schmidt, Oliva, Aude
Format: Article en ligne
Langue:English
Publié: 2022
Accès à la collection:IEEE transactions on pattern analysis and machine intelligence
Sujets:Journal Article Research Support, Non-U.S. Gov't