Multi-Task Learning of Object States and State-Modifying Actions From Web Videos

We aim to learn to temporally localize object state changes and the corresponding state-modifying actions by observing people interacting with objects in long uncurated web videos. We introduce three principal contributions. First, we develop a self-supervised model for jointly learning state-modify...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 46(2024), 7 vom: 05. Juni, Seite 5114-5130
1. Verfasser: Soucek, Tomas (VerfasserIn)
Weitere Verfasser: Alayrac, Jean-Baptiste, Miech, Antoine, Laptev, Ivan, Sivic, Josef
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2024
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article