Multi-Task Learning of Object States and State-Modifying Actions From Web Videos

We aim to learn to temporally localize object state changes and the corresponding state-modifying actions by observing people interacting with objects in long uncurated web videos. We introduce three principal contributions. First, we develop a self-supervised model for jointly learning state-modify...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on pattern analysis and machine intelligence. - 1979. - 46(2024), 7 vom: 05. Juni, Seite 5114-5130
1. Verfasser:	Soucek, Tomas (VerfasserIn)
Weitere Verfasser:	Alayrac, Jean-Baptiste, Miech, Antoine, Laptev, Ivan, Sivic, Josef
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2024
Zugriff auf das übergeordnete Werk:	IEEE transactions on pattern analysis and machine intelligence
Schlagworte:	Journal Article

Online verfügbar	Volltext