Learning to Localize Sound Sources in Visual Scenes : Analysis and Applications

Visual events are usually accompanied by sounds in our daily lives. However, can the machines learn to correlate the visual scene and sound, as well as localize the sound source only by observing them like humans? To investigate its empirical learnability, in this work we first present a novel unsup...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 43(2021), 5 vom: 01. Mai, Seite 1605-1619
1. Verfasser: Senocak, Arda (VerfasserIn)
Weitere Verfasser: Oh, Tae-Hyun, Kim, Junsik, Yang, Ming-Hsuan, Kweon, In So
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2021
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S.