Variational Bayesian Inference for Audio-Visual Tracking of Multiple Speakers

In this article, we address the problem of tracking multiple speakers via the fusion of visual and auditory information. We propose to exploit the complementary nature and roles of these two modalities in order to accurately estimate smooth trajectories of the tracked persons, to deal with the parti...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 43(2021), 5 vom: 21. Mai, Seite 1761-1776
1. Verfasser: Ban, Yutong (VerfasserIn)
Weitere Verfasser: Alameda-Pineda, Xavier, Girin, Laurent, Horaud, Radu
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2021
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article Research Support, Non-U.S. Gov't