A compact representation of visual speech data using latent variables

The problem of visual speech recognition involves the decoding of the video dynamics of a talking mouth in a high-dimensional visual space. In this paper, we propose a generative latent variable model to provide a compact representation of visual speech data. The model uses latent variables to separ...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 36(2014), 1 vom: 28. Jan., Seite 181-7
1. Verfasser: Zhou, Ziheng (VerfasserIn)
Weitere Verfasser: Hong, Xiaopeng, Zhao, Guoying, Pietikäinen, Matti
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2014
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article Research Support, Non-U.S. Gov't
Beschreibung
Zusammenfassung:The problem of visual speech recognition involves the decoding of the video dynamics of a talking mouth in a high-dimensional visual space. In this paper, we propose a generative latent variable model to provide a compact representation of visual speech data. The model uses latent variables to separately represent the interspeaker variations of visual appearances and those caused by uttering within images, and incorporates the structural information of the visual data through placing priors of the latent variables along a curve embedded within a path graph
Beschreibung:Date Completed 30.06.2014
Date Revised 15.11.2013
published: Print
Citation Status MEDLINE
ISSN:1939-3539
DOI:10.1109/TPAMI.2013.173