A compact representation of visual speech data using latent variables
The problem of visual speech recognition involves the decoding of the video dynamics of a talking mouth in a high-dimensional visual space. In this paper, we propose a generative latent variable model to provide a compact representation of visual speech data. The model uses latent variables to separ...
Veröffentlicht in: | IEEE transactions on pattern analysis and machine intelligence. - 1979. - 36(2014), 1 vom: 28. Jan., Seite 181-7 |
---|---|
1. Verfasser: | |
Weitere Verfasser: | , , |
Format: | Online-Aufsatz |
Sprache: | English |
Veröffentlicht: |
2014
|
Zugriff auf das übergeordnete Werk: | IEEE transactions on pattern analysis and machine intelligence |
Schlagworte: | Journal Article Research Support, Non-U.S. Gov't |
Zusammenfassung: | The problem of visual speech recognition involves the decoding of the video dynamics of a talking mouth in a high-dimensional visual space. In this paper, we propose a generative latent variable model to provide a compact representation of visual speech data. The model uses latent variables to separately represent the interspeaker variations of visual appearances and those caused by uttering within images, and incorporates the structural information of the visual data through placing priors of the latent variables along a curve embedded within a path graph |
---|---|
Beschreibung: | Date Completed 30.06.2014 Date Revised 15.11.2013 published: Print Citation Status MEDLINE |
ISSN: | 1939-3539 |
DOI: | 10.1109/TPAMI.2013.173 |