3D Talking Face With Personalized Pose Dynamics

Recently, we have witnessed a boom in applications for 3D talking face generation. However, most existing 3D face generation methods can only generate 3D faces with a static head pose, which is inconsistent with how humans perceive faces. Only a few articles focus on head pose generation, but even t...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on visualization and computer graphics. - 1996. - 29(2023), 2 vom: 04. Feb., Seite 1438-1449
1. Verfasser: Zhang, Chenxu (VerfasserIn)
Weitere Verfasser: Ni, Saifeng, Fan, Zhipeng, Li, Hongbo, Zeng, Ming, Budagavi, Madhukar, Guo, Xiaohu
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:IEEE transactions on visualization and computer graphics
Schlagworte:Journal Article Research Support, U.S. Gov't, Non-P.H.S. Research Support, Non-U.S. Gov't
LEADER 01000naa a22002652 4500
001 NLM33147719X
003 DE-627
005 20231225213541.0
007 cr uuu---uuuuu
008 231225s2023 xx |||||o 00| ||eng c
024 7 |a 10.1109/TVCG.2021.3117484  |2 doi 
028 5 2 |a pubmed24n1104.xml 
035 |a (DE-627)NLM33147719X 
035 |a (NLM)34606458 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Zhang, Chenxu  |e verfasserin  |4 aut 
245 1 0 |a 3D Talking Face With Personalized Pose Dynamics 
264 1 |c 2023 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Completed 06.04.2023 
500 |a Date Revised 03.05.2023 
500 |a published: Print-Electronic 
500 |a Citation Status MEDLINE 
520 |a Recently, we have witnessed a boom in applications for 3D talking face generation. However, most existing 3D face generation methods can only generate 3D faces with a static head pose, which is inconsistent with how humans perceive faces. Only a few articles focus on head pose generation, but even these ignore the attribute of personality. In this article, we propose a unified audio-driven approach to endow 3D talking faces with personalized pose dynamics. To achieve this goal, we establish an original person-specific dataset, providing corresponding head poses and face shapes for each video. Our framework is composed of two separate modules: PoseGAN and PGFace. Given an input audio, PoseGAN first produces a head pose sequence for the 3D head, and then, PGFace utilizes the audio and pose information to generate natural face models. With the combination of these two parts, a 3D talking head with dynamic head movement can be constructed. Experimental evidence indicates that our method can generate person-specific head pose sequences that are in sync with the input audio and that best match with the human experience of talking heads 
650 4 |a Journal Article 
650 4 |a Research Support, U.S. Gov't, Non-P.H.S. 
650 4 |a Research Support, Non-U.S. Gov't 
700 1 |a Ni, Saifeng  |e verfasserin  |4 aut 
700 1 |a Fan, Zhipeng  |e verfasserin  |4 aut 
700 1 |a Li, Hongbo  |e verfasserin  |4 aut 
700 1 |a Zeng, Ming  |e verfasserin  |4 aut 
700 1 |a Budagavi, Madhukar  |e verfasserin  |4 aut 
700 1 |a Guo, Xiaohu  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on visualization and computer graphics  |d 1996  |g 29(2023), 2 vom: 04. Feb., Seite 1438-1449  |w (DE-627)NLM098269445  |x 1941-0506  |7 nnns 
773 1 8 |g volume:29  |g year:2023  |g number:2  |g day:04  |g month:02  |g pages:1438-1449 
856 4 0 |u http://dx.doi.org/10.1109/TVCG.2021.3117484  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 29  |j 2023  |e 2  |b 04  |c 02  |h 1438-1449