Dubbing Movies via Hierarchical Phoneme Modeling and Acoustic Diffusion Denoising
Given a piece of text, a video clip, and reference audio, the movie dubbing (also known as Visual Voice Cloning, V2C) task aims to generate speeches that clone reference voice and align well with the video in both emotion and lip movement, which is more challenging than conventional text-to-speech s...
Ausführliche Beschreibung
Bibliographische Detailangaben
Veröffentlicht in: | IEEE transactions on pattern analysis and machine intelligence. - 1979. - 47(2025), 11 vom: 08. Okt., Seite 10361-10377
|
1. Verfasser: |
Li, Liang
(VerfasserIn) |
Weitere Verfasser: |
Cong, Gaoxiang,
Qi, Yuankai,
Zha, Zheng-Jun,
Wu, Qi,
Sheng, Quan Z,
Huang, Qingming,
Yang, Ming-Hsuan |
Format: | Online-Aufsatz
|
Sprache: | English |
Veröffentlicht: |
2025
|
Zugriff auf das übergeordnete Werk: | IEEE transactions on pattern analysis and machine intelligence
|
Schlagworte: | Journal Article |