Identity-Aware and Shape-Aware Propagation of Face Editing in Videos

The development of deep generative models has inspired various facial image editing methods, but many of them are difficult to be directly applied to video editing due to various challenges ranging from imposing 3D constraints, preserving identity consistency, ensuring temporal coherence, etc. To ad...

Description complète

Détails bibliographiques
Publié dans:IEEE transactions on visualization and computer graphics. - 1996. - 30(2024), 7 vom: 06. Juli, Seite 3444-3456
Auteur principal: Jiang, Yue-Ren (Auteur)
Autres auteurs: Chen, Shu-Yu, Fu, Hongbo, Gao, Lin
Format: Article en ligne
Langue:English
Publié: 2024
Accès à la collection:IEEE transactions on visualization and computer graphics
Sujets:Journal Article
LEADER 01000caa a22002652c 4500
001 NLM355232707
003 DE-627
005 20250304151142.0
007 cr uuu---uuuuu
008 231226s2024 xx |||||o 00| ||eng c
024 7 |a 10.1109/TVCG.2023.3235364  |2 doi 
028 5 2 |a pubmed25n1183.xml 
035 |a (DE-627)NLM355232707 
035 |a (NLM)37018564 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Jiang, Yue-Ren  |e verfasserin  |4 aut 
245 1 0 |a Identity-Aware and Shape-Aware Propagation of Face Editing in Videos 
264 1 |c 2024 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 28.06.2024 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a The development of deep generative models has inspired various facial image editing methods, but many of them are difficult to be directly applied to video editing due to various challenges ranging from imposing 3D constraints, preserving identity consistency, ensuring temporal coherence, etc. To address these challenges, we propose a new framework operating on the StyleGAN2 latent space for identity-aware and shape-aware edit propagation on face videos. In order to reduce the difficulties of maintaining the identity, keeping the original 3D motion, and avoiding shape distortions, we disentangle the StyleGAN2 latent vectors of human face video frames to decouple the appearance, shape, expression, and motion from identity. An edit encoding module is used to map a sequence of image frames to continuous latent codes with 3D parametric control and is trained in a self-supervised manner with identity loss and triple shape losses. Our model supports propagation of edits in various forms: I. direct appearance editing on a specific keyframe, II. implicit editing of face shape via a given reference image, and III. existing latent-based semantic edits. Experiments show that our method works well for various forms of videos in the wild and outperforms an animation-based approach and the recent deep generative techniques 
650 4 |a Journal Article 
700 1 |a Chen, Shu-Yu  |e verfasserin  |4 aut 
700 1 |a Fu, Hongbo  |e verfasserin  |4 aut 
700 1 |a Gao, Lin  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on visualization and computer graphics  |d 1996  |g 30(2024), 7 vom: 06. Juli, Seite 3444-3456  |w (DE-627)NLM098269445  |x 1941-0506  |7 nnas 
773 1 8 |g volume:30  |g year:2024  |g number:7  |g day:06  |g month:07  |g pages:3444-3456 
856 4 0 |u http://dx.doi.org/10.1109/TVCG.2023.3235364  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 30  |j 2024  |e 7  |b 06  |c 07  |h 3444-3456