3DFaceShop : Explicitly Controllable 3D-Aware Portrait Generation

In contrast to the traditional avatar creation pipeline which is a costly process, contemporary generative approaches directly learn the data distribution from photographs. While plenty of works extend unconditional generative models and achieve some levels of controllability, it is still challengin...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on visualization and computer graphics. - 1996. - 30(2024), 9 vom: 16. Aug., Seite 6020-6037
1. Verfasser: Tang, Junshu (VerfasserIn)
Weitere Verfasser: Zhang, Bo, Yang, Binxin, Zhang, Ting, Chen, Dong, Ma, Lizhuang, Wen, Fang
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2024
Zugriff auf das übergeordnete Werk:IEEE transactions on visualization and computer graphics
Schlagworte:Journal Article
LEADER 01000caa a22002652 4500
001 NLM363401962
003 DE-627
005 20240801232601.0
007 cr uuu---uuuuu
008 231226s2024 xx |||||o 00| ||eng c
024 7 |a 10.1109/TVCG.2023.3323578  |2 doi 
028 5 2 |a pubmed24n1488.xml 
035 |a (DE-627)NLM363401962 
035 |a (NLM)37847635 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Tang, Junshu  |e verfasserin  |4 aut 
245 1 0 |a 3DFaceShop  |b Explicitly Controllable 3D-Aware Portrait Generation 
264 1 |c 2024 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 01.08.2024 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a In contrast to the traditional avatar creation pipeline which is a costly process, contemporary generative approaches directly learn the data distribution from photographs. While plenty of works extend unconditional generative models and achieve some levels of controllability, it is still challenging to ensure multi-view consistency, especially in large poses. In this work, we propose a network that generates 3D-aware portraits while being controllable according to semantic parameters regarding pose, identity, expression and illumination. Our network uses neural scene representation to model 3D-aware portraits, whose generation is guided by a parametric face model that supports explicit control. While the latent disentanglement can be further enhanced by contrasting images with partially different attributes, there still exists noticeable inconsistency in non-face areas when animating expressions. We solve this by proposing a volume blending strategy in which we form a composite output by blending dynamic and static areas, with two parts segmented from the jointly learned semantic field. Our method outperforms prior arts in extensive experiments, producing realistic portraits with vivid expression in natural lighting when viewed from free viewpoints. It also demonstrates generalization ability to real images as well as out-of-domain data, showing great promise in real applications 
650 4 |a Journal Article 
700 1 |a Zhang, Bo  |e verfasserin  |4 aut 
700 1 |a Yang, Binxin  |e verfasserin  |4 aut 
700 1 |a Zhang, Ting  |e verfasserin  |4 aut 
700 1 |a Chen, Dong  |e verfasserin  |4 aut 
700 1 |a Ma, Lizhuang  |e verfasserin  |4 aut 
700 1 |a Wen, Fang  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on visualization and computer graphics  |d 1996  |g 30(2024), 9 vom: 16. Aug., Seite 6020-6037  |w (DE-627)NLM098269445  |x 1941-0506  |7 nnns 
773 1 8 |g volume:30  |g year:2024  |g number:9  |g day:16  |g month:08  |g pages:6020-6037 
856 4 0 |u http://dx.doi.org/10.1109/TVCG.2023.3323578  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 30  |j 2024  |e 9  |b 16  |c 08  |h 6020-6037