NerfCap : Human Performance Capture With Dynamic Neural Radiance Fields

This paper addresses the challenge of human performance capture from sparse multi-view or monocular videos. Given a template mesh of the performer, previous methods capture the human motion by non-rigidly registering the template mesh to images with 2D silhouettes or dense photometric alignment. How...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on visualization and computer graphics. - 1996. - 29(2023), 12 vom: 30. Dez., Seite 5097-5110
1. Verfasser: Wang, Kangkan (VerfasserIn)
Weitere Verfasser: Peng, Sida, Zhou, Xiaowei, Yang, Jian, Zhang, Guofeng
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:IEEE transactions on visualization and computer graphics
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM345583280
003 DE-627
005 20231226025404.0
007 cr uuu---uuuuu
008 231226s2023 xx |||||o 00| ||eng c
024 7 |a 10.1109/TVCG.2022.3202503  |2 doi 
028 5 2 |a pubmed24n1151.xml 
035 |a (DE-627)NLM345583280 
035 |a (NLM)36040949 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Wang, Kangkan  |e verfasserin  |4 aut 
245 1 0 |a NerfCap  |b Human Performance Capture With Dynamic Neural Radiance Fields 
264 1 |c 2023 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 22.11.2023 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a This paper addresses the challenge of human performance capture from sparse multi-view or monocular videos. Given a template mesh of the performer, previous methods capture the human motion by non-rigidly registering the template mesh to images with 2D silhouettes or dense photometric alignment. However, the detailed surface deformation cannot be recovered from the silhouettes, while the photometric alignment suffers from instability caused by appearance variation in the videos. To solve these problems, we propose NerfCap, a novel performance capture method based on the dynamic neural radiance field (NeRF) representation of the performer. Specifically, a canonical NeRF is initialized from the template geometry and registered to the video frames by optimizing the deformation field and the appearance model of the canonical NeRF. To capture both large body motion and detailed surface deformation, NerfCap combines linear blend skinning with embedded graph deformation. In contrast to the mesh-based methods that suffer from fixed topology and texture, NerfCap is able to flexibly capture complex geometry and appearance variation across the videos, and synthesize more photo-realistic images. In addition, NerfCap can be pre-trained end to end in a self-supervised manner by matching the synthesized videos with the input videos. Experimental results on various datasets show that NerfCap outperforms prior works in terms of both surface reconstruction accuracy and novel-view synthesis quality 
650 4 |a Journal Article 
700 1 |a Peng, Sida  |e verfasserin  |4 aut 
700 1 |a Zhou, Xiaowei  |e verfasserin  |4 aut 
700 1 |a Yang, Jian  |e verfasserin  |4 aut 
700 1 |a Zhang, Guofeng  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on visualization and computer graphics  |d 1996  |g 29(2023), 12 vom: 30. Dez., Seite 5097-5110  |w (DE-627)NLM098269445  |x 1941-0506  |7 nnns 
773 1 8 |g volume:29  |g year:2023  |g number:12  |g day:30  |g month:12  |g pages:5097-5110 
856 4 0 |u http://dx.doi.org/10.1109/TVCG.2022.3202503  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 29  |j 2023  |e 12  |b 30  |c 12  |h 5097-5110