Playing for 3D Human Recovery

Image- and video-based 3D human recovery (i.e., pose and shape estimation) have achieved substantial progress. However, due to the prohibitive cost of motion capture, existing datasets are often limited in scale and diversity. In this work, we obtain massive human sequences by playing the video game...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 46(2024), 12 vom: 01. Dez., Seite 10533-10545
1. Verfasser: Cai, Zhongang (VerfasserIn)
Weitere Verfasser: Zhang, Mingyuan, Ren, Jiawei, Wei, Chen, Ren, Daxuan, Lin, Zhengyu, Zhao, Haiyu, Yang, Lei, Loy, Chen Change, Liu, Ziwei
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2024
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article Research Support, Non-U.S. Gov't
LEADER 01000caa a22002652 4500
001 NLM376777168
003 DE-627
005 20250104234458.0
007 cr uuu---uuuuu
008 240828s2024 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2024.3450537  |2 doi 
028 5 2 |a pubmed24n1651.xml 
035 |a (DE-627)NLM376777168 
035 |a (NLM)39190516 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Cai, Zhongang  |e verfasserin  |4 aut 
245 1 0 |a Playing for 3D Human Recovery 
264 1 |c 2024 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Completed 07.11.2024 
500 |a Date Revised 03.01.2025 
500 |a published: Print-Electronic 
500 |a Citation Status MEDLINE 
520 |a Image- and video-based 3D human recovery (i.e., pose and shape estimation) have achieved substantial progress. However, due to the prohibitive cost of motion capture, existing datasets are often limited in scale and diversity. In this work, we obtain massive human sequences by playing the video game with automatically annotated 3D ground truths. Specifically, we contribute GTA-Human, a large-scale 3D human dataset generated with the GTA-V game engine, featuring a highly diverse set of subjects, actions, and scenarios. More importantly, we study the use of game-playing data and obtain five major insights. First, game-playing data is surprisingly effective. A simple frame-based baseline trained on GTA-Human outperforms more sophisticated methods by a large margin. For video-based methods, GTA-Human is even on par with the in-domain training set. Second, we discover that synthetic data provides critical complements to the real data that is typically collected indoor. We highlight that our investigation into domain gap provides explanations for our data mixture strategies that are simple yet useful, which offers new insights to the research community. Third, the scale of the dataset matters. The performance boost is closely related to the additional data available. A systematic study on multiple key factors (such as camera angle and body pose) reveals that the model performance is sensitive to data density. Fourth, the effectiveness of GTA-Human is also attributed to the rich collection of strong supervision labels (SMPL parameters), which are otherwise expensive to acquire in real datasets. Fifth, the benefits of synthetic data extend to larger models such as deeper convolutional neural networks (CNNs) and Transformers, for which a significant impact is also observed. We hope our work could pave the way for scaling up 3D human recovery to the real world 
650 4 |a Journal Article 
650 4 |a Research Support, Non-U.S. Gov't 
700 1 |a Zhang, Mingyuan  |e verfasserin  |4 aut 
700 1 |a Ren, Jiawei  |e verfasserin  |4 aut 
700 1 |a Wei, Chen  |e verfasserin  |4 aut 
700 1 |a Ren, Daxuan  |e verfasserin  |4 aut 
700 1 |a Lin, Zhengyu  |e verfasserin  |4 aut 
700 1 |a Zhao, Haiyu  |e verfasserin  |4 aut 
700 1 |a Yang, Lei  |e verfasserin  |4 aut 
700 1 |a Loy, Chen Change  |e verfasserin  |4 aut 
700 1 |a Liu, Ziwei  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g 46(2024), 12 vom: 01. Dez., Seite 10533-10545  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:46  |g year:2024  |g number:12  |g day:01  |g month:12  |g pages:10533-10545 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2024.3450537  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 46  |j 2024  |e 12  |b 01  |c 12  |h 10533-10545