3D Human Pose, Shape and Texture From Low-Resolution Images and Videos

3D human pose and shape estimation from monocular images has been an active research area in computer vision. Existing deep learning methods for this task rely on high-resolution input, which however, is not always available in many scenarios such as video surveillance and sports broadcasting. Two c...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 44(2022), 9 vom: 03. Sept., Seite 4490-4504
1. Verfasser: Xu, Xiangyu (VerfasserIn)
Weitere Verfasser: Chen, Hao, Moreno-Noguer, Francesc, Jeni, Laszlo A, De la Torre, Fernando
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2022
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S.
LEADER 01000naa a22002652 4500
001 NLM323443761
003 DE-627
005 20231225184210.0
007 cr uuu---uuuuu
008 231225s2022 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2021.3070002  |2 doi 
028 5 2 |a pubmed24n1078.xml 
035 |a (DE-627)NLM323443761 
035 |a (NLM)33788678 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Xu, Xiangyu  |e verfasserin  |4 aut 
245 1 0 |a 3D Human Pose, Shape and Texture From Low-Resolution Images and Videos 
264 1 |c 2022 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Completed 08.08.2022 
500 |a Date Revised 14.09.2022 
500 |a published: Print-Electronic 
500 |a Citation Status MEDLINE 
520 |a 3D human pose and shape estimation from monocular images has been an active research area in computer vision. Existing deep learning methods for this task rely on high-resolution input, which however, is not always available in many scenarios such as video surveillance and sports broadcasting. Two common approaches to deal with low-resolution images are applying super-resolution techniques to the input, which may result in unpleasant artifacts, or simply training one model for each resolution, which is impractical in many realistic applications. To address the above issues, this paper proposes a novel algorithm called RSC-Net, which consists of a Resolution-aware network, a Self-supervision loss, and a Contrastive learning scheme. The proposed method is able to learn 3D body pose and shape across different resolutions with one single model. The self-supervision loss enforces scale-consistency of the output, and the contrastive learning scheme enforces scale-consistency of the deep features. We show that both these new losses provide robustness when learning in a weakly-supervised manner. Moreover, we extend the RSC-Net to handle low-resolution videos and apply it to reconstruct textured 3D pedestrians from low-resolution input. Extensive experiments demonstrate that the RSC-Net can achieve consistently better results than the state-of-the-art methods for challenging low-resolution images 
650 4 |a Journal Article 
650 4 |a Research Support, Non-U.S. Gov't 
650 4 |a Research Support, U.S. Gov't, Non-P.H.S. 
700 1 |a Chen, Hao  |e verfasserin  |4 aut 
700 1 |a Moreno-Noguer, Francesc  |e verfasserin  |4 aut 
700 1 |a Jeni, Laszlo A  |e verfasserin  |4 aut 
700 1 |a De la Torre, Fernando  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g 44(2022), 9 vom: 03. Sept., Seite 4490-4504  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:44  |g year:2022  |g number:9  |g day:03  |g month:09  |g pages:4490-4504 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2021.3070002  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 44  |j 2022  |e 9  |b 03  |c 09  |h 4490-4504