DyCrowd : Towards Dynamic Crowd Reconstruction from a Large-scene Video

3D reconstruction of dynamic crowds in large scenes has become increasingly important for applications such as city surveillance and crowd analysis. However, current works attempt to reconstruct 3D crowds from a static image, causing a lack of temporal consistency and inability to alleviate the typi...

Description complète

Détails bibliographiques
Publié dans:IEEE transactions on pattern analysis and machine intelligence. - 1979. - PP(2025) vom: 19. Aug.
Auteur principal: Wen, Hao (Auteur)
Autres auteurs: Kang, Hongbo, Ma, Jian, Huang, Jing, Yang, Yuanwang, Lin, Haozhe, Lai, Yu-Kun, Li, Kun
Format: Article en ligne
Langue:English
Publié: 2025
Accès à la collection:IEEE transactions on pattern analysis and machine intelligence
Sujets:Journal Article
LEADER 01000caa a22002652c 4500
001 NLM391447718
003 DE-627
005 20250828002342.0
007 cr uuu---uuuuu
008 250820s2025 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2025.3600465  |2 doi 
028 5 2 |a pubmed25n1546.xml 
035 |a (DE-627)NLM391447718 
035 |a (NLM)40828705 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Wen, Hao  |e verfasserin  |4 aut 
245 1 0 |a DyCrowd  |b Towards Dynamic Crowd Reconstruction from a Large-scene Video 
264 1 |c 2025 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 19.08.2025 
500 |a published: Print-Electronic 
500 |a Citation Status Publisher 
520 |a 3D reconstruction of dynamic crowds in large scenes has become increasingly important for applications such as city surveillance and crowd analysis. However, current works attempt to reconstruct 3D crowds from a static image, causing a lack of temporal consistency and inability to alleviate the typical impact caused by occlusions. In this paper, we propose DyCrowd, the first framework for spatio-temporally consistent 3D reconstruction of hundreds of individuals' poses, positions and shapes from a large-scene video. We design a coarse-to-fine group-guided motion optimization strategy for occlusion-robust crowd reconstruction in large scenes. To address temporal instability and severe occlusions, we further incorporate a VAE (Variational Autoencoder)-based human motion prior along with a segment-level group-guided optimization. The core of our strategy leverages collective crowd behavior to address long-term dynamic occlusions. By jointly optimizing the motion sequences of individuals with similar motion segments and combining this with the proposed Asynchronous Motion Consistency (AMC) loss, we enable high-quality unoccluded motion segments to guide the motion recovery of occluded ones, ensuring robust and plausible motion recovery even in the presence of temporal desynchronization and rhythmic inconsistencies. Additionally, in order to fill the gap of no existing well-annotated large-scene video dataset, we contribute a virtual benchmark dataset, VirtualCrowd, for evaluating dynamic crowd reconstruction from large-scene videos. Experimental results demonstrate that the proposed method achieves state-of-the-art performance in the large-scene dynamic crowd reconstruction task. The code and dataset will be available for research purposes 
650 4 |a Journal Article 
700 1 |a Kang, Hongbo  |e verfasserin  |4 aut 
700 1 |a Ma, Jian  |e verfasserin  |4 aut 
700 1 |a Huang, Jing  |e verfasserin  |4 aut 
700 1 |a Yang, Yuanwang  |e verfasserin  |4 aut 
700 1 |a Lin, Haozhe  |e verfasserin  |4 aut 
700 1 |a Lai, Yu-Kun  |e verfasserin  |4 aut 
700 1 |a Li, Kun  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g PP(2025) vom: 19. Aug.  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnas 
773 1 8 |g volume:PP  |g year:2025  |g day:19  |g month:08 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2025.3600465  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d PP  |j 2025  |b 19  |c 08