MotionDiffuse : Text-Driven Human Motion Generation With Diffusion Model

Human motion modeling is important for many modern graphics applications, which typically require professional skills. In order to remove the skill barriers for laymen, recent motion generation methods can directly generate human motions conditioned on natural languages. However, it remains challeng...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 46(2024), 6 vom: 29. Juni, Seite 4115-4128
1. Verfasser: Zhang, Mingyuan (VerfasserIn)
Weitere Verfasser: Cai, Zhongang, Pan, Liang, Hong, Fangzhou, Guo, Xinying, Yang, Lei, Liu, Ziwei
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2024
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article Research Support, Non-U.S. Gov't
LEADER 01000caa a22002652 4500
001 NLM367759985
003 DE-627
005 20250103231835.0
007 cr uuu---uuuuu
008 240130s2024 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2024.3355414  |2 doi 
028 5 2 |a pubmed24n1650.xml 
035 |a (DE-627)NLM367759985 
035 |a (NLM)38285589 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Zhang, Mingyuan  |e verfasserin  |4 aut 
245 1 0 |a MotionDiffuse  |b Text-Driven Human Motion Generation With Diffusion Model 
264 1 |c 2024 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Completed 07.05.2024 
500 |a Date Revised 03.01.2025 
500 |a published: Print-Electronic 
500 |a Citation Status MEDLINE 
520 |a Human motion modeling is important for many modern graphics applications, which typically require professional skills. In order to remove the skill barriers for laymen, recent motion generation methods can directly generate human motions conditioned on natural languages. However, it remains challenging to achieve diverse and fine-grained motion generation with various text inputs. To address this problem, we propose MotionDiffuse, one of the first diffusion model-based text-driven motion generation frameworks, which demonstrates several desired properties over existing methods. 1) Probabilistic Mapping. Instead of a deterministic language-motion mapping, MotionDiffuse generates motions through a series of denoising steps in which variations are injected. 2) Realistic Synthesis. MotionDiffuse excels at modeling complicated data distribution and generating vivid motion sequences. 3) Multi-Level Manipulation. MotionDiffuse responds to fine-grained instructions on body parts, and arbitrary-length motion synthesis with time-varied text prompts. Our experiments show MotionDiffuse outperforms existing SoTA methods by convincing margins on text-driven motion generation and action-conditioned motion generation. A qualitative analysis further demonstrates MotionDiffuse's controllability for comprehensive motion generation 
650 4 |a Journal Article 
650 4 |a Research Support, Non-U.S. Gov't 
700 1 |a Cai, Zhongang  |e verfasserin  |4 aut 
700 1 |a Pan, Liang  |e verfasserin  |4 aut 
700 1 |a Hong, Fangzhou  |e verfasserin  |4 aut 
700 1 |a Guo, Xinying  |e verfasserin  |4 aut 
700 1 |a Yang, Lei  |e verfasserin  |4 aut 
700 1 |a Liu, Ziwei  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g 46(2024), 6 vom: 29. Juni, Seite 4115-4128  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:46  |g year:2024  |g number:6  |g day:29  |g month:06  |g pages:4115-4128 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2024.3355414  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 46  |j 2024  |e 6  |b 29  |c 06  |h 4115-4128