LIA : Latent Image Animator

Previous animation techniques mainly focus on leveraging explicit structure representations (e.g., meshes or keypoints) for transferring motion from driving videos to source images. However, such methods are challenged with large appearance variations between source and driving data, as well as requ...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - PP(2024) vom: 23. Aug.
1. Verfasser: Wang, Yaohui (VerfasserIn)
Weitere Verfasser: Yang, Di, Bremond, Francois, Dantcheva, Antitza
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2024
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM376652845
003 DE-627
005 20240824233248.0
007 cr uuu---uuuuu
008 240824s2024 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2024.3449075  |2 doi 
028 5 2 |a pubmed24n1511.xml 
035 |a (DE-627)NLM376652845 
035 |a (NLM)39178067 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Wang, Yaohui  |e verfasserin  |4 aut 
245 1 0 |a LIA  |b Latent Image Animator 
264 1 |c 2024 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 23.08.2024 
500 |a published: Print-Electronic 
500 |a Citation Status Publisher 
520 |a Previous animation techniques mainly focus on leveraging explicit structure representations (e.g., meshes or keypoints) for transferring motion from driving videos to source images. However, such methods are challenged with large appearance variations between source and driving data, as well as require complex additional modules to respectively model appearance and motion. Towards addressing these issues, we introduce the Latent Image Animator (LIA), streamlined to animate high-resolution images. LIA is designed as a simple autoencoder that does not rely on explicit representations. Motion transfer in the pixel space is modeled as linear navigation of motion codes in the latent space. Specifically such navigation is represented as an orthogonal motion dictionary learned in a self-supervised manner based on proposed Linear Motion Decomposition (LMD). Extensive experimental results demonstrate that LIA outperforms state-of-the-art on VoxCeleb, TaichiHD, and TED-talk datasets with respect to video quality and spatio-temporal consistency. In addition LIA is well equipped for zero-shot high-resolution image animation 
650 4 |a Journal Article 
700 1 |a Yang, Di  |e verfasserin  |4 aut 
700 1 |a Bremond, Francois  |e verfasserin  |4 aut 
700 1 |a Dantcheva, Antitza  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g PP(2024) vom: 23. Aug.  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:PP  |g year:2024  |g day:23  |g month:08 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2024.3449075  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d PP  |j 2024  |b 23  |c 08