Text-Guided Human Image Manipulation via Image-Text Shared Space

Text is a new way to guide human image manipulation. Albeit natural and flexible, text usually suffers from inaccuracy in spatial description, ambiguity in the description of appearance, and incompleteness. We in this paper address these issues. To overcome inaccuracy, we use structured information...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 44(2022), 10 vom: 01. Okt., Seite 6486-6500
1. Verfasser: Xu, Xiaogang (VerfasserIn)
Weitere Verfasser: Chen, Ying-Cong, Tao, Xin, Jia, Jiaya
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2022
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000naa a22002652 4500
001 NLM326102515
003 DE-627
005 20231225193849.0
007 cr uuu---uuuuu
008 231225s2022 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2021.3085339  |2 doi 
028 5 2 |a pubmed24n1086.xml 
035 |a (DE-627)NLM326102515 
035 |a (NLM)34061734 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Xu, Xiaogang  |e verfasserin  |4 aut 
245 1 0 |a Text-Guided Human Image Manipulation via Image-Text Shared Space 
264 1 |c 2022 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Completed 16.09.2022 
500 |a Date Revised 19.11.2022 
500 |a published: Print-Electronic 
500 |a Citation Status MEDLINE 
520 |a Text is a new way to guide human image manipulation. Albeit natural and flexible, text usually suffers from inaccuracy in spatial description, ambiguity in the description of appearance, and incompleteness. We in this paper address these issues. To overcome inaccuracy, we use structured information (e.g., poses) to help identify correct location to manipulate, by disentangling the control of appearance and spatial structure. Moreover, we learn the image-text shared space with derived disentanglement to improve accuracy and quality of manipulation, by separating relevant and irrelevant editing directions for the textual instructions in this space. Our model generates a series of manipulation results by moving source images in this space with different degrees of editing strength. Thus, to reduce the ambiguity in text, our model generates sequential output for manual selection. In addition, we propose an efficient pseudo-label loss to enhance editing performance when the text is incomplete. We evaluate our method on various datasets and show its precision and interactiveness to manipulate human images 
650 4 |a Journal Article 
700 1 |a Chen, Ying-Cong  |e verfasserin  |4 aut 
700 1 |a Tao, Xin  |e verfasserin  |4 aut 
700 1 |a Jia, Jiaya  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g 44(2022), 10 vom: 01. Okt., Seite 6486-6500  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:44  |g year:2022  |g number:10  |g day:01  |g month:10  |g pages:6486-6500 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2021.3085339  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 44  |j 2022  |e 10  |b 01  |c 10  |h 6486-6500