Parts2Whole : Generalizable Multi-Part Portrait Customization

Multi-part portrait customization aims to generate realistic human images by assembling specified body parts from multiple reference images, with significant applications in digital human creation. Existing customization methods typically follow two approaches: 1) test-time fine-tuning, which learn...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 34(2025) vom: 07., Seite 5241-5256
1. Verfasser: Fan, Hongxing (VerfasserIn)
Weitere Verfasser: Huang, Zehuan, Wang, Lipeng, Chen, Haohua, Yin, Li, Sheng, Lu
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2025
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article
LEADER 01000caa a22002652c 4500
001 NLM391241354
003 DE-627
005 20250828001213.0
007 cr uuu---uuuuu
008 250815s2025 xx |||||o 00| ||eng c
024 7 |a 10.1109/TIP.2025.3597037  |2 doi 
028 5 2 |a pubmed25n1546.xml 
035 |a (DE-627)NLM391241354 
035 |a (NLM)40811197 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Fan, Hongxing  |e verfasserin  |4 aut 
245 1 0 |a Parts2Whole  |b Generalizable Multi-Part Portrait Customization 
264 1 |c 2025 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 21.08.2025 
500 |a published: Print 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a Multi-part portrait customization aims to generate realistic human images by assembling specified body parts from multiple reference images, with significant applications in digital human creation. Existing customization methods typically follow two approaches: 1) test-time fine-tuning, which learn concepts effectively but is time-consuming and struggles with multi-part composition; 2) generalizable feed-forward methods, which offer efficiency but lack fine control over appearance specifics. To address these limitations, we present Parts2Whole, a diffusion-based generalizable portrait generator that harmoniously integrates multiple reference parts into high-fidelity human images by our proposed multi-reference mechanism. To adequately characterize each part, we propose a detail-aware appearance encoder, which is initialized and inherits powerful image priors from the pre-trained denoising U-Net, enabling the encoding of detailed information from reference images. The extracted features are incorporated into the denoising U-Net by a shared self-attention mechanism, enhanced by mask information for precise part selection. Additionally, we integrate pose map conditioning to control the target posture of generated portraits, facilitating more flexible customization. Extensive experiments demonstrate the superiority of our approach over existing methods and applicability to related tasks like pose transfer and pose-guided human image generation, showcasing its versatile conditioning. Our project is available at https://huanngzh.github.io/Parts2Whole/ 
650 4 |a Journal Article 
700 1 |a Huang, Zehuan  |e verfasserin  |4 aut 
700 1 |a Wang, Lipeng  |e verfasserin  |4 aut 
700 1 |a Chen, Haohua  |e verfasserin  |4 aut 
700 1 |a Yin, Li  |e verfasserin  |4 aut 
700 1 |a Sheng, Lu  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society  |d 1992  |g 34(2025) vom: 07., Seite 5241-5256  |w (DE-627)NLM09821456X  |x 1941-0042  |7 nnas 
773 1 8 |g volume:34  |g year:2025  |g day:07  |g pages:5241-5256 
856 4 0 |u http://dx.doi.org/10.1109/TIP.2025.3597037  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 34  |j 2025  |b 07  |h 5241-5256