Pro-PULSE : Learning Progressive Encoders of Latent Semantics in GANs for Photo Upsampling

The state-of-the-art photo upsampling method, PULSE, demonstrates that a sharp, high-resolution (HR) version of a given low-resolution (LR) input can be obtained by exploring the latent space of generative models. However, mapping an extreme LR input (162) directly to an HR image (10242) is too ambi...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 31(2022) vom: 11., Seite 1230-1242
1. Verfasser:	Zhou, Yang (VerfasserIn)
Weitere Verfasser:	Xu, Yangyang, Du, Yong, Wen, Qiang, He, Shengfeng
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2022
Zugriff auf das übergeordnete Werk:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:	Journal Article


LEADER	01000naa a22002652 4500
001	NLM335506631
003	DE-627
005	20231225230014.0
007	cr uuu---uuuuu
008	231225s2022 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1109/TIP.2022.3140603 \|2 doi
028	5	2	\|a pubmed24n1118.xml
035			\|a (DE-627)NLM335506631
035			\|a (NLM)35015636
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Zhou, Yang \|e verfasserin \|4 aut
245	1	0	\|a Pro-PULSE \|b Learning Progressive Encoders of Latent Semantics in GANs for Photo Upsampling
264		1	\|c 2022
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Revised 20.01.2022
500			\|a published: Print-Electronic
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a The state-of-the-art photo upsampling method, PULSE, demonstrates that a sharp, high-resolution (HR) version of a given low-resolution (LR) input can be obtained by exploring the latent space of generative models. However, mapping an extreme LR input (162) directly to an HR image (10242) is too ambiguous to preserve faithful local facial semantics. In this paper, we propose an enhanced upsampling approach, Pro-PULSE, that addresses the issues of semantic inconsistency and optimization complexity. Our idea is to learn an encoder that progressively constructs the HR latent codes in the extended W+ latent space of StyleGAN. This design divides the complex 64× upsampling problem into several steps, and therefore small-scale facial semantics can be inherited from one end to the other. In particular, we train two encoders, the base encoder maps latent vectors in W space and serves as a foundation of the HR latent vector, while the second scale-specific encoder performed in W+ space gradually replaces the previous vector produced by the base encoder at each scale. This process produces intermediate side-outputs, which injects deep supervision into the training of encoder. Extensive experiments demonstrate superiorities over the latest latent space exploration methods, in terms of efficiency, quantitative quality metrics, and qualitative visual results
650		4	\|a Journal Article
700	1		\|a Xu, Yangyang \|e verfasserin \|4 aut
700	1		\|a Du, Yong \|e verfasserin \|4 aut
700	1		\|a Wen, Qiang \|e verfasserin \|4 aut
700	1		\|a He, Shengfeng \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society \|d 1992 \|g 31(2022) vom: 11., Seite 1230-1242 \|w (DE-627)NLM09821456X \|x 1941-0042 \|7 nnns
773	1	8	\|g volume:31 \|g year:2022 \|g day:11 \|g pages:1230-1242
856	4	0	\|u http://dx.doi.org/10.1109/TIP.2022.3140603 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|d 31 \|j 2022 \|b 11 \|h 1230-1242