Semantic Probability Distribution Modeling for Diverse Semantic Image Synthesis

Semantic image synthesis, translating semantic layouts to photo-realistic images, is a one-to-many mapping problem. Though impressive progress has been recently made, diverse semantic synthesis that can efficiently produce semantic-level or even instance-level multimodal results, still remains a cha...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 45(2023), 5 vom: 01. Mai, Seite 6247-6264
1. Verfasser: Tan, Zhentao (VerfasserIn)
Weitere Verfasser: Chu, Qi, Chai, Menglei, Chen, Dongdong, Liao, Jing, Liu, Qiankun, Liu, Bin, Hua, Gang, Yu, Nenghai
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
Beschreibung
Zusammenfassung:Semantic image synthesis, translating semantic layouts to photo-realistic images, is a one-to-many mapping problem. Though impressive progress has been recently made, diverse semantic synthesis that can efficiently produce semantic-level or even instance-level multimodal results, still remains a challenge. In this article, we propose a novel diverse semantic image synthesis framework from the perspective of semantic class distributions, which naturally supports diverse generation at both semantics and instance level. We achieve this by modeling class-level conditional modulation parameters as continuous probability distributions instead of discrete values, and sampling per-instance modulation parameters through instance-adaptive stochastic sampling that is consistent across the network. Moreover, we propose prior noise remapping, through linear perturbation parameters encoded from paired references, to facilitate supervised training and exemplar-based instance style control at test time. To further extend the user interaction function of the proposed method, we also introduce sketches into the network. In addition, specially designed generator modules, Progressive Growing Module and Multi-Scale Refinement Module, can be used as a general module to improve the performance of complex scene generation. Extensive experiments on multiple datasets show that our method can achieve superior diversity and comparable quality compared to state-of-the-art methods. Codes are available at https://github.com/tzt101/INADE.git
Beschreibung:Date Completed 10.04.2023
Date Revised 10.04.2023
published: Print-Electronic
Citation Status PubMed-not-MEDLINE
ISSN:1939-3539
DOI:10.1109/TPAMI.2022.3210085