Learning Layout and Style Reconfigurable GANs for Controllable Image Synthesis

With the remarkable recent progress on learning deep generative models, it becomes increasingly interesting to develop models for controllable image synthesis from reconfigurable structured inputs. This paper focuses on a recently emerged task, layout-to-image, whose goal is to learn generative mode...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 44(2022), 9 vom: 25. Sept., Seite 5070-5087
1. Verfasser: Sun, Wei (VerfasserIn)
Weitere Verfasser: Wu, Tianfu
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2022
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article Research Support, U.S. Gov't, Non-P.H.S.
Beschreibung
Zusammenfassung:With the remarkable recent progress on learning deep generative models, it becomes increasingly interesting to develop models for controllable image synthesis from reconfigurable structured inputs. This paper focuses on a recently emerged task, layout-to-image, whose goal is to learn generative models for synthesizing photo-realistic images from a spatial layout (i.e., object bounding boxes configured in an image lattice) and its style codes (i.e., structural and appearance variations encoded by latent vectors). This paper first proposes an intuitive paradigm for the task, layout-to-mask-to-image, which learns to unfold object masks in a weakly-supervised way based on an input layout and object style codes. The layout-to-mask component deeply interacts with layers in the generator network to bridge the gap between an input layout and synthesized images. Then, this paper presents a method built on Generative Adversarial Networks (GANs) for the proposed layout-to-mask-to-image synthesis with layout and style control at both image and object levels. The controllability is realized by a proposed novel Instance-Sensitive and Layout-Aware Normalization (ISLA-Norm) scheme. A layout semi-supervised version of the proposed method is further developed without sacrificing performance. In experiments, the proposed method is tested in the COCO-Stuff dataset and the Visual Genome dataset with state-of-the-art performance obtained
Beschreibung:Date Completed 08.08.2022
Date Revised 14.09.2022
published: Print-Electronic
Citation Status MEDLINE
ISSN:1939-3539
DOI:10.1109/TPAMI.2021.3078577