Unsupervised Deep Exemplar Colorization via Pyramid Dual Non-Local Attention

Exemplar-based colorization is a challenging task, which attempts to add colors to the target grayscale image with the aid of a reference color image, so as to keep the target semantic content while with the reference color style. In order to achieve visually plausible chromatic results, it is impor...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 32(2023) vom: 13., Seite 4114-4127
1. Verfasser: Wang, Hanzhang (VerfasserIn)
Weitere Verfasser: Zhai, Deming, Liu, Xianming, Jiang, Junjun, Gao, Wen
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article
Beschreibung
Zusammenfassung:Exemplar-based colorization is a challenging task, which attempts to add colors to the target grayscale image with the aid of a reference color image, so as to keep the target semantic content while with the reference color style. In order to achieve visually plausible chromatic results, it is important to sufficiently exploit the global color style and the semantic color information of the reference color image. However, existing methods are either clumsy in exploiting the semantic color information, or lack of the dedicated fusion mechanism to decorate the target grayscale image with the reference semantic color information. Besides, these methods usually use a single-stage encoder-decoder architecture, which results in the loss of spatial details. To remedy these problems, we propose an effective exemplar colorization strategy based on pyramid dual non-local attention network to exploit the long-range dependency as well as multi-scale correlation. Specifically, two symmetrical branches of pyramid non-local attention block are tailored to achieve alignments from the target feature to the reference feature and from the reference feature to the target feature respectively. The bidirectional non-local fusion strategy is further applied to get a sufficient fusion feature that achieves full semantic consistency between multi-modal information. To train the network, we propose an unsupervised learning manner, which employs the hybrid supervision including the pseudo paired supervision from the reference color images and unpaired supervision from both the target grayscale and reference color images. Extensive experimental results are provided to demonstrate that our method achieves better photo-realistic colorization performance than the state-of-the-art methods
Beschreibung:Date Revised 20.07.2023
published: Print-Electronic
Citation Status PubMed-not-MEDLINE
ISSN:1941-0042
DOI:10.1109/TIP.2023.3293777