Unsupervised Low-Light Video Enhancement With Spatial-Temporal Co-Attention Transformer

Existing low-light video enhancement methods are dominated by Convolution Neural Networks (CNNs) that are trained in a supervised manner. Due to the difficulty of collecting paired dynamic low/normal-light videos in real-world scenes, they are usually trained on synthetic, static, and uniform motion...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 32(2023) vom: 07., Seite 4701-4715
1. Verfasser:	Lv, Xiaoqian (VerfasserIn)
Weitere Verfasser:	Zhang, Shengping, Wang, Chenyang, Zhang, Weigang, Yao, Hongxun, Huang, Qingming
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2023
Zugriff auf das übergeordnete Werk:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:	Journal Article


LEADER	01000naa a22002652 4500
001	NLM360487416
003	DE-627
005	20231226083122.0
007	cr uuu---uuuuu
008	231226s2023 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1109/TIP.2023.3301332 \|2 doi
028	5	2	\|a pubmed24n1201.xml
035			\|a (DE-627)NLM360487416
035			\|a (NLM)37549080
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Lv, Xiaoqian \|e verfasserin \|4 aut
245	1	0	\|a Unsupervised Low-Light Video Enhancement With Spatial-Temporal Co-Attention Transformer
264		1	\|c 2023
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Revised 16.08.2023
500			\|a published: Print-Electronic
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a Existing low-light video enhancement methods are dominated by Convolution Neural Networks (CNNs) that are trained in a supervised manner. Due to the difficulty of collecting paired dynamic low/normal-light videos in real-world scenes, they are usually trained on synthetic, static, and uniform motion videos, which undermines their generalization to real-world scenes. Additionally, these methods typically suffer from temporal inconsistency (e.g., flickering artifacts and motion blurs) when handling large-scale motions since the local perception property of CNNs limits them to model long-range dependencies in both spatial and temporal domains. To address these problems, we propose the first unsupervised method for low-light video enhancement to our best knowledge, named LightenFormer, which models long-range intra- and inter-frame dependencies with a spatial-temporal co-attention transformer to enhance brightness while maintaining temporal consistency. Specifically, an effective but lightweight S-curve Estimation Network (SCENet) is first proposed to estimate pixel-wise S-shaped non-linear curves (S-curves) to adaptively adjust the dynamic range of an input video. Next, to model the temporal consistency of the video, we present a Spatial-Temporal Refinement Network (STRNet) to refine the enhanced video. The core module of STRNet is a novel Spatial-Temporal Co-attention Transformer (STCAT), which exploits multi-scale self- and cross-attention interactions to capture long-range correlations in both spatial and temporal domains among frames for implicit motion estimation. To achieve unsupervised training, we further propose two non-reference loss functions based on the invertibility of the S-curve and the noise independence among frames. Extensive experiments on the SDSD and LLIV-Phone datasets demonstrate that our LightenFormer outperforms state-of-the-art methods
650		4	\|a Journal Article
700	1		\|a Zhang, Shengping \|e verfasserin \|4 aut
700	1		\|a Wang, Chenyang \|e verfasserin \|4 aut
700	1		\|a Zhang, Weigang \|e verfasserin \|4 aut
700	1		\|a Yao, Hongxun \|e verfasserin \|4 aut
700	1		\|a Huang, Qingming \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society \|d 1992 \|g 32(2023) vom: 07., Seite 4701-4715 \|w (DE-627)NLM09821456X \|x 1941-0042 \|7 nnns
773	1	8	\|g volume:32 \|g year:2023 \|g day:07 \|g pages:4701-4715
856	4	0	\|u http://dx.doi.org/10.1109/TIP.2023.3301332 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|d 32 \|j 2023 \|b 07 \|h 4701-4715