Adaptive Deep Reinforcement Learning-Based In-Loop Filter for VVC

Deep learning-based in-loop filters have recently demonstrated great improvement for both coding efficiency and subjective quality in video coding. However, most existing deep learning-based in-loop filters tend to develop a sophisticated model in exchange for good performance, and they employ a sin...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 30(2021) vom: 03., Seite 5439-5451
1. Verfasser: Huang, Zhijie (VerfasserIn)
Weitere Verfasser: Sun, Jun, Guo, Xiaopeng, Shang, Mingyu
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2021
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article
Beschreibung
Zusammenfassung:Deep learning-based in-loop filters have recently demonstrated great improvement for both coding efficiency and subjective quality in video coding. However, most existing deep learning-based in-loop filters tend to develop a sophisticated model in exchange for good performance, and they employ a single network structure to all reconstructed samples, which lack sufficient adaptiveness to the various video content, limiting their performances to some extent. In contrast, this paper proposes an adaptive deep reinforcement learning-based in-loop filter (ARLF) for versatile video coding (VVC). Specifically, we treat the filtering as a decision-making process and employ an agent to select an appropriate network by leveraging recent advances in deep reinforcement learning. To this end, we develop a lightweight backbone and utilize it to design a network set S containing networks with different complexities. Then a simple but efficient agent network is designed to predict the optimal network from S , which makes the model adaptive to various video contents. To improve the robustness of our model, a two-stage training scheme is further proposed to train the agent and tune the network set. The coding tree unit (CTU) is seen as the basic unit for the in-loop filtering processing. A CTU level control flag is applied in the sense of rate-distortion optimization (RDO). Extensive experimental results show that our ARLF approach obtains on average 2.17%, 2.65%, 2.58%, 2.51% under all-intra, low-delay P, low-delay, and random access configurations, respectively. Compared with other deep learning-based methods, the proposed approach can achieve better performance with low computation complexity
Beschreibung:Date Revised 09.06.2021
published: Print-Electronic
Citation Status PubMed-not-MEDLINE
ISSN:1941-0042
DOI:10.1109/TIP.2021.3084345