Study of Subjective and Objective Quality Assessment of Audio-Visual Signals

The topics of visual and audio quality assessment (QA) have been widely researched for decades, yet nearly all of this prior work has focused only on single-mode visual or audio signals. However, visual signals rarely are presented without accompanying audio, including heavy-bandwidth video streamin...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - (2020) vom: 21. Apr.
1. Verfasser: Min, Xiongkuo (VerfasserIn)
Weitere Verfasser: Zhai, Guangtao, Zhou, Jiantao, Farias, Mylene C Q, Bovik, Alan Conrad
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2020
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article
Beschreibung
Zusammenfassung:The topics of visual and audio quality assessment (QA) have been widely researched for decades, yet nearly all of this prior work has focused only on single-mode visual or audio signals. However, visual signals rarely are presented without accompanying audio, including heavy-bandwidth video streaming applications. Moreover, the distortions that may separately (or conjointly) afflict the visual and audio signals collectively shape user-perceived quality of experience (QoE). This motivated us to conduct a subjective study of audio and video (A/V) quality, which we then used to compare and develop A/V quality measurement models and algorithms. The new LIVE-SJTU Audio and Video Quality Assessment (A/V-QA) Database includes 336 A/V sequences that were generated from 14 original source contents by applying 24 different A/V distortion combinations on them. We then conducted a subjective A/V quality perception study on the database towards attaining a better understanding of how humans perceive the overall combined quality of A/V signals. We also designed four different families of objective A/V quality prediction models, using a multimodal fusion strategy. The different types of A/V quality models differ in both the unimodal audio and video quality prediction models comprising the direct signal measurements and in the way that the two perceptual signal modes are combined. The objective models are built using both existing state-of-the-art audio and video quality prediction models and some new prediction models, as well as quality-predictive features delivered by a deep neural network. The methods of fusing audio and video quality predictions that are considered include simple product combinations as well as learned mappings. Using the new subjective A/V database as a tool, we validated and tested all of the objective A/V quality prediction models. We will make the database publicly available to facilitate further research
Beschreibung:Date Revised 27.02.2024
published: Print-Electronic
Citation Status Publisher
ISSN:1941-0042
DOI:10.1109/TIP.2020.2988148