From Whole Video to Frames : Weakly-Supervised Domain Adaptive Continuous-Time QoE Evaluation
Due to the rapid increase in video traffic and relatively limited delivery infrastructure, end users often experience dynamically varying quality over time when viewing streaming videos. The user quality-of-experience (QoE) must be continuously monitored to deliver an optimized service. However, mod...
Publié dans: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 31(2022) vom: 29., Seite 4937-4951 |
---|---|
Auteur principal: | |
Autres auteurs: | , , , |
Format: | Article en ligne |
Langue: | English |
Publié: |
2022
|
Accès à la collection: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society |
Sujets: | Journal Article |
Résumé: | Due to the rapid increase in video traffic and relatively limited delivery infrastructure, end users often experience dynamically varying quality over time when viewing streaming videos. The user quality-of-experience (QoE) must be continuously monitored to deliver an optimized service. However, modern approaches for continuous-time video QoE estimation require densely annotating the continuous-time QoE labels, which is labor-intensive and time-consuming. To cope with such limitations, we propose a novel weakly-supervised domain adaptation approach for continuous-time QoE evaluation, by making use of a small amount of continuously labeled data in the source domain and abundant weakly-labeled data (only containing the retrospective QoE labels) in the target domain. Specifically, given a pair of videos from source and target domains, effective spatiotemporal segment-level feature representation is first learned by a combination of 2D and 3D convolutional networks. Then, a multi-task prediction framework is developed to simultaneously achieve continuous-time and retrospective QoE predictions, where a quality attentive adaptation approach is investigated to effectively alleviate the domain discrepancy without hampering the prediction performance. This approach is enabled by explicitly attending to the video-level discrimination and segment-level transferability in terms of the domain discrepancy. Experiments on benchmark databases demonstrate that the proposed method significantly improves the prediction performance under the cross-domain setting |
---|---|
Description: | Date Revised 25.07.2022 published: Print-Electronic Citation Status PubMed-not-MEDLINE |
ISSN: | 1941-0042 |
DOI: | 10.1109/TIP.2022.3190711 |