A Benchmark Dataset and Saliency-Guided Stacked Autoencoders for Video-Based Salient Object Detection

Image-based salient object detection (SOD) has been extensively studied in past decades. However, video-based SOD is much less explored due to the lack of large-scale video datasets within which salient objects are unambiguously defined and annotated. Toward this end, this paper proposes a video-bas...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 27(2018), 1 vom: 13. Jan., Seite 349-364
1. Verfasser:	Jia Li (VerfasserIn)
Weitere Verfasser:	Changqun Xia, Xiaowu Chen
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2018
Zugriff auf das übergeordnete Werk:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:	Journal Article


LEADER	01000caa a22002652 4500
001	NLM276970128
003	DE-627
005	20250222111350.0
007	cr uuu---uuuuu
008	231225s2018 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1109/TIP.2017.2762594 \|2 doi
028	5	2	\|a pubmed25n0923.xml
035			\|a (DE-627)NLM276970128
035			\|a (NLM)29028198
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Jia Li \|e verfasserin \|4 aut
245	1	2	\|a A Benchmark Dataset and Saliency-Guided Stacked Autoencoders for Video-Based Salient Object Detection
264		1	\|c 2018
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Completed 11.12.2018
500			\|a Date Revised 11.12.2018
500			\|a published: Print-Electronic
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a Image-based salient object detection (SOD) has been extensively studied in past decades. However, video-based SOD is much less explored due to the lack of large-scale video datasets within which salient objects are unambiguously defined and annotated. Toward this end, this paper proposes a video-based SOD dataset that consists of 200 videos. In constructing the dataset, we manually annotate all objects and regions over 7650 uniformly sampled keyframes and collect the eye-tracking data of 23 subjects who free-view all videos. From the user data, we find that salient objects in a video can be defined as objects that consistently pop-out throughout the video, and objects with such attributes can be unambiguously annotated by combining manually annotated object/region masks with eye-tracking data of multiple subjects. To the best of our knowledge, it is currently the largest dataset for video-based salient object detection. Based on this dataset, this paper proposes an unsupervised baseline approach for video-based SOD by using saliency-guided stacked autoencoders. In the proposed approach, multiple spatiotemporal saliency cues are first extracted at the pixel, superpixel, and object levels. With these saliency cues, stacked autoencoders are constructed in an unsupervised manner that automatically infers a saliency score for each pixel by progressively encoding the high-dimensional saliency cues gathered from the pixel and its spatiotemporal neighbors. In experiments, the proposed unsupervised approach is compared with 31 state-of-the-art models on the proposed dataset and outperforms 30 of them, including 19 image-based classic (unsupervised or non-deep learning) models, six image-based deep learning models, and five video-based unsupervised models. Moreover, benchmarking results show that the proposed dataset is very challenging and has the potential to boost the development of video-based SOD
650		4	\|a Journal Article
700	1		\|a Changqun Xia \|e verfasserin \|4 aut
700	1		\|a Xiaowu Chen \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society \|d 1992 \|g 27(2018), 1 vom: 13. Jan., Seite 349-364 \|w (DE-627)NLM09821456X \|x 1941-0042 \|7 nnns
773	1	8	\|g volume:27 \|g year:2018 \|g number:1 \|g day:13 \|g month:01 \|g pages:349-364
856	4	0	\|u http://dx.doi.org/10.1109/TIP.2017.2762594 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|d 27 \|j 2018 \|e 1 \|b 13 \|c 01 \|h 349-364