Adaptive Selection of Reference Frames for Video Object Segmentation

Video object segmentation is a challenging task in computer vision because the appearances of target objects might change drastically along the time in the video. To solve this problem, space-time memory (STM) networks are exploited to make use of the information from all the intermediate frames bet...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 31(2022) vom: 29., Seite 1057-1071
1. Verfasser:	Hong, Lingyi (VerfasserIn)
Weitere Verfasser:	Zhang, Wei, Chen, Liangyu, Zhang, Wenqiang, Fan, Jianping
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2022
Zugriff auf das übergeordnete Werk:	IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:	Journal Article


LEADER	01000naa a22002652 4500
001	NLM33501027X
003	DE-627
005	20231225224915.0
007	cr uuu---uuuuu
008	231225s2022 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1109/TIP.2021.3137660 \|2 doi
028	5	2	\|a pubmed24n1116.xml
035			\|a (DE-627)NLM33501027X
035			\|a (NLM)34965210
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Hong, Lingyi \|e verfasserin \|4 aut
245	1	0	\|a Adaptive Selection of Reference Frames for Video Object Segmentation
264		1	\|c 2022
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Revised 20.01.2022
500			\|a published: Print-Electronic
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a Video object segmentation is a challenging task in computer vision because the appearances of target objects might change drastically along the time in the video. To solve this problem, space-time memory (STM) networks are exploited to make use of the information from all the intermediate frames between the first frame and the current frame in the video. However, fully using the information from all the memory frames may make STM not practical for long videos. To overcome this issue, a novel method is developed in this paper to select the reference frames adaptively. First, an adaptive selection criterion is introduced to choose the reference frames with similar appearance and precise mask estimation, which can efficiently capture the rich information of the target object and overcome the challenges of appearance changes, occlusion, and model drift. Secondly, bi-matching (bi-scale and bi-direction) is conducted to obtain more robust correlations for objects of various scales and prevents multiple similar objects in the current frame from being mismatched with the same target object in the reference frame. Thirdly, a novel edge refinement technique is designed by using an edge detection network to obtain smooth edges from the outputs of edge confidence maps, where the edge confidence is quantized into ten sub-intervals to generate smooth edges step by step. Experimental results on the challenging benchmark datasets DAVIS-2016, DAVIS-2017, YouTube-VOS, and a Long-Video dataset have demonstrated the effectiveness of our proposed approach to video object segmentation
650		4	\|a Journal Article
700	1		\|a Zhang, Wei \|e verfasserin \|4 aut
700	1		\|a Chen, Liangyu \|e verfasserin \|4 aut
700	1		\|a Zhang, Wenqiang \|e verfasserin \|4 aut
700	1		\|a Fan, Jianping \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society \|d 1992 \|g 31(2022) vom: 29., Seite 1057-1071 \|w (DE-627)NLM09821456X \|x 1941-0042 \|7 nnns
773	1	8	\|g volume:31 \|g year:2022 \|g day:29 \|g pages:1057-1071
856	4	0	\|u http://dx.doi.org/10.1109/TIP.2021.3137660 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_NLM
912			\|a GBV_ILN_350
951			\|a AR
952			\|d 31 \|j 2022 \|b 29 \|h 1057-1071