Event-Based Semantic Segmentation With Posterior Attention

In the past years, attention-based Transformers have swept across the field of computer vision, starting a new stage of backbones in semantic segmentation. Nevertheless, semantic segmentation under poor light conditions remains an open problem. Moreover, most papers about semantic segmentation work...

Description complète

Détails bibliographiques
Publié dans:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 32(2023) vom: 04., Seite 1829-1842
Auteur principal: Jia, Zexi (Auteur)
Autres auteurs: You, Kaichao, He, Weihua, Tian, Yang, Feng, Yongxiang, Wang, Yaoyuan, Jia, Xu, Lou, Yihang, Zhang, Jingyi, Li, Guoqi, Zhang, Ziyang
Format: Article en ligne
Langue:English
Publié: 2023
Accès à la collection:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Sujets:Journal Article
Description
Résumé:In the past years, attention-based Transformers have swept across the field of computer vision, starting a new stage of backbones in semantic segmentation. Nevertheless, semantic segmentation under poor light conditions remains an open problem. Moreover, most papers about semantic segmentation work on images produced by commodity frame-based cameras with a limited framerate, hindering their deployment to auto-driving systems that require instant perception and response at milliseconds. An event camera is a new sensor that generates event data at microseconds and can work in poor light conditions with a high dynamic range. It looks promising to leverage event cameras to enable perception where commodity cameras are incompetent, but algorithms for event data are far from mature. Pioneering researchers stack event data as frames so that event-based segmentation is converted to frame-based segmentation, but characteristics of event data are not explored. Noticing that event data naturally highlight moving objects, we propose a posterior attention module that adjusts the standard attention by the prior knowledge provided by event data. The posterior attention module can be readily plugged into many segmentation backbones. Plugging the posterior attention module into a recently proposed SegFormer network, we get EvSegFormer (the event-based version of SegFormer) with state-of-the-art performance in two datasets (MVSEC and DDD-17) collected for event-based segmentation. Code is available at https://github.com/zexiJia/EvSegFormer to facilitate research on event-based vision
Description:Date Revised 04.04.2025
published: Print-Electronic
Citation Status PubMed-not-MEDLINE
ISSN:1941-0042
DOI:10.1109/TIP.2023.3249579