RegSeg : An End-to-End Network for Multimodal RGB-Thermal Registration and Semantic Segmentation
The misalignment between RGB and thermal images significantly impairs RGB-Thermal semantic segmentation accuracy. Current non-end-to-end methods treat RGB-Thermal registration independently of semantic segmentation, resulting in fusion errors, redundant computations, and poor real-time performance....
Veröffentlicht in: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - PP(2024) vom: 22. Nov. |
---|---|
1. Verfasser: | |
Weitere Verfasser: | , , , , |
Format: | Online-Aufsatz |
Sprache: | English |
Veröffentlicht: |
2024
|
Zugriff auf das übergeordnete Werk: | IEEE transactions on image processing : a publication of the IEEE Signal Processing Society |
Schlagworte: | Journal Article |
Zusammenfassung: | The misalignment between RGB and thermal images significantly impairs RGB-Thermal semantic segmentation accuracy. Current non-end-to-end methods treat RGB-Thermal registration independently of semantic segmentation, resulting in fusion errors, redundant computations, and poor real-time performance. Semantic segmentation accuracy directly correlates with registration precision: better registration yields more accurate segmentation. Moreover, regions with identical semantic labels, indicating the same object, tend to share similar registration offsets. Based on these correlations, we propose an end-to-end multimodal registration and segmentation method using flexible deformation fields. Our method utilizes a shared encoder for registration and semantic segmentation to reduce redundancy. Unlike traditional non-end-to-end approaches, it directly registers high-level perceptual features, thereby optimizing computational efficiency and real-time performance. Additionally, we employ a flexible deformation field to register RGB-Thermal data, addressing limitations of traditional affine transformations in handling non-coplanar and non-rigid registrations. However, the increased flexibility of deformation fields compared to affine transformations, and the sacrificing of geometric feature preservation, pose training challenges. To overcome this, we introduce a semantic alignment loss function to train the alignment module. This function calculates the semantic segmentation loss between the predictions from registered thermal features and RGB semantic labels. It shortens the gradient backpropagation path, aligning the objectives of registration and segmentation. We validate our end-to-end approach through extensive experiments, achieving significant performance enhancements. On the IR SEG dataset, our end-to-end method achieves state-of-the-art results with a mean Intersection over Union (mIoU) of 61.1% and a mean accuracy (mAcc) of 76.0% |
---|---|
Beschreibung: | Date Revised 03.03.2025 published: Print-Electronic Citation Status Publisher |
ISSN: | 1941-0042 |
DOI: | 10.1109/TIP.2024.3501077 |