Prompt-Based Modality Alignment for Effective Multi-Modal Object Re-Identification

A critical challenge for multi-modal Object Re-Identification (ReID) is the effective aggregation of complementary information to mitigate illumination issues. State-of-the-art methods typically employ complex and highly-coupled architectures, which unavoidably result in heavy computational costs. M...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 34(2025) vom: 05., Seite 2450-2462
1. Verfasser: Zhang, Shizhou (VerfasserIn)
Weitere Verfasser: Luo, Wenlong, Cheng, De, Xing, Yinghui, Liang, Guoqiang, Wang, Peng, Zhang, Yanning
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2025
Zugriff auf das übergeordnete Werk:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Schlagworte:Journal Article
LEADER 01000caa a22002652c 4500
001 NLM386681643
003 DE-627
005 20250509185820.0
007 cr uuu---uuuuu
008 250508s2025 xx |||||o 00| ||eng c
024 7 |a 10.1109/TIP.2025.3556531  |2 doi 
028 5 2 |a pubmed25n1396.xml 
035 |a (DE-627)NLM386681643 
035 |a (NLM)40193270 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Zhang, Shizhou  |e verfasserin  |4 aut 
245 1 0 |a Prompt-Based Modality Alignment for Effective Multi-Modal Object Re-Identification 
264 1 |c 2025 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 05.05.2025 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a A critical challenge for multi-modal Object Re-Identification (ReID) is the effective aggregation of complementary information to mitigate illumination issues. State-of-the-art methods typically employ complex and highly-coupled architectures, which unavoidably result in heavy computational costs. Moreover, the significant distribution gap among different image spectra hinders the joint representation of multi-modal features. In this paper, we propose a framework named as PromptMA to establish effective communication channels between different modality paths, thereby aggregating modal complementary information and bridging the distribution gap. Specifically, we inject a series of learnable multi-modal prompts into the Image Encoder and introduce a prompt exchange mechanism to enable the prompts to alternately interact with different modal token embeddings, thus capturing and distributing multi-modal features effectively. Building on top of the multi-modal prompts, we further propose Prompt-based Token Selection (PBTS) and Prompt-based Modality Fusion (PBMF) modules to achieve effective multi-modal feature fusion while minimizing background interference. Additionally, due to the flexibility of our prompt exchange mechanism, our method is well-suited to handle scenarios with missing modalities. Extensive evaluations are conducted on four widely used benchmark datasets and the experimental results demonstrate that our method achieves state-of-the-art performances, surpassing the current benchmarks by over 15% on the challenging MSVR310 dataset and by 6% on the RGBNT201. The code is available at https://github.com/FHR-L/PromptMA 
650 4 |a Journal Article 
700 1 |a Luo, Wenlong  |e verfasserin  |4 aut 
700 1 |a Cheng, De  |e verfasserin  |4 aut 
700 1 |a Xing, Yinghui  |e verfasserin  |4 aut 
700 1 |a Liang, Guoqiang  |e verfasserin  |4 aut 
700 1 |a Wang, Peng  |e verfasserin  |4 aut 
700 1 |a Zhang, Yanning  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society  |d 1992  |g 34(2025) vom: 05., Seite 2450-2462  |w (DE-627)NLM09821456X  |x 1941-0042  |7 nnas 
773 1 8 |g volume:34  |g year:2025  |g day:05  |g pages:2450-2462 
856 4 0 |u http://dx.doi.org/10.1109/TIP.2025.3556531  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 34  |j 2025  |b 05  |h 2450-2462