Exploring Point-BEV Fusion for 3D Point Cloud Object Tracking With Transformer

With the prevalent use of LiDAR sensors in autonomous driving, 3D point cloud object tracking has received increasing attention. In a point cloud sequence, 3D object tracking aims to predict the location and orientation of an object in consecutive frames. Motivated by the success of transformers, we...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 46(2024), 9 vom: 01. Aug., Seite 5921-5935
1. Verfasser: Luo, Zhipeng (VerfasserIn)
Weitere Verfasser: Zhou, Changqing, Pan, Liang, Zhang, Gongjie, Liu, Tianrui, Luo, Yueru, Zhao, Haiyu, Liu, Ziwei, Lu, Shijian
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2024
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article
LEADER 01000caa a22002652 4500
001 NLM369319060
003 DE-627
005 20240807232441.0
007 cr uuu---uuuuu
008 240306s2024 xx |||||o 00| ||eng c
024 7 |a 10.1109/TPAMI.2024.3373693  |2 doi 
028 5 2 |a pubmed24n1494.xml 
035 |a (DE-627)NLM369319060 
035 |a (NLM)38442046 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Luo, Zhipeng  |e verfasserin  |4 aut 
245 1 0 |a Exploring Point-BEV Fusion for 3D Point Cloud Object Tracking With Transformer 
264 1 |c 2024 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 07.08.2024 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a With the prevalent use of LiDAR sensors in autonomous driving, 3D point cloud object tracking has received increasing attention. In a point cloud sequence, 3D object tracking aims to predict the location and orientation of an object in consecutive frames. Motivated by the success of transformers, we propose Point Tracking TRansformer (PTTR), which efficiently predicts high-quality 3D tracking results in a coarse-to-fine manner with the help of transformer operations. PTTR consists of three novel designs. 1) Instead of random sampling, we design Relation-Aware Sampling to preserve relevant points to the given template during subsampling. 2) We propose a Point Relation Transformer for effective feature aggregation and feature matching between the template and search region. 3) Based on the coarse tracking results, we employ a novel Prediction Refinement Module to obtain the final refined prediction through local feature pooling. In addition, motivated by the favorable properties of the Bird's-Eye View (BEV) of point clouds in capturing object motion, we further design a more advanced framework named PTTR++, which incorporates both the point-wise view and BEV representation to exploit their complementary effect in generating high-quality tracking results. PTTR++ substantially boosts the tracking performance on top of PTTR with low computational overhead. Extensive experiments over multiple datasets show that our proposed approaches achieve superior 3D tracking accuracy and efficiency 
650 4 |a Journal Article 
700 1 |a Zhou, Changqing  |e verfasserin  |4 aut 
700 1 |a Pan, Liang  |e verfasserin  |4 aut 
700 1 |a Zhang, Gongjie  |e verfasserin  |4 aut 
700 1 |a Liu, Tianrui  |e verfasserin  |4 aut 
700 1 |a Luo, Yueru  |e verfasserin  |4 aut 
700 1 |a Zhao, Haiyu  |e verfasserin  |4 aut 
700 1 |a Liu, Ziwei  |e verfasserin  |4 aut 
700 1 |a Lu, Shijian  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on pattern analysis and machine intelligence  |d 1979  |g 46(2024), 9 vom: 01. Aug., Seite 5921-5935  |w (DE-627)NLM098212257  |x 1939-3539  |7 nnns 
773 1 8 |g volume:46  |g year:2024  |g number:9  |g day:01  |g month:08  |g pages:5921-5935 
856 4 0 |u http://dx.doi.org/10.1109/TPAMI.2024.3373693  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 46  |j 2024  |e 9  |b 01  |c 08  |h 5921-5935