APSNet : Toward Adaptive Point Sampling for Efficient 3D Action Recognition

Observing that it is still a challenging task to deploy 3D action recognition methods in real-world scenarios, in this work, we investigate the accuracy-efficiency trade-off for 3D action recognition. We first introduce a simple and efficient backbone network structure for 3D action recognition, in...

Description complète

Détails bibliographiques
Publié dans:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. - 1992. - 31(2022) vom: 28., Seite 5287-5302
Auteur principal: Liu, Jiaheng (Auteur)
Autres auteurs: Guo, Jinyang, Xu, Dong
Format: Article en ligne
Langue:English
Publié: 2022
Accès à la collection:IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Sujets:Journal Article
LEADER 01000caa a22002652c 4500
001 NLM344207404
003 DE-627
005 20250303153955.0
007 cr uuu---uuuuu
008 231226s2022 xx |||||o 00| ||eng c
024 7 |a 10.1109/TIP.2022.3193290  |2 doi 
028 5 2 |a pubmed25n1147.xml 
035 |a (DE-627)NLM344207404 
035 |a (NLM)35901004 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Liu, Jiaheng  |e verfasserin  |4 aut 
245 1 0 |a APSNet  |b Toward Adaptive Point Sampling for Efficient 3D Action Recognition 
264 1 |c 2022 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 17.08.2022 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a Observing that it is still a challenging task to deploy 3D action recognition methods in real-world scenarios, in this work, we investigate the accuracy-efficiency trade-off for 3D action recognition. We first introduce a simple and efficient backbone network structure for 3D action recognition, in which we directly extract the geometry and motion representations from the raw point cloud videos through a set of simple operations (i.e., coordinate offset generation and mini-PoinNet). Based on the backbone network, we propose an end-to-end optimized network called adaptive point sampling network (APSNet) to achieve the accuracy-efficiency trade-off, which mainly consists of three stages: the coarse feature extraction stage, the decision making stage, and the fine feature extraction stage. In APSNet, we adaptively decide the optimal resolutions (i.e., the optimal number of points) for each pair of frames based on any input point cloud video under the given computational complexity constraint. Comprehensive experiments on multiple benchmark datasets demonstrate the effectiveness and efficiency of our newly proposed APSNet for 3D action recognition 
650 4 |a Journal Article 
700 1 |a Guo, Jinyang  |e verfasserin  |4 aut 
700 1 |a Xu, Dong  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on image processing : a publication of the IEEE Signal Processing Society  |d 1992  |g 31(2022) vom: 28., Seite 5287-5302  |w (DE-627)NLM09821456X  |x 1941-0042  |7 nnas 
773 1 8 |g volume:31  |g year:2022  |g day:28  |g pages:5287-5302 
856 4 0 |u http://dx.doi.org/10.1109/TIP.2022.3193290  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 31  |j 2022  |b 28  |h 5287-5302