Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising

In many human-computer interaction applications, fast and accurate hand tracking is necessary for an immersive experience. However, raw hand motion data can be flawed due to issues such as joint occlusions and high-frequency noise, hindering the interaction. Using only current motion for interaction...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on visualization and computer graphics. - 1996. - 30(2024), 10 vom: 12. Sept., Seite 6754-6769
1. Verfasser: Zhou, Kanglei (VerfasserIn)
Weitere Verfasser: Shum, Hubert P H, Li, Frederick W B, Liang, Xiaohui
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2024
Zugriff auf das übergeordnete Werk:IEEE transactions on visualization and computer graphics
Schlagworte:Journal Article
LEADER 01000caa a22002652 4500
001 NLM365239178
003 DE-627
005 20240906232529.0
007 cr uuu---uuuuu
008 231226s2024 xx |||||o 00| ||eng c
024 7 |a 10.1109/TVCG.2023.3337868  |2 doi 
028 5 2 |a pubmed24n1525.xml 
035 |a (DE-627)NLM365239178 
035 |a (NLM)38032781 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Zhou, Kanglei  |e verfasserin  |4 aut 
245 1 0 |a Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising 
264 1 |c 2024 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Revised 05.09.2024 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a In many human-computer interaction applications, fast and accurate hand tracking is necessary for an immersive experience. However, raw hand motion data can be flawed due to issues such as joint occlusions and high-frequency noise, hindering the interaction. Using only current motion for interaction can lead to lag, so predicting future movement is crucial for a faster response. Our solution is the Multi-task Spatial-Temporal Graph Auto-Encoder (Multi-STGAE), a model that accurately denoises and predicts hand motion by exploiting the inter-dependency of both tasks. The model ensures a stable and accurate prediction through denoising while maintaining motion dynamics to avoid over-smoothed motion and alleviate time delays through prediction. A gate mechanism is integrated to prevent negative transfer between tasks and further boost multi-task performance. Multi-STGAE also includes a spatial-temporal graph autoencoder block, which models hand structures and motion coherence through graph convolutional networks, reducing noise while preserving hand physiology. Additionally, we design a novel hand partition strategy and hand bone loss to improve natural hand motion generation. We validate the effectiveness of our proposed method by contributing two large-scale datasets with a data corruption algorithm based on two benchmark datasets. To evaluate the natural characteristics of the denoised and predicted hand motion, we propose two structural metrics. Experimental results show that our method outperforms the state-of-the-art, showcasing how the multi-task framework enables mutual benefits between denoising and prediction 
650 4 |a Journal Article 
700 1 |a Shum, Hubert P H  |e verfasserin  |4 aut 
700 1 |a Li, Frederick W B  |e verfasserin  |4 aut 
700 1 |a Liang, Xiaohui  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t IEEE transactions on visualization and computer graphics  |d 1996  |g 30(2024), 10 vom: 12. Sept., Seite 6754-6769  |w (DE-627)NLM098269445  |x 1941-0506  |7 nnns 
773 1 8 |g volume:30  |g year:2024  |g number:10  |g day:12  |g month:09  |g pages:6754-6769 
856 4 0 |u http://dx.doi.org/10.1109/TVCG.2023.3337868  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 30  |j 2024  |e 10  |b 12  |c 09  |h 6754-6769