Pose2Gaze : Eye-Body Coordination During Daily Activities for Gaze Prediction From Full-Body Poses

Human eye gaze plays a significant role in many virtual and augmented reality (VR/AR) applications, such as gaze-contingent rendering, gaze-based interaction, or eye-based activity recognition. However, prior works on gaze analysis and prediction have only explored eye-head coordination and were lim...

Description complète

Détails bibliographiques
Publié dans:	IEEE transactions on visualization and computer graphics. - 1996. - PP(2024) vom: 11. Juni
Auteur principal:	Hu, Zhiming (Auteur)
Autres auteurs:	Xu, Jiahui, Schmitt, Syn, Bulling, Andreas
Format:	Article en ligne
Langue:	English
Publié:	2024
Accès à la collection:	IEEE transactions on visualization and computer graphics
Sujets:	Journal Article

Description
Résumé:	Human eye gaze plays a significant role in many virtual and augmented reality (VR/AR) applications, such as gaze-contingent rendering, gaze-based interaction, or eye-based activity recognition. However, prior works on gaze analysis and prediction have only explored eye-head coordination and were limited to human-object interactions. We first report a comprehensive analysis of eye-body coordination in various human-object and human-human interaction activities based on four public datasets collected in real-world (MoGaze), VR (ADT), as well as AR (GIMO and EgoBody) environments. We show that in human-object interactions, e.g. pick and place, eye gaze exhibits strong correlations with full-body motion while in human-human interactions, e.g. chat and teach, a person's gaze direction is correlated with the body orientation towards the interaction partner. Informed by these analyses we then present Pose2Gaze - a novel eye-body coordination model that uses a convolutional neural network and a spatio-temporal graph convolutional neural network to extract features from head direction and full-body poses, respectively, and then uses a convolutional neural network to predict eye gaze. We compare our method with state-of-the-art methods that predict eye gaze only from head movements and show that Pose2Gaze outperforms these baselines with an average improvement of 24.0% on MoGaze, 10.1% on ADT, 21.3% on GIMO, and 28.6% on EgoBody in mean angular error, respectively. We also show that our method significantly outperforms prior methods in the sample downstream task of eye-based activity recognition. These results underline the significant information content available in eye-body coordination during daily activities and open up a new direction for gaze prediction
Description:	Date Revised 25.06.2024 published: Print-Electronic Citation Status Publisher
ISSN:	1941-0506
DOI:	10.1109/TVCG.2024.3412190