Integrating Both Parallax and Latency Compensation into Video See-through Head-mounted Display

This work introduces a perspective-corrected video see-through mixed-reality head-mounted display with edge-preserving occlusion and low-latency capabilities. To realize the consistent spatial and temporal composition of a captured real world containing virtual objects, we perform three essential ta...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on visualization and computer graphics. - 1996. - PP(2023) vom: 27. Feb.
1. Verfasser: Ishihara, Atsushi (VerfasserIn)
Weitere Verfasser: Aga, Hiroyuki, Ishihara, Yasuko, Ichikawa, Hirotake, Kaji, Hidetaka, Kobayashi, Daita, Kobayashi, Toshimi, Nishida, Ken, Hamasaki, Takumi, Mori, Hideto, Kawasaki, Koichi, Morikubo, Yuki
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2023
Zugriff auf das übergeordnete Werk:IEEE transactions on visualization and computer graphics
Schlagworte:Journal Article
Beschreibung
Zusammenfassung:This work introduces a perspective-corrected video see-through mixed-reality head-mounted display with edge-preserving occlusion and low-latency capabilities. To realize the consistent spatial and temporal composition of a captured real world containing virtual objects, we perform three essential tasks: 1) to reconstruct captured images so as to match the user's view; 2) to occlude virtual objects with nearer real objects, to provide users with correct depth cues; and 3) to reproject the virtual and captured scenes to be matched and to keep up with users' head motions. Captured image reconstruction and occlusion-mask generation require dense and accurate depth maps. However, estimating these maps is computationally difficult, which results in longer latencies. To obtain an acceptable balance between spatial consistency and low latency, we rapidly generated depth maps by focusing on edge smoothness and disocclusion (instead of fully accurate maps), to shorten the processing time. Our algorithm refines edges via a hybrid method involving infrared masks and color-guided filters, and it fills disocclusions using temporally cached depth maps. Our system combines these algorithms in a two-phase temporal warping architecture based upon synchronized camera pairs and displays. The first phase of warping is to reduce registration errors between the virtual and captured scenes. The second is to present virtual and captured scenes that correspond with the user's head motion. We implemented these methods on our wearable prototype and performed end-to-end measurements of its accuracy and latency. We achieved an acceptable latency due to head motion (less than 4 ms) and spatial accuracy (less than 0.1° in size and less than 0.3° in position) in our test environment. We anticipate that this work will help improve the realism of mixed reality systems
Beschreibung:Date Revised 07.04.2023
published: Print-Electronic
Citation Status Publisher
ISSN:1941-0506
DOI:10.1109/TVCG.2023.3247460