Joint Task-Recursive Learning for RGB-D Scene Understanding
RGB-D scene understanding under monocular camera is an emerging and challenging topic with many potential applications. In this paper, we propose a novel Task-Recursive Learning (TRL) framework to jointly and recurrently conduct three representative tasks therein containing depth estimation, surface...
| Publié dans: | IEEE transactions on pattern analysis and machine intelligence. - 1979. - 42(2020), 10 vom: 02. Okt., Seite 2608-2623 |
|---|---|
| Auteur principal: | |
| Autres auteurs: | , , , , |
| Format: | Article en ligne |
| Langue: | English |
| Publié: |
2020
|
| Accès à la collection: | IEEE transactions on pattern analysis and machine intelligence |
| Sujets: | Journal Article |
| Résumé: | RGB-D scene understanding under monocular camera is an emerging and challenging topic with many potential applications. In this paper, we propose a novel Task-Recursive Learning (TRL) framework to jointly and recurrently conduct three representative tasks therein containing depth estimation, surface normal prediction and semantic segmentation. TRL recursively refines the prediction results through a series of task-level interactions, where one-time cross-task interaction is abstracted as one network block of one time stage. In each stage, we serialize multiple tasks into a sequence and then recursively perform their interactions. To adaptively enhance counterpart patterns, we encapsulate interactions into a specific Task-Attentional Module (TAM) to mutually-boost the tasks from each other. Across stages, the historical experiences of previous states of tasks are selectively propagated into the next stages by using Feature-Selection unit (FS-Unit), which takes advantage of complementary information across tasks. The sequence of task-level interactions is also evolved along a coarse-to-fine scale space such that the required details may be refined progressively. Finally the task-abstracted sequence problem of multi-task prediction is framed into a recursive network. Extensive experiments on NYU-Depth v2 and SUN RGB-D datasets demonstrate that our method can recursively refines the results of the triple tasks and achieves state-of-the-art performance |
|---|---|
| Description: | Date Revised 04.09.2020 published: Print-Electronic Citation Status PubMed-not-MEDLINE |
| ISSN: | 1939-3539 |
| DOI: | 10.1109/TPAMI.2019.2926728 |