Diagnosing the cluster-based performance of large-scale deep neural network (DNN) models during training is essential for improving training efficiency and reducing resource consumption. However, it remains challenging due to the incomprehensibility of the parallelization strategy and the sheer volu...
Bibliographische Detailangaben
Veröffentlicht in: | IEEE transactions on visualization and computer graphics. - 1996. - 30(2024), 7 vom: 10. Juni, Seite 3915-3929
|
1. Verfasser: |
Wei, Yating
(VerfasserIn) |
Weitere Verfasser: |
Wang, Zhiyong,
Wang, Zhongwei,
Dai, Yong,
Ou, Gongchang,
Gao, Han,
Yang, Haitao,
Wang, Yue,
Cao, Caleb Chen,
Weng, Luoxuan,
Lu, Jiaying,
Zhu, Rongchen,
Chen, Wei |
Format: | Online-Aufsatz
|
Sprache: | English |
Veröffentlicht: |
2024
|
Zugriff auf das übergeordnete Werk: | IEEE transactions on visualization and computer graphics
|
Schlagworte: | Journal Article |