NaturalSpeech : End-to-End Text-to-Speech Synthesis With Human-Level Quality

Text-to-speech (TTS) has made rapid progress in both academia and industry in recent years. Some questions naturally arise that whether a TTS system can achieve human-level quality, how to define/judge that quality, and how to achieve it. In this paper, we answer these questions by first defining th...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 46(2024), 6 vom: 22. Juni, Seite 4234-4245
1. Verfasser: Tan, Xu (VerfasserIn)
Weitere Verfasser: Chen, Jiawei, Liu, Haohe, Cong, Jian, Zhang, Chen, Liu, Yanqing, Wang, Xi, Leng, Yichong, Yi, Yuanhao, He, Lei, Zhao, Sheng, Qin, Tao, Soong, Frank, Liu, Tie-Yan
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2024
Zugriff auf das übergeordnete Werk:IEEE transactions on pattern analysis and machine intelligence
Schlagworte:Journal Article Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S.