NaturalSpeech : End-to-End Text-to-Speech Synthesis With Human-Level Quality

Text-to-speech (TTS) has made rapid progress in both academia and industry in recent years. Some questions naturally arise that whether a TTS system can achieve human-level quality, how to define/judge that quality, and how to achieve it. In this paper, we answer these questions by first defining th...

Description complète

Détails bibliographiques
Publié dans:IEEE transactions on pattern analysis and machine intelligence. - 1979. - 46(2024), 6 vom: 19. Juni, Seite 4234-4245
Auteur principal: Tan, Xu (Auteur)
Autres auteurs: Chen, Jiawei, Liu, Haohe, Cong, Jian, Zhang, Chen, Liu, Yanqing, Wang, Xi, Leng, Yichong, Yi, Yuanhao, He, Lei, Zhao, Sheng, Qin, Tao, Soong, Frank, Liu, Tie-Yan
Format: Article en ligne
Langue:English
Publié: 2024
Accès à la collection:IEEE transactions on pattern analysis and machine intelligence
Sujets:Journal Article Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S.