Evaluate leading text-to-speech models – US English
Many of us hear the YouTube videos whose voices are generated by AI, or even some of us use such APIs actively. But, you may wonder how good the public text-to-speech (TTS) APIs. It is actually a hard problem, even in the AI society. For such TTS evaluations, some groups use Word Error Rate (WER)…