NLP Cloud Playground

What Is Speech Synthesis?

Speech synthesis (also known as text-to-speech or voice synthesis) is about turning a piece of text into audio. Let's see how to perform speech synthesis with Microsoft Speech T5 on NLP Cloud.

Simply send a piece of text and let the model generate the corresponding audio out of it (in English only).

Why Use Speech Synthesis?

Text-to-speech is used in more and more applications as the last part of an AI pipeline. Many applications can be considered. Here are 2 examples:

Virtual Assistant

When used together with speech to text (see the OpenAI Whisper model for example) and generative models, it is possible to build fully fledged virtual assistants that understand human voice, and respond to it.

Accessibility

Being able to read text out loud is very useful for persons who cannot properly read.