Microsoft's new Text-to-Speech voices are more 'realistic, lifelike, and engaging'
The new Text-to-Speech (TTS) voices promise more realistic and lifelike user interactions.
What you need to know
- Microsoft recently introduced four "super realistic" Text-to-Speech voices designed for conversational scenarios.
- They include en-US-AndrewNeural, en-US-BrianNeural, en-US-EmmaNerual, and zh-CN-YunjieNeural, which are available in public preview across three regions: East US, Southeast Asia, and West Europe.
- Microsoft boasts that the new voices will complement "any application necessitating lifelike speech interactions."
- The new voices will help enhance interactions by making them realistic and more engaging.
With the exponential growth of AI and its capabilities across the world, there's a rise in the demand for "naturalness and expressiveness in Text-to-Speech voices," according to Microsoft. The company recently announced four new voices, including en-US-AndrewNeural, en-US-BrianNeural, en-US-EmmaNerual, and zh-CN-YunjieNeural.
The tech giant indicated that the new voices are designed for conversational scenarios to ensure user interactions are "more realistic, lifelike, and engaging." The four new voices are available in public preview in three regions: East US, Southeast Asia, and West Europe.
To demystify the difference between existing voices designed for general purposes and the new voices optimized for conversations, Microsoft also included several demos showcasing the different flavors of the newly incorporated voices.
Microsoft explained that it's possible to integrate the voices into existing applications via Azure OpenAI, using Azure Speech SDK, REST API, and leveraging Azure Bot Framework's capabilities to develop intelligent bots with the ability to use the new Text-to-Speech (TTS) voices.
Adding a natural and expressive touch
AI has enjoyed several wins and setbacks, with an incline to the latter. There have been several reports indicating that chatbots are getting dumber and also experiencing a decline in accuracy and user base.
Perhaps the debut of the new voices will positively impact this trend. Microsoft "offers over 400 neural voices covering more than 140 languages and locales," and those figures seem likely to expand over time.
Get the Windows Central Newsletter
All the latest news, reviews, and guides for Windows and Xbox diehards.
Kevin Okemwa is a seasoned tech journalist based in Nairobi, Kenya with lots of experience covering the latest trends and developments in the industry at Windows Central. With a passion for innovation and a keen eye for detail, he has written for leading publications such as OnMSFT, MakeUseOf, and Windows Report, providing insightful analysis and breaking news on everything revolving around the Microsoft ecosystem. You'll also catch him occasionally contributing at iMore about Apple and AI. While AFK and not busy following the ever-emerging trends in tech, you can find him exploring the world or listening to music.