Tech demo shows off the power of Microsoft's Azure Custom Neural Voice
Azure was used to create a convincing replication of a presenter's voice in a tech demo.
What you need to know
- A recent tech demo uses Azure Custom Neural Voice to imitate a presenter's actual voice.
- You can compare the presenter and the AI-generated voice within the video.
- The demo explains how to configure Dapr to communicate over gRPC.
Microsoft's Donavan Brown recently shared a video that utilizes Azure Custom Neural Voice to imitate his real voice. Brown is a partner program manager of Azure Incubations at Microsoft. His recent video illustrates the power of Azure when used to replicate human speech.
The video itself is about how to configure Dapr to communicate over gRPC. It's a clear video that explains the process well, but everyday tech enthusiasts are probably more interested in the technology that went into creating the presentation than the contents of the video.
Azure Custom Neural Voice is a text-to-speech feature in Azure Cognitive Service. It lets organizations create a synthetic voice, such as the Flo virtual chatbot for the insurance company Progressive. When Custom Neural Voice came out of preview in February 2021, Microsoft explained how it could be used for chatbots, voice assistants, online learning, and in other areas.
Traditional methods of creating text-to-speech voices require around 10,000 lines of voice data. In contrast, Azure Custom Neural Voice can create a realistic voice with much less voice data.
Brown starts the video by speaking to the camera. The video then transitions to a tech demo that uses a synthetic voice based on Brown's real voice. Having both Brown's actual voice and the synthetic voice makes it easy to compare the two.
On Twitter, Brown explained that he played some sentences out loud with his wife and that neither of them could determine if the clips were of Brown's actual voice or from the synthetic voice.
When I first started playing with it there were some sentences I shared with my wife and we could not tell if it was me or not. It is unbelievable what we can do with Azure.When I first started playing with it there were some sentences I shared with my wife and we could not tell if it was me or not. It is unbelievable what we can do with Azure.— Donovan Brown #BlackLivesMatter (@DonovanBrown) September 14, 2021September 14, 2021
Brown also explains that creating a synthetic voice based on a person requires consent.
Get the Windows Central Newsletter
All the latest news, reviews, and guides for Windows and Xbox diehards.
Sean Endicott is a tech journalist at Windows Central, specializing in Windows, Microsoft software, AI, and PCs. He's covered major launches, from Windows 10 and 11 to the rise of AI tools like ChatGPT. Sean's journey began with the Lumia 740, leading to strong ties with app developers. Outside writing, he coaches American football, utilizing Microsoft services to manage his team. He studied broadcast journalism at Nottingham Trent University and is active on X @SeanEndicott_ and Threads @sean_endicott_.