Microsoft now has an AI that can turn hours of audio into text instantly — and businesses will love it
MAI-Transcribe-1 boasts speech-to-text accuracy across 25 of the world's most spoken languages.
All the latest news, reviews, and guides for Windows and Xbox diehards.
You are now subscribed
Your newsletter sign-up was successful
Join the club
Get full access to premium articles, exclusive features and a growing list of member rewards.
Microsoft is doubling down its efforts in the generative AI landscape with new in-house AI models, including "MAI-Transcribe-1". It's an advanced transcription model designed to deliver state-of-the-art speech-to-text accuracy across 25 of the world's most spoken languages, making it a great candidate for meetings, closed captioning, or other forms of dictation.
MAI-Transcribe-1 will be available on Microsoft Foundry alongside MAI-Voice-1 and MAI-Image-2: "With this launch, MAI models will become broadly available for commercial use for the first time, enabling customers to evaluate and build with models across transcription, voice, and image generation," Microsoft says.
Microsoft says MAI-Voice-1 ships with hyper-realistic speech generation capabilities that preserve the speaker's identity across long-form content with emotional range. It ships with a new voice-prompting feature that can create custom brand voices from just one minute of audio.
Article continues belowPlus, MAI-Image-2 is Microsoft's new text-to-image generation model, which excels at natural lighting, accurate skin tones, and clear in-image text. What's more, it had ranked among the top three on the Arena.ai text-to-image leaderboard.
So, is Microsoft building its own AI camp?
It's no secret that Microsoft heavily relies on OpenAI's AI technology, which it has heavily integrated across its tech stack. However, the tech giant has openly criticized the ChatGPT maker's GPT-4 technology, citing that it's too expensive and slow to meet consumer needs.
Last year, Microsoft started developing its own in-house AI models and testing third-party ones for Copilot, potentially freeing itself from an overdependence on OpenAI for its AI efforts. However, Microsoft's AI CEO, Mustafa Suleyman, confirmed that the company is developing "off-frontier" AI models, but admitted that they'd play a close second to OpenAI's sophisticated technology.
Last month, Microsoft made some major changes to its Copilot leadership structure, splitting the division into four pillars: Copilot experience, Copilot platform, Microsoft 365 apps, and AI models.
Related: Microsoft faces its worst quarter since 2008's financial crisis because of AI
Ex-Snap exec Jacob Andreou will lead Copilot experiences, both consumer and commercial, as an executive vice president reporting to Microsoft CEO Satya Nadella. Consequently, Microsoft's AI CEO, Mustafa Suleyman, will now double down on building in-house AI models for the company.
I guess Salesforce CEO Marc Benioff was onto something when he predicted that Microsoft wouldn't use OpenAI's technology in the future, following the announcement of the ChatGPT maker's now-abandoned $500 billion Stargate project designed to facilitate the construction of data centers across the United States.
Join us on Reddit at r/WindowsCentral to share your insights and discuss our latest news, reviews, and more.

Kevin Okemwa is a seasoned tech journalist based in Nairobi, Kenya with lots of experience covering the latest trends and developments in the industry at Windows Central. With a passion for innovation and a keen eye for detail, he has written for leading publications such as OnMSFT, MakeUseOf, and Windows Report, providing insightful analysis and breaking news on everything revolving around the Microsoft ecosystem. While AFK and not busy following the ever-emerging trends in tech, you can find him exploring the world or listening to music.
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.
