Microsoft Releases New AI Models for Speech, Audio and Video Generation

Senior AI Reporter
April 2, 2026
Microsoft Releases New AI Models for Speech, Audio and Video Generation

Microsoft has introduced three new multimodal AI models—MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2—marking an expansion of its in-house AI development as it builds out its own model stack alongside its partnership with OpenAI.

The models, announced Thursday by Microsoft AI, span speech-to-text, voice generation, and visual content creation. MAI-Transcribe-1 supports transcription across 25 languages and, according to the company, operates 2.5 times faster than its Azure Fast alternative. MAI-Voice-1 can generate up to 60 seconds of audio in one second and includes tools for creating custom voices. MAI-Image-2, a video-generating model, had previously launched on March 19 through the company’s MAI Playground and is now being rolled out more broadly.

All three models are being made available through Microsoft Foundry, while the transcription and voice models are also accessible in MAI Playground, a testing environment for large language models.

The release comes from Microsoft’s MAI Superintelligence team, led by Microsoft AI CEO Mustafa Suleyman and established in November 2025. The move reflects Microsoft’s effort to deepen its capabilities across multiple AI modalities while continuing to compete with other major AI developers.

“At Microsoft AI, we’re building Humanist AI. We have a distinct view when creating our AI models — putting humans at the center, optimizing for how people actually communicate, training for practical use,” Suleyman wrote in a blog post. “You’ll see more models from us soon in Foundry and directly in Microsoft products and experiences.”

Microsoft is positioning the models in part on cost, with pricing set below some competing offerings. MAI-Transcribe-1 starts at $0.36 per hour, MAI-Voice-1 at $22 per 1 million characters, and MAI-Image-2 at $5 per 1 million input tokens and $33 per 1 million output tokens.

Despite the internal push, the company said it remains committed to its long-standing relationship with OpenAI, which has included more than $13 billion in investment.

This analysis is based on reporting from TechCrunch.

Image courtesy of Unsplash.

This article was generated with AI assistance and reviewed for accuracy and quality.

Last updated: April 2, 2026

About this article: This article was generated with AI assistance and reviewed by our editorial team to ensure it follows our editorial standards for accuracy and independence. We maintain strict fact-checking protocols and cite all sources.

Word count: 329Reading time: 0 minutes

AI Tools for this Article

📧 Stay Updated

Get the latest AI news delivered to your inbox every morning.

Browse All Articles
Share this article:
Next Article

AI News Daily

Breaking Intelligence • Since 2023

Join hundreds of thousands of AI professionals who start their day with our curated newsletter. Get breaking news, expert analysis, and exclusive insights.

Articles Published Daily
Thousands of Monthly Readers

Stay Ahead of AI

Get the latest AI breakthroughs, tools, and insights delivered to your inbox every week.

Free forever Unsubscribe anytime No spam guarantee

Go Premium

Unlock unlimited AI tools and an ad-free reading experience designed for AI professionals.

• Ad-free experience
• Premium AI tools
Start Free Trial

14-day free trial • Cancel anytime
Plus $9/mo • Pro $90/yr (2 months free)

AI News Hub

Breaking Intelligence

Your daily briefing on what matters in AI. Trusted by developers, researchers, executives, and AI enthusiasts worldwide.

Follow Our Community

© 2026 ChatAI. All rights reserved.