Q2 2026 Market Update

Text to Speech Models

Explore our curated collection of Text-to-Speech resources. Compare performance, ELO ratings, and stability across top Speech AI providers.

Speech Arena Trends

The landscape of generative audio has shifted dramatically in mid-2025. Inworld has claimed the top spot with their "TTS 1 Max" model, pushing the boundaries of conversational realism with an ELO of 1189.

MiniMax continues to be a dominant force, securing the 2nd and 3rd positions with their "Speech-02" series. Meanwhile, ElevenLabs remains the volume leader with five models in the top 10, demonstrating unmatched consistency across different latency tiers (Turbo, Flash, and Multilingual).

A notable entry is Kokoro 82M, an open-weights model that punches significantly above its weight class. With an ELO of 1058, it outperforms many proprietary enterprise solutions, marking a pivotal moment for open-source speech synthesis.

Current Champion
Inworld
TTS 1 Max (1189 ELO)
Top Open Weights
Kokoro
82M v1.0 (1058 ELO)
Fastest Rise
MiniMax
+14 ELO (Speech-02)
Legacy Leader
OpenAI
TTS-1 (Rank #5)
Top 10 Models by ELO Rating
Market Presence (Top 30 Models)

Global Leaderboard

Updated: June 2026
Rank Model Name ELO Score Release Date