Text-to-Speech Arena Leaderboard

Speech Arena Trends

The landscape of generative audio has shifted dramatically in mid-2025. Inworld has claimed the top spot with their "TTS 1 Max" model, pushing the boundaries of conversational realism with an ELO of 1189.

MiniMax continues to be a dominant force, securing the 2nd and 3rd positions with their "Speech-02" series. Meanwhile, ElevenLabs remains the volume leader with five models in the top 10, demonstrating unmatched consistency across different latency tiers (Turbo, Flash, and Multilingual).

A notable entry is Kokoro 82M, an open-weights model that punches significantly above its weight class. With an ELO of 1058, it outperforms many proprietary enterprise solutions, marking a pivotal moment for open-source speech synthesis.

Current Champion

Inworld

TTS 1 Max (1189 ELO)

Top Open Weights

Kokoro

82M v1.0 (1058 ELO)

Fastest Rise

MiniMax

+14 ELO (Speech-02)

Legacy Leader

OpenAI

TTS-1 (Rank #5)

Text to Speech Models

Speech Arena Trends

Global Leaderboard