Generative media
Media
Ranked by human-preference arena ELO (text-to-speech) · via Artificial Analysis · verified Jun 15, 2026 · methodology
| # | Model | Creator | ELO | ±95% | Votes | Released |
|---|---|---|---|---|---|---|
| 1 | Fun-Realtime-TTS | Alibaba | 1226 | -16/16 | — | — |
| 2 | Gemini 3.1 Flash TTS | 1215 | -13/13 | — | — | |
| 3 | Realtime TTS-2 - Research Preview | Inworld | 1208 | -14/14 | — | — |
| 4 | Sonic 3.5 | Cartesia | 1197 | -15/15 | — | — |
| 5 | xAI Text to Speech | xAI | 1196 | -14/14 | — | — |
| 6 | Realtime TTS 1.5 Max | Inworld | 1195 | -14/14 | — | — |
| 7 | Async Flash v1.5 | async | 1181 | -15/15 | — | — |
| 8 | StepAudio 2.5 TTS | StepFun | 1177 | -16/16 | — | — |
| 9 | Eleven v3 | ElevenLabs | 1176 | -12/12 | — | — |
| 10 | Speech 2.8 HD | MiniMax | 1170 | -12/12 | — | — |
| 11 | Inworld TTS 1 Max | Inworld | 1161 | -13/13 | — | — |
| 12 | Async Pro v1.0 | async | 1159 | -15/15 | — | — |
| 13 | Step TTS 2 (Mar 2026) | StepFun | 1155 | -14/14 | — | — |
| 14 | Speech 2.8 Turbo | MiniMax | 1152 | -12/12 | — | — |
| 15 | Realtime TTS 1.5 Mini | Inworld | 1151 | -12/12 | — | — |
| 16 | Speech 2.6 Turbo | MiniMax | 1137 | -11/11 | — | — |
| 17 | Speech 2.6 HD | MiniMax | 1136 | -12/12 | — | — |
| 18 | SIMBA 3.0 | Speechify | 1129 | -13/13 | — | — |
| 19 | Inworld TTS 1 | Inworld | 1121 | -12/12 | — | — |
| 20 | Azure HD 2.5 | Microsoft | 1121 | -13/13 | — | — |
| 21 | Fish Audio S2 Pro | Fish Audio | 1117 | -14/14 | — | — |
| 22 | Turbo v2.5 | ElevenLabs | 1109 | -10/10 | — | — |
| 23 | Speech-02-HD | MiniMax | 1108 | -12/12 | — | — |
| 24 | Step Audio EditX (Mar 2026) | StepFun | 1107 | -14/14 | — | — |
| 25 | Multilingual v2 | ElevenLabs | 1107 | -10/10 | — | — |
| 26 | Gradium TTS | Gradium | 1098 | -15/15 | — | — |
| 27 | TTS-1 HD | OpenAI | 1096 | -12/12 | — | — |
| 28 | Flash v2.5 | ElevenLabs | 1095 | -11/11 | — | — |
| 29 | Speech-02-Turbo | MiniMax | 1090 | -12/12 | — | — |
| 30 | Lightning V3.1 Pro | Smallest.ai | 1090 | -14/14 | — | — |
| 31 | Chatterbox HD | Resemble AI | 1086 | -13/13 | — | — |
| 32 | TTS-1 | OpenAI | 1086 | -10/10 | — | — |
| 33 | Sonic 3 | Cartesia | 1083 | -12/12 | — | — |
| 34 | Gemini 2.5 Flash Lite TTS | 1082 | -12/12 | — | — | |
| 35 | Studio | 1081 | -11/11 | — | — | |
| 36 | MiMo-V2.5-TTS | Xiaomi | 1078 | -14/14 | — | — |
| 37 | Voxtral TTS | Mistral | 1077 | -14/14 | — | — |
| 38 | Polly Generative | Amazon | 1072 | -12/12 | — | — |
| 39 | Azure Neural | Microsoft | 1067 | -12/12 | — | — |
| 40 | Polly Long-Form | Amazon | 1064 | -13/13 | — | — |
| 41 | T2A-01-HD | MiniMax | 1064 | -11/11 | — | — |
| 42 | Kokoro 82M v1.0 | Kokoro | 1061 | -11/11 | — | — |
| 43 | Magpie-Multilingual 357M (Feb 2026) | NVIDIA | 1060 | -14/14 | — | — |
| 44 | OpenAudio S1 | Fish Audio | 1060 | -11/11 | — | — |
| 45 | SIMBA 1.6 | Speechify | 1060 | -13/13 | — | — |
| 46 | Octave 2 | Hume AI | 1060 | -12/12 | — | — |
| 47 | Async Flash v1.0 | async | 1059 | -11/11 | — | — |
| 48 | Chirp 3: HD | 1056 | -12/12 | — | — | |
| 49 | Maya1 | Maya Research | 1051 | -12/12 | — | — |
| 50 | Journey | 1050 | -13/13 | — | — | |
| 51 | Sonic English (Oct 2024) | Cartesia | 1048 | -12/12 | — | — |
| 52 | Coda | Rime | 1044 | -14/14 | — | — |
| 53 | Gemini 2.5 Pro (Dec 2025) | 1037 | -12/12 | — | — | |
| 54 | SIMBA 1.0 | Speechify | 1034 | -11/11 | — | — |
| 55 | Gemini 2.5 Flash TTS (Dec 2025) | 1032 | -12/12 | — | — | |
| 56 | Lightning v3.1 | Smallest.ai | 1030 | -13/13 | — | — |
| 57 | T2A-01-Turbo | MiniMax | 1028 | -11/11 | — | — |
| 58 | Octave TTS | Hume AI | 1027 | -11/11 | — | — |
| 59 | MiMo-V2-TTS | Xiaomi | 1012 | -14/14 | — | — |
| 60 | Fish Speech 1.5 | Fish Audio | 1008 | -11/11 | — | — |
| 61 | Magpie-Multilingual 357M | NVIDIA | 1008 | -11/11 | — | — |
| 62 | Arcana v3 | Rime | 1006 | -14/14 | — | — |
| 63 | Chatterbox | Resemble AI | 1004 | -11/11 | — | — |
| 64 | MAI-Voice-1 | Microsoft | 1004 | -13/13 | — | — |
| 65 | Zonos-v0.1 | Zyphra | 1000 | 0/0 | — | — |
| 66 | VibeVoice 7B | Microsoft | 968 | -13/13 | — | — |
| 67 | LMNT | LMNT | 967 | -12/12 | — | — |
| 68 | Murf Speech Gen 2 | Murf AI | 967 | -11/11 | — | — |
| 69 | VibeVoice 1.5B | Microsoft | 961 | -13/13 | — | — |
| 70 | OpenVoice v2 | OpenVoice | 960 | -13/13 | — | — |
| 71 | Magpie Multilingual | NVIDIA | 939 | -14/14 | — | — |
| 72 | Neuphonic TTS | Neuphonic | 939 | -13/13 | — | — |
| 73 | Qwen3 TTS Flash | Alibaba | 930 | -14/14 | — | — |
| 74 | Qwen3 TTS | Alibaba | 917 | -13/13 | — | — |
| 75 | XTTS v2 | Coqui | 914 | -15/15 | — | — |
| 76 | WaveNet | 895 | -11/11 | — | — | |
| 77 | StyleTTS 2 | StyleTTS | 889 | -15/15 | — | — |
| 78 | Mist V2 | Rime | 881 | -15/15 | — | — |
| 79 | Polly Neural | Amazon | 881 | -14/14 | — | — |
| 80 | Standard | 876 | -11/11 | — | — | |
| 81 | Neural2 | 875 | -12/12 | — | — | |
| 82 | Noiz TTS | Noiz | 853 | -17/17 | — | — |
| 83 | MetaVoice v1 | MetaVoice | 838 | -18/18 | — | — |
| 84 | Falcon (Beta) | Murf AI | 826 | -15/15 | — | — |
| 85 | Polly Standard | Amazon | 812 | -15/15 | — | — |
ELO comes from blind pairwise human votes in the Artificial Analysis arenas — preference, not capability; confidence intervals (±95%) show how settled each rating is. Votes = arena appearances.