MMODELYST
Capability × cost × speed

Frontier

Capability = Modelyst cross-benchmark percentile · price & speed = Artificial Analysis medians · verified Jun 15, 2026 · methodology
The efficient frontier — capability vs price
140 models · hover a point · click to open
$0.1$120406080100blended price, USD per 1M tokens (log)capability scoreQwen3.6 Max Preview · Alibaba — $2.92, 95Qwen3.7 Plus · Alibaba — $0.59, 93Qwen3.6 Plus · Alibaba — $1.13, 93Qwen3.5 397B A17B · Alibaba — $1.35, 92GLM-5.1 · Z AI — $2.15, 92GLM-5 · Z AI — $1.55, 91GLM-4.7 · Z AI — $1, 91Kimi K2.5 · Kimi — $1.19, 91MiniMax-M2.7 · MiniMax — $0.525, 91Kimi K2 Thinking · Kimi — $1.07, 90Qwen3.5 122B A10B · Alibaba — $1.1, 89Nemotron 3 Ultra 550B A55B · NVIDIA — $1.18, 88MiniMax-M2.5 · MiniMax — $0.525, 88Qwen3.5 27B · Alibaba — $0.825, 88DeepSeek V3.2 · DeepSeek — $0.337, 88Qwen3.6 27B · Alibaba — $1.35, 87Qwen3 Max Thinking · Alibaba — $2.4, 85MiniMax-M2.1 · MiniMax — $0.525, 85Hy3-preview · tencent — $0.2, 83Qwen3.6 35B A3B · Alibaba — $0.557, 83Qwen3 235B A22B 2507 · Alibaba — $0.838, 82Mistral Medium 3.5 · Mistral — $3, 82Qwen3.5 35B A3B · Alibaba — $0.688, 82gpt-oss-120b · openai — $0.262, 82DeepSeek V3.1 Terminus · DeepSeek — $1.91, 82MiniMax-M2 · MiniMax — $0.525, 81DeepSeek V3.2 Exp · DeepSeek — $0.31, 80NVIDIA Nemotron 3 Super 120B A12B · NVIDIA — $0.412, 78GLM-4.6 · Z AI — $0.963, 77Qwen3 Next 80B A3B · Alibaba — $1.88, 76Qwen3.5 Omni Plus · Alibaba — $1.5, 76Qwen3 Max Thinking (Preview) · Alibaba — $2.4, 76Qwen3 VL 235B A22B · Alibaba — $2.17, 76DeepSeek R1 0528 (May '25) · DeepSeek — $2.06, 75Gemma 4 26B A4B · Google — $0.198, 75Qwen3 Max · Alibaba — $3.05, 75GLM-4.5 · Z AI — $1, 73Qwen3 VL 32B · Alibaba — $2.63, 71GLM-4.7-Flash · Z AI — $0.153, 71Qwen3 Max (Preview) · Alibaba — $2.4, 70Qwen3 235B A22B 2507 Instruct · Alibaba — $0.356, 69Qwen3 30B A3B 2507 · Alibaba — $0.673, 68GLM-4.5-Air · Z AI — $0.372, 68Kimi K2 · Kimi — $1.04, 68MiniMax M1 80k · MiniMax — $0.963, 67Kimi K2 0905 · Kimi — $1.07, 67Mistral Small 4 · Mistral — $0.262, 66Llama Nemotron Super 49B v1.5 · NVIDIA — $0.175, 66QwQ 32B · Alibaba — $0.745, 65DeepSeek V3.1 · DeepSeek — $0.834, 64Qwen2.5 Max · Alibaba — $2.8, 64Qwen3 VL 30B A3B · Alibaba — $0.338, 63Qwen3 235B A22B · Alibaba — $2.63, 63Qwen3 VL 235B A22B Instruct · Alibaba — $0.7, 63DeepSeek R1 (Jan '25) · DeepSeek — $2.43, 63DeepSeek V3 0324 · DeepSeek — $1.21, 62Qwen3 Coder Next · Alibaba — $0.563, 62Qwen3 Coder 480B A35B Instruct · Alibaba — $0.675, 62Qwen3 Next 80B A3B Instruct · Alibaba — $0.875, 62Qwen3 VL 32B Instruct · Alibaba — $1.23, 59GLM-4.6V · Z AI — $0.45, 59Qwen3 32B · Alibaba — $0.276, 59Hermes 4 - Llama-3.1 405B · Nous Research — $1.5, 59Qwen3.5 Omni Flash · Alibaba — $0.275, 58Nemotron 3 Nano Omni 30B A3B Reasoning · NVIDIA — $0.131, 57Llama 3.1 Nemotron Ultra 253B v1 · NVIDIA — $0.9, 57Qwen3 Omni 30B A3B · Alibaba — $0.43, 57Qwen3 30B A3B 2507 Instruct · Alibaba — $0.213, 57Qwen3 30B A3B · Alibaba — $0.18, 56Hermes 4 - Llama-3.1 70B · Nous Research — $0.198, 56NVIDIA Nemotron Nano 12B v2 VL · NVIDIA — $0.3, 56Mistral Large 3 · Mistral — $0.75, 55Qwen3 VL 30B A3B Instruct · Alibaba — $0.3, 55Mistral Medium 3.1 · Mistral — $0.8, 55Qwen3 14B · Alibaba — $0.731, 55Llama 4 Maverick · Meta — $0.475, 54Mistral Medium 3 · Mistral — $0.8, 52Qwen3 Coder 30B A3B Instruct · Alibaba — $0.352, 52Llama 3.2 Instruct 90B (Vision) · Meta — $1.38, 52GLM-4.5V · Z AI — $0.9, 52DeepSeek V3 (Dec '24) · DeepSeek — $0.523, 51Qwen2.5 Turbo · Alibaba — $0.088, 50DeepSeek R1 Distill Llama 70B · DeepSeek — $0.787, 49NVIDIA-Nemotron-Nano-9B-v2 · nvidia — $0.07, 49ERNIE 4.5 300B A47B · Baidu — $0.485, 49Mistral Small 3.2 · Mistral — $0.128, 48Qwen2.5 Instruct 72B · Alibaba — $0.37, 48Hermes 3 - Llama-3.1 70B · Nous Research — $0.3, 48Qwen3 VL 8B · Alibaba — $0.66, 47Qwen3 4B · Alibaba — $0.398, 46Qwen3 8B · Alibaba — $0.37, 46Llama 3.3 Instruct 70B · Meta — $0.612, 45Mistral Small (Sep '24) · Mistral — $0.3, 45Llama 3.1 Instruct 405B · Meta — $3.69, 45Mistral Large (Feb '24) · Mistral — $6, 44Mistral Large 2 (Nov '24) · Mistral — $3, 44Llama 4 Scout · Meta — $0.292, 44Mistral Small 3.1 · Mistral — $0.138, 43Qwen3 Omni 30B A3B Instruct · Alibaba — $0.43, 43phi-4 · microsoft — $0.219, 43Qwen3 VL 8B Instruct · Alibaba — $0.31, 42Qwen3 1.7B · Alibaba — $0.398, 41Llama 3.1 Nemotron Instruct 70B · NVIDIA — $1.2, 40Mistral Large 2 (Jul '24) · Mistral — $3, 40Gemma 3 27B Instruct · Google — $0.145, 40Gemma 3 12B Instruct · Google — $0.14, 39Llama 3.1 Instruct 70B · Meta — $0.56, 39Mistral Medium · Mistral — $4.09, 38Mistral Small 3 · Mistral — $0.104, 38Granite 4.0 H Small · IBM — $0.107, 37Mistral Small (Feb '24) · Mistral — $1.5, 37Olmo 3 7B Instruct · Allen Institute for AI — $0.125, 37Command-R+ (Apr '24) · Cohere — $6, 36Granite 4.1 8B · IBM — $0.063, 35Llama 3 Instruct 70B · Meta — $1.18, 35Llama 3.1 Instruct 8B · Meta — $0.1, 32Mixtral 8x7B Instruct · Mistral — $0.512, 31Gemma 3 4B Instruct · Google — $0.05, 30Llama 3.2 Instruct 11B (Vision) · Meta — $0.245, 29Qwen3 0.6B · Alibaba — $0.398, 28Granite 3.3 8B · IBM — $0.085, 28Command-R (Mar '24) · Cohere — $0.75, 27Llama 3 Instruct 8B · Meta — $0.07, 26Llama 3.2 Instruct 3B · Meta — $0.15, 25Llama 2 Chat 7B · Meta — $0.1, 23Llama 3.2 Instruct 1B · Meta — $0.05, 16Mistral 7B Instruct · Mistral — $0.206, 14Qwen3.5 0.8B · Alibaba — $0.02, 19Gemma 3n E4B Instruct · Google — $0.025, 32Qwen3.5 2B · Alibaba — $0.04, 37Qwen3.5 4B · Alibaba — $0.06, 65gpt-oss-20b · openai — $0.088, 72NVIDIA Nemotron 3 Nano 30B A3B · NVIDIA — $0.096, 73Qwen3.5 9B · Alibaba — $0.113, 74Gemma 4 12B · Google — $0.15, 74DeepSeek V4 Flash · DeepSeek — $0.175, 91MiniMax-M3 · MiniMax — $0.525, 95DeepSeek V4 Pro · DeepSeek — $0.544, 95Kimi K2.6 · Kimi — $1.71, 96Qwen3.7 Max · Alibaba — $3.75, 96Qwen3.7 MaxDeepSeek V4 FlashGemma 4 12BQwen3.5 4BQwen3.5 2BGemma 3n E4B InstructQwen3.5 0.8B
on the frontier — nothing is both cheaper and bettereverything else
Up and to the left wins: the gold staircase is the set of models nothing beats on both price and capability at once.
The race — capability over time
gold = set a new record on release
2024.0620252025.0620262026.0620406080100release datecapability scoreQwen3.6 Max Preview · Alibaba — 2026.04, 95DeepSeek V4 Pro · DeepSeek — 2026.04, 95MiniMax-M3 · MiniMax — 2026.05, 95Qwen3.7 Plus · Alibaba — 2026.05, 93GLM-5.1 · Z AI — 2026.03, 92DeepSeek V4 Flash · DeepSeek — 2026.04, 91Kimi K2.5 · Kimi — 2026.01, 91MiniMax-M2.7 · MiniMax — 2026.03, 91Qwen3.5 122B A10B · Alibaba — 2026.02, 89Nemotron 3 Ultra 550B A55B · NVIDIA — 2026.05, 88MiniMax-M2.5 · MiniMax — 2026.01, 88Qwen3.5 27B · Alibaba — 2026.02, 88DeepSeek V3.2 · DeepSeek — 2025.11, 88Qwen3.6 27B · Alibaba — 2026.04, 87Qwen3 Max Thinking · Alibaba — 2026.01, 85Gemma 4 31B · Google — 2026.03, 85MiniMax-M2.1 · MiniMax — 2025.12, 85Hy3-preview · tencent — 2026.04, 83Qwen3.6 35B A3B · Alibaba — 2026.04, 83Mistral Medium 3.5 · Mistral — 2026.04, 82Qwen3.5 35B A3B · Alibaba — 2026.02, 82gpt-oss-120b · openai — 2025.07, 82DeepSeek V3.1 Terminus · DeepSeek — 2025.09, 82MiniMax-M2 · MiniMax — 2025.10, 81DeepSeek V3.2 Exp · DeepSeek — 2025.09, 80NVIDIA Nemotron 3 Super 120B A12B · NVIDIA — 2026.02, 78GLM-4.6 · Z AI — 2025.09, 77Qwen3 Next 80B A3B · Alibaba — 2025.08, 76Command A+ · Cohere — 2026.05, 76Qwen3.5 Omni Plus · Alibaba — 2026.03, 76Qwen3 Max Thinking (Preview) · Alibaba — 2025.10, 76Qwen3 VL 235B A22B · Alibaba — 2025.09, 76Gemma 4 26B A4B · Google — 2026.03, 75Qwen3 Max · Alibaba — 2025.09, 75Gemma 4 12B · Google — 2026.05, 74Qwen3.5 9B · Alibaba — 2026.02, 74GLM-4.5 · Z AI — 2025.07, 73NVIDIA Nemotron 3 Nano 30B A3B · NVIDIA — 2025.11, 73gpt-oss-20b · openai — 2025.07, 72Qwen3 VL 32B · Alibaba — 2025.10, 71GLM-4.7-Flash · Z AI — 2026.01, 71Qwen3 Max (Preview) · Alibaba — 2025.08, 70Qwen3 235B A22B 2507 Instruct · Alibaba — 2025.07, 69Qwen3 30B A3B 2507 · Alibaba — 2025.07, 68GLM-4.5-Air · Z AI — 2025.07, 68Kimi K2 · Kimi — 2025.06, 68MiniMax M1 80k · MiniMax — 2025.06, 67Kimi K2 0905 · Kimi — 2025.08, 67Mistral Small 4 · Mistral — 2026.03, 66Llama Nemotron Super 49B v1.5 · NVIDIA — 2025.07, 66Qwen3.5 4B · Alibaba — 2026.02, 65DeepSeek V3.1 · DeepSeek — 2025.08, 64Qwen3 VL 30B A3B · Alibaba — 2025.09, 63Qwen3 235B A22B · Alibaba — 2025.04, 63Qwen3 VL 235B A22B Instruct · Alibaba — 2025.09, 63DeepSeek V3 0324 · DeepSeek — 2025.03, 62Qwen3 Coder Next · Alibaba — 2026.01, 62Qwen3 Coder 480B A35B Instruct · Alibaba — 2025.07, 62Qwen3 Next 80B A3B Instruct · Alibaba — 2025.08, 62Qwen3 VL 32B Instruct · Alibaba — 2025.10, 59GLM-4.6V · Z AI — 2025.11, 59Qwen3 32B · Alibaba — 2025.04, 59Hermes 4 - Llama-3.1 405B · Nous Research — 2025.08, 59Qwen3.5 Omni Flash · Alibaba — 2026.03, 58Nemotron 3 Nano Omni 30B A3B Reasoning · NVIDIA — 2026.04, 57Llama 3.1 Nemotron Ultra 253B v1 · NVIDIA — 2025.03, 57Qwen3 Omni 30B A3B · Alibaba — 2025.09, 57Qwen3 30B A3B 2507 Instruct · Alibaba — 2025.07, 57Qwen3 30B A3B · Alibaba — 2025.04, 56Hermes 4 - Llama-3.1 70B · Nous Research — 2025.08, 56NVIDIA Nemotron Nano 12B v2 VL · NVIDIA — 2025.10, 56Mistral Large 3 · Mistral — 2025.11, 55Qwen3 VL 30B A3B Instruct · Alibaba — 2025.09, 55Mistral Medium 3.1 · Mistral — 2025.07, 55Qwen3 14B · Alibaba — 2025.04, 55Llama 4 Maverick · Meta — 2025.03, 54Mistral Medium 3 · Mistral — 2025.04, 52Qwen3 Coder 30B A3B Instruct · Alibaba — 2025.07, 52GLM-4.5V · Z AI — 2025.07, 52DeepSeek V3 (Dec '24) · DeepSeek — 2024.12, 51Qwen2.5 Turbo · Alibaba — 2024.11, 50DeepSeek R1 Distill Llama 70B · DeepSeek — 2025.01, 49NVIDIA-Nemotron-Nano-9B-v2 · nvidia — 2025.08, 49ERNIE 4.5 300B A47B · Baidu — 2025.06, 49Mistral Small 3.2 · Mistral — 2025.06, 48Qwen3 VL 8B · Alibaba — 2025.09, 47Qwen3 4B · Alibaba — 2025.04, 46Qwen3 8B · Alibaba — 2025.04, 46Llama 3.3 Instruct 70B · Meta — 2024.11, 45Mistral Small (Sep '24) · Mistral — 2024.09, 45Mistral Large 2 (Nov '24) · Mistral — 2024.11, 44Llama 4 Scout · Meta — 2025.03, 44Mistral Small 3.1 · Mistral — 2025.03, 43Qwen3 Omni 30B A3B Instruct · Alibaba — 2025.09, 43phi-4 · microsoft — 2024.11, 43Phi-4 Multimodal Instruct · Microsoft — 2025.02, 42Qwen3 VL 8B Instruct · Alibaba — 2025.09, 42Qwen3 1.7B · Alibaba — 2025.04, 41Llama 3.1 Nemotron Instruct 70B · NVIDIA — 2024.09, 40Mistral Large 2 (Jul '24) · Mistral — 2024.07, 40Gemma 3 27B Instruct · Google — 2025.02, 40Gemma 3 12B Instruct · Google — 2025.02, 39Llama 3.1 Instruct 70B · Meta — 2024.07, 39Mistral Small 3 · Mistral — 2025.01, 38Granite 4.0 H Small · IBM — 2025.09, 37Mistral Small (Feb '24) · Mistral — 2024.02, 37Olmo 3 7B Instruct · Allen Institute for AI — 2025.11, 37Qwen3.5 2B · Alibaba — 2026.02, 37Command-R+ (Apr '24) · Cohere — 2024.03, 36Granite 4.1 8B · IBM — 2026.04, 35Llama 3 Instruct 70B · Meta — 2024.04, 35Gemma 3n E4B Instruct · Google — 2025.06, 32Llama 3.1 Instruct 8B · Meta — 2024.07, 32Gemma 3 4B Instruct · Google — 2025.02, 30Llama 3.2 Instruct 11B (Vision) · Meta — 2024.09, 29Phi-4-mini-instruct · microsoft — 2024.02, 28Qwen3 0.6B · Alibaba — 2025.04, 28Granite 3.3 8B · IBM — 2025.04, 28Command-R (Mar '24) · Cohere — 2024.02, 27Llama 3 Instruct 8B · Meta — 2024.04, 26Llama 3.2 Instruct 3B · Meta — 2024.09, 25Tiny Aya Global · Cohere — 2026.02, 23Qwen3.5 0.8B · Alibaba — 2026.02, 19Llama 3.2 Instruct 1B · Meta — 2024.09, 16Mistral Large (Feb '24) · Mistral — 2024.02, 44Llama 3.1 Instruct 405B · Meta — 2024.07, 45Hermes 3 - Llama-3.1 70B · Nous Research — 2024.07, 48Qwen2.5 Instruct 72B · Alibaba — 2024.09, 48Llama 3.2 Instruct 90B (Vision) · Meta — 2024.09, 52DeepSeek R1 (Jan '25) · DeepSeek — 2025.01, 63Qwen2.5 Max · Alibaba — 2025.01, 64QwQ 32B · Alibaba — 2025.02, 65DeepSeek R1 0528 (May '25) · DeepSeek — 2025.05, 75Qwen3 235B A22B 2507 · Alibaba — 2025.07, 82Kimi K2 Thinking · Kimi — 2025.10, 90GLM-4.7 · Z AI — 2025.12, 91GLM-5 · Z AI — 2026.01, 91Qwen3.5 397B A17B · Alibaba — 2026.02, 92Qwen3.6 Plus · Alibaba — 2026.03, 93Kimi K2.6 · Kimi — 2026.04, 96Qwen3.7 Max · Alibaba — 2026.05, 96Qwen3.7 MaxGLM-5Qwen3 235B A22B 2507DeepSeek R1 0528 (May '25)QwQ 32BLlama 3.2 Instruct 90B (Vision)Qwen2.5 Instruct 72B
Throughput value — output speed vs price
114 models
$0.1$10100200300400blended price, USD per 1M tokens (log)output tokens / secondQwen3.7 Max · Alibaba — $3.75, 199Kimi K2.6 · Kimi — $1.71, 46Qwen3.6 Max Preview · Alibaba — $2.92, 47DeepSeek V4 Pro · DeepSeek — $0.544, 89MiniMax-M3 · MiniMax — $0.525, 59Qwen3.7 Plus · Alibaba — $0.59, 53Qwen3.6 Plus · Alibaba — $1.13, 52Qwen3.5 397B A17B · Alibaba — $1.35, 51GLM-5.1 · Z AI — $2.15, 80DeepSeek V4 Flash · DeepSeek — $0.175, 114GLM-5 · Z AI — $1.55, 82GLM-4.7 · Z AI — $1, 107Kimi K2.5 · Kimi — $1.19, 41MiniMax-M2.7 · MiniMax — $0.525, 43Kimi K2 Thinking · Kimi — $1.07, 120Qwen3.5 122B A10B · Alibaba — $1.1, 147Nemotron 3 Ultra 550B A55B · NVIDIA — $1.18, 196MiniMax-M2.5 · MiniMax — $0.525, 249Qwen3.5 27B · Alibaba — $0.825, 84Qwen3.6 27B · Alibaba — $1.35, 63MiniMax-M2.1 · MiniMax — $0.525, 257Hy3-preview · tencent — $0.2, 135Qwen3.6 35B A3B · Alibaba — $0.557, 166Qwen3 235B A22B 2507 · Alibaba — $0.838, 76Mistral Medium 3.5 · Mistral — $3, 105Qwen3.5 35B A3B · Alibaba — $0.688, 154gpt-oss-120b · openai — $0.262, 356MiniMax-M2 · MiniMax — $0.525, 132NVIDIA Nemotron 3 Super 120B A12B · NVIDIA — $0.412, 147GLM-4.6 · Z AI — $0.963, 52Qwen3 Next 80B A3B · Alibaba — $1.88, 178Qwen3.5 Omni Plus · Alibaba — $1.5, 55Qwen3 Max Thinking (Preview) · Alibaba — $2.4, 54Qwen3 VL 235B A22B · Alibaba — $2.17, 52Qwen3 Max · Alibaba — $3.05, 59Gemma 4 12B · Google — $0.15, 161Qwen3.5 9B · Alibaba — $0.113, 65GLM-4.5 · Z AI — $1, 52NVIDIA Nemotron 3 Nano 30B A3B · NVIDIA — $0.096, 85gpt-oss-20b · openai — $0.088, 252Qwen3 VL 32B · Alibaba — $2.63, 96GLM-4.7-Flash · Z AI — $0.153, 88Qwen3 Max (Preview) · Alibaba — $2.4, 62Qwen3 235B A22B 2507 Instruct · Alibaba — $0.356, 72Qwen3 30B A3B 2507 · Alibaba — $0.673, 146GLM-4.5-Air · Z AI — $0.372, 82Kimi K2 · Kimi — $1.04, 26Kimi K2 0905 · Kimi — $1.07, 26Mistral Small 4 · Mistral — $0.262, 179Llama Nemotron Super 49B v1.5 · NVIDIA — $0.175, 49Qwen3.5 4B · Alibaba — $0.06, 23QwQ 32B · Alibaba — $0.745, 32Qwen3 VL 30B A3B · Alibaba — $0.338, 125Qwen3 235B A22B · Alibaba — $2.63, 64Qwen3 VL 235B A22B Instruct · Alibaba — $0.7, 50Qwen3 Coder Next · Alibaba — $0.563, 93Qwen3 Coder 480B A35B Instruct · Alibaba — $0.675, 68Qwen3 Next 80B A3B Instruct · Alibaba — $0.875, 168Qwen3 VL 32B Instruct · Alibaba — $1.23, 75GLM-4.6V · Z AI — $0.45, 84Qwen3 32B · Alibaba — $0.276, 102Hermes 4 - Llama-3.1 405B · Nous Research — $1.5, 40Qwen3.5 Omni Flash · Alibaba — $0.275, 279Nemotron 3 Nano Omni 30B A3B Reasoning · NVIDIA — $0.131, 298Llama 3.1 Nemotron Ultra 253B v1 · NVIDIA — $0.9, 52Qwen3 Omni 30B A3B · Alibaba — $0.43, 101Qwen3 30B A3B 2507 Instruct · Alibaba — $0.213, 149Qwen3 30B A3B · Alibaba — $0.18, 103Hermes 4 - Llama-3.1 70B · Nous Research — $0.198, 92NVIDIA Nemotron Nano 12B v2 VL · NVIDIA — $0.3, 298Mistral Large 3 · Mistral — $0.75, 64Qwen3 VL 30B A3B Instruct · Alibaba — $0.3, 121Mistral Medium 3.1 · Mistral — $0.8, 78Qwen3 14B · Alibaba — $0.731, 64Llama 4 Maverick · Meta — $0.475, 97Mistral Medium 3 · Mistral — $0.8, 46Qwen3 Coder 30B A3B Instruct · Alibaba — $0.352, 113Llama 3.2 Instruct 90B (Vision) · Meta — $1.38, 49GLM-4.5V · Z AI — $0.9, 18Qwen2.5 Turbo · Alibaba — $0.088, 100DeepSeek R1 Distill Llama 70B · DeepSeek — $0.787, 41NVIDIA-Nemotron-Nano-9B-v2 · nvidia — $0.07, 82Mistral Small 3.2 · Mistral — $0.128, 149Hermes 3 - Llama-3.1 70B · Nous Research — $0.3, 32Qwen3 VL 8B · Alibaba — $0.66, 131Qwen3 8B · Alibaba — $0.37, 61Llama 3.3 Instruct 70B · Meta — $0.612, 97Mistral Small (Sep '24) · Mistral — $0.3, 171Llama 3.1 Instruct 405B · Meta — $3.69, 65Mistral Large 2 (Nov '24) · Mistral — $3, 63Llama 4 Scout · Meta — $0.292, 108Mistral Small 3.1 · Mistral — $0.138, 163Qwen3 Omni 30B A3B Instruct · Alibaba — $0.43, 107phi-4 · microsoft — $0.219, 37Qwen3 VL 8B Instruct · Alibaba — $0.31, 145Llama 3.1 Nemotron Instruct 70B · NVIDIA — $1.2, 304Llama 3.1 Instruct 70B · Meta — $0.56, 34Mistral Medium · Mistral — $4.09, 128Mistral Small 3 · Mistral — $0.104, 176Granite 4.0 H Small · IBM — $0.107, 391Mistral Small (Feb '24) · Mistral — $1.5, 172Qwen3.5 2B · Alibaba — $0.04, 21Llama 3 Instruct 70B · Meta — $1.18, 46Llama 3.1 Instruct 8B · Meta — $0.1, 198Llama 3.2 Instruct 11B (Vision) · Meta — $0.245, 87Llama 3 Instruct 8B · Meta — $0.07, 81Llama 3.2 Instruct 3B · Meta — $0.15, 52Llama 2 Chat 7B · Meta — $0.1, 120Mistral 7B Instruct · Mistral — $0.206, 105Qwen3.5 0.8B · Alibaba — $0.02, 20Gemma 3n E4B Instruct · Google — $0.025, 40Llama 3.2 Instruct 1B · Meta — $0.05, 87Granite 4.1 8B · IBM — $0.063, 122Granite 3.3 8B · IBM — $0.085, 461Granite 3.3 8BGranite 4.1 8BLlama 3.2 Instruct 1BGemma 3n E4B Instruct

On the frontier · 13

cheapest → most capable
ModelCap$/1Mtok/sWeights
Qwen3.5 0.8B
Alibaba
19.3$0.0220open
Gemma 3n E4B Instruct
Google
32.2$0.02540open
Qwen3.5 2B
Alibaba
36.6$0.0421open
Qwen3.5 4B
Alibaba
65.3$0.0623open
gpt-oss-20b
openai
72.5$0.088252open
NVIDIA Nemotron 3 Nano 30B A3B
NVIDIA
73.1$0.09685open
Qwen3.5 9B
Alibaba
73.6$0.11365open
Gemma 4 12B
Google
73.8$0.15161open
DeepSeek V4 Flash
DeepSeek
91.4$0.175114open
MiniMax-M3
MiniMax
94.6$0.52559open
DeepSeek V4 Pro
DeepSeek
95.0$0.54489open
Kimi K2.6
Kimi
96.0$1.7146open
Qwen3.7 Max
Alibaba
96.1$3.75199open

Every point links to the model's page — scores with sources, latency, and the research behind it. Compare any of them head-to-head on Compare.

Workload cost calculator

Tokens per task × tasks per day → what each model actually costs to run, and how long a task takes.

Presets
ModelCap$ / task$ / daytime / task
Qwen3.5 0.8BAlibaba19$0$0.17515s
Gemma 3n E4B InstructGoogle32$0.0001$0.268.0s
Qwen3.5 2BAlibaba37$0.0001$0.3515s
Sarvam 30BSarvam39$0.0001$0.4252.4s
LFM2 24B A2BLiquid AI29$0.0001$0.482.5s
Gemma 3 4B InstructGoogle30$0.0001$0.52
Qwen3.5 4BAlibaba65$0.0001$0.52514s
Nova MicroAmazon34$0.0001$0.561.6s
Llama 3.2 Instruct 1BMeta16$0.0001$0.5754.0s
HyperNova 60B 2605Multiverse Computing70$0.0001$0.611.4s
NVIDIA-Nemotron-Nano-9B-v2nvidia49$0.0001$0.6419s
Granite 4.1 8BIBM35$0.0001$0.652.9s

Price arithmetic from live per-token prices (Artificial Analysis medians) — not a measured task benchmark. Time per task = latency + output tokens ÷ speed; ignores caching, rate limits and retries. For reasoning models, latency includes thinking time.