Capability × cost × speed

Frontier

Capability = Modelyst cross-benchmark percentile · price & speed = Artificial Analysis medians · verified Jun 15, 2026 · methodology

League tables — best atAgents Code Instruction Following Knowledge & QA Language & Instruction Long Context Math Reasoning Vision & Multimodal

The efficient frontier — capability vs price

112 models · hover a point · click to open

on the frontier — nothing is both cheaper and bettereverything else

Up and to the left wins: the gold staircase is the set of models nothing beats on both price and capability at once.

The race — capability over time

gold = set a new record on release

Google OpenAI Xiaomi Anthropic xAI StepFun

Throughput value — output speed vs price

89 models

On the frontier · 7

cheapest → most capable

Model	Cap	$/1M	tok/s	Weights
Sarvam 30B Sarvam	38.9	$0.047	240	—
HyperNova 60B 2605 Multiverse Computing	69.8	$0.065	347	—
GPT-5 nano OpenAI	72.6	$0.138	162	—
MiMo-V2-Flash Xiaomi	87.7	$0.15	154	—
GPT-5.4 nano OpenAI	88.8	$0.463	159	—
MiMo-V2.5-Pro Xiaomi	95.5	$0.544	43	—
Gemini 3.1 Pro Preview Google	97.9	$4.5	142	—

Every point links to the model's page — scores with sources, latency, and the research behind it. Compare any of them head-to-head on Compare.

Workload cost calculator

Tokens per task × tasks per day → what each model actually costs to run, and how long a task takes.

Presets

Input tok / taskOutput tok / taskTasks / day

Model	Cap	$ / task	$ / day	time / task
Qwen3.5 0.8BAlibaba	19	$0	$0.175	15s
Gemma 3n E4B InstructGoogle	32	$0.0001	$0.26	8.0s
Qwen3.5 2BAlibaba	37	$0.0001	$0.35	15s
Sarvam 30BSarvam	39	$0.0001	$0.425	2.4s
LFM2 24B A2BLiquid AI	29	$0.0001	$0.48	2.5s
Gemma 3 4B InstructGoogle	30	$0.0001	$0.52	—
Qwen3.5 4BAlibaba	65	$0.0001	$0.525	14s
Nova MicroAmazon	34	$0.0001	$0.56	1.6s
Llama 3.2 Instruct 1BMeta	16	$0.0001	$0.575	4.0s
HyperNova 60B 2605Multiverse Computing	70	$0.0001	$0.61	1.4s
NVIDIA-Nemotron-Nano-9B-v2nvidia	49	$0.0001	$0.64	19s
Granite 4.1 8BIBM	35	$0.0001	$0.65	2.9s

Price arithmetic from live per-token prices (Artificial Analysis medians) — not a measured task benchmark. Time per task = latency + output tokens ÷ speed; ignores caching, rate limits and retries. For reasoning models, latency includes thinking time.