The score ledger
Changes
Every benchmark score on Modelyst is tracked in an append-only ledger. When a number moves between refreshes — a re-evaluation, a silent endpoint update, a harness change upstream — it lands here, with the old value, the new value, and the date. Last refresh: Jun 15, 2026.
Capability · price · speed drift — last 30 days
LFM2.5-8B-A1Bcap -3.7+227 tok/sMolmo2-8Bcap -0.4Olmo 3.1 32B Instructcap +0.4LFM2.5-1.2B-Instructcap +0.3+606 tok/sLFM2 24B A2Bcap -0.2+23 tok/sQwen3 4Bcap +0.2LFM2.5-VL-1.6Bcap +0.2+553 tok/sQwen3 1.7Bcap -0.2Olmo 3 7B Instructcap -0.2Granite 4.1 30Bcap -0.2Qwen3 VL 30B A3B Instructcap -0.2+10 tok/sLFM2 8B A1Bcap -0.2
Jun 10, 2026
2 changes · 20 first observationsSource of each value: see the score's provenance on its model page. Scores via Artificial Analysis are medians across providers; changes can reflect re-evaluation, endpoint updates, or methodology changes upstream.